Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality: when the script finds a proper name it also makes a record of it somewhere #6

Open
danielboniface opened this issue Sep 29, 2015 · 3 comments

Comments

@danielboniface
Copy link

The idea is to begin compiling a database for each player and team of stories (cannonical urls only) that they are mentioned in.

That way, once we have this list, we can do something useful with it.

@freejoe76 freejoe76 self-assigned this Sep 29, 2015
@freejoe76
Copy link
Contributor

How the data ingestion is going to work with a minimum of system expense:

  1. Every article load will send a request to a js file, passing it the article URL and an object consisting of the noun-matched items.
  2. The backend script will hash the URL and look for that file in its filesystem. If the file doesn't exist it will create it and add the noun-matched items and submitting URL. It will also write the hash to a list of queued hashes.
  3. A daemon will watch the queue list and process the noun-matched items in each file, adding them to the database.

@freejoe76
Copy link
Contributor

How the publishing of these lists will work:

  1. Lists of related articles for each proper noun will be published at, say, http://extras.denverpost.com/app/nouner/related/broncos/peyton-manning.html for Peyton Manning.
  2. These lists will also be available in javascript form at a similar URL.
  3. Once a day those pages will be put together and published via a cronjob and backend script that pulls the article information from the database and writes it to flat files.

The database should include proper noun, article title, and the date it first appeared. The date field is questionable since some old articles will have a new date assigned to them.

@freejoe76
Copy link
Contributor

Note: If we move this noun-matching functionality to the backend, this won't work.

@freejoe76 freejoe76 removed their assignment Feb 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants