Skip to content

Github mirror of "research/article-recommender/deploy" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)

Notifications You must be signed in to change notification settings

wikimedia/research-article-recommender-deploy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Importing Wikidata item normalized ranks to MySQL

Requirements

On Debian: python-mysql.connector

How to import data

Test data can be found here.

Assuming TSV files are in the current directory, run the following commands in order:

  1. Import languages (recommendationapi_password.txt must contain the MySQL password for the user ‘recommendationapi’):

    python deploy.py import_languages 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt \ –language_file languages.tsv

  2. Import normalized ranks (make sure to change the source and target languages when importing other files):

    python deploy.py import_normalized_ranks 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt \ –normalized_ranks_file predictions-06032018-20181130/en-es.tsv \ –source_language en –target_language es

  3. Create views so that the recommendation API can access the data:

    python deploy.py create_views 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt

    (This doesn’t always work. TODO: fix)

  4. Optional. If you want to delete old data:

    python deploy.py cleanup 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt

Prepare environment for Oozie

  • Ssh stat1007 so that you get the same version of python that the cluster has (is that true?)
  • ./bin/create-environment.sh
  • Copy over the generated zip file to the `artifacts` folder of analytics/refinery.

About

Github mirror of "research/article-recommender/deploy" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)

Resources

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published