Skip to content
Github mirror of "research/article-recommender/deploy" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
.gitattributes
.gitfat
.gitreview
README.org
deploy.py
requirements.txt

README.org

Importing Wikidata item normalized ranks to MySQL

Requirements

On Debian: python-mysql.connector

How to import data

Test data can be found here.

Assuming TSV files are in the current directory, run the following commands in order:

  1. Import languages (recommendationapi_password.txt must contain the MySQL password for the user ‘recommendationapi’):

    python deploy.py import_languages 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt \ –language_file languages.tsv

  2. Import normalized ranks (make sure to change the source and target languages when importing other files):

    python deploy.py import_normalized_ranks 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt \ –normalized_ranks_file predictions-06032018-20181130/en-es.tsv \ –source_language en –target_language es

  3. Create views so that the recommendation API can access the data:

    python deploy.py create_views 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt

    (This doesn’t always work. TODO: fix)

  4. Optional. If you want to delete old data:

    python deploy.py cleanup 20181130 m2-master.eqiad.wmnet \ 3306 recommendationapi recommendationapi recommendationapi_password.txt

Prepare environment for Oozie

  • Ssh stat1007 so that you get the same version of python that the cluster has (is that true?)
  • ./bin/create-environment.sh
  • Copy over the generated zip file to the `artifacts` folder of analytics/refinery.
You can’t perform that action at this time.