Skip to content

wikimedia/research-article-recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Article-recommender

Recommend Wikipedia articles for creation

Requirements

  • `pyspark` (is assumed to be available in the cluster)

Output

The output looks something like this:

wikidata_idnormalized_rank
Q1255760.930232
Q1274180.928457
Q1255760.927625
Q2266970.919053

Documentation

Code documentation adheres to the Google Style Python Docstrings[fn:1].

For other documentation see the doc folder.

[fn:1] https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html

How to package and upload to PyPi

  • From the root directory run:
    • python3 setup.py sdist bdist_wheel
  • Generated files will be in ./dist/
  • Upload to PyPi:
    • twine upload dist/*

How to test changes

  • Make changes, commit, upload to Gerrit.
  • Clone your patch to stat1007.
  • cd article-recommender
  • Optional: edit article_recommender/recommend.py and set TRAIN_RANGE_DAYS to 10, and TOP_LANGUAGES_COUNT to 1 for faster train times.
  • Generate recommendations, e.g.:
    spark2-submit --master yarn --deploy-mode client\
    article_recommender/recommend.py kk uz 20190401
        
  • In case of an error find the application ID in the logs and search for the cause of the error, e.g.: yarn logs -applicationId application_1553764233554_53161

About

Github mirror of "research/article-recommender" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages