Skip to content

CurbSafeCharmer/refill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

reFill 2

reFill fixes bare URLs on Wikipedia articles, semi-automatically. It extracts bibliographical information from web pages referenced by bare URL citations, generates complete references and finally inserts them back onto the original page.

This README gives you all the details needed to set up a reFill instance for testing and development. If you only intend to use reFill, the manual may be more helpful.

Quick start

You will need to install:

  1. git clone https://github.com/CurbSafeCharmer/refill
  2. cd refill
  3. make setup
  4. make start
  5. Voila! reFill is now running on your machine.

Overview

The tool consists of three parts:

  • APIs: A set of APIs that allow the user to submit tasks and retrieve their results. Tasks may be initiated by a human user or a script.
  • Workers: Long-running processes that complete tasks received through the broker, orchestrated by Celery.
  • Web UI: A single-page web app powered by Vue.js. It's the reference implementation of an API consumer.

Navigating the source

Most interesting stuff happens in backend/refill, where you can find the individual parsers that extract information from webpages.

  • backend: APIs and worker
    • app.py: Flask-based APIs
    • refill
      • dataparsers: Metadata parsers
      • formatters: Wikicode generators
      • transforms: Wikicode transformations
        • fillref.py: Complete bare references
        • fillexternal.py: Complete bare external links
        • mergeref.py: Merge duplicate citations
      • models: Models
        • citation.py: Citation
        • context.py: Task context
      • utils: Utilities
  • web: Web UI
    • libs: External libraries
      • wdiff.js: A modified version of wDiff by User:Cacycle, with additional code to display reFill markers
    • src: Source code

Hints

Result expiration

By default, tasks on Celery expire in a day. If you are using a database backend, be sure to have celery beat running in order to clear the old results. This is especially important if you are running a public instance.

Contributing

Patches are always welcome! To contribute, simply create a fork of the repo, make your changes and submit a pull request. Your contributions are appreciated. It would be great to have some new maintainers!

Please report issues on Phabricator (https://phabricator.wikimedia.org/project/board/5013/).

Localization of the tool is powered by Intuition and handled on translatewiki.net. To start translating the tool, please register at translatewiki.net and request to become a Translator. You can also submit your translations manually via GitHub pull requests or even on-wiki.

Licensing

reFill is licensed under the BSD 2-Clause License. See LICENSE for details.

External libraries

This program uses wDiff by Cacycle, released into public domain.

Licenses of NPM and PyPI dependencies may be viewed using third-party tools including license-checker (for NPM) and python-license-check (for PyPI).