Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor for page revisions in XML dumps #12

Closed
mkroetzsch opened this issue Feb 18, 2014 · 0 comments
Closed

Processor for page revisions in XML dumps #12

mkroetzsch opened this issue Feb 18, 2014 · 0 comments
Assignees
Labels

Comments

@mkroetzsch
Copy link
Member

A component for processing the page revisions found in one or more dump files in MediaWiki XML format should be provided. The component should process a file given by a Java file handler (no path!) and iterate over the revisions in that file. Other components that process revisions should be able to register as listeners, which will be called back for every new revision. The iterator should also be able to parse multiple files (last dump + incremental daily dumps), obtained from the dump file management component of Issue #8.

@mkroetzsch mkroetzsch added this to the Wikidata Toolkit 0.1 milestone Feb 18, 2014
@mkroetzsch mkroetzsch self-assigned this Mar 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant