You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There should be a component to download and manage dump files in the format provided for Wikidata.org. It should access dumps from a specified location, find out which dumps are available, and fetch dumps as needed. Relevant types of dumps (current revisions, full, daily) should be distinguished and treated suitably. The component should provide access to any of these files transparently (without requiring accessing components to know about their location or compression format).
The text was updated successfully, but these errors were encountered:
There are several types of dump files provided: full dumps,
dumps of current versions, and incremental (daily) dumps.
The are found online and (after being downloaded once), also
locally. The class MediaWikiDumpFile represents such a dump
and provides transparent access to its contents (whereever
it comes from, whatever type it is).
The WmfDumpFileManager provides methods for creating
MediaWikiDumpFile objects from the data seen online or in local
files. Classes that implement the interface DumpFileProcessor
can ask WmfDumpFileManager to call them for all relevant dump
files in the right order. This is the preferred way of processing
all dumps.
There are no tests yet. Testing will require suitable mock objects
to simulate the Web and the file systems. The code provides places
to inject these mock objects.
This contributes to issue #8.
We are talking about the dumps at http://dumps.wikimedia.org/ here. The goal of this task is to be able to import the data from Wikidata into our system. We plan to create RDF output dumps from that data, but this is another issue: #14.
There should be a component to download and manage dump files in the format provided for Wikidata.org. It should access dumps from a specified location, find out which dumps are available, and fetch dumps as needed. Relevant types of dumps (current revisions, full, daily) should be distinguished and treated suitably. The component should provide access to any of these files transparently (without requiring accessing components to know about their location or compression format).
The text was updated successfully, but these errors were encountered: