Skip to content

rhysics/wikiTree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wikiTree

Python Version License

A framework to convert Wikipedia article edit histories into ROOT [1] trees for analysis.

File Reduction

wikiTree is designed to efficiently process and reduce large Wikipedia XML dump files into a more manageable ROOT tree format, enabling easier data analysis and manipulation.

article Number of Revisions XML Dump Size (bytes) ROOT Tree Size (bytes) Reduction
Trains 1000 15301346 1153432
Particle Physics 1000 16942429 1287648

Features

  • Converts Wikipedia XML dumps [2] into ROOT trees

Quick Start

To use wikiTree, follow these steps:

Links

[1] https://root.cern/

[2] https://en.wikipedia.org/wiki/Special:Export

About

Tool to save wikipedia edit history to CERN ROOT TTree / RNTuple format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages