Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
Welcome to the LMW-tree wiki!
For the latest news see the K-tree project homepage
See the ClueWeb09 clusters and the ClueWeb12 clusters for examples of clusters produced by the EM-tree algorithm. The ClueWeb09 dataset contains 500 million web pages and was clustered into 700,000 clusters. The ClueWeb12 datasets contains 733 million web pages and was clustered into 600,000 clusters. The document to cluster mappings and other related files area available at SourceForge.
All TODO are now tracked in the github issue tracker using labels.