Skip to content

hieusydo/IndexBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IndexBuilder

CS 6913 (Web Search Engine) Assignment - NYU Tandon School of Engineering

Goal

Create an inverted index structure from CommonCrawl data

What It Does

  • Use merge sort indexing
  • Compress final index with variable byte encoding and chunk-wise compression
  • End-to-end indexing rate of 430 documents per second
  • Store original documents to database (SQLite3 or Redis)

About

Create an inverted index structure

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages