Skip to content
Building an inverted index using Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
download_wet_files.py
generate_page_table.py
generate_postings.py
index_construction.py
readme.pdf

README.md

Indexer

Building an inverted index using Python

Building an inverted index (or "indexing") is the second step in building a search engine (the first step being crawling). This code builds an inverted index (with varbyte compression), the corresponding lexicon and page table. It uses data (in the form of WET files) from CommonCrawl.

Refer to readme.pdf for more details.

You can’t perform that action at this time.