Skip to content

Positional Index vs Inverted Index

Le Thu Nguyen edited this page Apr 19, 2018 · 6 revisions

Previous

Positional index:

Explain:

•Each term in the vocabulary, store postings of the form docID (position1, position2,..) •Store each term in the form of

  <number of docs containing term;
  Doc1: position1, position2…;
  Doc2: position1, position2…;
  etc.>

Example:

To, 993:
(1,6:(7,18,33,72,..))

The word to has a document frequency 993 and occurs 6 times in document 1 at positions 7,18,33, 72 and etc.

  • Phrase queries, use a merge algorithm recursively at the document level.

Pros/Cons:

Fast indexing

Less efficient query's

Inverted index

Explain

  • A data structure that maps terms back to the parts of a document in which they occur
  • For each term T, store a list of all documents containing term T. It is a list of all unique word that appears in any document.

Example:

Document 1: The quick brown fox jumped
Document 2: Quick brown foxes

 Term	Doc1	Doc2
 The	x	
 quick	x	
 brown	x	x
 fox	x	
 jumped	x	
 Quick		x
 foxes		x

Pros/Cons:

Fast query

Slower indexing

Previous