Skip to content

AyaMosaad1/Searching_by_Inverted_index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Searching_by_Inverted_index


Project description

inverted index was through building a trie that would take all words found in all documents and save them all in the same trie. At the end node of each word, we save the ids of all the documents that this word occurred in it. So now when we search for a word, we display this list of ids easily, as we only need to traverse one trie structure with the length of given word.

While implementing this algorithm, we stumbled across some corner cases that required a solution. For example, since we build the trie children to include the 26 letters of the alphabet, any word that’s not in English (Japanese... Arabic …etc ) was difficult to input to our trie and so unfortunately we couldn’t support it. If a word had (‘s) or any other time of abbreviation would be stored in its original form, ex: master’s is saved as master. When we input our files, we separate the string inside using not only space but also “ . ” and “@” so that we can handle as many corner cases as we can. Just for safety, we also remove any punctuation from our word before inserting it to our trie in case our conditions didn’t satisfy a particular word.


To run it :

1- clone https://github.com/AyaMosaad1/Searching_by_Inverted_index

2- cd to repo directory

3- install pip install pyqt5

4- Run Gui.py


Contributors :

  • Alaa Mohamed Abd El Rahman
  • Aya Mosaad Mohamed
  • Baraa Magdy El Sayed
  • Toka Abd El Hamid
  • Panse Yasser Mohamed

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages