# Building an in-memory search index using a hash map

In [1]:
docs = ["the cat is under the table",
            "the dog is under the table",
            "cats and dogs smell roses",
            "Carla eats an apple"
       ]

In [2]:
matches = [doc for doc in docs if "table" in doc]
print(matches)

['the cat is under the table', 'the dog is under the table']


### Building an inverted index
#### inverted index is used to quickly retrieve data, not only in search engines but also in databases.
#### Building an inverted index is an expensive operation and requires you to encode every possible query.

In [3]:
index = {}

for i, doc in enumerate(docs):
    for word in doc.split():
        if word not in index:
            index[word] = [i]
        else:
            index[word].append(i)
            
results = index["table"]
result_documents = [docs[i] for i in results]

In [4]:
print(result_documents)

['the cat is under the table', 'the dog is under the table']


### Using inverted index based on sets and the query using set operations.
#### Sets offer efficiency in search operations.

In [5]:
index = {}

for i, doc in enumerate(docs):
    for word in doc.split():
        if word not in index:
            index[word] = {i}
        else:
            index[word].add(i)
            
index['cats'].intersection(index['roses'])

{2}