# Semantic search

## What is semantic search?
This is a hot field of NLP, everybody is talking about it.

To understand what it is, let's define the problem.

![searching...](https://media.giphy.com/media/l46Cy1rHbQ92uuLXa/giphy.gif)

## The problem

What do you do if I ask you to build a search engine? Ok, maybe that's a subject that is too broad.
Let's be more precise.

### Case 1: predict the end of a word

If I give you the sentence `protect my house with an al` as input, and I ask you to predict the end of the word `al` based on this list of words:
```
room
ally
code
alien
alert
door
alarm
```

How would you tackle that?

Of course you could try a naive approach. Filtering all the words that don't start with `al`. So the list would be:
```
ally
alien
alert
alarm
```

Better, but still, how would you choose between those? How would you sort them?


## Case 2: predict which document you are searching for

If I give you the sentence `how to be the best programmer?`, how would you choose which document is the best to answer this question, based on a list of 2000 documents?

You could try to remove stop words and not important words to get this sentence: `how be best programmer`. 

Better, now we can search for the document where these words occur the most . 

But what if I tell you that the document you are looking for only uses the keywords `better developer`? 

You could try to build a list of synonyms for each word. Great, but now you have a lot of words to compare with a lot of documents, your query will take ages to run. 

A better solution would be to use embeddings to extract the meaning of each word instead of basing yourself on text representation of words. But the question still remains: how would you select the best document?

## Case 3: Understand a query

You type this in your favorite search engine: `how to kill parent with fork?`.

Obviously, you are not looking for a way to kill your lovely mom and dad with a fork. *(if you do so, I will not judge you, but hey, you need help...)*

Let's try this query on Google. Do it. I'm waiting for you.

Indeed, it's all about Linux processes. But how can Google understand that? It's because it uses semantic search. It doesn't only look at the meaning of each word but also at the **context** around. Moreover, it also looks at what other uses have clicked on after typing this query.

Phew, we are saved! Most Google user don't want to kill their parent with a fork...

How would you tackle that?

## Reminder

Don't forget to take into account that your user can make typos.

## Tools

In this domain, before starting to build a specific model, check if `ElasticSearch` can't solve your problem.
I don't ask you to be able to build something with it, but you should at least understand what it is and what the purpose of it is.

## Conclusion 

Ok, I think you start to understand the problem. We need to base our search on the **meaning** of the query and not only on the meaning of each word taken separately.

## The solution

I was planning to write a little explanation about this subject but I found this fantastic article that explains everything you need to know in a nice way.

So let's [give it a shot](https://medium.com/modern-nlp/semantic-search-fuck-yeah-e371c0f639d)!
Let's also check [this one](https://medium.com/swlh/semantic-search-with-nlp-86084ca81247) out about the modern way of implementing it.
And if you want to see some code and learn about sentence similarity, [this one is for you](https://towardsdatascience.com/cutting-edge-semantic-search-and-sentence-similarity-53380328c655).

If you want to understand more deeply how it works, you will need to understand the concept of `LSTM`, `encoder - decoder` well, so if you don't understand it, you can ask Google for an answer.

## Other useful resources
* [Semantic search with SEO](https://ahrefs.com/blog/semantic-search/)
* [Semantic search with examples](https://www.searchenginewatch.com/2019/12/16/the-beginners-guide-to-semantic-search/)
* [Semantic search on the business side](https://www.bloomreach.com/en/blog/2019/06/semantic-search-explained-in-5-minutes.html)
* [Short demonstration with Google](https://towardsdatascience.com/semantic-search-73fa1177548f)
* [Build a semantic search engine](https://azati.ai/how-to-build-semantic-search-engine/)

You are now a master of the search!

![search master](https://media.giphy.com/media/QBXgRASuwHDIJ2IAKI/giphy.gif)