# Langchain with Qdrant

In this lab, we will do a deeper dive around the Qdrant vector store and different ways to interact with it. We'll look at how we can use different search methods to vary the results and how we can use the results with a large language model.

We'll start as usual by loading the values from the `.env` file in the root of this repository.

In [None]:
import os
from dotenv import load_dotenv

# Load environment variables
if load_dotenv():
    print("Found Azure OpenAI Endpoint: " + os.getenv("AZURE_OPENAI_ENDPOINT"))
else: 
    print("No file .env found")

In this lab, we'll use the data from the movies.csv file in this folder, which contains details of around 500 different movies. We'll use a Langchain document loader to load all of the movie data into memory.

In [None]:
from langchain.document_loaders.csv_loader import CSVLoader

loader = CSVLoader(file_path='./movies.csv', source_column='original_title', encoding='utf-8', csv_args={'delimiter':',', 'fieldnames': ['id', 'original_language', 'original_title', 'popularity', 'release_date', 'vote_average', 'vote_count', 'genre', 'overview', 'revenue', 'runtime', 'tagline']})
data = loader.load()
# data = data[1:51] # You can uncomment this line to load a smaller subset of the data if you experience issues with token limits or timeouts later in this lab.
print('Loaded %s movies' % len(data))

Next, we'll create Azure OpenAI embedding and chat completion deployments. The embeddings instance will be used to create the vector representation of the movies in the loaded CSV file and the chat completion instance will be used to ask questions.

In [None]:
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings

# Create an Embeddings Instance of Azure OpenAI
embeddings = AzureOpenAIEmbeddings(
    azure_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME"),
    openai_api_version = os.getenv("OPENAI_EMBEDDING_API_VERSION"),
    model= os.getenv("AZURE_OPENAI_EMBEDDING_MODEL")
)


# Create a Chat Completion Instance of Azure OpenAI
llm = AzureChatOpenAI(
    azure_deployment = os.getenv("AZURE_OPENAI_COMPLETION_DEPLOYMENT_NAME")
)

## Start Qdrant Server Locally

**NOTE**: Please read the following carefully.

In previous labs, we ran the Qdrant vector store in memory, so it's contents were lost once Qdrant stopped running. In this lab, we'll run Qdrant to persist data to disk.

If you are running this lab in Codespaces or a VS Code local devcontainer, then Qdrant is already running and you can continue to the next section.

Otherwise, we need to start a local instance of Qdrant. The easiest way to do this is with Docker. If you don't have Docker on your device, consider running this workshop in a Codespace or a VS Code devcontainer.

In [None]:
# Start Qdrant Server
# !docker run -d --name qdrant -p 6333:6333 -p 6334:6334 -v "$(pwd)\qdrantstorage" qdrant/qdrant
# for mac and linux use the following command
!docker run -d --name qdrant -p 6333:6333 -p 6334:6334 -v "$(pwd)/qdrantstorage:/qdrant/storage" qdrant/qdrant

# If you want to stop and cleanup the Qdrant server, uncomment and run the following commands:
# !docker stop qdrant
# !docker rm qdrant
# !rm -rf labs/03-orchestration/03-Qdrant/qdrantstorage

## Load Movies into Qdrant

Now that we have the Qdrant server running and persisting data locally, let's load the movies into the vector store.

We'll configure Langchain to use Qdrant as the vector store, embed the loaded documents and store the embeddings in the vector store.

**NOTE**: Depending on the number of movies loaded and rate limiting, this might take a while.

In [None]:
from langchain.vectorstores import Qdrant

url = "http://localhost:6333"
qdrant = Qdrant.from_documents(
    data,
    embeddings,
    url=url,
    prefer_grpc=False,
    collection_name="my_movies",
)

## Vector Store Searching using Qdrant

Now we are going to demonstrate how the results of searching a vector store can be affected by using different search methods.

The most common method used is a similarity search. This search method aims to return the most similar documents to the query text, using a metric such as cosine similarity.

In [None]:
vectorstore = qdrant

query = "Can you suggest similar movies to The Matrix?"

query_results = qdrant.similarity_search(query)

for doc in query_results:
    print(doc.metadata['source'])

Another commonly used method is MMR - Maximal Marginal Relevance - which also executes a similarity search, but the algorithm is geared towards returning a more diverse set of results.

In [None]:
retriever = vectorstore.as_retriever(search_type="mmr")

query = "Can you suggest similar movies to The Matrix?"

for doc in retriever.get_relevant_documents(query):
    print(doc.metadata['source'])


In both of the above examples, a list of movies will have been returned. Both should be relevant to the query, but the MMR search should return a more diverse set of results. To explain that a little more, the first set of results might, for example, all be from the same genre, whereas the MMR search would return results from a wider range of genres.

## Vector Store Searching using Langchain Retriever

In this part we will use Langchain to search the Qdrant vector store and parse the results via a large language model. This is different from the previous section where we were simply returning the results of a Qdrant search directly.

As you will have seen in the **02-Embeddings** lab, we can use the `VectorstoreIndexCreator` to quickly create a vector store index and a retriever.


In [None]:
from langchain.indexes import VectorstoreIndexCreator
from langchain.chains import RetrievalQA

index_creator = VectorstoreIndexCreator(embedding=embeddings)
docsearch = index_creator.from_loaders([loader])

Now we can use a Langchain QA (Question Answering) retrieval chain to ask questions about the movies.

In [None]:
chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=docsearch.vectorstore.as_retriever(), input_key="question", return_source_documents=True)

query = "Do you have a column called popularity?"
response = chain.invoke({"question": query})
print(response['result'])

You can see that we've retrieved a set of search results from the vector store, and then used the LLM to parse the results and provide a natural language answer to the question.

We can also see the documents that were retrieved from the search.

In [None]:
print(response['source_documents'])

We could ask a different question, such as "What is the name of the most popular movie?".

In [None]:
query = "If the popularity score is defined as a higher value being a more popular movie, what is the name of the most popular movie in the data provided?"
response = chain.invoke({"question": query})
print(response['result'])

Again, you'll see that instead of just returning the search result, we've got a natural language answer to the question.

And we can see the set of documents that were retrieved, which will be different to those returned for the previous question because we asked a different question.

In [None]:
print(response['source_documents'])

## Next Section

📣 [Azure AI Search with Semantic Kernel and C#](../04-ACS/acs-sk-csharp.ipynb)

📣 [Azure AI Search with Langchain and Python](../04-ACS/acs-lc-python.ipynb)