# Build a Semantic Search Engine in 5 Minutes

This notebook accompanies the tutorial at https://qdrant.tech/documentation/tutorials-basics/search-beginners/

First, install the Qdrant Client for Python. This library allows you to interact with Qdrant from Python code.

In [None]:
!pip install qdrant-client

Next, create a client connection to your Qdrant cluster. Ensure that you have added QDRANT_URL and QDRANT_API_KEY as secrets.

In [None]:
from google.colab import userdata

QDRANT_URL=userdata.get("QDRANT_URL")
QDRANT_API_KEY=userdata.get("QDRANT_API_KEY")

In [None]:
from qdrant_client import QdrantClient, models

client = QdrantClient(
    url=QDRANT_URL,
    api_key=QDRANT_API_KEY,
    cloud_inference=True
)

All data in Qdrant is organized within collections. Since you're storing books, let's create a collection named `my_books`.

In [None]:
COLLECTION_NAME="my_books"

client.create_collection(
    collection_name=COLLECTION_NAME,
    vectors_config=models.VectorParams(
        size=384,  # Vector size is defined by the model
        distance=models.Distance.COSINE,
    ),
)

The dataset consists of a list of science fiction books. Each entry has a name, author, publication year, and short description.

In [None]:
documents = [
    {
        "name": "The Time Machine",
        "description": "A man travels through time and witnesses the evolution of humanity.",
        "author": "H.G. Wells",
        "year": 1895,
    },
    {
        "name": "Ender's Game",
        "description": "A young boy is trained to become a military leader in a war against an alien race.",
        "author": "Orson Scott Card",
        "year": 1985,
    },
    {
        "name": "Brave New World",
        "description": "A dystopian society where people are genetically engineered and conditioned to conform to a strict social hierarchy.",
        "author": "Aldous Huxley",
        "year": 1932,
    },
    {
        "name": "The Hitchhiker's Guide to the Galaxy",
        "description": "A comedic science fiction series following the misadventures of an unwitting human and his alien friend.",
        "author": "Douglas Adams",
        "year": 1979,
    },
    {
        "name": "Dune",
        "description": "A desert planet is the site of political intrigue and power struggles.",
        "author": "Frank Herbert",
        "year": 1965,
    },
    {
        "name": "Foundation",
        "description": "A mathematician develops a science to predict the future of humanity and works to save civilization from collapse.",
        "author": "Isaac Asimov",
        "year": 1951,
    },
    {
        "name": "Snow Crash",
        "description": "A futuristic world where the internet has evolved into a virtual reality metaverse.",
        "author": "Neal Stephenson",
        "year": 1992,
    },
    {
        "name": "Neuromancer",
        "description": "A hacker is hired to pull off a near-impossible hack and gets pulled into a web of intrigue.",
        "author": "William Gibson",
        "year": 1984,
    },
    {
        "name": "The War of the Worlds",
        "description": "A Martian invasion of Earth throws humanity into chaos.",
        "author": "H.G. Wells",
        "year": 1898,
    },
    {
        "name": "The Hunger Games",
        "description": "A dystopian society where teenagers are forced to fight to the death in a televised spectacle.",
        "author": "Suzanne Collins",
        "year": 2008,
    },
    {
        "name": "The Andromeda Strain",
        "description": "A deadly virus from outer space threatens to wipe out humanity.",
        "author": "Michael Crichton",
        "year": 1969,
    },
    {
        "name": "The Left Hand of Darkness",
        "description": "A human ambassador is sent to a planet where the inhabitants are genderless and can change gender at will.",
        "author": "Ursula K. Le Guin",
        "year": 1969,
    },
    {
        "name": "The Three-Body Problem",
        "description": "Humans encounter an alien civilization that lives in a dying system.",
        "author": "Liu Cixin",
        "year": 2008,
    },
]


Upload the dataset to the `my_books` collection. Each book will be stored as a point with:
- a unique ID
- a vector generated by the `sentence-transformers/all-minilm-l6-v2` embedding model (available for free on Qdrant Cloud), based on the book's description. This is achieved by providing a `Document` object with the `model` name and the `text` to embed.
- a payload containing each of the fields in the dataset.

In [None]:
EMBEDDING_MODEL="sentence-transformers/all-minilm-l6-v2"

client.upload_points(
    collection_name=COLLECTION_NAME,
    points=[
        models.PointStruct(
            id=idx,
            vector=models.Document(
                text=doc["description"],
                model=EMBEDDING_MODEL
            ),
            payload=doc
        )
        for idx, doc in enumerate(documents)
    ],
)

Now that the data is stored in Qdrant, you can query it and receive semantically relevant results.

In [None]:
hits = client.query_points(
    collection_name=COLLECTION_NAME,
    query=models.Document(
        text="alien invasion",
        model=EMBEDDING_MODEL
    ),
    limit=3,
).points

for hit in hits:
    print(hit.payload, "score:", hit.score)

How about the most recent book from the early 2000s? Qdrant, allows you to narrow down query results by applying a filter. To filter for books published after the year 2000, you can filter on the `year` field in the payload. Before filtering on a payload field, create a payload index for that field:

In [None]:
client.create_payload_index(
    collection_name=COLLECTION_NAME,
    field_name="year",
    field_schema="integer",
)

Now you can apply a filter to the query:

In [None]:
hits = client.query_points(
    collection_name=COLLECTION_NAME,
    query=models.Document(
        text="alien invasion",
        model=EMBEDDING_MODEL
    ),
    query_filter=models.Filter(
        must=[models.FieldCondition(key="year", range=models.Range(gte=2000))]
    ),
    limit=1,
).points

for hit in hits:
    print(hit.payload, "score:", hit.score)