# TiDB Vector

> [TiDB](https://github.com/pingcap/tidb) is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics.

In its latest version (insert version number here), TiDB introduces support for vector search. This notebook provides a detailed guide on utilizing the tidb vector search in LlamaIndex.

## Setting up environments

In [None]:
%pip install llama-index
%pip install tidbvec

In [None]:
import textwrap
import openai

from llama_index import SimpleDirectoryReader, StorageContext
from llama_index.indices.vector_store import VectorStoreIndex
from llama_index.vector_stores.tidb_vector import TiDBVector

Configure both the OpenAI and TiDB host settings that you will need

In [None]:
# Here we useimport getpass
import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")
tidb_connection_url = getpass.getpass(
    "TiDB connection URL (format - mysql+pymysql://root@127.0.0.1:4000/test): "
)

Prepare data that used to show case

In [None]:
%pip install pymysql
%mkdir -p 'data/paul_graham/'
%wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

In [None]:
documents = SimpleDirectoryReader("./data/paul_graham").load_data()
print("Document ID:", documents[0].doc_id)
for index, document in enumerate(documents):
    document.metadata = {"book": "paul_graham"}

## Create TiDB Vectore Store

The code snippet below creates a table named 'COLLECTION_NAME' in TiDB, optimized for vector searching. Upon successful execution of this code, you will be able to view and access the 'collection name' table directly within your TiDB database environment

In [None]:
COLLECTION_NAME = "paul_graham_test"
tidbvec = TiDBVector(
    connection_string=tidb_connection_url,
    collection_name=COLLECTION_NAME,
    pre_delete_collection=False,
)

Create a query engine based on tidb vectore store

In [None]:
storage_context = StorageContext.from_defaults(vector_store=tidbvec)
index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context, show_progress=True
)
query_engine = index.as_query_engine()

## Semantic similarity search

This section focus on vector search basics and refining results using metadata filters




In [None]:
response = query_engine.query("What did the author do?")
print(textwrap.fill(str(response), 100))

### Filter with metadata

perform searches using metadata filters to retrieve a specific number of nearest-neighbor results that align with the applied filters.