## FastAPI Tutorial

This requires uvicorn and fastapi to be installed by running

`pip install fastapi uvicorn`

In order to start the FastAPI, open up a terminal and run the following command (This must be done from the root directory of this project):

`uvicorn api.fastapi:app --host 0.0.0.0 --port 8000`

### Setup the environment

Load in the necessary packages and append the paths needed

In [None]:
import requests
import pickle
import sys
import os

# Load in minDB from the local directory
current_dir = os.getcwd()
sys.path.append(current_dir + "/../")
sys.path.append(current_dir + "/../tests/integration/")

from mindb.mindb import minDB
from tests.data import helpers

In [None]:
# Load in the Fiqa test data
vectors, text, queries, _ = helpers.fiqa_test_data()
with open(current_dir + "/../tests/data/fiqa_queries_text.pickle", "rb") as f:
    query_text = pickle.load(f)
# Vectors needs to be a list when using FastAPI
vectors = vectors.tolist()


### Create the minDB object

In [None]:
# Create a new minDB

db_name = "fast_api_test"
url = "http://0.0.0.0:8000/db/create"
response = requests.post(url, json={"name": db_name})
print (response.text)

### Add data to the minDB object

Adding data to the minDB object using FastAPI must be done in batches. We recommend using a batch size of ~100. Pushing this number too high will result in a failure

The data must also be a list. Numpy arrays are not a valid data type for FastAPI

In [None]:
# Add the data to the minDB in batches of 100
batch_size = 1000
data = [(vectors[i], {"text": text[i]}) for i in range(len(vectors))]

url = f"http://0.0.0.0:8000/db/{db_name}/add"

for i in range(0, 10000, batch_size):
    print (i)
    add_data = data[i:i+batch_size]
    response = requests.post(url, json={"add_data": add_data})

### Train the minDB object

For this example, we are using PCA 256, compressed vector bytes of 32, and omitting OPQ

For more information on these parameters, you can visit the Github Wiki [here](https://github.com/D-Star-AI/minDB/wiki/Tunable-parameters)

In [None]:
# Train the minDB

url = f"http://0.0.0.0:8000/db/{db_name}/train"
response = requests.post(url, json={
    "use_two_level_clustering": False,
    "pca_dimension": 256,
    "compressed_vector_bytes": 32,
    "omit_opq": True
})
print (response.text)

### Query the trained index

Make a test query using the `query` endpoint. The query vector must be converted to a list

In [None]:
url = f"http://0.0.0.0:8000/db/{db_name}/query"
query_vector = queries[0].tolist()
response = requests.post(url, json={"query_vector": query_vector})

print ("Query text:", query_text[0])
print ("")
print (response.json()["metadata"][0]["text"])