## Getting Started

This is a simple tutorial showing how to use minDB. This example makes use of the Fiqa Beir dataset. You can find more information about the Beir datasets [here](https://github.com/beir-cellar/beir)

### Set up the environment

First, we will set up the environment by importing the required libraries and appending the paths needed

In [None]:
import os
import sys
import numpy as np
import pickle

# Load in minDB from the local directory
current_dir = os.getcwd()
sys.path.append(current_dir + "/../")
sys.path.append(current_dir + "/../tests/integration/")

from mindb.mindb import minDB, load_db
from tests.data import helpers

### Load in test data

In [None]:
# Load in the Fiqa test data
vectors, text, queries, _ = helpers.fiqa_test_data()
with open(current_dir + "/../tests/data/fiqa_queries_text.pickle", "rb") as f:
    query_text = pickle.load(f)

print (len(vectors))
print (type(vectors[0][0]))


### Create the minDB object

In [None]:
# Create the minDB
db_name = "fiqa_test"
db = minDB(db_name)

### Load in the minDB

This section is not necessary to run, it just shows how to load in an minDB object that has been created

In [None]:
# Optional: Load in the minDB object
db = load_db(db_name)

### Add data to the minDB

The data must be a list of tuples, where each tuple contains `(vector, metadata)`

In [None]:
# Add the data to the minDB
add_data = [(vectors[i], {"text": text[i]}) for i in range(len(vectors))]
db.add(add_data)

In [None]:
# Get info
print(db.vector_dimension)

### Train the faiss index

For this example, we are using PCA 256, compressed vector bytes of 32, and omitting OPQ

For more information on these parameters, you can visit the Github Wiki [here](https://github.com/SuperpoweredAI/spDB/wiki/Tunable-parameters)

In [None]:
# Train the minDB
db.train(True, pca_dimension=256, compressed_vector_bytes=32, omit_opq=True)

### Query the trained index

Make a test query using the `db.query()` method

In [None]:
# Make a test query
results = db.query(queries[0])
print ("Query text:", query_text[0])
print ("")
print (results["metadata"][0]["text"])