# Lab #1
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/basic-operations-workshop/blob/main/lab1.ipynb)
1. Install pinecone client
2. Initialize Pinecone client and create your first index
3. Insert vectors and get statistics about your index
4. Query for top_k=10 with meta-data filter
5. TEARDOWN: Delete the index

# 1. Install Pinecone client 
Use the following shell command to install Pinecone:

In [1]:
!pip install -U "pinecone-client[grpc]" "python-dotenv"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


# 2. Initialize Pinecone client and create your first index

* To use Pinecone, you must have an API key. To find your API key, open the [Pinecone console](https://app.pinecone.io/organizations/-NF9xx-MFLRfp0AAuCon/projects/us-east4-gcp:55a4eee/indexes) and click API Keys. This view also displays the environment for your project. Note both your API key and your environment.
* Create a .env file and make sure the following properties are specified

```
PINECONE_API_KEY=[YOUR_PINECONE_API_KEY]
PINECONE_ENVIRONMENT=[YOUR_PINECONE_ENVIRONMENT]
PINECONE_INDEX_NAME=[YOUR_INDEX_NAME]
DIMENSIONS="768"
METRIC="euclidean"
```

* It will take roughly 1 minute to create your index. Once completed a list of all project indexes will be printed.

In [2]:
import os

from dotenv import load_dotenv
load_dotenv('.env')

PINECONE_INDEX_NAME = os.environ['PINECONE_INDEX_NAME']
PINECONE_API_KEY = os.environ['PINECONE_API_KEY']
PINECONE_ENVIRONMENT = os.environ['PINECONE_ENVIRONMENT']
DIMENSIONS = int(os.environ['DIMENSIONS'])
METRIC = os.environ['METRIC']

# print all of values to verify
print(f"PINECONE_INDEX_NAME: {PINECONE_INDEX_NAME}")
print(f"PINECONE_ENVIRONMENT: {PINECONE_ENVIRONMENT}")
print(f"PINECONE_API_KEY: {PINECONE_API_KEY}")
print(f"DIMENSIONS: {DIMENSIONS}")
print(f"METRIC: {METRIC}")

PINECONE_INDEX_NAME: james-williams
PINECONE_ENVIRONMENT: us-east4-gcp
PINECONE_API_KEY: 67e40045-9cca-486e-b1ce-1ad1a784d793
DIMENSIONS: 768
METRIC: euclidean


In [3]:
import pinecone

pinecone.init(api_key=PINECONE_API_KEY, environment=PINECONE_ENVIRONMENT)
pinecone.create_index(PINECONE_INDEX_NAME, dimension=DIMENSIONS, metric=METRIC, pods=1, replicas=1, pod_type="s1.x1")
pinecone.list_indexes()

  from tqdm.autonotebook import tqdm


['james-williams']

# 3. Insert vectors and get statistics about your index

* The upsert operation inserts a new vector in the index or updates the vector if a vector with the same ID is already present.
* The following commands upserts a large batch of vectors with meta-data into your index.

In [4]:
import numpy as np
import random
import time

def generate_vectors(dimensions):
    vectors = []
    id_seed = 1
    value_seed = 0.1

    for _ in range(500):
        meta_data = {"category": random.choice(["one", "two", "three"]),
                     "timestamp": time.time()}
        embeddings = np.full(shape=dimensions, fill_value=value_seed).tolist()
        vectors.append({'id': str(id_seed),
                        'values': embeddings,
                        'metadata': meta_data})
        id_seed = id_seed + 1
        value_seed = value_seed + 0.1
    return vectors

index = pinecone.Index(PINECONE_INDEX_NAME)
index.upsert(generate_vectors(DIMENSIONS))
index.describe_index_stats()

{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 500}},
 'total_vector_count': 500}

# 4. Query for top_k=10 with meta-data filter

The following example queries the index for the vectors that are most similar to the embedding and match the category filter.

In [7]:
embedding = np.full(DIMENSIONS,0.5).tolist()

index.query(
  vector = embedding,
  top_k=10,
  include_values=False,
  include_metadata=True,
  filter={
        "category": {"$eq": "one"}
  },)

{'matches': [{'id': '5',
              'metadata': {'category': 'one', 'timestamp': 1690907825.511887},
              'score': 0.0,
              'values': []},
             {'id': '4',
              'metadata': {'category': 'one', 'timestamp': 1690907825.511869},
              'score': 7.68005371,
              'values': []},
             {'id': '8',
              'metadata': {'category': 'one', 'timestamp': 1690907825.5119371},
              'score': 69.1201782,
              'values': []},
             {'id': '1',
              'metadata': {'category': 'one', 'timestamp': 1690907825.5117729},
              'score': 122.880013,
              'values': []},
             {'id': '11',
              'metadata': {'category': 'one', 'timestamp': 1690907825.5119848},
              'score': 276.480225,
              'values': []},
             {'id': '12',
              'metadata': {'category': 'one', 'timestamp': 1690907825.511997},
              'score': 376.319946,
              'values':

# 5. TEARDOWN: Delete the index

Free up project pod resources by deleting this index. It is no longer needed.

In [8]:
pinecone.delete_index(PINECONE_INDEX_NAME)