# üöÄ CRUD Operations in Weaviate

Welcome to the core of database management! In this notebook, we will explore the **CRUD** lifecycle (Create, Read, Update, Delete) within a Vector Database. 

Unlike traditional databases, every time we 'Create' or 'Update' an object here, Weaviate uses an ML model to transform our text into a high-dimensional vector. This allows for semantic retrieval later on.

### Step 1: Environmental Setup
First, we fetch a tiny slice of Jeopardy data and initialize our embedded Weaviate instance.

In [None]:
import requests
import json

# Download the data
resp = requests.get('https://raw.githubusercontent.com/weaviate-tutorials/quickstart/main/data/jeopardy_tiny.json')
data = json.loads(resp.text)  # Load data

# Parse the JSON and preview it
print(type(data), len(data))
print(json.dumps(data[0], indent=2))

def json_print(data):
    print(json.dumps(data, indent=2))

In [None]:
#Start Weaviate in embedded mode and specify an OpenAI API key

import weaviate
from weaviate import EmbeddedOptions
import os

client = weaviate.Client(
    embedded_options=EmbeddedOptions(),
    additional_headers={
        "X-OpenAI-Api-Key": os.environ.get("OPENAI_API_KEY", "YOUR_KEY_HERE")
    }
)

In [None]:
if client.schema.exists("Question"):
    client.schema.delete_class("Question")

In [None]:
class_obj = {
    "class": "Question",
    "vectorizer": "text2vec-openai",  
}

client.schema.create_class(class_obj)

### ‚ú® C is for CREATE

To create an object, we use the `data_object.create` method. We can provide a custom **UUID** (Universally Unique Identifier). If we don't, Weaviate will generate one for us. 



In [None]:
#Create an object
client.data_object.create(
    data_object={
        "question": "This vector database is open-source and very cool",
        "answer": "Weaviate",
        "category": "Software"
    },
    class_name="Question",
    uuid="d466453b-e7b3-442f-b1ef-becac6b9c7e1" # Manually setting a UUID for easy retrieval
)
print("Object Created!")

### üìñ R is for READ

We can retrieve an object directly using its UUID. In a vector database, we can choose to return just the data properties, or include the **Vector embedding**‚Äîthe numerical representation of the text. 



In [None]:
#Read the object that we just created using its ID
data_object = client.data_object.get_by_id(
    'd466453b-e7b3-442f-b1ef-becac6b9c7e1',
    class_name='Question'
)

json_print(data_object)

In [None]:
#Extract the vector for this object

data_object = client.data_object.get_by_id(
    'd466453b-e7b3-442f-b1ef-becac6b9c7e1',
    class_name='Question',
    with_vector=True
)

print("Object properties + first 5 dimensions of the vector:")
print(json.dumps(data_object['properties'], indent=2))
print(data_object['vector'][:5], "...")

### üîÑ U is for UPDATE

In Weaviate, you can perform a **Patch** (partial update) or a full **Replace**. When you update a property that is part of the vectorization (like the answer), Weaviate will automatically re-calculate the vector so your semantic search stays accurate!

In [None]:
#Update the object with a more detailed answer
client.data_object.update(
    data_object={
        "answer": "Weaviate (An open-source vector database)"
    },
    class_name="Question",
    uuid="d466453b-e7b3-442f-b1ef-becac6b9c7e1"
)
print("Object Updated!")

In [None]:
data_object = client.data_object.get_by_id(
    'd466453b-e7b3-442f-b1ef-becac6b9c7e1',
    class_name='Question',
)

print(json.dumps(data_object, indent=2))

### üóëÔ∏è D is for DELETE

The final stage of the lifecycle. Deleting an object removes both the metadata and the indexed vector from the HNSW graph (the vector index).

In [None]:
#Delete the object using its ID
client.data_object.delete(
    uuid="d466453b-e7b3-442f-b1ef-becac6b9c7e1",
    class_name="Question"
)
print("Object Deleted!")

In [None]:
#Examine that our database is empty

json_print(client.query.aggregate("Question").with_meta_count().do())