# Initial Setup

## Install Weaviate Python Client v4
> This notebook was created with Weaviate `1.23.6` and the Weaviate Client `4.4.rc1`

Run the below command to install the latest version of the Weaviate Python Client v4.

In [None]:
!pip install --pre -I "weaviate-client==4.4.rc1"

## Deploy Weaviate

Weaviate offers 3 deployment options:
* Embedded
* Self-hosted - with Docker Compose
* Cloud deployment - [Weaviate Cloud Service](https://console.weaviate.cloud/)

# Time to Build

## Connect to Weaviate

* If you are new to OpenAI, register at [https://platform.openai.com](https://platform.openai.com/) and head to [https://platform.openai.com/api-keys](https://platform.openai.com/api-keys) to create your API key.
* If you are new to Cohere, register at [https://cohere.com](https://https://cohere.com) and head to [https://dashboard.cohere.com/api-keys](https://dashboard.cohere.com/api-keys) to create your API key.

In [None]:
import weaviate, os

# Connect with Weaviate Embedded
# client = weaviate.connect_to_embedded(
#     version="1.23.6",
#     headers={
#         "X-OpenAI-Api-Key": os.environ['OPENAI_API_KEY'], # Replace with your inference API key
#         # "X-Cohere-Api-Key": os.environ['COHERE_API_KEY'], # Replace with your inference API key
#     })

# Connect to the local instance deployed with Docker Compose
client = weaviate.connect_to_local(
    headers={
        "X-OpenAI-Api-Key": os.environ['OPENAI_API_KEY'], # Replace with your inference API key
        "X-Cohere-Api-Key": os.environ['COHERE_API_KEY'], # Replace with your inference API key
    }
)

client.is_ready()

## Create a collection
[Weaviate Docs - collection creation and configuration](https://weaviate.io/developers/weaviate/configuration/schema-configuration)

In [None]:
import weaviate.classes as wvc

if client.collections.exists("Questions"):
    client.collections.delete("Questions")

# Create a collection here - with Cohere as a vectorizer
client.collections.create(
    name="Questions",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_cohere()
)

## Import data

### Sample Data

In [None]:
import requests, json

def load_data(path):
    resp = requests.get(path)
    return json.loads(resp.text)

data_10 = load_data("https://raw.githubusercontent.com/weaviate-tutorials/multimodal-workshop/main/1-intro/jeopardy_tiny.json")

print(json.dumps(data_10, indent=2))

### Insert Many
[Weaviate Docs - insert many](https://weaviate.io/developers/weaviate/manage-data/import)

In [None]:
# Insert data
questions = client.collections.get("Questions")
questions.data.insert_many(data_10)

### Data preview

In [None]:
# Show data preview
questions = client.collections.get("Questions")
response = questions.query.fetch_objects(limit=4)

for item in response.objects:
    print(item.uuid, item.properties)

In [None]:
# Show data preview - with vectors
questions = client.collections.get("Questions")
response = questions.query.fetch_objects(
    limit=4,
    include_vector=True
)

for item in response.objects:
    print(item.properties)
    print(item.vector, '\n')

### Super quick query example

In [None]:
response = questions.query.near_text(
    "Afrikan animals",
    # "Zwierzęta afrykańskie", #African animals in Polish
    # "アフリカの動物", #African animals in Japanese
    limit=2
)

for item in response.objects:
    print(item.properties)

## Create a collection with OpenAI and Generative module

In [None]:
# new collection with 1k objects and OpenAI vectorizer and generative model

import weaviate.classes as wvc

if client.collections.exists("Questions"):
    client.collections.delete("Questions")

# Create a collection here - with Cohere as a vectorizer
client.collections.create(
    name="Questions",
    vectorizer_config=wvc.config.Configure.Vectorizer.text2vec_openai(),
    generative_config=wvc.config.Configure.Generative.openai(model="gpt-4")
)

### Import data - 1k objects

In [None]:
data_1k = load_data("https://raw.githubusercontent.com/weaviate-tutorials/multimodal-workshop/main/1-intro/jeopardy_1k.json")

print(json.dumps(data_1k, indent=2))

In [None]:
# Insert data
questions = client.collections.get("Questions")
questions.data.insert_many(data_1k)

### RAG Examples

* `single_prompt` - generate text per returned object
* `group_task` - generate a single text for all returned objects

In [None]:
questions = client.collections.get("Questions")

response = questions.generate.near_text(
    query="musical instruments",
    limit=3,
    single_prompt="Write a short tweet about: {answer} that would match following description: {question}"
)

for item in response.objects:
    print(item.properties)
    print(item.generated, '\n')

In [None]:
questions = client.collections.get("Questions")

response = questions.generate.near_text(
    query="african animals",
    limit=4,
    grouped_task="Explain what this content is about."
)

print(response.generated)