## Introduction to Weaviate - Demo 1

### Setup

<a target="_blank" href="https://colab.research.google.com/github/weaviate-tutorials/intro-workshop/blob/main/1_weaviate_examples.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Define helper functions

In [None]:
def trunc_item(item_in):
    if len(str(item_in)) > 100:
        return str(item_in)[:100] + "..."
    else:
        return str(item_in)

def getprint(weaviate_result, truncate=True):
    for k, results in weaviate_result["data"]["Get"].items():
        print(f"========== {k} Results: ==========")
        for r in results:
            for item_k, item_v in r.items():
                if truncate:
                    item_v = trunc_item(item_v)
                print(f"{item_k}: {trunc_item(item_v)}")
            print("\n")

In [None]:
import weaviate
import os
import json

auth=weaviate.AuthApiKey(api_key="weaviate-workshop")  # Note: Read-only key
client = weaviate.Client(
    "https://edu-demo.weaviate.network",
    auth_client_secret=auth,
#     additional_headers={  # After the demo, uncomment this and pass your own API credentials
#         "X-OpenAI-Api-Key": os.environ["OPENAI_API_KEY"],
#         "X-Cohere-Api-Key": os.environ["COHERE_API_KEY"]
#     }
)

### Search

In [None]:
res = client.query.get(
    "WikiCity", ["city_name", "wiki_summary"]
).with_near_text({
    "concepts": ["Major European city"]
}).with_limit(5).do()

In [None]:
getprint(res)

In [None]:
res = client.query.get(
    "WikiArticle", ['title']
).with_near_text({
    "concepts": ["Formula 1 driver"]
}).with_limit(1).do()

In [None]:
getprint(res)

#### Linguistic flexibility

Vector search allows for flexibility & linguistic freedom. 

... in more than one sense of the word.

In [None]:
res = client.query.get(
    "MultiLingualReview", 
    ['review_body', 'review_title', 'product_category']
).with_near_text({
    "concepts": ["did not receive product"]
}).with_limit(4).do()

In [None]:
getprint(res)

In [None]:
queries = [
    "did not receive product", "没有收到产品", 
    "Produkt nicht erhalten", "no recibi producto"
]

results_list = list()
for q in queries:
    res = client.query.get(
        "MultiLingualReview", 
        ['review_body', 'review_title', 'product_category']
    ).with_near_text(
        {"concepts": [q]}
    ).with_limit(4).do()
    results_list.append(res)

In [None]:
for i, res in enumerate(results_list):
    q = queries[i]
    print(f"Query: {q}")
    getprint(res)

With this Cohere model, Weaviate speaks...

**more than 100** languages.

## More than just search

With Weaviate, you can do more than just **retrieve** data. 

Weaviate + modern AI tools → **dynamic** data.

### Question answering

In [None]:
ask = {
  "question": "How many races has Lewis Hamilton won?",
  "properties": ["wiki_summary"]
}

res = (
  client.query
  .get("WikiArticle", [
      "title", 
      "_additional {answer {hasAnswer property result startPosition endPosition} }"
  ])
  .with_ask(ask)
  .with_limit(1)
  .do()
)

In [None]:
print(json.dumps(res["data"]["Get"]["WikiArticle"], indent=2))

In [None]:
ask = {
  "question": "Which cities have hosted the Olympics?",
  "properties": ["wiki_summary"]
}

res = (
  client.query
  .get("WikiCity", [
      "city_name", 
      "_additional {answer {hasAnswer property result} }"
  ])
  .with_ask(ask)
  .with_limit(20)
  .do()
)

In [None]:
res

In [None]:
for d in res["data"]["Get"]["WikiCity"]:
    if d["_additional"]["answer"]["hasAnswer"] and "not" not in d["_additional"]["answer"]["result"]:
        print(d["city_name"])
        print(d["_additional"]["answer"]["result"])

### Search + Generative model

Search + `generative-openai` module → **magic**

Transform information like:

In [None]:
res = client.query.get(
    "WikiCity", ["city_name", "wiki_summary"]
).with_near_text({
    "concepts": ["Popular European tourist destination"]
}).with_limit(5).with_generate(
    single_prompt=\
    "Write a tweet with a potentially surprising fact from {wiki_summary}"
).do()

In [None]:
for wa in res["data"]["Get"]["WikiCity"]:
    print(wa["_additional"]["generate"]["singleResult"], "\n")

Reduce the amount of work in aggregating and summarizing information.

In [None]:
res = client.query.get(
    "MultiLingualReview", ['review_body', 'review_title', 'product_category']
).with_near_text({
    "concepts": ["unhappy with seller"]
}).with_limit(20).with_generate(
    grouped_task=\
    "What are some of the top reasons cited for being unhappy" + \
    "based on this passage? Do not cite any additional inferred ideas."
).do()

In [None]:
for r in res["data"]["Get"]["MultiLingualReview"]:
    print(r["review_title"])

In [None]:
print(res["data"]["Get"]["MultiLingualReview"][0]["_additional"]["generate"]["groupedResult"], "\n")

### So... how does it all work?

Let's take a look.