## Introduction to Weaviate - Demo 1

This is intended as a quick demonstration of Weaviate's capabilities - run through the demo to see some examples of what you can do with Weaviate.

### Setup

Define helper functions

In [1]:
import weaviate
import os
import json

client = weaviate.Client(
    "https://edu-demo.weaviate.network",
    auth_client_secret=weaviate.AuthApiKey(api_key="learn-weaviate"),  # Note: Read-only key
    additional_headers={  # After the demo, uncomment this and pass your own API credentials
        "X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"],
        "X-Cohere-Api-Key": os.environ["COHERE_APIKEY"]
    }
)

In [2]:
def trunc_item(item_in):
    if len(str(item_in)) > 100:
        return str(item_in)[:100] + "..."
    else:
        return str(item_in)

def getprint(weaviate_result, truncate=True):
    for k, results in weaviate_result["data"]["Get"].items():
        print(f"========== {k} Results: ==========")
        for r in results:
            for item_k, item_v in r.items():
                if truncate:
                    item_v = trunc_item(item_v)
                print(f"{item_k}: {trunc_item(item_v)}")
            print("\n")

### Search

In [3]:
res = client.query.get(
    "WikiCity", ["city_name", "wiki_summary"]
).with_near_text({
    "concepts": ["Major European city"]
}).with_limit(5).do()

In [4]:
getprint(res)

city_name: Paris
wiki_summary: Paris (English: ; French pronunciation: ​[paʁi] (listen)) is the capital and most populous city of F...


city_name: London
wiki_summary: London is the capital and largest city of England and the United Kingdom, with a population of just ...


city_name: Madrid
wiki_summary: Madrid ( mə-DRID, Spanish: [maˈðɾið]) is the capital and most populous city of Spain. The city has a...


city_name: Berlin
wiki_summary: Berlin ( bur-LIN, German: [bɛɐ̯ˈliːn] (listen)) is the capital and largest city of Germany by both a...


city_name: Budapest
wiki_summary: Budapest (UK: , US: ; Hungarian pronunciation: [ˈbudɒpɛʃt] (listen)) is the capital and most populou...




In [5]:
res = client.query.get(
    "WikiArticle", ['title']
).with_near_text({
    "concepts": ["Formula 1 driver"]
}).with_limit(1).do()

In [6]:
res

{'data': {'Get': {'WikiArticle': [{'title': 'Lewis Hamilton'}]}}}

#### Linguistic flexibility

Vector search allows for flexibility & linguistic freedom. 

... in more than one sense of the word.

In [7]:
res = client.query.get(
    "MultiLingualReview", 
    ['review_body', 'review_title', 'product_category']
).with_near_text({
    "concepts": ["did not receive product"]
}).with_limit(4).do()

In [8]:
getprint(res)

product_category: grocery
review_body: Never received the product
review_title: Receiving the product


product_category: electronics
review_body: Habe das Produkt noch nicht erhalten
review_title: Produkt nicht da


product_category: electronics
review_body: Hallo, ich habe das Produkt bestellt, in den versand Details steht, dass es im Briefkasten hinterleg...
review_title: Produkt nicht erhalten


product_category: electronics
review_body: 未收到货，却显示货物已送达，也无法查询货物物流信息
review_title: 未收到货




In [9]:
queries = [
    "did not receive product", "没有收到产品", 
    "Produkt nicht erhalten", "no recibi producto"
]

results_list = list()
for q in queries:
    res = client.query.get(
        "MultiLingualReview", 
        ['review_body', 'review_title', 'product_category']
    ).with_near_text(
        {"concepts": [q]}
    ).with_limit(4).do()
    results_list.append(res)

In [10]:
for i, res in enumerate(results_list):
    q = queries[i]
    print(f"Query: {q}")
    getprint(res)

Query: did not receive product
product_category: grocery
review_body: Never received the product
review_title: Receiving the product


product_category: electronics
review_body: Habe das Produkt noch nicht erhalten
review_title: Produkt nicht da


product_category: electronics
review_body: Hallo, ich habe das Produkt bestellt, in den versand Details steht, dass es im Briefkasten hinterleg...
review_title: Produkt nicht erhalten


product_category: electronics
review_body: 未收到货，却显示货物已送达，也无法查询货物物流信息
review_title: 未收到货


Query: 没有收到产品
product_category: electronics
review_body: Habe das Produkt noch nicht erhalten
review_title: Produkt nicht da


product_category: grocery
review_body: Never received the product
review_title: Receiving the product


product_category: electronics
review_body: Hallo, ich habe das Produkt bestellt, in den versand Details steht, dass es im Briefkasten hinterleg...
review_title: Produkt nicht erhalten


product_category: electronics
review_body: 未收到货，却显示货物已送达，也无

With this Cohere model, Weaviate speaks...

**more than 100** languages.

## More than just search

With Weaviate, you can do more than just **retrieve** data. 

Weaviate + modern AI tools → **dynamic** data.

### Question answering

In [11]:
ask = {
  "question": "How many races has Lewis Hamilton won?",
  "properties": ["wiki_summary"]
}

res = (
  client.query
  .get("WikiArticle", [
      "title", 
      "_additional {answer {hasAnswer property result startPosition endPosition} }"
  ])
  .with_ask(ask)
  .with_limit(1)
  .do()
)

In [12]:
print(json.dumps(res["data"]["Get"]["WikiArticle"], indent=2))

[
  {
    "_additional": {
      "answer": {
        "endPosition": 0,
        "hasAnswer": true,
        "property": "",
        "result": " 103",
        "startPosition": 0
      }
    },
    "title": "Lewis Hamilton"
  }
]


In [13]:
ask = {
  "question": "Which cities have hosted the Olympics?",
  "properties": ["wiki_summary"]
}

res = (
  client.query
  .get("WikiCity", [
      "city_name", 
      "_additional {answer {hasAnswer property result} }"
  ])
  .with_ask(ask)
  .with_limit(20)
  .do()
)

In [14]:
res

{'data': {'Get': {'WikiCity': [{'_additional': {'answer': {'hasAnswer': False,
       'property': None,
       'result': None}},
     'city_name': 'Kōbe'},
    {'_additional': {'answer': {'hasAnswer': True,
       'property': 'wiki_summary',
       'result': ' Sapporo hosted the 1972 Winter Olympics, the first Winter Olympics ever held in Asia'}},
     'city_name': 'Sapporo'},
    {'_additional': {'answer': {'hasAnswer': False,
       'property': None,
       'result': None}},
     'city_name': 'Lima'},
    {'_additional': {'answer': {'hasAnswer': False,
       'property': None,
       'result': None}},
     'city_name': 'Paris'},
    {'_additional': {'answer': {'hasAnswer': True,
       'property': '',
       'result': ' Tokyo has hosted the Olympics three times: in 1964, 2020 (postponed'}},
     'city_name': 'Tokyo'},
    {'_additional': {'answer': {'hasAnswer': False,
       'property': None,
       'result': None}},
     'city_name': 'Rome'},
    {'_additional': {'answer': {'hasAns

In [15]:
for d in res["data"]["Get"]["WikiCity"]:
    if d["_additional"]["answer"]["hasAnswer"] and "not" not in d["_additional"]["answer"]["result"]:
        print(d["city_name"])
        print(d["_additional"]["answer"]["result"])

Sapporo
 Sapporo hosted the 1972 Winter Olympics, the first Winter Olympics ever held in Asia
Tokyo
 Tokyo has hosted the Olympics three times: in 1964, 2020 (postponed
Beijing
 Beijing and Shanghai
Rio de Janeiro
 Rio de Janeiro, Brazil; Tokyo, Japan; and Beijing, China.
Sydney
 Sydney has hosted the Olympics.
Seoul
 Seoul has hosted the 1988 Summer Olympics.
Los Angeles
 Los Angeles has hosted the Olympics in 1932 and 1984 and will host the 2028


### Search + Generative model

Search + `generative-openai` module → **magic**

Transform information like:

In [16]:
res = client.query.get(
    "WikiCity", ["city_name", "wiki_summary"]
).with_near_text({
    "concepts": ["Popular European tourist destination"]
}).with_limit(5).with_generate(
    single_prompt=\
    "Write a tweet with a potentially surprising fact from {wiki_summary}"
).do()

In [17]:
for wa in res["data"]["Get"]["WikiCity"]:
    print(wa["_additional"]["generate"]["singleResult"], "\n")

Did you know that Budapest has the largest thermal water cave system in the world? With around 80 geothermal springs, it's no wonder the city is known for its relaxing thermal baths! #Budapest #travel #funfact 

Did you know that Warsaw is home to the tallest building in the European Union? Varso Place stands at 310 meters tall and is just one of the many architectural wonders in this alpha global city. #Warsaw #architecture #EU 

Did you know that Paris is the fourth-most populated city in the European Union and the 30th most densely populated city in the world? With a population of over 2 million, it's no wonder it's known as the City of Light! #ParisFacts #CityofLight #SurprisingFacts 

Did you know that Berlin is home to the world's most visited zoo, the Zoological Garden? With over 3.3 million visitors annually, it's a must-see attraction for animal lovers visiting the city! 🦁🐯🐻 #BerlinFacts #TravelTrivia 

Did you know that Istanbul was founded in the 7th century BCE by Greek set

Reduce the amount of work in aggregating and summarizing information.

In [18]:
res = client.query.get(
    "MultiLingualReview", ['review_body', 'review_title', 'product_category']
).with_near_text({
    "concepts": ["unhappy with seller"]
}).with_limit(20).with_generate(
    grouped_task=\
    "What are some of the top reasons cited for being unhappy" + \
    "based on this passage? Do not cite any additional inferred ideas."
).do()

In [19]:
for r in res["data"]["Get"]["MultiLingualReview"]:
    print(r["review_title"])

Receiving the product
Do Not Buy from This Seller
El producto no ha llegado
Didn't recieve
Horrible
Keine Ware erhalten
Probably getting ripped off haven’t received yet
You shipped wrong size
Replaced item
Not working
DON'T BUY
Satisfied
Produkt nicht erhalten
Produkt nicht da
Defective
Reklamation
收到错误的商品
Entrega no correcta
never got the item
Precio engañoso


In [20]:
print(res["data"]["Get"]["MultiLingualReview"][0]["_additional"]["generate"]["groupedResult"], "\n")

The top reasons cited for being unhappy based on this passage are: not receiving the product, receiving a defective or wrong product, poor customer service, and not getting a refund or response from the seller. 



### So... how does it all work?

Let's take a look.