# Query data

Learn three search methods: **Vector** (semantic similarity), **Keyword** (BM25), and **Hybrid** (combines both).

## Connect to Weaviate

Connect to a Weaviate instance.

In [1]:
# Refresh credentials & load the Weaviate IP
from helpers import update_creds

AWS_ACCESS_KEY, AWS_SECRET_KEY, AWS_SESSION_TOKEN = update_creds()

%store -r WEAVIATE_IP

In [None]:
import weaviate

client = weaviate.connect_to_local(
    WEAVIATE_IP,
    headers = {
        "X-AWS-Access-Key": AWS_ACCESS_KEY,
        "X-AWS-Secret-Key": AWS_SECRET_KEY,
        "X-AWS-Session-Token": AWS_SESSION_TOKEN,
    }
)

client.is_ready()

True

### Helper function

Get our collection of financial articles (5000 articles total).

In [None]:
# STUDENT TODO:
# Instantiate the "FinancialArticles" collection object
# BEGIN_SOLUTION
articles = client.collections.use("FinancialArticles")
# END_SOLUTION

In [None]:
# STUDENT TODO:
# Check the collection size (hint: use len())
# BEGIN_SOLUTION
len(articles)
# END_SOLUTION

5000

### Fetch objects

In [None]:
# STUDENT TODO:
# Use .query.fetch_objects() to fetch 5 objects from the collection
# And iterate through the response to print the article titles
# Hint: response.objects gives you a list of the objects; each object has a .properties dictionary
# BEGIN_SOLUTION
response = articles.query.fetch_objects(limit=5)

for item in response.objects:
    print(item.properties["article_title"])
# END_SOLUTION

Returned object count: 5
Qualcomm, Jabil: Should Apple's Suppliers Look Forward To A Strong 2021?
Apple Implications Overshadow Qualcomm’s Earnings Beat
Apple's Steady Results Prove Why It's the Ultimate Warren Buffett Stock
Thursday’s Vital Data: Intel, Apple and Nvidia
Is Unity Stock a Buy as It Gets a Major Boost From Apple?


## Filters

Add conditions to narrow down search results before querying.

### Fetch with filters

In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Perfrom the same query as above, but filter the results to only include articles with "apple" in the title
# Hint: use Filter.by_property(<property_name>).like(<search_pattern>)
# BEGIN_SOLUTION
response = articles.query.fetch_objects(
    limit=5,
    filters=Filter.by_property("article_title").like("apple")
)
# END_SOLUTION

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 5
Qualcomm, Jabil: Should Apple's Suppliers Look Forward To A Strong 2021?
Apple Implications Overshadow Qualcomm’s Earnings Beat
Apple's Steady Results Prove Why It's the Ultimate Warren Buffett Stock
Thursday’s Vital Data: Intel, Apple and Nvidia
Is Unity Stock a Buy as It Gets a Major Boost From Apple?


In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Try another filter - fetch articles with "aust*" and "ev" in the title
# Hint: use & to combine multiple Filter conditions
# BEGIN_SOLUTION
response = articles.query.fetch_objects(
    limit=5,
    filters=(
        Filter.by_property("article_title").like("aust*") &
        Filter.by_property("article_title").like("ev")
    )
)
# END_SOLUTION

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 2
ANALYSIS-Australia's push for faster EV uptake will be slow to charge
BMW AG (BAMXF) Announces Ambitious EV Plans in China, Austria


## Keyword Search

![images/search_keyword.png](images/search_keyword.png)

[Docs - keyword/bm25](https://weaviate.io/developers/weaviate/search/bm25)

In [None]:
# STUDENT TODO:
# Construct a BM25 query to find articles about "earnings report"
# Query the "article_title" property
# Limit the results to 5 articles
# BEGIN_SOLUTION
response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])

5 Big Takeaways From Costco's Earnings Report
1 Number To Watch in Alphabet's Earnings Report
Here's What Stood Out in Starbucks' Earnings Report
MercadoLibre (MELI) to Report Q2 Earnings: What's in Store?
TripAdvisor (TRIP) to Report Q3 Earnings: What's in Store?


In [11]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

5 Big Takeaways From Costco's Earnings Report
2.764371156692505
1 Number To Watch in Alphabet's Earnings Report
2.6476149559020996
Here's What Stood Out in Starbucks' Earnings Report
2.6476149559020996
MercadoLibre (MELI) to Report Q2 Earnings: What's in Store?
2.5403215885162354
TripAdvisor (TRIP) to Report Q3 Earnings: What's in Store?
2.5403215885162354


In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the score metadata
# Hint: from weaviate.classes.query import MetadataQuery
# Use this in the return_metadata parameter: MetadataQuery(score=True)
# BEGIN_SOLUTION
response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title", "article"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

Drug Stocks' Earnings Previews: KERX, BXLT, BMRN, AGEN, IMGN
2.3630902767181396
Drug Stocks Earnings Slated for Feb 14: INCY, PRTA & More
2.322143316268921
Q4 Earnings Inform the Market - Ahead of Wall Street
2.313457727432251
Can PVH Corp.'s (PVH) Q2 Earnings Maintain Solid Trend?
2.306293249130249
Semiconductor Stocks Q1 Earnings Preview: SNDK, XLNX, TXN
2.304908514022827


## Vector search

![images/search_vector.png](images/search_vector.png)

[Docs - near_text](https://weaviate.io/developers/weaviate/search/similarity#an-input-medium)

In [None]:
# STUDENT TODO:
# Perform a `near_text` query to find the top 5 articles
# with `title` target vector closest to "tech market trends"
# BEGIN_SOLUTION
response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title"
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])

Technology Sector Update for 05/18/2023: GOOG, META, SNOW
Technology Sector Update for 07/28/2023: INTC, YOU, META
Technology Sector Update for 05/17/2023: IBM, GOOG, TSLA, TME
Technology Sector Update for 01/19/2018: ZAYO,IBM,CRM,LRAD,ALGN,IPHI,WD
Technology Sector Update for 01/05/2022: GOOG,GOOGL,AMZN,AAPL,FB,WATT,GLBE,SHOP,SHOP.TO


In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the distance metadata
# Use this in the return_metadata parameter: MetadataQuery(distance=True)
# BEGIN_SOLUTION
response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title",
    return_metadata=MetadataQuery(distance=True)
)
# END

for item in response.objects:
    print(item.properties["article_title"])
    # STUDENT TODO:
    # Print the distance metadata for each item
    # BEGIN_SOLUTION
    print(item.metadata.distance)
    # END_SOLUTION

Technology Sector Update for 05/18/2023: GOOG, META, SNOW
0.5262356400489807
Technology Sector Update for 07/28/2023: INTC, YOU, META
0.5352218151092529
Technology Sector Update for 05/17/2023: IBM, GOOG, TSLA, TME
0.5485501289367676
Technology Sector Update for 01/19/2018: ZAYO,IBM,CRM,LRAD,ALGN,IPHI,WD
0.5497270822525024
Technology Sector Update for 01/05/2022: GOOG,GOOGL,AMZN,AAPL,FB,WATT,GLBE,SHOP,SHOP.TO
0.5543919801712036


## Search with filters

Combine vector search with filters for precise results.

In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Perform the near_text query again, but filter the results to only include articles with "netflix" in the title
# Hint: use Filter.by_property(<property_name>).like(<search_pattern>)
# BEGIN_SOLUTION
response = articles.query.near_text(
    query="strategy",
    target_vector="title",
    limit=5,
    filters=Filter.by_property("article_title").like("netflix")
)
# END_SOLUTION

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 5
3 Key Ways Netflix Has Changed Its Content Strategy
How I Knew to Stay Away from Netflix Before Earnings
Could This Be Netflix's Next Big Move?
Zacks Investment Ideas feature highlights: Microsoft, Netflix, Nvidia, Ionq and Oracle
The Zacks Analyst Blog Highlights: Time Warner, Netflix, Altria Group, Disney, CBS and Comcast


## Hybrid search

![images/search_hybrid.png](images/search_hybrid.png)

[Docs - hybrid](https://weaviate.io/developers/weaviate/search/hybrid)

In [None]:
# STUDENT TODO:
# Perform a hybrid query to find articles about "earnings report"
# Query the "article_title" property (the property used for BM25)
# Target the "title" vector (the vector used for near_text)
# Limit the results to 5 articles
# Use an alpha value of 0.7 to weight the vector search results more heavily
# BEGIN_SOLUTION
response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    target_vector="title",
    alpha=0.7
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])

New Quarterly Earnings Results for DIS, CMG, SNAP and More
Electronic Arts' (EA) Q3 Earnings and Revenues Increase Y/Y
Q4 Earnings Inform the Market - Ahead of Wall Street
Earnings Season: 3 Upcoming Reports Investors Can't Ignore
A Review of Big Techs’ Earnings Results


### Hybrid - Explain score

See how hybrid search combines vector and keyword scores.

In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the score and explain_score metadata
# Use this in the return_metadata parameter: MetadataQuery(score=True, explain_score=True)
# BEGIN_SOLUTION
response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    target_vector="title",
    alpha=0.7,
    return_metadata=MetadataQuery(score=True, explain_score=True)
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)
    print(item.metadata.explain_score)

New Quarterly Earnings Results for DIS, CMG, SNAP and More
0.699999988079071

Hybrid (Result Set vector,hybridVector) Document 174bf34a-639b-5229-bd24-532c7f91ccac: original score 0.38914192, normalized score: 0.7
Electronic Arts' (EA) Q3 Earnings and Revenues Increase Y/Y
0.6494570374488831

Hybrid (Result Set vector,hybridVector) Document 7c868813-e065-5e4a-9261-cfd7f197fcf7: original score 0.38081086, normalized score: 0.64945704
Q4 Earnings Inform the Market - Ahead of Wall Street
0.6336445212364197

Hybrid (Result Set vector,hybridVector) Document 1b47ac7d-ac03-588e-a030-b18e9bb4a056: original score 0.37820446, normalized score: 0.6336445
Earnings Season: 3 Upcoming Reports Investors Can't Ignore
0.5480558276176453

Hybrid (Result Set vector,hybridVector) Document 841143ca-0896-51ef-b2c3-c4d2a4dfa1a5: original score 0.36409676, normalized score: 0.5480558
A Review of Big Techs’ Earnings Results
0.5287110805511475

Hybrid (Result Set vector,hybridVector) Document b49744c9-6f88-5943

## Close the client

Always close your connection when finished.

In [16]:
client.close()