# Query data

Learn three search methods: **Vector** (semantic similarity), **Keyword** (BM25), and **Hybrid** (combines both).

## Connect to Weaviate

Connect to a Weaviate instance.

In [1]:
# Refresh credentials & load the Weaviate IP
from helpers import update_creds

AWS_ACCESS_KEY, AWS_SECRET_KEY, AWS_SESSION_TOKEN = update_creds()

%store -r WEAVIATE_IP

In [2]:
import weaviate

client = weaviate.connect_to_local(
    WEAVIATE_IP,
    headers = {
        "X-AWS-Access-Key": AWS_ACCESS_KEY,
        "X-AWS-Secret-Key": AWS_SECRET_KEY,
        "X-AWS-Session-Token": AWS_SESSION_TOKEN,
    }
)

client.is_ready()

True

### Helper function

Get our collection of financial articles (5000 articles total).

In [3]:
# STUDENT TODO:
# Instantiate the "FinancialArticles" collection object
# BEGIN_SOLUTION
articles = client.collections.use("FinancialArticles")
# END_SOLUTION

## Filters

Add conditions to narrow down search results before querying.

### Fetch with filters

In [4]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Perfrom articles.query.fetch_objects, but add a filter the results
# to only include articles with "apple" in the title
# Hint: filters=Filter.by_property(<property_name>).like(<search_pattern>)
# BEGIN_SOLUTION
response = articles.query.fetch_objects(
    limit=5,
    filters=Filter.by_property("article_title").like("apple")
)
# END_SOLUTION

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 5
Qualcomm, Jabil: Should Apple's Suppliers Look Forward To A Strong 2021?
Apple Implications Overshadow Qualcomm’s Earnings Beat
Apple's Steady Results Prove Why It's the Ultimate Warren Buffett Stock
Thursday’s Vital Data: Intel, Apple and Nvidia
Buying This Fund Is Like Buying Apple With a 12.1% Dividend


In [5]:
from weaviate.classes.query import Filter

response = articles.query.fetch_objects(
    limit=5,
    filters=(
        Filter.by_property("article_title").like("aust*") &
        Filter.by_property("article_title").like("ev")
    )
)

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 2
ANALYSIS-Australia's push for faster EV uptake will be slow to charge
BMW AG (BAMXF) Announces Ambitious EV Plans in China, Austria


## Keyword Search

![images/search_keyword.png](images/search_keyword.png)

[Docs - keyword/bm25](https://weaviate.io/developers/weaviate/search/bm25)

In [6]:
# STUDENT TODO:
# Keyword search for articles about "earnings report"
# Limit the results to 5 articles
# BEGIN_SOLUTION
response = articles.query.bm25(
    query="earnings report",
    limit=5,
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])

MercadoLibre (MELI) to Report Q2 Earnings: What's in Store?
Here's What Stood Out in Starbucks' Earnings Report
Can PVH Corp.'s (PVH) Q2 Earnings Maintain Solid Trend?
Fortune Brands (FBHS): What to Expect in Q3 Earnings?
Will Currency Volatility Keep Hurting Abbott's (ABT) Earnings?


Retrieve scores also

In [7]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    # STUDENT TODO - inspect keyword search "score"
    # START_SOLUTION
    print(item.metadata.score)
    # END_SOLUTION

Here's What Stood Out in Starbucks' Earnings Report
2.6926846504211426
MercadoLibre (MELI) to Report Q2 Earnings: What's in Store?
2.5838348865509033
TripAdvisor (TRIP) to Report Q3 Earnings: What's in Store?
2.5838348865509033
PACCAR (PCAR) to Report Q3 Earnings: Is a Beat in Store?
2.4834437370300293
Pre-Market Earnings Report for March 21, 2022 : PDD, MRNS, AXU
2.4834437370300293


## Vector search

![images/search_vector.png](images/search_vector.png)

[Docs - near_text](https://weaviate.io/developers/weaviate/search/similarity#an-input-medium)

In [8]:
# STUDENT TODO:
# Perform a `near_text` query to find the top 5 articles
# with `title` target vector closest to "tech market trends"
# BEGIN_SOLUTION
response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title"
)
# END_SOLUTION

for item in response.objects:
    print(item.properties["article_title"])

Technology Sector Update for 07/28/2023: INTC, YOU, META
Technology Sector Update for 01/05/2022: GOOG,GOOGL,AMZN,AAPL,FB,WATT,GLBE,SHOP,SHOP.TO
Technology Sector Update for 12/12/2018: FB, INTC, PLAB
Technology Sector Update for 07/01/2022: META, ATVI, PING, MU
Tech Is a Long-Term Bullish Sector: Recent Meltdown Temporary


In [9]:
from weaviate.classes.query import MetadataQuery

response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title",
    return_metadata=MetadataQuery(distance=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.distance)

Technology Sector Update for 07/28/2023: INTC, YOU, META
0.5352218151092529
Technology Sector Update for 01/05/2022: GOOG,GOOGL,AMZN,AAPL,FB,WATT,GLBE,SHOP,SHOP.TO
0.5543919801712036
Technology Sector Update for 12/12/2018: FB, INTC, PLAB
0.5621839165687561
Technology Sector Update for 07/01/2022: META, ATVI, PING, MU
0.5749168395996094
Tech Is a Long-Term Bullish Sector: Recent Meltdown Temporary
0.5751214027404785


## Search with filters

Combine vector search with filters for precise results.

In [10]:
from weaviate.classes.query import Filter

response = articles.query.near_text(
    query="strategy",
    target_vector="title",
    limit=5,
    filters=Filter.by_property("article_title").like("netflix")
)

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 5
Could This Be Netflix's Next Big Move?
The Zacks Analyst Blog Highlights: Time Warner, Netflix, Altria Group, Disney, CBS and Comcast
Netflix and Alphabet: What's New?
Netflix, Inc. and PayPal Holdings, Inc.: Long-Term Growth Stories Find Favor
Will Netflix (NFLX) Beat Earnings Estimates? - Analyst Blog


## Hybrid search

![images/search_hybrid.png](images/search_hybrid.png)

[Docs - hybrid](https://weaviate.io/developers/weaviate/search/hybrid)

In [11]:
response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    target_vector="title",
    alpha=0.7
)

for item in response.objects:
    print(item.properties["article_title"])

New Quarterly Earnings Results for DIS, CMG, SNAP and More
Electronic Arts' (EA) Q3 Earnings and Revenues Increase Y/Y
Earnings Outlook Reflects Stability
A Slew of New Q2 Earnings Reports: SBUX, T, CMG, V & PYPL
Weekly Market Summary: Investors Digest Strong Earnings Reports


### Hybrid - Explain score

See how hybrid search combines vector and keyword scores.

In [12]:
from weaviate.classes.query import MetadataQuery

response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    target_vector="title",
    alpha=0.7,
    return_metadata=MetadataQuery(score=True, explain_score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)
    print(item.metadata.explain_score)

New Quarterly Earnings Results for DIS, CMG, SNAP and More
0.699999988079071

Hybrid (Result Set vector,hybridVector) Document 174bf34a-639b-5229-bd24-532c7f91ccac: original score 0.38914192, normalized score: 0.7
Electronic Arts' (EA) Q3 Earnings and Revenues Increase Y/Y
0.6610763669013977

Hybrid (Result Set vector,hybridVector) Document 7c868813-e065-5e4a-9261-cfd7f197fcf7: original score 0.38081086, normalized score: 0.66107637
Earnings Outlook Reflects Stability
0.5437194108963013

Hybrid (Result Set keyword,bm25) Document f5b65499-8840-56fb-94d0-39f7e91acc86: original score 1.2311968, normalized score: 0.038721338 - 
Hybrid (Result Set vector,hybridVector) Document f5b65499-8840-56fb-94d0-39f7e91acc86: original score 0.34740448, normalized score: 0.5049981
A Slew of New Q2 Earnings Reports: SBUX, T, CMG, V & PYPL
0.5307629704475403

Hybrid (Result Set vector,hybridVector) Document 1f0f68b8-561e-581d-b65a-ed76c73ef0ba: original score 0.3529191, normalized score: 0.530763
Weekly M

## Close the client

Always close your connection when finished.

In [13]:
client.close()