# Query data

Learn three search methods: **Vector** (semantic similarity), **Keyword** (BM25), and **Hybrid** (combines both).

## Connect to Weaviate

Connect to a Weaviate instance.

In [None]:
# Refresh credentials & load the Weaviate IP
from helpers import update_creds

AWS_ACCESS_KEY, AWS_SECRET_KEY, AWS_SESSION_TOKEN = update_creds()

%store -r WEAVIATE_IP

In [None]:
import weaviate

client = weaviate.connect_to_local(
    WEAVIATE_IP,
    headers = {
        "X-AWS-Access-Key": AWS_ACCESS_KEY,
        "X-AWS-Secret-Key": AWS_SECRET_KEY,
        "X-AWS-Session-Token": AWS_SESSION_TOKEN,
    }
)

client.is_ready()

### Helper function

Get our collection of financial articles (5000 articles total).

In [None]:
# STUDENT TODO:
# Instantiate the "FinancialArticles" collection object
# ADD YOUR CODE HERE

In [None]:
# STUDENT TODO:
# Check the collection size (hint: use len())
# ADD YOUR CODE HERE

### Fetch objects

In [None]:
# STUDENT TODO:
# Use .query.fetch_objects() to fetch 5 objects from the collection
# And iterate through the response to print the article titles
# Hint: response.objects gives you a list of the objects; each object has a .properties dictionary
# ADD YOUR CODE HERE

## Filters

Add conditions to narrow down search results before querying.

### Fetch with filters

In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Perfrom the same query as above, but filter the results to only include articles with "apple" in the title
# Hint: use Filter.by_property(<property_name>).like(<search_pattern>)
# ADD YOUR CODE HERE

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Try another filter - fetch articles with "aust*" and "ev" in the title
# Hint: use & to combine multiple Filter conditions
# ADD YOUR CODE HERE

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

## Keyword Search

![images/search_keyword.png](images/search_keyword.png)

[Docs - keyword/bm25](https://weaviate.io/developers/weaviate/search/bm25)

In [None]:
# STUDENT TODO:
# Construct a BM25 query to find articles about "earnings report"
# Query the "article_title" property
# Limit the results to 5 articles
# ADD YOUR CODE HERE

for item in response.objects:
    print(item.properties["article_title"])

In [None]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the score metadata
# Hint: from weaviate.classes.query import MetadataQuery
# Use this in the return_metadata parameter: MetadataQuery(score=True)
# ADD YOUR CODE HERE

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

## Vector search

![images/search_vector.png](images/search_vector.png)

[Docs - near_text](https://weaviate.io/developers/weaviate/search/similarity#an-input-medium)

In [None]:
# STUDENT TODO:
# Perform a `near_text` query to find the top 5 articles
# with `title` target vector closest to "tech market trends"
# ADD YOUR CODE HERE

for item in response.objects:
    print(item.properties["article_title"])

In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the distance metadata
# Use this in the return_metadata parameter: MetadataQuery(distance=True)
# ADD YOUR CODE HERE

## Search with filters

Combine vector search with filters for precise results.

In [None]:
from weaviate.classes.query import Filter

# STUDENT TODO:
# Perform the near_text query again, but filter the results to only include articles with "netflix" in the title
# Hint: use Filter.by_property(<property_name>).like(<search_pattern>)
# ADD YOUR CODE HERE

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

## Hybrid search

![images/search_hybrid.png](images/search_hybrid.png)

[Docs - hybrid](https://weaviate.io/developers/weaviate/search/hybrid)

In [None]:
# STUDENT TODO:
# Perform a hybrid query to find articles about "earnings report"
# Query the "article_title" property (the property used for BM25)
# Target the "title" vector (the vector used for near_text)
# Limit the results to 5 articles
# Use an alpha value of 0.7 to weight the vector search results more heavily
# ADD YOUR CODE HERE

for item in response.objects:
    print(item.properties["article_title"])

### Hybrid - Explain score

See how hybrid search combines vector and keyword scores.

In [None]:
from weaviate.classes.query import MetadataQuery

# STUDENT TODO:
# Perform the same query, but also return the score and explain_score metadata
# Use this in the return_metadata parameter: MetadataQuery(score=True, explain_score=True)
# ADD YOUR CODE HERE

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)
    print(item.metadata.explain_score)

## Close the client

Always close your connection when finished.

In [None]:
client.close()