# Query data

Learn three search methods: **Vector** (semantic similarity), **Keyword** (BM25), and **Hybrid** (combines both).

## Connect to Weaviate

Connect to a Weaviate instance.

In [1]:
import weaviate
from weaviate.classes.init import Auth
import os
# from weaviate.classes.init import AdditionalConfig, Timeout

# client = weaviate.connect_to_custom(
#     http_host="<http_host>",
#     http_port="<http_port>",
#     grpc_host="<grpc_host>",
#     grpc_port="<grpc_port>",
# )

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=os.environ["WCD_TEST_URL"],
    auth_credentials=os.environ["WCD_TEST_KEY"]
)

client.is_ready()

True

### Helper function

Get our collection of financial articles (5000 articles total).

In [None]:
articles = client.collections.use("FinancialArticles")

In [3]:
len(articles)

100

## Vector search

![images/search_vector.png](images/search_vector.png)

[Docs - near_text](https://weaviate.io/developers/weaviate/search/similarity#an-input-medium)

In [4]:
response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title"
)

for item in response.objects:
    print(item.properties["article_title"])

Wednesday Sector Laggards: Utilities, Technology & Communications
Technology Sector Update for 07/14/2023: ASML, WDC, META, OPRA
Does Nvidia Have a $1 Trillion Market Opportunity?
A Bull Market Is Coming: 2 Phenomenal Growth Stocks Insiders Are Buying Like There's No Tomorrow
Analog Devices (ADI) Q4 Earnings to Ride on End Market Growth


In [5]:
from weaviate.classes.query import MetadataQuery

response = articles.query.near_text(
    query="tech market trends",
    limit=5,
    target_vector="title",
    return_metadata=MetadataQuery(distance=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.distance)

Wednesday Sector Laggards: Utilities, Technology & Communications
0.6580092310905457
Technology Sector Update for 07/14/2023: ASML, WDC, META, OPRA
0.6774057149887085
Does Nvidia Have a $1 Trillion Market Opportunity?
0.6901583075523376
A Bull Market Is Coming: 2 Phenomenal Growth Stocks Insiders Are Buying Like There's No Tomorrow
0.7020165920257568
Analog Devices (ADI) Q4 Earnings to Ride on End Market Growth
0.707457423210144


## Filters

Add conditions to narrow down search results before querying.

### Fetch with filters

In [6]:
from weaviate.classes.query import Filter

response = articles.query.fetch_objects(
    limit=5,
    filters=Filter.by_property("article_title").like("apple")
)

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 0


In [7]:
from weaviate.classes.query import Filter

response = articles.query.fetch_objects(
    limit=5,
    filters=(
        Filter.by_property("article_title").like("aust*") &
        Filter.by_property("article_title").like("ev")
    )
)

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 0


## Search with filters

Combine vector search with filters for precise results.

In [8]:
from weaviate.classes.query import Filter

response = articles.query.near_text(
    query="strategy",
    target_vector="title",
    limit=5,
    filters=Filter.by_property("article_title").like("netflix")
)

print(f"Returned object count: {len(response.objects)}")

for item in response.objects:
    print(item.properties["article_title"])

Returned object count: 0


## Keyword Search

![images/search_keyword.png](images/search_keyword.png)

[Docs - keyword/bm25](https://weaviate.io/developers/weaviate/search/bm25)

In [9]:
response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
)

for item in response.objects:
    print(item.properties["article_title"])

Microsoft says product chief Panay to leave, report says headed to Amazon
Cirrus Logic (CRUS) Q2 Earnings & Revenues Top, Shares Up
Ansys Q2 22 Earnings Conference Call At 8:30 AM ET
Analog Devices (ADI) Q4 Earnings to Ride on End Market Growth
PepsiCo (PEP) Q4 Earnings: Taking a Look at Key Metrics Versus Estimates


In [10]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

Microsoft says product chief Panay to leave, report says headed to Amazon
1.4298129081726074
Cirrus Logic (CRUS) Q2 Earnings & Revenues Top, Shares Up
1.0003288984298706
Ansys Q2 22 Earnings Conference Call At 8:30 AM ET
0.92288738489151
Analog Devices (ADI) Q4 Earnings to Ride on End Market Growth
0.92288738489151
PepsiCo (PEP) Q4 Earnings: Taking a Look at Key Metrics Versus Estimates
0.8884954452514648


In [11]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title", "article"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

Cirrus Logic (CRUS) Q2 Earnings & Revenues Top, Shares Up
2.1827924251556396
Nokia (NOK) Solutions to Power Charter's Network Connectivity
2.061749219894409
Dollar Tree (DLTR) Stock Moves -0.48%: What You Should Know
2.014293909072876
Pre-Market Most Active for Feb 4, 2014 : JCP, ATMI, KORS, SIRI, ZNGA, BAC, XIV, FB, ARMH, ALU, NMR, DB
1.9457871913909912
Stocks End Mostly Lower in Late Selloff
1.8376561403274536


In [12]:
from weaviate.classes.query import MetadataQuery

response = articles.query.bm25(
    query="earnings report",
    query_properties=["article_title", "article^3"],
    limit=5,
    return_metadata=MetadataQuery(score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)

Nokia (NOK) Solutions to Power Charter's Network Connectivity
6.185247421264648
Dollar Tree (DLTR) Stock Moves -0.48%: What You Should Know
6.042881488800049
Pre-Market Most Active for Feb 4, 2014 : JCP, ATMI, KORS, SIRI, ZNGA, BAC, XIV, FB, ARMH, ALU, NMR, DB
5.8373613357543945
Stocks End Mostly Lower in Late Selloff
5.51296854019165
Cirrus Logic (CRUS) Q2 Earnings & Revenues Top, Shares Up
5.491669654846191


## Hybrid search

![images/search_hybrid.png](images/search_hybrid.png)

[Docs - hybrid](https://weaviate.io/developers/weaviate/search/hybrid)

In [13]:
response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    alpha=0.7,
    target_vector="title"
)

for item in response.objects:
    print(item.properties["article_title"])

JD.com (JD) to Report Q3 Earnings: What's in the Offing?
Berkshire Hathaway Q2 Earnings Lag; to Ink its Biggest Deal - Analyst Blog
Applied Materials (AMAT) Surpasses Q1 Earnings and Revenue Estimates
Validea Guru Fundamental Report for AAPL - 2/25/2023
Fortinet's Q4 Earnings Lag, Rev Beats - Analyst Blog


### Hybrid - Explain score

See how hybrid search combines vector and keyword scores.

In [14]:
from weaviate.classes.query import MetadataQuery

response = articles.query.hybrid(
    query="earnings report",
    query_properties=["article_title"],
    limit=5,
    alpha=0.7,
    target_vector="title",
    return_metadata=MetadataQuery(score=True, explain_score=True)
)

for item in response.objects:
    print(item.properties["article_title"])
    print(item.metadata.score)
    print(item.metadata.explain_score)

JD.com (JD) to Report Q3 Earnings: What's in the Offing?
1.0

Hybrid (Result Set keyword,bm25) Document de2e20e7-017b-5582-8610-1354b7ff9d37: original score 2.3183084, normalized score: 0.3 - 
Hybrid (Result Set vector,hybridVector) Document de2e20e7-017b-5582-8610-1354b7ff9d37: original score 0.29265332, normalized score: 0.7
Berkshire Hathaway Q2 Earnings Lag; to Ink its Biggest Deal - Analyst Blog
0.675672173500061

Hybrid (Result Set keyword,bm25) Document 3ebf30a7-badd-5e3b-9a08-066d0e18ca7a: original score 0.88849545, normalized score: 0.0065512634 - 
Hybrid (Result Set vector,hybridVector) Document 3ebf30a7-badd-5e3b-9a08-066d0e18ca7a: original score 0.2802862, normalized score: 0.6691209
Applied Materials (AMAT) Surpasses Q1 Earnings and Revenue Estimates
0.6292091012001038

Hybrid (Result Set keyword,bm25) Document d9d2b595-dd70-5eee-98e6-a865f97f2fcd: original score 1.0003289, normalized score: 0.029503489 - 
Hybrid (Result Set vector,hybridVector) Document d9d2b595-dd70-5eee

## Close the client

Always close your connection when finished.

In [15]:
client.close()