In [1]:
import weaviate
client = weaviate.Client("http://localhost:8082")
client.schema.get()

{'classes': [{'class': 'Knowledge_chunk',
   'invertedIndexConfig': {'bm25': {'b': 0.75, 'k1': 1.2},
    'cleanupIntervalSeconds': 60,
    'stopwords': {'additions': None, 'preset': 'en', 'removals': None}},
   'moduleConfig': {'generative-openai': {},
    'text2vec-openai': {'model': 'ada',
     'modelVersion': '002',
     'type': 'text',
     'vectorizeClassName': True}},
   'multiTenancyConfig': {'enabled': False},
   'properties': [{'dataType': ['text'],
     'description': "This property was generated by Weaviate's auto-schema feature on Fri Jul 21 14:52:11 2023",
     'indexFilterable': True,
     'indexSearchable': True,
     'moduleConfig': {'text2vec-openai': {'skip': False,
       'vectorizePropertyName': False}},
     'name': 'source_title',
     'tokenization': 'word'},
    {'dataType': ['text'],
     'description': "This property was generated by Weaviate's auto-schema feature on Fri Jul 21 14:52:11 2023",
     'indexFilterable': True,
     'indexSearchable': True,
     'm

In [20]:
response = (
    client.query.get("Knowledge_chunk", ["body"])
    .with_limit(2)
    .with_bm25(query="search")
    .do()
)

In [21]:
response

{'data': {'Get': {'Knowledge_chunk': [{'body': "https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-chunk-documents\n\nChunking large documents for vector search solutions in Cognitive Search\nArticle\n07/11/2023\n4 contributors\nIn this article\nWhy is chunking important?\nHow chunking fits into the workflow\nSimple example of how to create chunks with sentences\nTry it out: Chunking and vector embedding generation sample\nSee also\n Important\n\nVector search is in public preview under supplemental terms of use. It's available through the Azure portal, preview REST API, and alpha SDKs.\n\nThis article describes several approaches for chunking large documents so that you can generate embeddings for vector search. Chunking is only required if source documents are too large for the maximum input size imposed by models.\n\nWhy is chunking important?\nThe models"},
    {'body': 'example, when dealing with large documents, you might use variable-sized chunks, but also appen

Possible search syntax:

Search

What about a syntax like this, separating:
- Properties always available for Get
- Optional search operator (vector searches, bm25, hybrid)
- Optional boolean filter
- Chained 2nd operation (Generate / Ask)

```python
from weaviate.weaviate_classes import NearText, Filter, FilterOperator, Metadata

search_response = collection.search.get(
    # ===== Parameters always available for `GraphQL/get`
    properties=["chunk", "title"],
    limit=2,
    metadata=Metadata(vector=True),

    # ===== And optional parameters
    # e.g. `NearText`, `NearVector`, `BM25Search`, `HybridSearch`, etc.
    search_operator=NearText(
        query="multi tenancy",
        distance=0.85,
        # Also autocut, certainty, etc.
    ),

    # Add Boolean filters
    filter=Filter(
        operator=FilterOperator.LessThan,  # Enum
        path=["chunk_no"],
        value=5
    )
)
```

For Generative / Ask, chaining makes sense to me. Because generate is a two-step query conceptually. What do you think?
```python
from weaviate.weaviate_classes import NearText

generative_response = collection.search.get(
    limit=2,
    properties=["chunk", "title"],
    search_operator=NearText(
        query="multi tenancy",
    ),
).with_generate(
    single_prompt="turn this into a country song verse",
    properties=["chunk"]
)
```

Two-step generative
In some situations this might be nice. Maybe we could do this with returned IDs? 🤔
```python
generative_response = search_response.with_generate(
    single_prompt="turn this into a country song verse",
    properties=["chunk"]
)
```

Aggregate

The `Get` query above should translate relatively well to aggregate, because `search_operator` and `filter` are universal. Then the user needs to select the meta properties.
```python
from weaviate.weaviate_classes import NearText, Filter, FilterOperator

search_response = collection.search.aggregate(
    # ===== Parameters always available for `GraphQL/get`
    meta_properties=[  # *shrug* maybe something like this?
        {"title": [
            MetaProperty.Text.COUNT,
            MetaProperty.Text.Top_occurrences.VALUE,
            MetaProperty.Text.Top_occurrences.COUNT,
        ]},
        {"chunk_no": [
            MetaProperty.Int.COUNT,
            MetaProperty.Int.MEAN,
        ]},
    ],
    object_limit=1000,

    # ===== And optional parameters
    search_operator=NearText(
        query="multi tenancy",
        distance=0.85,
    ),
    filter=Filter(
        operator=FilterOperator.LessThan,
        path=["chunk_no"],
        value=5
    )
)
```



```python
data.search.aggregate(
    searchOperators = [
        NearText(query="italian pizza"),
        Filter(property="price", operator="GreaterThan", value=100)
    ],
    meta_properties=[
        {"title": [
            MetaProperty.Text.COUNT,
            MetaProperty.Text.Top_occurrences.VALUE,
            MetaProperty.Text.Top_occurrences.COUNT,
        ]},
        {"page_no": [
            MetaProperty.Int.COUNT,
            MetaProperty.Int.MEAN,
        ]},
    ],
    count=True,
    limit=10,
)
```