# Getting Started with Weaviate Python Library

##  How to use the python-client with a weaviate cluster?

###  Create an weaviate instance/cluster.

Creating a Weaviate instance can be done in multiple ways. It can be done using a `docker-compose.yaml` file that

Another option is to create an account on [Weaviate Cloud Service](https://console.semi.technology/) (WCS) and create a cluster there. There are different options for clusters you can choose from. 

install the Weaviate Python Client

In [1]:
import sys
!{sys.executable} -m pip install weaviate-client==2.5.0 

Defaulting to user installation because normal site-packages is not writeable


__UPDATE__ for version __3.0.0__.

```python
import sys
!{sys.executable} -m pip install weaviate-client==3.0.0
```

import the package and create a cluster on WCS

In [2]:
from getpass import getpass # hide password
import weaviate # to communicate to the Weaviate instance
from weaviate.tools import WCS

__UPDATE__ for version __3.0.0__.

```python
from getpass import getpass # hide password
import weaviate # to communicate to the Weaviate instance
from weaviate.wcs import WCS
```

In order to authenticate to WCS or Weaviate instance (if Weaviate instance has Authentication enable) we need to create an Authentication object. At the moment it supports two types of authentication credentials: 
* Password credentials: `weaviate.auth.AuthClientPassword(username='WCS_ACCOUNT_EMAIL', password='WCS_ACCOUNT_PASSWORD')`
* Token credentials `weaviate.auth.AuthClientCredentials(client_secret=YOUR_SECRET_TOKEN)`



In [3]:
my_credentials = weaviate.auth.AuthClientPassword(username=input("User name: "), password=getpass('Password: '))

User name: renukaalai@gmail.com
Password: ········


In [None]:
my_wcs = WCS(my_credentials)

Now that we connected to WCS, we can `create`, `delete`, `get_clusters`, `get_cluster_config` and check the status of a cluster with `is_ready` method.

 prototype of the `create` method:
```python
my_wcs.create(cluster_name:str=None, 
    cluster_type:str='sandbox',
    config:dict=None,
    wait_for_completion:bool=True) -> str
```
The return value is the URL of the created cluster.

*If you want to check the prototype and docstring of any methods in a notebook, run this command: `object.method?`. You can also use the `help()` function.*<br>
Ex: `WCS.is_ready?` or `my_wcs.is_ready?` or `help(WCS.is_ready)`.

In [16]:
cluster_name = 'my-first-weaviate-instance'
weaviate_url = my_wcs.create(cluster_name=cluster_name)
weaviate_url

'https://my-first-weaviate-instance.semi.network'

In [17]:
my_wcs.is_ready(cluster_name)

True

###  Connect to the cluster.

 connect to the created weaviate instance with the `Client` object. The constructor looks like this:
```python
weaviate.Client(
    url:str,
    auth_client_secret:weaviate.auth.AuthCredentials=None,
    timeout_config:Union[Tuple[int, int], NoneType]=None,
)
```

The constructor has only one required argument, `url`, and two optional ones: `auth_client_secret` - used if weaviate instance has authentication enabled and `timeout_config` - that sets REST time out configuration and is a tuple (retries, time out seconds).

In [18]:
client = weaviate.Client(weaviate_url)

it does not necessary mean that it is all set up. It might still do some setup processes in the background. We can check the health of the Weaviate instance by calling the `.is_live` method, and check if Weaviate is ready for requests by calling the `.is_ready`.

In [19]:
client.is_ready()

True

###  Get Data and Analyze it.

 use news articles to construct weaviate data. For this we are going to need the `newspaper3k` package.

In [20]:
!{sys.executable} -m pip install newspaper3k

Defaulting to user installation because normal site-packages is not writeable


**UPDATE:** If none of the articles were downloaded, it might be because nltk punkt tools were not downloaded. To fix it please run the cell below.

In [21]:
import nltk # it is a dependency of newspaper3k
nltk.download('punkt')

[nltk_data] Downloading package punkt to /home/renuka/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [22]:
import newspaper
import uuid
import json
from tqdm import tqdm

def get_articles_from_newspaper(
        news_url: str, 
        max_articles: int=100
    ) -> None:
    """
    Download and save newspaper articles as weaviate schemas.
    Parameters
    ----------
    newspaper_url : str
        Newspaper title.
    """
    
    objects = []
    
    # Build the actual newspaper    
    news_builder = newspaper.build(news_url, memoize_articles=False)
    
    if max_articles > news_builder.size():
        max_articles = news_builder.size()
    pbar = tqdm(total=max_articles)
    pbar.set_description(f"{news_url}")
    i = 0
    while len(objects) < max_articles and i < news_builder.size():
        article = news_builder.articles[i]
        try:
            article.download()
            article.parse()
            article.nlp()

            if (article.title != '' and \
                article.title is not None and \
                article.summary != '' and \
                article.summary is not None and\
                article.authors):

                # create an UUID for the article using its URL
                article_id = uuid.uuid3(uuid.NAMESPACE_DNS, article.url)

                # create the object
                objects.append({
                    'id': str(article_id),
                    'title': article.title,
                    'summary': article.summary,
                    'authors': article.authors
                })
                
                pbar.update(1)

        except:
            # something went wrong with getting the article, ignore it
            pass
        i += 1
    pbar.close()
    return objects

In [23]:
data = []
data += get_articles_from_newspaper('https://www.theguardian.com/international')
data += get_articles_from_newspaper('http://cnn.com')

https://www.theguardian.com/international: 100%|██████████| 100/100 [01:43<00:00,  1.04s/it]
http://cnn.com: 100%|██████████| 100/100 [04:01<00:00,  2.42s/it]


### Create appropriate data types.

In the function `get_articles_from_newspaper` we keep the _title, summary_ and _authors_ of the article. We also compute an UUID (Universally Unique IDentifier) for each article. 

With this information at hand we already can define a schema, that is a data structure for each object type and how they are related. The schema is a nested dictionary.

So lets create the `Article` class schema. the article has a ___title, summary___ and ___authors___.


In [24]:
article_class_schema = {
    # name of the class
    "class": "Article",
    # a description of what this class represents
    "description": "An Article class to store the article summary and its authors",
    # class properties
    "properties": [
        {
            "name": "title",
            "dataType": ["string"],
            "description": "The title of the article", 
        },
        {
            "name": "summary",
            "dataType": ["text"],
            "description": "The summary of the article",
        },
        {
            "name": "hasAuthors",
            "dataType": ["Author"],
            "description": "The authors this article has",
        }
    ]
}

In the class schema, we create a class named `Article`and with the description `An Article class to store the article summary and its authors`. 

Also we define 3 properties: `title` - The title of the article, of type `string` (case sensitive), `summary` - The summary of the article, of data type `text` (case insensitive), `hasAuthor` - The authors of the article, of data type `Author`. 

**NOTE 1:** The properties should always be in cameCase format and starts with a lowercased word.<br>
**NOTE 2:** The property data type is always a list because it can accept more than one data type. 



Now lets create the `Author` class schema in the same manner, but with properties `name` and `wroteArticles`.

In [25]:
author_class_schema = {
    "class": "Author",
    "description": "An Author class to store the author information",
    "properties": [
        {
            "name": "name",
            "dataType": ["string"],
            "description": "The name of the author", 
        },
        {
            "name": "wroteArticles",
            "dataType": ["Article"],
            "description": "The articles of the author", 
        }
    ]
}

In [27]:
# helper function
def prettify(json_dict): 
    print(json.dumps(json_dict, indent=2))

In [28]:
prettify(client.schema.get())

{
  "classes": [
    {
      "class": "Article",
      "description": "Auto generated class",
      "invertedIndexConfig": {
        "bm25": {
          "b": 0.75,
          "k1": 1.2
        },
        "cleanupIntervalSeconds": 60,
        "stopwords": {
          "additions": null,
          "preset": "en",
          "removals": null
        }
      },
      "properties": [
        {
          "dataType": [
            "text"
          ],
          "description": "Auto generated property",
          "name": "title",
          "tokenization": "word"
        },
        {
          "dataType": [
            "text"
          ],
          "description": "Auto generated property",
          "name": "summary",
          "tokenization": "word"
        }
      ],
      "shardingConfig": {
        "virtualPerPhysical": 128,
        "desiredCount": 1,
        "actualCount": 1,
        "desiredVirtualCount": 128,
        "actualVirtualCount": 128,
        "key": "_id",
        "strategy": "hash"

In [32]:
schema = client.schema.get() # save schema
client.schema.delete_all() # delete all classes
prettify(client.schema.get())

{
  "classes": []
}


This looks exactly as the schema we created class by class and property by property. This way we can save now the schema in a file and in the next session just directly import it by providing the file path.
```python
# save schema to file
with open('schema.json', 'w') as outfile: 
    json.dump(schema, outfile)
# remove current schema from Weaviate, removes all the data too
client.schema.delete_all()
# import schema using file path
client.schema.create('schema.json')
# print schema
print(json.dumps(client.schema.get(), indent=2))
```

###  Load data.



Importing data to weaviate can be done in 3 different ways.

1. Adding object by object iteratively. This ca be done using the `data_object` object attribute of the client.
2. In batches. This can be done by creating an appropriate batch request object and submitting it using the `batch` object attribute of the client. (__Only in__ `weaviate-client` __version <3.0.0.__)
3. Using a `Batcher` object from the `weaviate.tools` module. (__Only in__ `weaviate-client` __version <3.0.0.__)
4. __New__ `Batch` __class introduced in weaviate-client version 3.0.0.__

We are going to see all of them in action, but first lets underline the differences between them.

- Option 1. is the safest method to add data objects and creating references because it does object validation before creating it, whereas importing data in batches skips most of the validation in favor for speed. This option requires one REST request per object, thus is slower than importing data in batches. It is recommended to use this option if you are not sure if the your data is valid.


- Option 2. as mentioned above skips most of data validation and requires only one REST request per BATCH. For this method you just add as much data as you want to a batch request (there are 2 types: `ReferenceBatchRequest` and `ObjectsBatchRequest`) then you submit it using the `batch` object attribute of the client. This option requires you to first import data objects, and then references (make sure that the objects used in the reference are already imported before creating a reference). (__Only in__ `weaviate-client` __version <3.0.0.__)


- Option 3. relies on the batch requests from 2. but for a `Batcher` you do not have to submit any batch requests it does it automatically for you when it is full. (__Only in__ `weaviate-client` __version <3.0.0.__)

- Option 4: __New__ `Batch` __class introduced in weaviate-client version 3.0.0.__ The new `Batch` object does not need the `BatchRequests` from 2. but uses them internally. The new class also supports 3 different cases of loading data in batches: a) Manually - the user has the absolute control when and how to add and create batches; b) Auto-create batches when full; c) Auto-create batches using dynamic batching, i.e. the batch size is adjusted every time it is created to avoid any `Timeout` errors.

###  Load data using `data_object` attribute

For this case lets take only one article (`data[0]`) and import it to Weaviate using the `data_object` attribute.

The way to do it, is by creating first the objects and then the reference that links them.

Each data object should have the same format as defined in schema.

In [34]:
prettify(data[0])

{
  "id": "83a3d904-3961-31a1-97d4-0117450c5bc4",
  "title": "Russia-Ukraine war latest: Zelenskiy says Sievierodonetsk seeing most difficult fighting so far in war \u2013 live",
  "summary": "11m ago 05.39 Fight for Sievierodonetsk will decide fate of eastern Ukraine - ZelenskiyUkraine\u2019s president, Volodymyr Zelenskiy, has said the battle for the eastern city of Sievierodonetsk will decide the fate of Donbas and is seeing probably the most difficult fighting since Russia\u2019s invasion began.\n\u201cSievierodonetsk remains the epicentre of the confrontation in Donbas,\u201d Zelenskiy said in a late-night address to the nation on Wednesday evening, claiming that Ukraine had inflicted \u201csignificant losses on the enemy\u201d.\nHeavy fighting continues in Ukraine's eastern Donbas Heavy fighting continues in Ukraine\u2019s eastern DonbasSerhiy Haidai, the governor of Luhansk, said most of the city was now in Russian hands and that it was no longer possible to rescue civilians str

In [35]:
article_object = {
    'title': data[0]['title'],
    'summary': data[0]['summary'].replace('\n', '') # remove newline character
    # we leave out the `hasAuthors` because it is a reference and will be created after we create the Authors
}
article_id = data[0]['id']

# validated the object
result = client.data_object.validate(
    data_object=article_object,
    class_name='Article',
    uuid=article_id
)

prettify(result)

{
  "error": [
    {
      "message": "invalid object: class 'Article' not present in schema"
    }
  ],
  "valid": false
}


Object passed the validation test, now it is safe to create/import it.

In [36]:
# create the object
client.data_object.create(
    data_object=article_object,
    class_name='Article',
    uuid=article_id # if not specified, weaviate is going to create an UUID for you.
)

'83a3d904-3961-31a1-97d4-0117450c5bc4'

The `client.data_object.create` return the UUID of the object, if you specified one it is going to be returned too. If you do not specify one, Weaviate is going to generate one for you and return it.

 we have added our first object to weaviate!!!

Now we can actually "get" this object from Weaviate by its UUID using `get_by_id` or `get` method. (`get` without specifying and UUID return first 100 objects)

In [37]:
prettify(client.data_object.get(article_id, with_vector=False))

{
  "class": "Article",
  "creationTimeUnix": 1654750619898,
  "id": "83a3d904-3961-31a1-97d4-0117450c5bc4",
  "lastUpdateTimeUnix": 1654750619898,
  "properties": {
    "summary": "11m ago 05.39 Fight for Sievierodonetsk will decide fate of eastern Ukraine - ZelenskiyUkraine\u2019s president, Volodymyr Zelenskiy, has said the battle for the eastern city of Sievierodonetsk will decide the fate of Donbas and is seeing probably the most difficult fighting since Russia\u2019s invasion began.\u201cSievierodonetsk remains the epicentre of the confrontation in Donbas,\u201d Zelenskiy said in a late-night address to the nation on Wednesday evening, claiming that Ukraine had inflicted \u201csignificant losses on the enemy\u201d.Heavy fighting continues in Ukraine's eastern Donbas Heavy fighting continues in Ukraine\u2019s eastern DonbasSerhiy Haidai, the governor of Luhansk, said most of the city was now in Russian hands and that it was no longer possible to rescue civilians stranded there.But

###  Load data using batches
(__Only in__ `weaviate-client` __version < 3.0.0.__)

Importing data in batches is very similar to adding object by object.

The first thing we have to do is to create a `BatchRequest` object for each object type: `DataObject` and `Reference`. They are named accordingly: `ObjectsBatchRequest` and `ReferenceBatchRequest`.



In [41]:
from weaviate import ObjectsBatchRequest, ReferenceBatchRequest

Lets create a function that adds a single article to the batch request.

In [42]:
def add_article(batch: ObjectsBatchRequest, article_data: dict) -> str:
    
    article_object = {
        'title': article_data['title'],
        'summary': article_data['summary'].replace('\n', '') # remove newline character
    }
    article_id = article_data['id']
    
    # add article to the object batch request
    batch.add(
        data_object=article_object,
        class_name='Article',
        uuid=article_id
    )
    
    return article_id

Lets now create a function add a single author to the batch request, if the author was not already created.

In [43]:
def add_author(batch: ObjectsBatchRequest, author_name: str, created_authors: dict) -> str:
    
    if author_name in created_authors:
        # return author UUID
        return created_authors[author_name]
    
    # generate an UUID for the Author
    author_id = generate_uuid(author)
    
    # add author to the object batch request
    batch.add(
        data_object={'name': author_name},
        class_name='Author',
        uuid=author_id
    )
    
    created_authors[author_name] = author_id
    return author_id

And the last function for adding cross references.

In [44]:
def add_references(batch: ReferenceBatchRequest, article_id: str, author_id: str)-> None:
    # add references to the reference batch request
    ## Author -> Article
    batch.add(
        from_object_uuid=author_id,
        from_object_class_name='Author',
        from_property_name='wroteArticles',
        to_object_uuid=article_id
    )
    
    ## Article -> Author 
    batch.add(
        from_object_uuid=article_id,
        from_object_class_name='Article',
        from_property_name='hasAuthors',
        to_object_uuid=author_id
    )

iterate through the data and import data using batches.

In [45]:
from weaviate.tools import generate_uuid # in version 3.0.0 it is weaviate.util.generate_uuid5
from tqdm.notebook import trange

objects_batch = ObjectsBatchRequest()
reference_batch = ReferenceBatchRequest()

for i in trange(1, 100):
    
    # add article to batch request
    article_id = add_article(objects_batch, data[i])
    
    for author in data[i]['authors']:
        
        # add author to batch request
        author_id = add_author(objects_batch, author, created_authors)
        
        # add cross references to the reference batch
        add_references(reference_batch, article_id=article_id, author_id=author_id)
    
    if i % 20 == 0:
        # submit the object batch request to weaviate, can be done with method '.create_objects'
        client.batch.create(objects_batch)
        
        # submit the reference batch request to weaviate, can be done with method '.create_references'
        client.batch.create(reference_batch)
        
        # batch requests are not reusable, so we create new ones
        objects_batch = ObjectsBatchRequest()
        reference_batch = ReferenceBatchRequest()


# submit the any object that are left
status_objects = client.batch.create(objects_batch)
status_references = client.batch.create(reference_batch)

  0%|          | 0/99 [00:00<?, ?it/s]

In order to import data in batches we should create a `BatchRequest` object for the data object type we want to import. A batch request object does not have a size limit so you should submit it when there are as many objects as you want. (Keep in mind that if you will use a batch with too many objects it might result in an TimeOut error so keep it to a reasonable size so your Weaviate instance can process it.) Also we keep track of the authors we already created so we do not create the same author over and over again.


### Load data using a Batcher object.
(__Only in__ `weaviate-client` __version <3.0.0.__)

The `Batcher` is a class that automatically submits objects to weaviate, both `DataObject`s and `Reference`s. The `Batcher` can be found in the `weaviate.tools` module
```python
Batcher(
    client : weaviate.client.Client,
    batch_size : int=512,
    verbose : bool=False,
    auto_commit_timeout : float=-1.0,
    max_backoff_time : int=300,
    max_request_retries : int=4,
    return_values_callback : Callable=None,
)
```

In [46]:
from weaviate.tools import Batcher

For a `Batcher` we only need to add the objects we want to import to Weaviate. The `Batcher` has a special method to add `objects` (`batcher.add_data_object`) and a special method to add `references` (`batcher.add_reference`). Also it provides a `batcher.add` method that has keywords arguments, which detects what kind of data you are trying to add. The `batcher.add` method makes it possible to reuse the `add_article`, `add_author` and `add_references` functions we defined above.

**NOTE:** The `Batcher.add` was introduced in `weaviate-client` version 2.3.0. 

Lets use the batcher to add the remaining articles and authors from `data`. Because the `Batcher` automatically submits objects to weaviate, we need to ALWAYS `.close()` it after we are done to make sure we are submitting what remains in the `Batcher`.



In [47]:
# we still need the 'created_authors' so we do not add the same author twice
with Batcher(client, 30, True) as batcher:
    for i in trange(100, 200):
        
        # add article to batcher
        article_id = add_article(batcher, data[i]) # NOTE the 'bather' object instead of 'objects_batch'

        for author in data[i]['authors']:

            # add author to batcher
            author_id = add_author(batcher, author, created_authors) # NOTE the 'bather' object instead of 'objects_batch'

            # add cross references to the batcher
            add_references(batcher, article_id=article_id, author_id=author_id) # NOTE the 'bather' object instead of 'reference_batch'

Batcher object created!


  0%|          | 0/100 [00:00<?, ?it/s]

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: n

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: n

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no b

Updated object batch successfully
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'hasAuthors__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: no bucket for prop 'wroteArticles__meta_count' found"}]}, 'status': 'FAILED'}}
{'result': {'errors': {'error': [{'message': "ref batch: write inverted batch: write additions: n

###  Query data.

Now we have the data imported and ready to be queried. Data can be queried by using the `query` attribute of the client object (`client.query`).

The data is queried using GraphQL syntax, and can be done in three different ways:
- **GET**: query that gets objects and from Weaviate. 
    Use `client.query.get(class_name, properties).OTHER_OPTIONAL_FILTERS.do()`

- **AGGREGATE**: query that aggregates data.
    Use `client.query.aggregate(class_name, properties).OTHER_OPTIONAL_FILTERS.do()`
    
- Or use a GraphQL query represented as a `str`. <br>
    Use `client.query.raw()`
    
**NOTE:** Both `.get` and `.aggregate` require the call of the `.do()` method to run the query. `.raw()` does NOT.

###  GET

In [53]:
result = client.query.get(class_name='Article', properties="title")\
    .do()
print(f"Number of articles returned: {len(result['data']['Get']['Article'])}")
result

Number of articles returned: 100


{'data': {'Get': {'Article': [{'title': 'Amazon natives hold on to tradition'},
    {'title': "Electric robots are mapping the seafloor, Earth's last frontier"},
    {'title': 'Scan gives Emma Raducanu hope she will be fit for Wimbledon'},
    {'title': "Nick Cannon says 'the stork is on the way' as he confirms he's having more children this year"},
    {'title': 'Moscow’s chief rabbi ‘in exile’ after resisting Kremlin pressure over war'},
    {'title': 'Nike is shutting down its Run Club app in China'},
    {'title': '‘The worst law on earth’: why the rich love London’s reputation managers'},
    {'title': "'This is Not America's Flag:' Artworks challenge what it means to be from the United States"},
    {'title': 'Six ways with Asian greens: ‘They’re almost like a cross between spinach and broccoli’'},
    {'title': 'Desert dancers highlight Andean culture'},
    {'title': 'I’m nearly 60. Here’s what I’ve learned about growing old so far'},
    {'title': "'Ms. Marvel' tackles a Musli

So as we can see the `result` contains only 100 articles, this is due to the default limit of 100. Lets change it.

In [59]:
result = client.query.get(class_name='Article', properties="title")\
    .with_limit(200)\
    .do()
print(f"Number of articles returned: {len(result['data']['Get']['Article'])}")
result

Number of articles returned: 200


{'data': {'Get': {'Article': [{'title': 'Amazon natives hold on to tradition'},
    {'title': "Electric robots are mapping the seafloor, Earth's last frontier"},
    {'title': 'Scan gives Emma Raducanu hope she will be fit for Wimbledon'},
    {'title': "Nick Cannon says 'the stork is on the way' as he confirms he's having more children this year"},
    {'title': 'Moscow’s chief rabbi ‘in exile’ after resisting Kremlin pressure over war'},
    {'title': 'Nike is shutting down its Run Club app in China'},
    {'title': '‘The worst law on earth’: why the rich love London’s reputation managers'},
    {'title': "'This is Not America's Flag:' Artworks challenge what it means to be from the United States"},
    {'title': 'Six ways with Asian greens: ‘They’re almost like a cross between spinach and broccoli’'},
    {'title': 'Desert dancers highlight Andean culture'},
    {'title': 'I’m nearly 60. Here’s what I’ve learned about growing old so far'},
    {'title': "'Ms. Marvel' tackles a Musli

We can do much more by stacking multiple methods. The available methods for `.get` are:
- `.with_limit` - set another limit of returned objects.
- `.with_near_object` - get objects that are similar to the object passed to this method.
- `.with_near_text` - get objects that are similar to the text passed to this method.
- `.with_near_vector` - get objects that are similar to the vector passed to this method.
- `.with_where` - get objects that are filtered using the `Where` filter

Also instead of `.do()` one can use the `.build()` method that returns the GraphQL query as a string. This string can be passed to `.raw()` method.

**NOTE:** Only one `.with_near_*` can be used per query.

In [62]:
a=client.query.get(class_name='Article', properties="title")\
    .with_limit(5)\
    .with_near_text({'concepts': ['Fashion']})\
    .do()
a

{'errors': [{'locations': [{'column': 23, 'line': 1}],
   'message': 'Unknown argument "nearText" on field "Article" of type "GetObjectsObj". Did you mean "nearVector" or "nearObject"?',
   'path': None}]}

###  AGGREGATE

We can use the `.aggregate` to count number of objects that satisfy a specific condition.

In [57]:
# no filter, count all objects of class Article
client.query.aggregate(class_name='Article')\
    .with_meta_count()\
    .do()

{'data': {'Aggregate': {'Article': [{'meta': {'count': 200}}]}}}

In [58]:
# no filter, count all objects of class Author
client.query.aggregate(class_name='Author')\
    .with_meta_count()\
    .do()

{'data': {'Aggregate': {'Author': [{'meta': {'count': 231}}]}}}


Here are the methods that are supported by the `.aggregate`.

- `.with_meta_count` sets meta count to True. Used to count objects per filtered group.
- `.with_fields` - fields to return by the aggregated query.
- `.with_group_by_filter` - set a `GroupBy` filter. 
- `.with_where` - aggregate objects using a `Where` filter. 
