# Getting Started with Weaviate Python Library

* 1. What is Weaviate?
* 2. Where can it be used?
* 3. What are the advantages?
* 4. What is Weaviate Python Client?
* 5. How to use the Weaviate Python Client with a weaviate cluster?
  * 5.0. Create a Weaviate instance/cluster.
  * 5.1. Connect to the cluster.
  * 5.2. Get Data and Analyze it.
  * 5.3. Create appropriate data types.
  * 5.4. Load data.
  * 5.5. Query data.

## 1. What is Weaviate?

Weaviate is an open-source, cloud-native, modular, real-time vector search engine. It is build to scale your machine learning models. Because Weaviate is modular, you can use it with any machine learning model that does data encoding. Weaviate comes with optional modules for text, image, and other other media types, that can be chose based on your task and data. Also, one could use more than one module, depending on the variety of data. More information [here](https://www.semi.technology/developers/weaviate/current/).

In this articles we are going to use the _text_ module to see the most important functionalities and capabilities of Weaviate. The text module, also called _text2vec-contextionary_, captures the semantic meaning of the text objects and places it a concept hyper-space. This allows to do semantic search, in contrast to 'word matching search' that other search engines do.

For more information about the Weaviate and SeMI Technology - the company that builds Weaviate, visit the official [website](https://www.semi.technology/).

## 2. Where can it be used?

At the moment weaviate is used in such cases as: 
  - semantic search,
  - similarity search,
  - image search,
  - power recommendation engines,
  - e-commerce search,
  - cybersecurity threat analysis,
  - automated data harmonization,
  - anomaly detection,
  - data classification in ERP systems,
  
, and many many more cases.

## 3. What are the advantages?

To understand what are the Weaviate advantages, you should ask yourself these questions:

 - Is the quality of results, that your current search engine gives you, good enough for you?
 - Is it is too much work to bring your machine learning models to scale?
 - Do you need to classify large datasets fast and near-real time?
 - Do you need to scale your machine learning models to production size?
 
Weaviate is the solution to all these questions.

## 4. What is Weaviate Python Client?

The Weaviate Python Client is a python package that allows you to connect and interact with a Weaviate instance. The python client is NOT a Weaviate instance but you can use it to create one on the [Weaviate Cloud Service](https://console.semi.technology/). It provides API for importing data, creating schemas, do classification, query data, ... We are going to go through most of them and explain how and when one could use them.

The package is published to PyPI ([link](https://pypi.org/project/weaviate-client/)). Also, a CLI tool is available on PyPI ([link](https://pypi.org/project/weaviate-cli/)).

## 5. How to use the python-client with a weaviate cluster?

In this section we are going to go through the process of creating a weaviate instance, connecting to it and explore some functionalities.

### 5.0. Create an weaviate instance/cluster.

Creating a Weaviate instance can be done in multiple ways. It can be done using a `docker-compose.yaml` file that can be generated [here](https://www.semi.technology/developers/weaviate/current/getting-started/installation.html#customize-your-weaviate-setup). For this option you have to have `docker` and `docker-compose` installed, and space on your drive.

Another option is to create an account on [Weaviate Cloud Service console](https://console.semi.technology/) (WCS console) and create a cluster there. There are different options for clusters you can choose from. If you do not have an account go ahead and create one.

In this tutorial we are going to create a cluster on WCS directly from python (you will only need your WCS credentials).

The first thing we have to do now, is to install the Weaviate Python Client. It can be done using pip command. 

In [None]:
import sys
!{sys.executable} -m pip install weaviate-client==2.3.1

Now lets import the package and create a cluster on WCS.

In [None]:
from getpass import getpass # hide password
import weaviate # to communicate to the Weaviate instance
from weaviate.tools import WCS

In order to authenticate to WCS or Weaviate instance (if Weaviate instance has Authentication enable) we need to create an Authentication object. At the moment it supports two types of authentication credentials: 
* Password credentials: `weaviate.auth.AuthClientPassword(username='WCS_ACCOUNT_EMAIL', password='WCS_ACCOUNT_PASSWORD')`
* Token credentials `weaviate.auth.AuthClientCredentials(client_secret=YOUR_SECRET_TOKEN)`

For WCS we will use the Password credentials.  

In [None]:
my_credentials = weaviate.auth.AuthClientPassword(username=input("User name: "), password=getpass('Password: '))

The `my_credentials` object contains your credentials so be careful not make it public.

In [None]:
my_wcs = WCS(my_credentials)

Now that we connected to WCS, we can `create`, `delete`, `get_clusters`, `get_cluster_config` and check the status of a cluster with `is_ready` method.

Here is the prototype of the `create` method:
```python
my_wcs.create(cluster_name:str=None, 
    cluster_type:str='sandbox',
    config:dict=None,
    wait_for_completion:bool=True) -> str
```
The return value is the URL of the created cluster.

*If you want to check the prototype and docstring of any methods in a notebook, run this command: `object.method?`. You can also use the `help()` function.*<br>
Ex: `WCS.is_ready?` or `my_wcs.is_ready?` or `help(WCS.is_ready)`.

In [None]:
cluster_name = 'my-first-weaviate-instance'
weaviate_url = my_wcs.create(cluster_name=cluster_name)
weaviate_url

In [None]:
my_wcs.is_ready(cluster_name)

### 5.1. Connect to the cluster.

Now we can connect to the created weaviate instance with the `Client` object. The constructor looks like this:
```python
weaviate.Client(
    url:str,
    auth_client_secret:weaviate.auth.AuthCredentials=None,
    timeout_config:Union[Tuple[int, int], NoneType]=None,
)
```

The constructor has only one required argument, `url`, and two optional ones: `auth_client_secret` - used if weaviate instance has authentication enabled and `timeout_config` - that sets REST time out configuration and is a tuple (retries, time out seconds). For more information about the arguments look at the docstring.

In [None]:
client = weaviate.Client(weaviate_url)

Now that we connected to Weavite, it does not necessary mean that it is all set up. It might still do some setup processes in the background. We can check the health of the Weaviate instance by calling the `.is_live` method, and check if Weaviate is ready for requests by calling the `.is_ready`.

In [None]:
client.is_ready()

### 5.2. Get Data and Analyze it.

We set up the Weaviate instance, connected to it and have it ready for requests, now we can take a step back and get some data and analyze it.

This step, as for all the machine learning models, is the most important one. Here we have to decide what is relevant, what is important and what data structures/types to use.

In this example we are going to use news articles to construct weaviate data. For this we are going to need the `newspaper3k` package.

In [None]:
!{sys.executable} -m pip install newspaper3k

In [None]:
import newspaper
import uuid
import json
from tqdm import tqdm

def get_articles_from_newspaper(
        news_url: str, 
        max_articles: int=100
    ) -> None:
    """
    Download and save newspaper articles as weaviate schemas.
    Parameters
    ----------
    newspaper_url : str
        Newspaper title.
    """
    
    objects = []
    
    # Build the actual newspaper    
    news_builder = newspaper.build(news_url, memoize_articles=False)
    
    if max_articles > news_builder.size():
        max_articles = news_builder.size()
    pbar = tqdm(total=max_articles)
    pbar.set_description(f"{news_url}")
    i = 0
    while len(objects) < max_articles and i < news_builder.size():
        article = news_builder.articles[i]
        try:
            article.download()
            article.parse()
            article.nlp()

            if (article.title != '' and \
                article.title is not None and \
                article.summary != '' and \
                article.summary is not None and\
                article.authors):

                # create an UUID for the article using its URL
                article_id = uuid.uuid3(uuid.NAMESPACE_DNS, article.url)

                # create the object
                objects.append({
                    'id': str(article_id),
                    'title': article.title,
                    'summary': article.summary,
                    'authors': article.authors
                })
                
                pbar.update(1)

        except:
            # something went wrong with getting the article, ignore it
            pass
        i += 1
    pbar.close()
    return objects

In [None]:
data = []
data += get_articles_from_newspaper('https://www.theguardian.com/international')
data += get_articles_from_newspaper('http://cnn.com')

### 5.3. Create appropriate data types.

In the function `get_articles_from_newspaper` we keep the _title, summary_ and _authors_ of the article. We also compute an UUID (Universally Unique IDentifier) for each article. All of these fields can be seen in the cell above.

With this information at hand we already can define a schema, that is a data structure for each object type and how they are related. The schema is a nested dictionary.

So lets create the `Article` class schema. We know that the article has a ___title, summary___ and ___authors___.

More about schemas and how to create them can be found [here](https://www.semi.technology/developers/weaviate/current/data-schema/schema-configuration.html) and [here](https://www.semi.technology/developers/weaviate/current/restful-api-references/schema.html#parameters).

In [None]:
article_class_schema = {
    # name of the class
    "class": "Article",
    # a description of what this class represents
    "description": "An Article class to store the article summary and its authors",
    # class properties
    "properties": [
        {
            "name": "title",
            "dataType": ["string"],
            "description": "The title of the article", 
        },
        {
            "name": "summary",
            "dataType": ["text"],
            "description": "The summary of the article",
        },
        {
            "name": "hasAuthors",
            "dataType": ["Author"],
            "description": "The authors this article has",
        }
    ]
}

In the class schema above we create a class named `Article`and with the description `An Article class to store the article summary and its authors`. The description is there to explain the user what this class is about.

Also we define 3 properties: `title` - The title of the article, of type `string` (case sensitive), `summary` - The summary of the article, of data type `text` (case insensitive), `hasAuthor` - The authors of the article, of data type `Author`. The `Author` is NOT a primitive data type, it is another class that we should define. The list of primitive data types can be found [here](https://www.semi.technology/developers/weaviate/current/data-schema/datatypes.html).

**NOTE 1:** The properties should always be in cameCase format and starts with a lowercased word.<br>
**NOTE 2:** The property data type is always a list because it can accept more than one data type. 

Specifying another class as a data type is called cross-referencing. This way you can link your data objects in-between them and create a relation graph.

Now lets create the `Author` class schema in the same manner, but with properties `name` and `wroteArticles`.

In [None]:
author_class_schema = {
    "class": "Author",
    "description": "An Author class to store the author information",
    "properties": [
        {
            "name": "name",
            "dataType": ["string"],
            "description": "The name of the author", 
        },
        {
            "name": "wroteArticles",
            "dataType": ["Article"],
            "description": "The articles of the author", 
        }
    ]
}

Now that we decided on the data structure, we can tell Weaviate what kind of data we will import. This can be done by accessing the `schema` attribute of the client. 

Schema can be created in two different ways:
1. using the `.create_class()` method, this option creates only one class per call.
2. using the `.create()` method, this option creates multiple classes at once (useful if you have the whole schema)

Also we can check if a schema is present or if a particular class schema is present with the `.contains()` method.

More about schema methods, click [here](https://www.semi.technology/developers/weaviate/current/restful-api-references/schema.html).

Because we defined each class separately we should use the `.create_class()` method.

```python
client.schema.create_class(schema_class:Union[dict, str]) -> None
```
It accepts also file paths or URLs to a class definition file.

In [None]:
client.schema.create_class(article_class_schema)

As we can see, we cannot create the class property that reference a non-existing data type. This does not mean that the class `Article` was not created at all. Lets get the schema from weaviate and look what was created.

In [None]:
# helper function
def prettify(json_dict): 
    print(json.dumps(json_dict, indent=2))

In [None]:
prettify(client.schema.get())

The configurations we did not specify are not mandatory and were set to the default values.

As we can see, only the `hasAuthor` property was not created. So lets then create the `Author` class.

In [None]:
client.schema.create_class(author_class_schema)
prettify(client.schema.get())

Now we have both classes created but still we do not have the `hasAuthor` property. No worries, it can be created at any time, using the schema's attribute `property` and its method `create`.
```python
client.schema.property.create(schema_class_name:str, schema_property:dict) -> None
```

In [None]:
client.schema.property.create('Article', article_class_schema['properties'][2])

Now lets get the schema and see if it is what we expect it to be.

In [None]:
prettify(client.schema.get())

Everything is exactly as we intended.

If you do not want to think about which class was created when and what properties might fail or not (due to yet non-existing classes), there is a solution for it. The solution is to create the whole schema with the `create` method. So lets delete the schema from weaviate and see how it works.

In [None]:
schema = client.schema.get() # save schema
client.schema.delete_all() # delete all classes
prettify(client.schema.get())

*Note that if we delete the schema or a class we delete all the objects associated with it.*

Now lets create it from the saved schema.

In [None]:
client.schema.create(schema)
prettify(client.schema.get())

This looks exactly as the schema we created class by class and property by property. This way we can save now the schema in a file and in the next session just directly import it by providing the file path.
```python
# save schema to file
with open('schema.json', 'w') as outfile: 
    json.dump(schema, outfile)
# remove current schema from Weaviate, removes all the data too
client.schema.delete_all()
# import schema using file path
client.schema.create('schema.json')
# print schema
print(json.dumps(client.schema.get(), indent=2))
```

### 5.4. Load data.

Now that we have our data ready, and Weaviate is aware of what kind of data we have, we can add the `Articles` and `Authors` to the Weaviate instance.

Importing data to weaviate can be done in 3 different ways.

1. Adding object by object iteratively. This ca be done using the `data_object` object attribute of the client.
2. In batches. This can be done by creating an appropriate batch request object and submitting it using the `batch` object attribute of the client.
3. Using a `Batcher` object from the `weaviate.tools` module.

We are going to see all of them in action, but first lets underline the differences between them.

- Option 1. is the safest method to add data objects and creating references because it does object validation before creating it, whereas importing data in batches skips most of the validation in favor for speed. This option requires one REST request per object, thus is slower than importing data in batches. It is recommended to use this option if you are not sure if the your data is valid.


- Option 2. as mentioned above skips most of data validation and requires only one REST request per BATCH. For this method you just add as much data as you want to a batch request (there are 2 types: `ReferenceBatchRequest` and `ObjectsBatchRequest`) then you submit it using the `batch` object attribute of the client. This option requires you to first import data objects, and then references (make sure that the objects used in the reference are already imported before creating a reference).


- Option 3. relies on the batch requests from 2. but for a `Batcher` you do not have to submit any batch requests it does it automatically for you when it is full.

### 5.4.1 Load data using `data_object` attribute

For this case lets take only one article (`data[0]`) and import it to Weaviate using the `data_object` attribute.

The way to do it, is by creating first the objects and then the reference that links them.

Run `client.data_object.create?` in a notebook to get more info about the method. Or `help(client.data_object.create)` in the IDLE.

Each data object should have the same format as defined in schema.

In [None]:
prettify(data[0])

In [None]:
article_object = {
    'title': data[0]['title'],
    'summary': data[0]['summary'].replace('\n', '') # remove newline character
    # we leave out the `hasAuthors` because it is a reference and will be created after we create the Authors
}
article_id = data[0]['id']

# validated the object
result = client.data_object.validate(
    data_object=article_object,
    class_name='Article',
    uuid=article_id
)

prettify(result)

Object passed the validation test, now it is safe to create/import it.

In [None]:
# create the object
client.data_object.create(
    data_object=article_object,
    class_name='Article',
    uuid=article_id # if not specified, weaviate is going to create an UUID for you.
)

The `client.data_object.create` return the UUID of the object, if you specified one it is going to be returned too. If you do not specify one, Weaviate is going to generate one for you and return it.

Congratulations we have added our first object to weaviate!!!

Now we can actually "get" this object from Weaviate by its UUID using `get_by_id` or `get` method. (`get` without specifying and UUID return first 100 objects)

In [None]:
prettify(client.data_object.get(article_id, with_vector=False))

Now lets create the authors and the cross references between the `Article` and the `Authors`.

The reference addition is in the same manner, but to add references use the `client.data_object.reference.add` method.

In [None]:
# keep track of the authors already imported/created and their respective UUID
# because same author can write more than one paper.
created_authors = {}

for author in data[0]['authors']:
    # create Author
    author_object = {
        'name': author,
        # we leave out the `wroteArticles` because it is a reference and will be created after we create the Author
    }
    author_id = client.data_object.create(
        data_object=author_object,
        class_name='Author'
    )
    
    # add author to the created_authors
    created_authors[author] = author_id
    
    # add references
    ## Author -> Article
    client.data_object.reference.add(
        from_uuid=author_id,
        from_property_name='wroteArticles',
        to_uuid=article_id
    )
    ## Article -> Author 
    client.data_object.reference.add(
        from_uuid=article_id,
        from_property_name='hasAuthors',
        to_uuid=author_id
    )
    

In the cell above we iterate through all authors of the article. For each iteration we first create the `Author` then we add the references: the reference from `Author` to `Article` - linked via the `wroteArticles` property of the `Author`, and reference from `Article` to `Author` - through the `hasAuthors` property of the `Article`.

Note that it is not required to have bi-directional references.

Now lets get the object and take a look at it.

In [None]:
prettify(client.data_object.get(article_id, with_vector=False))

As we can see we have the reference set as a `beacon` and a `href`. We cannot see the `Author`s data by getting the objects from weaviate. We can do it by _querying_ data (see section **5.5 Query data.**) or by getting the the object by the UUID (or `beacon`, or `href`).

In [None]:
from weaviate.util import get_valid_uuid # extract UUID from URL (beacon or href)

# extract authors references, lets take only the first one as an example (the article might have only one)
author = client.data_object.get(article_id, with_vector=False)['properties']['hasAuthors'][0]

# get and print data object by providing the 'beacon'
author_uuid = get_valid_uuid(author['beacon']) # can be 'href' too
prettify(client.data_object.get(author_uuid, with_vector=False))

So Far, so Good (... So What!)

There are more methods for Data Objects (`client.data_object`): `.delete`, `.exists`, `.replace` and `.update`.

Also there are some methods for references too (`client.data_object.reference`): `.add`, `.delete` and `.update`.

### 5.4.2 Load data using batches

Importing data in batches is very similar to adding object by object.

The first thing we have to do is to create a `BatchRequest` object for each object type: `DataObject` and `Reference`. They are named accordingly: `ObjectsBatchRequest` and `ReferenceBatchRequest`.

Lets create a object of each batch and import the next 99 articles to Weaviate.

**NOTE:** I want to bring to your attention again that importing/creating data in batches skips some validation steps and might lead to a corrupted graph.

In [None]:
from weaviate import ObjectsBatchRequest, ReferenceBatchRequest

Lets create a function that adds a single article to the batch request.

In [None]:
def add_article(batch: ObjectsBatchRequest, article_data: dict) -> str:
    
    article_object = {
        'title': article_data['title'],
        'summary': article_data['summary'].replace('\n', '') # remove newline character
    }
    article_id = article_data['id']
    
    # add article to the object batch request
    batch.add(
        data_object=article_object,
        class_name='Article',
        uuid=article_id
    )
    
    return article_id

Lets now create a function add a single author to the batch request, if the author was not already created.

In [None]:
def add_author(batch: ObjectsBatchRequest, author_name: str, created_authors: dict) -> str:
    
    if author_name in created_authors:
        # return author UUID
        return created_authors[author_name]
    
    # generate an UUID for the Author
    author_id = generate_uuid(author)
    
    # add author to the object batch request
    batch.add(
        data_object={'name': author_name},
        class_name='Author',
        uuid=author_id
    )
    
    created_authors[author_name] = author_id
    return author_id

And the last function for adding cross references.

In [None]:
def add_references(batch: ReferenceBatchRequest, article_id: str, author_id: str)-> None:
    # add references to the reference batch request
    ## Author -> Article
    batch.add(
        from_object_uuid=author_id,
        from_object_class_name='Author',
        from_property_name='wroteArticles',
        to_object_uuid=article_id
    )
    
    ## Article -> Author 
    batch.add(
        from_object_uuid=article_id,
        from_object_class_name='Article',
        from_property_name='hasAuthors',
        to_object_uuid=author_id
    )

Now we can iterate through the data and import data using batches.

In [None]:
from weaviate.tools import generate_uuid
from tqdm.notebook import trange

objects_batch = ObjectsBatchRequest()
reference_batch = ReferenceBatchRequest()

for i in trange(1, 100):
    
    # add article to batch request
    article_id = add_article(objects_batch, data[i])
    
    for author in data[i]['authors']:
        
        # add author to batch request
        author_id = add_author(objects_batch, author, created_authors)
        
        # add cross references to the reference batch
        add_references(reference_batch, article_id=article_id, author_id=author_id)
    
    if i % 20 == 0:
        # submit the object batch request to weaviate, can be done with method '.create_objects'
        client.batch.create(objects_batch)
        
        # submit the reference batch request to weaviate, can be done with method '.create_references'
        client.batch.create(reference_batch)
        
        # batch requests are not reusable, so we create new ones
        objects_batch = ObjectsBatchRequest()
        reference_batch = ReferenceBatchRequest()


# submit the any object that are left
status_objects = client.batch.create(objects_batch)
status_references = client.batch.create(reference_batch)

In order to import data in batches we should create a `BatchRequest` object for the data object type we want to import. A batch request object does not have a size limit so you should submit it when there are as many objects as you want. (Keep in mind that if you will use a batch with too many objects it might result in an TimeOut error so keep it to a reasonable size so your Weaviate instance can process it.) Also we keep track of the authors we already created so we do not create the same author over and over again.

The call of the `client.batch.create` returns the status of each object that was created. Check it if you want to be sure that everything worked just fine. Also **NOTE** that even if Weaviate failed to create objects it does not mean that the batch submission failed too, for more information read the documentation of the `client.batch.create`.

### 5.4.3 Load data using a Batcher object.

The `Batcher` is a class that automatically submits objects to weaviate, both `DataObject`s and `Reference`s. The `Batcher` can be found in the `weaviate.tools` module, and has the following constructor prototype:
```python
Batcher(
    client : weaviate.client.Client,
    batch_size : int=512,
    verbose : bool=False,
    auto_commit_timeout : float=-1.0,
    max_backoff_time : int=300,
    max_request_retries : int=4,
    return_values_callback : Callable=None,
)
```

See the documentation for an explanation of each argument.

Lets see how it works in action for the rest of the objects from `data` we extracted.

In [None]:
from weaviate.tools import Batcher

For a `Batcher` we only need to add the objects we want to import to Weaviate. The `Batcher` has a special method to add `objects` (`batcher.add_data_object`) and a special method to add `references` (`batcher.add_reference`). Also it provides a `batcher.add` method that has keywords arguments, which detects what kind of data you are trying to add. The `batcher.add` method makes it possible to reuse the `add_article`, `add_author` and `add_references` functions we defined above.

**NOTE:** The `Batcher.add` was introduced in `weaviate-client` version 2.3.0. 

Lets use the batcher to add the remaining articles and authors from `data`. Because the `Batcher` automatically submits objects to weaviate, we need to ALWAYS `.close()` it after we are done to make sure we are submitting what remains in the `Batcher`.

If you are like me, and sometimes forget to close objects, `Bather` can be used in a context manager, i.e. used with `with`. Lets see how it works with the context manager.

In [None]:
# we still need the 'created_authors' so we do not add the 
with Batcher(client, 30, True) as batcher:
    for i in trange(100, 200):
        
        # add article to batcher
        article_id = add_article(batcher, data[i]) # NOTE the 'bather' object instead of 'objects_batch'

        for author in data[i]['authors']:

            # add author to batcher
            author_id = add_author(batcher, author, created_authors) # NOTE the 'bather' object instead of 'objects_batch'

            # add cross references to the batcher
            add_references(batcher, article_id=article_id, author_id=author_id) # NOTE the 'bather' object instead of 'reference_batch'

That is it for a Batcher.

These are the 3 ways to import data to Weaviate. Choose the one that is appropriate for you and your project.

### 5.5. Query data.

Now we have the data imported and ready to be queried. Data can be queried by using the `query` attribute of the client object (`client.query`).

The data is queried using GraphQL syntax, and can be done in three different ways:
- **GET**: query that gets objects and from Weaviate. More information [here](https://www.semi.technology/developers/weaviate/current/graphql-references/get.html)<br>
    Use `client.query.get(class_name, properties).OTHER_OPTIONAL_FILTERS.do()`

- **AGGREGATE**: query that aggregates data. More information [here](https://www.semi.technology/developers/weaviate/current/graphql-references/aggregate.html) <br>
    Use `client.query.aggregate(class_name, properties).OTHER_OPTIONAL_FILTERS.do()`
    
- Or use a GraphQL query represented as a `str`. <br>
    Use `client.query.raw()`
    
**NOTE:** Both `.get` and `.aggregate` require the call of the `.do()` method to run the query. `.raw()` does NOT.

Lets now get the Articles objects and their corresponding title only.

### 5.5.1 GET

In [None]:
result = client.query.get(class_name='Article', properties="title")\
    .do()
print(f"Number of articles returned: {len(result['data']['Get']['Article'])}")
result

So as we can see the `result` contains only 100 articles, this is due to the default limit of 100. Lets change it.

In [None]:
result = client.query.get(class_name='Article', properties="title")\
    .with_limit(200)\
    .do()
print(f"Number of articles returned: {len(result['data']['Get']['Article'])}")

We can do much more by stacking multiple methods. The available methods for `.get` are:
- `.with_limit` - set another limit of returned objects.
- `.with_near_object` - get objects that are similar to the object passed to this method.
- `.with_near_text` - get objects that are similar to the text passed to this method.
- `.with_near_vector` - get objects that are similar to the vector passed to this method.
- `.with_where` - get objects that are filtered using the `Where` filter, see this [link](https://www.semi.technology/developers/weaviate/current/graphql-references/filters.html#where-filter) for examples and explanation.

Also instead of `.do()` one can use the `.build()` method that returns the GraphQL query as a string. This string can be passed to `.raw()` method.

**NOTE:** Only one `.with_near_*` can be used per query.

In [None]:
client.query.get(class_name='Article', properties="title")\
    .with_limit(5)\
    .with_near_text({'concepts': ['Fashion']})\
    .do()

With `Get` we can see the cross references of each object. We are going to use the `.raw()` method for this since it is not possible with any existing `.with_*` method.

In [None]:
query = """
{
  Get {
    Article(limit: 2) {
      title

      hasAuthors {         # the reference
        ... on Author {    # you always set the destination class
          name             # the property related to target class
        }
      }
    }
  }
}
"""

prettify(client.query.raw(query)['data']['Get']['Article'])

### 5.5.2 AGGREGATE

We can use the `.aggregate` to count number of objects that satisfy a specific condition.

In [None]:
# no filter, count all objects of class Article
client.query.aggregate(class_name='Article')\
    .with_meta_count()\
    .do()

In [None]:
# no filter, count all objects of class Author
client.query.aggregate(class_name='Author')\
    .with_meta_count()\
    .do()


Here are the methods that are supported by the `.aggregate`.

- `.with_meta_count` sets meta count to True. Used to count objects per filtered group.
- `.with_fields` - fields to return by the aggregated query.
- `.with_group_by_filter` - set a `GroupBy` filter. See this [link](https://www.semi.technology/developers/weaviate/current/graphql-references/aggregate.html#groupby-filter) for more information about the filter.
- `.with_where` - aggregate objects using a `Where` filter. See this [link](https://www.semi.technology/developers/weaviate/current/graphql-references/filters.html#where-filter) for examples and explanation.


Of course when it comes to querying data, the possibilities are endless. Have fun experimenting with these capabilities.

Feel free to check out and contribute to weaviate-client on [GitHub](https://github.com/semi-technologies/weaviate-python-client).

In [None]:
a = 
    12