# Weaviate Transformation Agent - Workshop

### Prerequisites

1. Log in to [Weaviate Cloud](https://console.weaviate.cloud) account (sign up if you don't have one yet)
1. Create a Weaviate Cloud [Sandbox](https://weaviate.io/developers/wcs/manage-clusters/create#sandbox-clusters) instance
1. Go to the 'Embedding' tab (on the left column) and enable `Weaviate Embeddings`
1. Take note of the `REST Endpoint` and a `Admin` `API Key`. 
1. Update `WEAVIATE_CLOUD_URL` with the `REST Endpoint` and `WEAVIATE_API_KEY` with the `Admin` `API Key` in the `.env` file in the root directory of this repository.

## Introduction

### Agenda

Let's talk about:
- What the Transformation Agent is
- What you can do with the Transformation Agent
- Some tips & tricks
- How to get started

### About the Transformation Agent

The *Weaviate Transformation Agent* is 

- A cloud-based service 
- for transforming your data in a Weaviate instance
- available for Weaviate Cloud users

**And** it is: in technical preview (do **not** use in production)

<center><img src="img/agents_tech_preview.png" width="60%"></center>

> ⚠️ The Weaviate Transformation Agent modifies data objects in Weaviate. **While the Agent is in technical preview, do not use it in a production environment.** 
> 
> The Agent may not work as expected, and the data in your Weaviate instance may be affected in unexpected ways.

**What the Transformation Agent is**

<center><img src="img/ta_obj.png" width="60%"></center>

The `TransformationAgent` can modify objects in a Weaviate collection to add new properties or update existing properties.

**What you can do with the Transformation Agent**

<center><img src="img/ta_overview.png" width="60%"></center>

Provide instructions to the `TransformationAgent` using natural language, and other required parameters. 

## Preparation

Here, we are going to use the [**Weaviate/ArxivPapers**](https://huggingface.co/datasets/weaviate/agents/viewer/query-agent-ecommerce) dataset. 

It includes titles and abstracts of a few research papers.

First, we load the dataset & add it to Weaviate.

### Load dataset

In [1]:
from datasets import load_dataset

papers_dataset = load_dataset("weaviate/agents", "transformation-agent-papers", split="train")

In [2]:
print(papers_dataset.shape)
print(papers_dataset[0]["properties"].keys())

(2000, 2)
dict_keys(['abstract', 'title'])


In [3]:
for k, v in papers_dataset[0]["properties"].items():
    if len(v) > 100:
        v = v[:100] + "..."
    print(f"{k}: {v}")

abstract:   Astronomy is increasingly encountering two fundamental truths: (1) The field
is faced with the tas...
title: Discussion on "Techniques for Massive-Data Machine Learning in
  Astronomy" by A. Gray


Iterate through the data

In [4]:
columns = papers_dataset[0]["properties"].keys()

for i, item in enumerate(papers_dataset):
    if i < 2:
        properties = {
            col: item["properties"][col] for col in columns
        }
        print(properties)

{'abstract': "  Astronomy is increasingly encountering two fundamental truths: (1) The field\nis faced with the task of extracting useful information from extremely large,\ncomplex, and high dimensional datasets; (2) The techniques of astroinformatics\nand astrostatistics are the only way to make this tractable, and bring the\nrequired level of sophistication to the analysis. Thus, an approach which\nprovides these tools in a way that scales to these datasets is not just\ndesirable, it is vital. The expertise required spans not just astronomy, but\nalso computer science, statistics, and informatics. As a computer scientist and\nexpert in machine learning, Alex's contribution of expertise and a large number\nof fast algorithms designed to scale to large datasets, is extremely welcome.\nWe focus in this discussion on the questions raised by the practical\napplication of these algorithms to real astronomical datasets. That is, what is\nneeded to maximally leverage their potential to impro

### Ingest data into Weaviate

#### Connect to Weaviate

In [5]:
import os
import dotenv

dotenv.load_dotenv()

# Update the variables in the .env file with your own values
weaviate_url = os.getenv("WEAVIATE_CLOUD_URL")
weaviate_api_key = os.getenv("WEAVIATE_CLOUD_API_KEY")

In [6]:
weaviate_url

'https://1ree7zierqqrwwnif6b6ug.c0.europe-west3.gcp.weaviate.cloud'

In [7]:
import weaviate
from weaviate.classes.init import Auth

client = weaviate.connect_to_weaviate_cloud(
    cluster_url=weaviate_url, auth_credentials=Auth.api_key(weaviate_api_key)
)

assert client.is_ready()

            We encourage you to update your code to use the async client instead when running inside async def functions!


#### Set up a collection

**Important:** Make sure to enable 'Embeddings' in the Weaviate Cloud console. 

[See above](#prerequisites)

In [8]:
from weaviate.classes.config import Configure, Property, DataType

collection_name = "ArxivPapersDemo"

# Can delete the collection if you would like to (re)start fresh
client.collections.delete(collection_name)

if client.collections.exists(collection_name):
    # For re-running this tutorial, do nothing
    pass
else:
    client.collections.create(
        collection_name,
        description="A dataset that lists research paper titles and abstracts",
        properties=[
            Property(name="title", data_type=DataType.TEXT),
            Property(name="abstract", data_type=DataType.TEXT),
        ],
        vectorizer_config=[
            Configure.NamedVectors.text2vec_weaviate(
                name="default",
                source_properties=["title", "abstract"],
            )
        ]
    )

#### Add data to Weaviate

We loop through the data and add it to Weaviate. 

For the demo/workshop, we add only a few rows for speed and simplicity.

In [9]:
papers_collection = client.collections.get(collection_name)
columns = papers_dataset[0]["properties"].keys()

with papers_collection.batch.fixed_size(100) as batch:
    for i, item in enumerate(papers_dataset):
        if i < 50:
            properties = {col: item["properties"][col] for col in columns}
            batch.add_object(properties=properties)


if papers_collection.batch.failed_objects:
    for fo in papers_collection.batch.failed_objects[:3]:
        print(fo.message)
        print(fo.object_)

In [10]:
len(papers_collection)

50

#### Inspect the collection 



In [12]:
response = papers_collection.query.fetch_objects(
    limit=3,
    include_vector=True
)

for o in response.objects:
    for k, v in o.properties.items():
        print(f"{k}: {v[:50]}")
    print()
    print(o.vector["default"][:10])  # No need to print the entire vector

abstract:   The problem of topic modeling can be seen as a g
title: A Spectral Algorithm for Latent Dirichlet Allocati

[-0.03759765625, -0.0188140869140625, 0.056060791015625, -0.041412353515625, -0.05413818359375, 0.004425048828125, 0.030029296875, 0.0279388427734375, 0.0162200927734375, 0.0528564453125]
abstract:   We propose in this paper an exploratory analysis
title: Exploratory Analysis of Functional Data via Cluste

[-0.036407470703125, 0.0197601318359375, -0.00771331787109375, 0.03326416015625, 0.01172637939453125, 0.0300140380859375, 0.0555419921875, -0.046905517578125, 0.00968170166015625, 0.04949951171875]
abstract:   When dealing with time series with complex non-s
title: Adapting to Non-stationarity with Growing Expert E

[0.006687164306640625, -0.0322265625, 0.036865234375, 0.0158538818359375, -0.0192108154296875, -0.028961181640625, 0.0218963623046875, 0.05517578125, -0.0018281936645507812, -0.016021728515625]


**Alternative: Use the `Explorer` cloud tool**

On Weaviate Cloud Console, click on the `Explorer` tab on the left column.

When you click on each object, you should see 2 properties:
- `title`
- `abstract`

As well as its `vectors`

## Using the original dataset:


### Can you find what you need?

Can you find papers about a specific topic (e.g. machine learning)?

In [13]:
response = papers_collection.query.near_text(
    query="machine learning",
    limit=5
)

for o in response.objects:
    print(o.properties["title"])

Probabilistic Approach to Neural Networks Computation Based on Quantum
  Probability Model Probabilistic Principal Subspace Analysis Example
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based
  Search
Discussion on "Techniques for Massive-Data Machine Learning in
  Astronomy" by A. Gray
Transfer Learning Using Feature Selection
Bayesian Active Learning for Classification and Preference Learning


Can you filter only for papers with a particular main topic? (e.g. classification)

In [14]:
## ???
## Not sure if we actually can do this with the current data

### Does your data meet your needs?

What if: 
- The data is in the wrong language?
- Each abstract is too long?

Would you want to perform a RAG query each time?




## Try the Weaviate Transformation Agent 

### Task 1: Create a `topics` property

Define the operation(s) that you want to perform on the data.

In [15]:
prompt_create_topics = """
Create a list of topic tags based on the abstract.
Topics should be distinct from each other. Provide a maximum of 5 topics.
Group similar topics under one topic tag.
"""

In [16]:
from weaviate.agents.classes import Operations

add_topics = Operations.append_property(
    property_name="topics",             # Property to create
    data_type=DataType.TEXT_ARRAY,      # Data type of the property
    view_properties=["abstract"],       # Existing properties to view for the operation
    instruction=prompt_create_topics,   # Instruction to the Transformation Agent
)

Instantiate the agent & start the operations

In [17]:
from weaviate.agents.transformation import TransformationAgent

ta = TransformationAgent(
    client=client,              # Weaviate client object
    collection=collection_name, # Collection name
    operations=[add_topics]     # List of transform operations
)

ta_response = ta.update_all()

What does the response look like?

In [18]:
ta_response

TransformationResponse(workflow_id='TransformationWorkflow-a9269a6a53231918db19c7450112bc8a')

The response contains the unique `workflow_id` of the operations. 

This does not mean that the operations are finished!

**The Transformation Agent is asynchronous**. You can check the status of the operation using the `workflow_id`.

In [19]:
ta.get_status(workflow_id=ta_response.workflow_id)

{'workflow_id': 'TransformationWorkflow-a9269a6a53231918db19c7450112bc8a',
 'status': {'batch_count': 1,
  'end_time': None,
  'start_time': '2025-03-25 14:59:53',
  'state': 'running',
  'total_duration': None,
  'total_items': 50}}

We can periodically check if the operation is done

In [20]:
def get_ta_status(agent_instance, workflow_id):
    # Rough code to check the status of the TA workflow
    import time
    from datetime import datetime

    while True:
        status = agent_instance.get_status(workflow_id=workflow_id)

        if status["status"]["state"] != "running":
            break

        # Calculate elapsed time from start_time
        start = datetime.strptime(status["status"]["start_time"], "%Y-%m-%d %H:%M:%S")
        elapsed = (datetime.now() - start).total_seconds()

        print(f"Waiting... Elapsed time: {elapsed:.2f} seconds")
        time.sleep(10)

    # Calculate total time
    if status["status"]["total_duration"]:
        total = status["status"]["total_duration"]
    else:
        start = datetime.strptime(status["status"]["start_time"], "%Y-%m-%d %H:%M:%S")
        end = datetime.now() if not status["status"]["end_time"] else datetime.strptime(status["status"]["end_time"], "%Y-%m-%d %H:%M:%S")
        total = (end - start).total_seconds()

    print(f"Total time: {total:.2f} seconds")
    print(status)

In [21]:
get_ta_status(agent_instance=ta, workflow_id=ta_response.workflow_id)

Waiting... Elapsed time: 7.41 seconds
Waiting... Elapsed time: 17.88 seconds
Waiting... Elapsed time: 28.55 seconds
Waiting... Elapsed time: 39.19 seconds
Waiting... Elapsed time: 49.84 seconds
Waiting... Elapsed time: 60.31 seconds
Total time: 67.83 seconds
{'workflow_id': 'TransformationWorkflow-a9269a6a53231918db19c7450112bc8a', 'status': {'batch_count': 1, 'end_time': '2025-03-25 15:01:01', 'start_time': '2025-03-25 14:59:53', 'state': 'completed', 'total_duration': 67.825215, 'total_items': 50}}


**How the Transformation Agent works**

<center><img src="img/ta_schematic.png" width="60%"></center>

The `TransformationAgent` connects to your Weaviate Cloud instance, and uses LLMs to follow these instructions.

When the operation is complete - let's see what we can do with the data:

In [22]:
from weaviate.classes.query import Metrics

response = papers_collection.aggregate.over_all(
    return_metrics=Metrics("topics").text(
        top_occurrences_count=True,
        top_occurrences_value=True,
        min_occurrences=10
    )
)

for t in response.properties["topics"].top_occurrences:
    print(t)

TopOccurrence(count=37, value='Machine Learning')
TopOccurrence(count=7, value='Algorithms')
TopOccurrence(count=6, value='Artificial Intelligence')
TopOccurrence(count=6, value='Classification')
TopOccurrence(count=6, value='Data Analysis')
TopOccurrence(count=6, value='Optimization')
TopOccurrence(count=5, value='Mathematics')
TopOccurrence(count=4, value='Computer Science')
TopOccurrence(count=4, value='Graph Theory')
TopOccurrence(count=4, value='Reinforcement Learning')


Try to filter for papers with particular topics:

In [23]:
from weaviate.classes.query import Filter

response = papers_collection.query.fetch_objects(
    limit=3,
    filters=Filter.by_property("topics").like("*machine*")
)

for o in response.objects:
    print(o.properties["title"])

Statistical Translation, Heat Kernels and Expected Distances
Adapting to Non-stationarity with Growing Expert Ensembles
Discussion on "Techniques for Massive-Data Machine Learning in
  Astronomy" by A. Gray


Inspect an object again:

In [24]:
response = papers_collection.query.fetch_objects(
    limit=3,
)

for o in response.objects:
    for k, v in o.properties.items():
        print(f"{k}: {v[:50]}")
    print()

abstract:   The problem of topic modeling can be seen as a g
title: A Spectral Algorithm for Latent Dirichlet Allocati
topics: ['Topic Modeling', 'Clustering', 'Latent Dirichlet Allocation (LDA)', 'Mixture Models', 'Singular Value Decomposition (SVD)']

abstract:   We propose in this paper an exploratory analysis
title: Exploratory Analysis of Functional Data via Cluste
topics: ['Data Analysis', 'Machine Learning', 'Clustering', 'Optimization', 'Data Science']

abstract:   When dealing with time series with complex non-s
topics: ['Machine Learning', 'Time Series', 'Non-Stationarity', 'Online Learning', 'Algorithm Design']
title: Adapting to Non-stationarity with Growing Expert E



### Task 2: Perform multiple operations

- Add a `paper_type` property (e.g. `survey`, `method`, `resource`)
- Add a boolean property `relevant_to_rag` (True/False)

In [25]:
prompt_paper_type = """
Determine the primary type of paper based on the abstract. Assign exactly one of the following categories that best represents the paper's main contribution:

'survey':   Comprehensive review or meta-analysis of existing work in a field
'model':    Introduction of a new predictive model, statistical method, or algorithmic approach
'system':   Description of a new data pipeline, workflow, framework, or system architecture
'analysis': Focused on insights derived from analyzing data
'resource': Introduction of a new dataset, benchmark, or tool for data science
'other':    None of the above
"""

add_paper_type = Operations.append_property(
      property_name="paper_type",
      data_type=DataType.TEXT,
      view_properties=["abstract"],
      instruction=prompt_paper_type,
)

In [26]:
prompt_about_classification = """
Based on the abstract, determine whether the paper is
primarily about the machine field of classification.

Do not include papers that are obliquely, or vaguely about classification.
"""

add_about_classification_bool = Operations.append_property(
    property_name="about_classification",
    data_type=DataType.BOOL,
    view_properties=["abstract"],
    instruction=prompt_about_classification,
)

In [27]:
prompt_add_french_title_suffix = """
Update the title to ensure that it contains the French translation of itself in parantheses, after the original title.
"""

update_title = Operations.update_property(
    property_name="title",
    view_properties=["title"],
    instruction=prompt_add_french_title_suffix,
)

In [28]:
from weaviate.agents.transformation import TransformationAgent

ta = TransformationAgent(
    client=client,
    collection=collection_name,
    operations=[
        update_title,
        add_paper_type,
        add_about_classification_bool
    ],
)

ta_response = ta.update_all()

Note that this still returns one object, with one workflow ID, even though we are performing multiple operations.

In [29]:
ta.get_status(workflow_id=ta_response.workflow_id)

{'workflow_id': 'TransformationWorkflow-dc1108934d9d07748146df5d3c05dad3',
 'status': {'batch_count': 1,
  'end_time': None,
  'start_time': '2025-03-25 15:04:05',
  'state': 'running',
  'total_duration': None,
  'total_items': 50}}

Let's monitor the operation as before:

In [30]:
get_ta_status(agent_instance=ta, workflow_id=ta_response.workflow_id)

Waiting... Elapsed time: 72.02 seconds
Total time: 76.94 seconds
{'workflow_id': 'TransformationWorkflow-dc1108934d9d07748146df5d3c05dad3', 'status': {'batch_count': 1, 'end_time': '2025-03-25 15:05:22', 'start_time': '2025-03-25 15:04:05', 'state': 'completed', 'total_duration': 76.937397, 'total_items': 50}}


And again, inspect a few transformed objects:

In [31]:
response = papers_collection.query.fetch_objects(
    limit=3,
)

for o in response.objects:
    for k, v in o.properties.items():
        if type(v) == str:
            if len(v) > 50:
                v = v[:50] + "..."
        print(f"{k}: {v}")
    print()

abstract:   The problem of topic modeling can be seen as a g...
title: A Spectral Algorithm for Latent Dirichlet Allocati...
topics: ['Topic Modeling', 'Clustering', 'Latent Dirichlet Allocation (LDA)', 'Mixture Models', 'Singular Value Decomposition (SVD)']
paper_type: model
about_classification: False

abstract:   We propose in this paper an exploratory analysis...
title: Exploratory Analysis of Functional Data via Cluste...
topics: ['Data Analysis', 'Machine Learning', 'Clustering', 'Optimization', 'Data Science']
paper_type: analysis
about_classification: False

abstract:   When dealing with time series with complex non-s...
paper_type: model
title: Adapting to Non-stationarity with Growing Expert E...
topics: ['Machine Learning', 'Time Series', 'Non-Stationarity', 'Online Learning', 'Algorithm Design']
about_classification: False



We see it did, in fact, perform all the specified transformation operations.

We can now use these improved properties to perform new queries. 

- e.g. what paper types do we have?

In [32]:
from weaviate.classes.query import Metrics

response = papers_collection.aggregate.over_all(
    return_metrics=Metrics("paper_type").text(
        top_occurrences_count=True,
        top_occurrences_value=True,
        min_occurrences=10
    )
)

for t in response.properties["paper_type"].top_occurrences:
    print(t)

TopOccurrence(count=28, value='model')
TopOccurrence(count=17, value='analysis')
TopOccurrence(count=2, value='other')
TopOccurrence(count=1, value='survey')


How many objects are about classifications?

In [33]:
from weaviate.classes.query import Filter

response = papers_collection.aggregate.over_all(
    filters=Filter.by_property("about_classification").equal(True),
)

response.total_count

11

In [34]:
from weaviate.classes.query import Filter

response = papers_collection.query.fetch_objects(
    filters=Filter.by_property("about_classification").equal(True),
    limit=10
)

for o in response.objects:
    print(o.properties["title"])

Bayesian Active Distance Metric Learning (Apprentissage de la distance métrique active bayésien)
Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition (Inferences rapides dans les algorithmes de codage parcimonieux avec applications à la reconnaissance dobjet)
Statistical Translation, Heat Kernels and Expected Distances (Statistiques de traduction, noyaux de chaleur et distances attendues)
Bayesian Active Learning for Classification and Preference Learning (Apprentissage Actif Bayésien pour la Classification et l''Apprentissage des Préférences)
Optimizing F-measure: A Tale of Two Approaches (Optimiser la mesure F : Un roman de deux approches)
Transfer Learning Using Feature Selection (Apprentissage transfert à l'aide de la sélection de caractéristiques)
Mutual information for the selection of relevant variables in (Information mutuelle pour la sélection de variables pertinentes dans) spectrometric nonlinear modelling
Using Genetic Algorithms for Texts Class

What about intersections of multiple properties?

In [35]:
from weaviate.classes.query import Filter

response = papers_collection.aggregate.over_all(
    filters=(
        Filter.by_property("paper_type").equal("model") &
        Filter.by_property("about_classification").equal(True)
    )
)

response.total_count

4

Let's take a look at a few:

In [36]:
from weaviate.classes.query import Filter

response = papers_collection.query.near_text(
    query="vector",
    filters=(
        Filter.by_property("paper_type").equal("model") &
        Filter.by_property("about_classification").equal(True)
    )
)

for o in response.objects:
    print(o.properties["title"])

Using a Kernel Adatron for Object Classification with RCS Data (Utilisation d'une Adatron de noyau pour la classification d'objets avec des données RCS)
Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition (Inferences rapides dans les algorithmes de codage parcimonieux avec applications à la reconnaissance dobjet)
Bayesian Active Distance Metric Learning (Apprentissage de la distance métrique active bayésien)
Bayesian Active Learning for Classification and Preference Learning (Apprentissage Actif Bayésien pour la Classification et l''Apprentissage des Préférences)


## Bonus: Use the Query Agent

The Weaviate [Query Agent](https://weaviate.io/developers/agents/query) is another agentic service on Weaviate Cloud. The Query Agent allows you to query your Weaviate instance using natural language.

In [37]:
from weaviate.agents.query import QueryAgent

qa = QueryAgent(
    client=client, collections=[collection_name]
)

Now, we can just tell the Query Agent to do the hard & boring stuff (syntax lookup!) for us.

In [38]:
# Perform a query
response = qa.run(
    """
    Find papers that are about classification. Tell me about some of them.
    Hint: There is a property called 'about_classification' that you can use.
    """,
)

# Print the response
response.display()





In [39]:
# Perform a query
response = qa.run(
    """
    How many papers are primarily about models?

    Hint: There is a property called 'paper_type' where the available values are: 'survey', 'model', 'system', 'analysis', 'resource', 'other'.
    """
)

# Print the response
response.display()





We can even ask it follow-up queries:

In [40]:
followup_response = qa.run(
    query="Can you select one or two of these papers and explain them in simple terms? I am not a data scientist.", context=response
)

followup_response.display()





Read more about the [Query Agent](https://weaviate.io/blog/query-agent) on our blog.

## Bonus: Current limitations

Remember that the Transformation Agent is being asked to update data objects for us. So, be very careful with the instructions you provide.

And currently, it is in technical preview. Do not use it in a production environment (*yet* 😉).

- Do not run multiple agents at the same time - this can cause conflicts (race conditions).
- There is a limit of 10,000 operations per day per Weaviate Cloud organization.

In [41]:
from weaviate.classes.config import Configure, Property, DataType

collection_name = "ArxivPapersDemo"

# Can delete the collection if you would like to (re)start fresh
client.collections.delete(collection_name)

client.collections.create(
    collection_name,
    description="A dataset that lists research paper titles and abstracts",
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="abstract", data_type=DataType.TEXT),
    ],
    vectorizer_config=[
        Configure.NamedVectors.text2vec_weaviate(
            name="default",
            source_properties=["title", "abstract"],
        )
    ]
)

papers_collection = client.collections.get(collection_name)
columns = papers_dataset[0]["properties"].keys()

with papers_collection.batch.fixed_size(100) as batch:
    for i, item in enumerate(papers_dataset):
        if i < 5:
            properties = {col: item["properties"][col] for col in columns}
            batch.add_object(properties=properties)


if papers_collection.batch.failed_objects:
    for fo in papers_collection.batch.failed_objects[:3]:
        print(fo.message)
        print(fo.object_)

len(papers_collection)

5

In [42]:
from weaviate.agents.transformation import TransformationAgent

responses = []
new_languages = ["spanish", "german", "italian"]

for lang in new_languages:

    prompt_task = f"""
    Create a {lang} version of the abstract
    """

    task = Operations.append_property(
        property_name=f"test_{lang}_abstract",
        data_type=DataType.TEXT,
        view_properties=["abstract"],
        instruction=prompt_task,
    )

    ta = TransformationAgent(
        client=client,
        collection=collection_name,
        operations=[task],
    )

    ta_response = ta.update_all()
    responses.append(ta_response)

print(responses)

[TransformationResponse(workflow_id='TransformationWorkflow-ca697ed94ac4bc7e0291f59ea255c506'), TransformationResponse(workflow_id='TransformationWorkflow-13a30b14320af54f2e01c0e4b231d1ab'), TransformationResponse(workflow_id='TransformationWorkflow-8a4147bfa2644661491db6f94aecce32')]


In [43]:
for r in responses:
    get_ta_status(agent_instance=ta, workflow_id=r.workflow_id)

Waiting... Elapsed time: 4.16 seconds
Waiting... Elapsed time: 14.79 seconds
Waiting... Elapsed time: 25.40 seconds
Waiting... Elapsed time: 36.00 seconds
Waiting... Elapsed time: 46.49 seconds
Total time: 49.02 seconds
{'workflow_id': 'TransformationWorkflow-ca697ed94ac4bc7e0291f59ea255c506', 'status': {'batch_count': 1, 'end_time': '2025-03-25 15:10:44', 'start_time': '2025-03-25 15:09:55', 'state': 'completed', 'total_duration': 49.018744, 'total_items': 5}}
Total time: 52.80 seconds
{'workflow_id': 'TransformationWorkflow-13a30b14320af54f2e01c0e4b231d1ab', 'status': {'batch_count': 1, 'end_time': '2025-03-25 15:10:49', 'start_time': '2025-03-25 15:09:57', 'state': 'completed', 'total_duration': 52.80279, 'total_items': 5}}
Waiting... Elapsed time: 61.78 seconds
Total time: 62.41 seconds
{'workflow_id': 'TransformationWorkflow-8a4147bfa2644661491db6f94aecce32', 'status': {'batch_count': 1, 'end_time': '2025-03-25 15:11:00', 'start_time': '2025-03-25 15:09:58', 'state': 'completed', 

If these operations worked perfectly, all objects should have all new properties (`test_spanish_abstract`, `test_german_abstract`, `test_italian_abstract`). 

In [None]:
response = papers_collection.query.fetch_objects(
    limit=50
)

properties = []
for o in response.objects:
    for p in properties:
        if o.properties[p] is None or o.properties[p] == "":
            print(f"Property {p} is empty in object UUID: {o.uuid}")

Property abstract found in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb, adding to list
Property title found in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb, adding to list
Property test_german_abstract found in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb, adding to list
Property test_spanish_abstract found in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb, adding to list
Property test_italian_abstract found in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb, adding to list
Property test_german_abstract is empty in object UUID: 3122f2e1-2f2d-4e00-96e9-aac8fcda8beb
Property test_german_abstract is empty in object UUID: a4904568-9a07-4fc0-a414-ff460326af50
Property test_german_abstract is empty in object UUID: e611c73a-a90f-4fdb-bd46-7db235e73528


But since we have very few objects, multiple objects worked on the same object at the same time. 

This shouldn't happen much in a real-world scenario, but it's something to keep in mind.

## Further resources

- Blog: ["Introducing the Weaviate Transformation Agent"](https://weaviate.io/blog/transformation-agent)
- Documentation: [Weaviate Transformation Agent](https://weaviate.io/developers/agents/transformation)