# CrewAI on VertexAI Reasoning Engine

|||
|----------|-------------|
| Author(s)   | Christos Aniftos |
| Reviewer(s) | |
| Last updated | 2024 08 28  |
| |   |

This demo uses the default crewai project skeleton template to allow the use of Gemini model.
At the time of the demo creation we used crewai version 0.51.0 and therefore some of the changes we mentioned might be outdate in future versions.
As we already include a template in this repository **you will not need to create a new project**. 

However if you want to know more about starting a new CrewAI project from template look here: [Starting Your CrewAI Project](https://docs.crewai.com/getting-started/Start-a-New-CrewAI-Project-Template-Method/) . 

### Installing dependencies

CrewAI is using poetry to manage dependencies. Lets install poetry so we can use it as well for our CrewAI project

In [None]:
!pip install poetry

Now lets install poetry dependencies. 

In this repo we already added langchain-google-vertexai using `poetry add langchain-google-vertexai`. \
This allow the use of VertexAI model through the langchain model api.

Below we explain how we change the llm used by crew agents to gemini-flash

In [None]:
!poetry lock
!poetry install

Now lets see some of the alternations we made to the original template in order to use Gemini as the llm that agents are using. 
Firstly we need to create a function in /src/crewai_gcp/crew.py that defines the llm. The function needs the llm annotation

Here you can see the added in `./src/crewai_gcp/crew.py`  an llm function that returns a VertexAI model object from langchain

In [None]:
!grep -n -A 3  '@llm' ./src/crewai_gcp/crew.py 

Additionally for every agent  configuration file in `./src/crewai_gcp/config/agents.yaml` we define the llm to be gemini_llm (same name as the function name we added in crew.py).

By defing the llm, crewai knows to use the defined llm for each agent. If desired you can use different llm functions for different agents. For example an agent performing more complicated tasks might benefit from gemini-pro whereas agents with simpler operations might be performant with the lightweight gemini-flash

In [None]:
!grep -n -C 2 'llm:' ./src/crewai_gcp/config/agents.yaml

### Running our crew demo

Run crewai_gcp project locally

In [None]:
!poetry run crewai_gcp

### Preapare CrewAI interface for Reasoning Engine

To be able to run CrewAI on Reasoning Engine we need to create a class that defines an `__init__`, `setup` and `query` functions and crew_ai_app.py.\
Below you can see what we included in crew_ai_app.py.\

##### Some remarks:

**Line 19**: We define what happens when Reasoning engine Starts. Here we initialise Traceloop.\
Traceloop provides an easy interface to trace our agent executions. We use `CloudTraceSpanExported` to export traces on Google Cloud Trace

**Line 27**:  `@workflow(name="CrewAI_Trace")` allows us to group all trace span together under CrewAI_Trace workflow. This name can change to whatever you want to name your trace grouping.

**Line 29**:  `CrewaiGcpCrew().crew().kickoff(inputs={"topic": question})` runs the CrewAI for a given question. The response should be returned as __str__

In [None]:
!cat -n ./crew_ai_app.py

In [None]:
project_id = "sa-org-project"
location = "us-central1"
staging_bucket = "gs://sa-org-project-reasoning-engine"

Test locally

In [None]:
from crew_ai_app import CrewAIApp

app = CrewAIApp(project=project_id, location=location)
app.set_up()
response_c = app.query("AI")


### Deploy on reasoning engine

In [None]:
import vertexai
from vertexai.preview import reasoning_engines

vertexai.init(project=project_id, location=location, staging_bucket=staging_bucket)

In [None]:

reasoning_engine_list = reasoning_engines.ReasoningEngine.list()
print(reasoning_engine_list)
for re in reasoning_engine_list:
    re.delete()

In [None]:
# Create a remote app with reasoning engine.
# This may take 1-2 minutes to finish.
from crew_ai_app import CrewAIApp

reasoning_engine = reasoning_engines.ReasoningEngine.create(
    CrewAIApp(project=project_id, location=location),
    display_name="Demo Addition App",
    description="A simple demo addition app",
    requirements=[
        'cloudpickle==3',
        'crewai==0.51.0', 
        'langchain-google-vertexai==1.0.7',
        
        'traceloop-sdk==0.26.5',
        'opentelemetry-exporter-gcp-trace==1.6.0'],
    extra_packages=['./src','./crew_ai_app.py'],
)

In [None]:
response = reasoning_engine.query(question="Henry VIII")

In [None]:
print(response)

## Limitations:
This section describes the limitations we encounter as of August 20th 2024

### Memory: 
By default the [memory system of CrewAI](https://docs.crewai.com/core-concepts/Memory/?h=memory) is dissabled and deploying CrewAI without memory will work as intented.\
Enabling memory at the moment is not fully supported using reasoning engine because CrewAI uses local directory\
for memory data storage.
When reasoning engine uses one isntaces this will work as intented but the logic will break\
with auto-scaling as new isntances will not share the same local files.

This is not a problem with reasoning engine but a challenge with local storage systems when your deployment benefits\
from auto-scaling. To resolve this issue there is a need for external storage support.\
You can find more details on this [feature request](https://github.com/crewAIInc/crewAI/issues/1218) we submitted to the CrewAI team.

### Vertex AI Embeddings 
CrewAI depends on [embedchain](https://github.com/mem0ai/mem0/tree/main/embedchain) library for generating embeddings.
embedchain uses an old langchain import for VertexAIEmbeddings\
which is depricated and [fails pydantic field validation](https://github.com/crewAIInc/crewAI/issues/1213).

A [PR was raised](https://github.com/mem0ai/mem0/pull/1717) in order to update to the newest  langchain VertexAIEmbeddings import that solves the issue.


## Todo
1) Check if crewai support streaming results to be used as api streaming (fastapi or flask)
1) Check if parallel requests are supported in reasoning engine
1) Benchmark local speed vs on reasoning engine
1) Return other generated artifacts such us files as part of the api resonse
1) Implement Human in the loop and how that can be achieved when deployed on reasoning engine.\
What happens when multiple users are using the app? how do we use sessions?

