# A low-cost option: CrewAI with Groq, Llama3 and Mixtral

## The problem and the solution
* Problem: Multi-Agent LLM Apps are very expensive to run in OpenAI.
* Solution: Free alternative with Groq and Llama3 or Mixtral.
    * Not the same quality but waaaaay more cheaper.

## Caveats
* Keep in mind that the quality of Llama3 and Mixtral is still below the quality of ChatGPT-4.

## Process
* Clone the code of the basic CrewAI project.
* Change LLM to Llama3 or Mixtral with Groq.

## Intro to Groq
* Groq is an AI Startup company. **It is not the same as Grok, the Open LLM from Elon Musk**.
* It has developed a new chip call LPU (Language Processing Unit) which is specificly design to run LLMs faster and cheaper.
* It offers a Groq Cloud where you can try OpenSource LLMs like Llama3 or Mixtral.
* **It allows you to use Llama3 or Mixtral in your apps for free using a Groq API Key**.

## How to get a free Groq API Key
* Login into Groq Cloud: [https://console.groq.com/login](https://console.groq.com/login)
* Once logged in, click on API Keys (left sidebar).
* Create a new API Key.
* Copy the API Key and paste it in your .env file.

## How to install Groq in your project
Very easy. LangChain has a module for it. We can install it the same way we install other LangChain modules, using PIP or (if we are working in a Poetry app) we can also install it using Poetry. Use one of the following options:
* pip install langchain-groq
* poetry add langchain-groq

## How to use Groq in our LangChain or CrewAI project
Very easy. Just add the following line at the top of your file:
* from langchain_groq import ChatGroq

And then, in the code, if you want to use Llama3:

In [2]:
# llm = ChatGroq(
#     model="llama3-70b-8192"
# )

Or if you want to use Mixtral:

In [3]:
# llm = ChatGroq(
#     model="mixtral-8x7b-32768"
# )

## Running the app we see some interesting results before seeing a Rate Limit error message
* Remember, we are asking the app to research a new and complex topic (Agentic Behavior). There is not much info about it online yet.
* When using Tavily the app does not find satisfactory results online, instead of using his own knowledge on the subject (as ChatGPT-4 did), Llama3 tries using a slightly different search query with Tavily.
* The app writes a blog post.
* But before completing the full cycle, we see this error message:
    * groq.RateLimitError: Error code: 429 - {'error': {'message': '**Rate limit reached** for model `llama3-70b-8192` in organization `org_01hwgcd1f3fqqr0xfmrxtyyqae` on tokens per minute (TPM): Limit 3500, Used 0, Requested ~3540. Please try again in 685.714285ms. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}

## We run the app again
* Same result: Partial results and Rate Limit error.

## Looking at LangSmith, we see a larger number of tokens used
* Almost double than when we used ChatGPT-4
* But it is free

## Let's now change the research topic for a much simpler one and see the results
* New topic: "Last Real Madrid - Barcelona soccer match".

## Tested with ChatGPT-4, it works OK.
* Multi-agent apps work fine with chatGPT-4 even if agent and task definitions are not very well tuned. ChatGPT-4 has an impressive reasoning level.

## Tested with Llama3

* No luck. Same error: Rate limit.

## Let's try with Mixtral.

* No luck. Same error: Rate limit.

* Huge ammount of tokens used. Still free. We will have to fine-tune the prompts.

## Our initial conclussions
* Groq is a good way to use Llama3 and Mixtral.
* But it is still not a good solution for multi-agent Apps:
    * Since multi-agent apps use a lot of tokens, it is easy to hit the Rate limits of Groq.
    * The reasoning capability of Llama3 and Mixtral is still too low compared with chatGPT-4.
* Homework for you:
    * Experiment using with several LLMs at the same time (example: one agent running with OpenAI, the other 3 agents running with Groq and Llama3).
    * Keep experimenting with Groq, Llama3 and Mixtral to find better approaches for Multi-Agent LLM Apps.

## You can take a look at Groq Rate limits here
* https://console.groq.com/settings/limits

## One possible way to prevent the Rate Limit error from Groq
* Add max_iter equal to a low number of iterations (2, 3 or 4) on every agent. For example:
    * max_iter=2
* Keep in mind that this fix will limit the functionality of the multi-agent app. We did not like it.

## Groq pricing for projects in Production
* [Groq pricing](https://wow.groq.com/).