# Introducing Groq
(The following summary generated by Gemini 2.0 Flash with the prompt *"Give me a shortish rundown of the Groq LLM platform."*)


>[Groq](https://groq.com/) is a company that has developed a platform specifically designed for running Large Language Models (LLMs) with a focus on real-time performance. Here's a rundown of their key aspects:

>Core Technology:

>>**Language Processing Unit (LPU):** Groq's unique hardware, the LPU, is a specialized chip architecture designed to accelerate LLM inference. It prioritizes sequential processing, which is crucial for generating text word by word, unlike traditional GPUs that excel at parallel computations.
  
>>**Software-First Approach:** Groq emphasizes software optimization, allowing their compiler to manage hardware execution and data flow. This simplifies development and speeds up deployment.
  
>Key Advantages:

>>**Speed:** Groq claims to offer the fastest inference performance in the industry, measured in tokens per second per user. This translates to quicker response times and a more interactive user experience.

>>**Efficiency:** Their platform is designed to be energy-efficient, which is important for cost savings and environmental considerations.

>>**Scalability:** Groq's architecture is built to maintain high performance even as the scale of LLM deployments increases.

>Use Cases:

>Groq's platform is well-suited for applications that demand real-time LLM interactions, such as:

>>**Chatbots and Virtual Assistants:** Where quick responses are essential for a natural conversation.
  
>>**Real-time Video Analytics:** For processing video feeds and making immediate decisions.

>>**Predictive Analytics:** Where large datasets need to be analyzed rapidly.


## 1 A simple example

(Slightly modified version of https://console.groq.com/docs/quickstart).

I'm using Colab secrets to API keys. See for instance [Gemini API: Authentication Quickstart](https://colab.research.google.com/github/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb).

In [1]:
!pip install -qU groq

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/121.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m112.6/121.9 kB[0m [31m4.8 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━[0m [32m112.6/121.9 kB[0m [31m4.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m121.9/121.9 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import os
from google.colab import userdata
os.environ["GROQ_API_KEY"] = userdata.get("GROQ_API_KEY")

""" If you're running this outside of Colab...

import getpass

if not os.environ.get("GROQ_API_KEY"):
    os.environ["GROQ_API_KEY"] = getpass.getpass("Enter API key for Groq: ")

"""

In [3]:
from groq import Groq

client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],
    model="llama-3.3-70b-versatile",
)

print(chat_completion.choices[0].message.content)

Fast language models are crucial in today's technology landscape, and their importance can be seen in various aspects. Here are some key reasons why fast language models are essential:

1. **Efficient Processing**: Fast language models can process and analyze large amounts of text data quickly, making them ideal for applications such as text classification, sentiment analysis, and language translation. This efficiency enables businesses and organizations to gain insights and make decisions faster.
2. **Real-time Applications**: Fast language models are necessary for real-time applications like chatbots, virtual assistants, and language translation software. These models can quickly respond to user queries, providing a seamless and interactive experience.
3. **Scalability**: As the amount of text data grows, fast language models can handle the increased volume and velocity, making them scalable and able to support large-scale applications.
4. **Improved User Experience**: Fast language 

## 2 Using Groq with LangChain

A modified excerpt from https://python.langchain.com/docs/tutorials/llm_chain/.

In [4]:
!pip install -qU "langchain[groq]"

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/413.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.9/413.2 kB[0m [31m4.3 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m413.2/413.2 kB[0m [31m6.5 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m34.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [5]:
from langchain.chat_models import init_chat_model

model = init_chat_model("llama3-8b-8192", model_provider="groq")

In [6]:
from langchain_core.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage("Translate the following from English into Italian"),
    HumanMessage("hi!"),
]

response = model.invoke(messages)

In [8]:
response

AIMessage(content='Ciao!', additional_kwargs={}, response_metadata={'token_usage': {'completion_tokens': 4, 'prompt_tokens': 24, 'total_tokens': 28, 'completion_time': 0.003333333, 'prompt_time': 0.005088586, 'queue_time': 0.09063891299999999, 'total_time': 0.008421919}, 'model_name': 'llama3-8b-8192', 'system_fingerprint': 'fp_6a6771ae9c', 'finish_reason': 'stop', 'logprobs': None}, id='run-e41cafcf-add0-4a82-988c-4846709b2f9e-0', usage_metadata={'input_tokens': 24, 'output_tokens': 4, 'total_tokens': 28})

In [7]:
response.content

'Ciao!'

## 3 Resources

- [Groq Cloud Console](https://console.groq.com/playground) has a lot of stuff: a playground for trying out prompts and models, API keys generation, [documentation](https://console.groq.com/docs/overview), etc.
- [Examples](https://console.groq.com/docs/examples) part of the documentation has a bunch of "simple python applications you can fork and run in [Replit](https://replit.com/) to get you started building with Groq".
- **Replit** is "an online integrated development environment (IDE) that provides a collaborative, browser-based platform for coding, learning, and building software" (Gemini again). See also [Replit on Wikipedia](https://en.wikipedia.org/wiki/Replit#:~:text=Features,a%20variety%20of%20programming%20languages.).
- [Groq API Cookbook](https://github.com/groq/groq-api-cookbook) is a GitHub repo with "a collection of example code and guides for Groq API ".
- There's also a [developer community on Discord](https://discord.com/invite/groq).



