Note:
1. Get key from [here](https://console.groq.com/keys?_gl=1*1ih2ghe*_gcl_au*MTYyNDMzNzQ4OC4xNzM0OTI3MjM3*_ga*MTYzMDI0MzAxOC4xNzI3MDg1NzIx*_ga_4TD0X2GEZG*MTczNTcxMTUxOS42LjEuMTczNTcxMTY0OS42MC4wLjA.)  and store in `secrets` of `colab` with name `GROQ_API_KEY`
2. Install groq: `!pip install groq `

---
Groq emerged as the first API provider to break the 100 tokens per second generation rate while running Meta’s Llama2-70B parameter model.

Groq currently hosts a variety of open-source large language models running on its LPUs (Language Processing Unit) for public access. Access to these demos are available through Groq's website.










Ref: https://console.groq.com/docs/text-chat

In [2]:
!pip install groq

Collecting groq
  Downloading groq-0.13.1-py3-none-any.whl.metadata (14 kB)
Downloading groq-0.13.1-py3-none-any.whl (109 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/109.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m109.1/109.1 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: groq
Successfully installed groq-0.13.1


# Text completion API

In [14]:
from groq import Groq
from google.colab import userdata
import os
client = Groq(
    api_key=userdata.get("GROQ_API_KEY"),
)


Parameters:
1. **frequency_penalty**  
   - `number or null` (Optional)  
   - Defaults to 0  
   - Accepts a number between `-2.0 and 2.0`. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same lines.

2. **max_tokens**  
   - `integer or null` (Optional)  
   - The maximum number of tokens that can be generated in the chat completion.  
   - The total length of input tokens and generated tokens is limited by the model's context length.

3. **messages**  
   - `array` (Required)  
   - A list of messages comprising the conversation so far.

4. **model**  
   - `string` (Required)  
   - The ID of the model to use.

1. **presence_penalty**   `number or null` (Optional)  
   - Defaults to 0  
   - Number between `-2.0 and 2.0`. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to discuss new topics.

2. **response_format**  
   - `object or null` (Optional)  
   - An object specifying the format in which the model must output.  
   - Setting to `{ "type": "json_object" }` enables JSON mode, which guarantees that the message generated by the model is valid JSON.  
   - **Important**: When using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message.

3. **stream**  
   - `boolean or null` (Optional)  
   - Defaults to `false`  
   - If set to `true`, partial message deltas will be sent. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a `data: [DONE]` message.

4. **temperature**  
   - `number or null` (Optional)  
   - Defaults to 1  
   - What sampling temperature to use, between 0 and 2. Higher values like `0.8` will make the output more random, while lower values like `0.2` will make it more focused and deterministic. We generally recommend altering this or `top_p`, but not both.

5. **tool_choice**  
   - `string / object or null` (Optional)  
   - Controls which (if any) tool is called by the model.  
   - `none` means the model will not call any tool and instead generates a message.  
   - `auto` means the model can pick between generating a message or calling one or more tools.  
   - `required` means the model must call one or more tools.  
   - Specifying a particular tool via `{ "type": "function", "function": { "name": "my_function" } }` forces the model to call that tool.  
   - `none` is the default when no tools are present. `auto` is the default if tools are present.

6. **tools**  
   - `array or null` (Optional)  
   - A list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.

7. **top_logprobs**  
   - `integer or null` (Optional)  
   - This is not yet supported by any of our models. An integer between `0 and 20` specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.

8. **top_p**  
   - `number or null` (Optional)  
   - Defaults to `1`  
   - An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So `0.1` means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or `temperature`, but not both.

9. **user**  
   - `string or null` (Optional)  
   - A unique identifier representing your end-user, which can help us monitor and detect abuse.


In [15]:

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "you are a helpful assistant."
        },
        {
            "role": "user",
            "content": "Explain the importance of Yoga",
        }
    ],
    model="llama-3.3-70b-versatile",
)

print(chat_completion.choices[0].message.content)

Yoga is a holistic practice that originated in ancient India, combining physical postures, breathing techniques, and meditation to promote overall well-being. The importance of yoga can be understood on multiple levels, including physical, mental, emotional, and spiritual. Here are some key benefits of practicing yoga:

**Physical Benefits:**

1. **Flexibility and Balance**: Yoga helps increase flexibility, balance, and coordination by stretching and strengthening the muscles.
2. **Weight Management**: Yoga can help with weight management by building muscle, improving metabolism, and reducing stress.
3. **Improved Posture**: Yoga helps improve posture by strengthening the core and back muscles, reducing the risk of back pain and other musculoskeletal issues.
4. **Cardiovascular Health**: Yoga can help lower blood pressure, improve circulation, and reduce the risk of heart disease.

**Mental and Emotional Benefits:**

1. **Reduced Stress and Anxiety**: Yoga helps reduce stress and anxie

# Audio transcription API
Transcribes audio into the input language.

Parameters:
1. **file**  
   - `string` (Required)  
   - The audio file object (not the file name) to transcribe, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

2. **language**  
   - `string` (Optional)  
   - The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

3. **model**  
   - `string` (Required)  
   - ID of the model to use. Only `whisper-large-v3` is currently available.

4. **prompt**  
   - `string` (Optional)  
   - An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.

5. **response_format**  
   - `string` (Optional)  
   - Defaults to `json`  
   - The format of the transcript output, in one of these options: `json`, `text`, or `verbose_json`.

6. **temperature**  
   - `number` (Optional)  
   - Defaults to `0`  
   - The sampling temperature, between `0` and `1`. Higher values like `0.8` will make the output more random, while lower values like `0.2` will make it more focused and deterministic. If set to `0`, the model will use log probability to automatically adjust the temperature until certain thresholds are met.

7. **timestamp_granularities**  
   - `array` (Optional)  
   - Defaults to `segment`  
   - The timestamp granularities to populate for this transcription. `response_format` must be set to `verbose_json` to use timestamp granularities. Either or both of these options are supported: `word`, or `segment`.  
   - Note: There is no additional latency for segment timestamps, but generating word timestamps incurs additional latency.


In [16]:
filename = "transcription.mp3"

with open(filename, "rb") as file:
    transcription = client.audio.transcriptions.create(
      file=(filename, file.read()),
      model="whisper-large-v3",
      prompt="Specify context or spelling",
      response_format="json",
      language="en",
      temperature=0.0
      )
    print(transcription.text)

 The fire that warms us can also consume us. It is not the fault of the fire.


# Audio translation API

Translates audio into English.

Parameters:
1. **file**  
   - `string` (Required)  
   - The audio file object (not the file name) to translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

2. **model**  
   - `string` (Required)  
   - ID of the model to use. Only `whisper-large-v3` is currently available.

3. **prompt**  
   - `string` (Optional)  
   - An optional text to guide the model's style or continue a previous audio segment. The prompt should be in English.

4. **response_format**  
   - `string` (Optional)  
   - Defaults to `json`  
   - The format of the translation output, in one of these options: `json`, `text`, or `verbose_json`.

5. **temperature**  
   - `number` (Optional)  
   - Defaults to `0`  
   - The sampling temperature, between `0` and `1`. Higher values like `0.8` will make the output more random, while lower values like `0.2` will make it more focused and deterministic. If set to `0`, the model will use log probability to automatically adjust the temperature until certain thresholds are met.


In [17]:
filename = "translation.m4a"
with open(filename, "rb") as file:
    translation = client.audio.translations.create(
      file=(filename, file.read()),
      model="whisper-large-v3",
      prompt="Specify context or spelling",  # Optional
      response_format="json",  # Optional
      temperature=0.0  # Optional
    )
    print(translation.text)



 The capital of West Bengal is located on the banks of the Huggli River, 180 km from the border of the Bengal Khadi.


# Vision models

Groq API offers fast inference and low latency for multimodal models with vision capabilities for understanding and interpreting visual data from images. By analyzing the content of an image, multimodal models can generate human-readable text for providing insights about given visual data.

In [18]:
# Pass Images from URLs as Input

completion = client.chat.completions.create(
    model="llama-3.2-11b-vision-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://static.wixstatic.com/media/nsplsh_464847566f4f6c77476149~mv2_d_6016_4016_s_4_2.jpg/v1/crop/x_0,y_32,w_6016,h_3951/fill/w_708,h_465,al_c,q_80,usm_0.66_1.00_0.01,enc_avif,quality_auto/Image%20by%20Proxyclick%20Visitor%20Management%20System.jpg"
                    }
                }
            ]
        }
    ],
    temperature=1,
    max_tokens=1024,
    top_p=1,
    stream=False,
    stop=None,
)

print(completion.choices[0].message.content)

This image shows a modern office space with two levels connected by a staircase. The lower level has a long, narrow room with exposed brick walls and towering windows along its left wall. In this space, there are several long tables topped with rolled-up rugs, around which several empty black chairs can be seen. Six clay amphora vases make up the backsplash, and several computer monitor units and other gear can be seen against the right wall.

The staircase includes metal handrails and dark bronze posts supporting a second-story landing that is furnished with the amphora vases. Along the left side of the stairwell as seen in the foreground, a black metal and wood table are pushed against the brick wall. At the middle of the second floor, a group of employees or students in long-sleeved jeans and light blue shirts appear to be, listening to a presentation.

The atmosphere of the space is open, bright, and collaborative, conducive to creativity and productivity, making it an effective of

In [19]:
#  Pass Locally Saved Images as Input
import base64


# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "yoga.jpeg"

# Getting the base64 string
base64_image = encode_image(image_path)


chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}",
                    },
                },
            ],
        }
    ],
    model="llama-3.2-11b-vision-preview",
)

print(chat_completion.choices[0].message.content)

This image depicts a man sitting in a lotus position, a yoga pose commonly used for meditation. He is wearing a white tank top and gray pants, with his legs crossed and hands raised in front of him. The background is a solid gray color, suggesting that this may be a promotional image for a yoga class or product.
