<a href="https://colab.research.google.com/github/abdullahyasser0/Free-LLMS-api/blob/main/groq_api.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🔰 Introduction

In this notebook, we will explore how to integrate **groq** into your code, along with usage examples.  
To get started:

1. Generate a free API key from [groq cloud](https://console.groq.com/keys) (sign in or create an account).
2. For security, store your API key in **Colab secrets** instead of directly in the code.


# Imports ⚓

In [1]:
pip install groq

Collecting groq
  Downloading groq-0.28.0-py3-none-any.whl.metadata (15 kB)
Downloading groq-0.28.0-py3-none-any.whl (130 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m130.2/130.2 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: groq
Successfully installed groq-0.28.0


In [3]:
from google.colab import userdata
from groq import Groq

In [None]:
client = Groq(api_key=userdata.get('GROQ_API_KEY'))

#Text generation



In [4]:
chat_completion = client.chat.completions.create(
    messages=[
        # Set an optional system message. This sets the behavior of the
        # assistant and can be used to provide specific instructions for
        # how it should behave throughout the conversation.
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        # Set a user message for the assistant to respond to.
        {
            "role": "user",
            "content": "Explain the importance of fast language models",
        }
    ],

    # The language model which will generate the completion.
    model="llama-3.3-70b-versatile"
)

# Print the completion returned by the LLM.
print(chat_completion.choices[0].message.content)

Fast language models are crucial in today's technological landscape, and their importance can be understood from several perspectives:

1. **Efficient Processing**: Fast language models can process and analyze vast amounts of text data quickly, making them essential for applications that require real-time or near-real-time processing, such as:
	* Sentiment analysis
	* Text classification
	* Language translation
	* Chatbots and virtual assistants
2. **Improved User Experience**: Fast language models enable applications to respond rapidly to user input, providing a seamless and interactive experience. This is particularly important for:
	* Voice assistants
	* Messaging platforms
	* Customer service chatbots
	* Online search engines
3. **Scalability**: Fast language models can handle large volumes of data and user requests, making them scalable for:
	* Big data analytics
	* Natural Language Processing (NLP) tasks
	* Machine learning applications
	* Cloud-based services
4. **Energy Efficie

# Text to Speech


---

Converts the plain text to an audio.

At the first time using it you must accept the terms find the link below, or it will appear as an error click and accept


---


Model Terms Acceptance
To use PlayAI TTS, you must accept the model terms: [link](https://console.groq.com/playground?model=playai-tts)

In [7]:
import os
from groq import Groq

# English Model

speech_file_path = "speech.wav"
model = "playai-tts" # "playai-tts-arabic" this model supports arabic
voice = "Fritz-PlayAI"
text = "I love building and shipping new features for our users!"
response_format = "wav"

response = client.audio.speech.create(
    model=model,
    voice=voice,
    input=text,
    response_format=response_format
)

response.write_to_file(speech_file_path)

In [10]:
# Arabic Model

speech_file_path = "speechAraic.wav"
model = "playai-tts-arabic" # "playai-tts-arabic" this model supports arabic
voice = "Nasser-PlayAI" # use one of the following "Nasser-PlayAI" or "Khalid-PlayAI" or "Amira-PlayAI" or "Ahmad-PlayAI"
text = "انا احب اللغة العربية"
response_format = "wav"

response = client.audio.speech.create(
    model=model,
    voice=voice,
    input=text,
    response_format=response_format
)

response.write_to_file(speech_file_path)

# Speech to Text

---

converts WAV files to text, supports only English

In [11]:
import os
import json
from groq import Groq


# Specify the path to the audio file
filename = "/content/speech.wav" # Replace with your audio file!

# Open the audio file
with open(filename, "rb") as file:
    # Create a transcription of the audio file
    transcription = client.audio.transcriptions.create(
      file=file, # Required audio file
      model="whisper-large-v3-turbo", # Required model to use for transcription
      prompt="Specify context or spelling",  # Optional
      response_format="verbose_json",  # Optional
      timestamp_granularities = ["word", "segment"], # Optional (must set response_format to "json" to use and can specify "word", "segment" (default), or both)
      language="en",  # Optional
      temperature=0.0  # Optional
    )
    # To print only the transcription text, you'd use print(transcription.text) (here we're printing the entire transcription object to access timestamps)
    print(json.dumps(transcription, indent=2, default=str))

"Transcription(text=' I love building and shipping new features for our users', task='transcribe', language='English', duration=3.1, words=[{'word': 'I', 'start': 0, 'end': 0.16}, {'word': 'love', 'start': 0.16, 'end': 0.36}, {'word': 'building', 'start': 0.36, 'end': 0.68}, {'word': 'and', 'start': 0.68, 'end': 0.86}, {'word': 'shipping', 'start': 0.86, 'end': 1.1}, {'word': 'new', 'start': 1.1, 'end': 1.32}, {'word': 'features', 'start': 1.32, 'end': 1.72}, {'word': 'for', 'start': 1.72, 'end': 1.94}, {'word': 'our', 'start': 1.94, 'end': 2.12}, {'word': 'users', 'start': 2.12, 'end': 2.5}], segments=[{'id': 0, 'seek': 0, 'start': 0, 'end': 3.1, 'text': ' I love building and shipping new features for our users', 'tokens': [50365, 286, 959, 2390, 293, 14122, 777, 4122, 337, 527, 5022, 50520], 'temperature': 0, 'avg_logprob': -0.0911512, 'compression_ratio': 0.9322034, 'no_speech_prob': 4.125372e-12}], x_groq={'id': 'req_01jyd246a6eybt02bpdnxmw24z'})"


# Images and Vision

---

It descripe the given image and the image can be a local file


In [12]:
from groq import Groq
import os

completion = client.chat.completions.create(
    model="meta-llama/llama-4-scout-17b-16e-instruct", #Also supports "meta-llama/llama-4-maverick-17b-128e-instruct"


    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What's in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://upload.wikimedia.org/wikipedia/commons/f/f2/LPU-v1-die.jpg"
                    }
                }
            ]
        }
    ],
    temperature=1,
    max_completion_tokens=1024,
    top_p=1,
    stream=False,
    stop=None,
)

print(completion.choices[0].message)


ChatCompletionMessage(content='The image presents a detailed view of a microchip, showcasing its intricate design and layout. The microchip is composed of various components, including:\n\n* **Grid-like patterns**: The majority of the microchip features grid-like patterns, which are likely used for routing and connecting different parts of the chip.\n* **Rectangular sections**: Two large rectangular sections are visible on either side of the microchip, each comprising a grid of small squares. These sections may represent memory or processing units.\n* **Decorative borders**: The edges of the microchip feature decorative borders, adding an aesthetic touch to the design.\n* **Smaller components**: At the bottom of the microchip, several smaller components are visible, which could be used for power management, input/output operations, or other supporting functions.\n\nThe overall design of the microchip suggests a high level of complexity and sophistication, with multiple layers and compo

Local image

In [None]:
from groq import Groq
import base64
import os

# Function to encode the image
def encode_image(image_path):
  with open(image_path, "rb") as image_file:
    return base64.b64encode(image_file.read()).decode('utf-8')

# Path to your image
image_path = "sf.jpg"

# Getting the base64 string
base64_image = encode_image(image_path)

chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}",
                    },
                },
            ],
        }
    ],
    model="meta-llama/llama-4-scout-17b-16e-instruct",
)

print(chat_completion.choices[0].message.content)

# Reasoning

---

Reasoning models excel at complex problem-solving tasks that require step-by-step analysis, logical deduction, and structured thinking and solution validation. With Groq inference speed, these types of models can deliver instant reasoning capabilities critical for real-time applications.

In [15]:
from groq import Groq

completion = client.chat.completions.create(
    model="deepseek-r1-distill-llama-70b", # "qwen/qwen3-32b" , "qwen-qwq-32b" , "deepseek-r1-distill-llama-70b"
    messages=[
        {
            "role": "user",
            "content": "How many r's are in the word strawberry?"
        }
    ],
    temperature=0.6,
    max_completion_tokens=1024,
    top_p=0.95,
    stream=True,
    reasoning_format="raw"
)

for chunk in completion:
    print(chunk.choices[0].delta.content or "", end="")

<think>
Okay, let's see. The user is asking how many times the letter 'r' appears in the word "strawberry". Hmm, first, I need to make sure I have the correct spelling of the word. Let me write it out: S-T-R-A-W-B-E-R-R-Y. Wait, is that right? Let me check again. S-T-R-A-W-B-E-R-R-Y. Yeah, that's the correct spelling. So, breaking it down letter by letter.

Starting with the first letter: S. No 'r' there. Next is T, still no. Then R. That's the first 'r'. Then A, no. W, no. B, no. E, no. Then R, that's the second 'r'. And another R right after, so that's the third 'r'. Then Y. So in total, there are three 'r's in "strawberry". Wait, let me count again to be sure. S-T-R-A-W-B-E-R-R-Y. The 'r's are in the third position, then after the 'e', there are two 'r's. So positions 3, 8, and 9. That's three 'r's. Hmm, I think that's correct. But sometimes people might miss one if they're not careful. Let me go through each letter one by one again.

1. S
2. T
3. R (1)
4. A
5. W
6. B
7. E
8. R (2)


# Agentic Tooling

---

solve problems by taking action and intelligently uses external tools - starting with web search and code execution - alongside the powerful Llama 4 models and Llama 3.3 70b model

---

Both systems support the following tools:



*   Web Search via [Tavily](https://www.tavily.com/)
*   Code Execution via [E2B](https://e2b.dev/) (only Python is currently supported)





In [16]:
from groq import Groq


completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "What is the current weather in Tokyo?",
        }
    ],
    # Change model to compound-beta to use agentic tooling
    # model: "llama-3.3-70b-versatile",
    model="compound-beta",
)

print(completion.choices[0].message.content)
# Print all tool calls
# print(completion.choices[0].message.executed_tools)


The current weather in Tokyo is partly cloudy with a temperature of 29°C (84.2°F), wind speed of 28.4 kph (17.7 mph) from the SSW, humidity of 66%, and pressure of 1008.0 mb (29.77 in).
