# DECODER ONLY MODELS

In [2]:
pip install groq

Collecting groq
  Downloading groq-0.31.1-py3-none-any.whl.metadata (16 kB)
Collecting anyio<5,>=3.5.0 (from groq)
  Downloading anyio-4.10.0-py3-none-any.whl.metadata (4.0 kB)
Collecting distro<2,>=1.7.0 (from groq)
  Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting httpx<1,>=0.23.0 (from groq)
  Downloading httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting pydantic<3,>=1.9.0 (from groq)
  Downloading pydantic-2.11.7-py3-none-any.whl.metadata (67 kB)
Collecting sniffio (from groq)
  Downloading sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting httpcore==1.* (from httpx<1,>=0.23.0->groq)
  Downloading httpcore-1.0.9-py3-none-any.whl.metadata (21 kB)
Collecting h11>=0.16 (from httpcore==1.*->httpx<1,>=0.23.0->groq)
  Downloading h11-0.16.0-py3-none-any.whl.metadata (8.3 kB)
Collecting annotated-types>=0.6.0 (from pydantic<3,>=1.9.0->groq)
  Downloading annotated_types-0.7.0-py3-none-any.whl.metadata (15 kB)
Collecting pydantic-core==2.33.2 (from 

In [2]:
import os
from groq import Groq

In [3]:
from dotenv import load_dotenv

load_dotenv(override=True) # This line loads the environment variables from the .env file

# Now you can access the variables with os.getenv()
groq_api_key = os.getenv("GROQ_API_KEY")

In [4]:
# Instantiate the Groq client
client = Groq(
    api_key=os.environ.get("GROQ_API_KEY"),
)

In [5]:
# Define the messages for the chat completion
messages = [
    {
        "role": "user",
        "content": "Explain in 5 lines encoder-only models.",
    },
]

# Create a completion request with the Qwen model
completion = client.chat.completions.create(
    model="qwen/qwen3-32b",
    messages=messages
)

# Print the model's response
print(completion.choices[0].message.content)

<think>
Okay, I need to explain what encoder-only models are in five lines. Let's start by recalling what I know. Encoder-only models are types of neural networks used in NLP tasks. They focus on processing input data, like text, and creating representations that capture context. Unlike encoder-decoder models (like in transformers), they don't have a separate decoding part. Common examples are BERT and its variants. They're used for tasks like text classification or sentiment analysis. Hmm, I should make sure to mention the transformer architecture since they're based on that. Also, the self-attention mechanism is key here. I need to keep it concise, each line should be a separate point. Let me check the line count. Wait, five lines? Let me structure each line to cover the main aspects: definition, based on transformer encoders, self-attention, training with masks, and examples. Need to make sure each line is clear and flows logically. Maybe start with definition, then components, how 

In [6]:
# Create a completion request with the Qwen model
completion_llama = client.chat.completions.create(
    model="openai/gpt-oss-120b",
    messages=messages
)

In [7]:
# Print the model's response
print(completion_llama.choices[0].message.content)

- Encoder‑only models consist solely of a stack of transformer encoders, without any decoder or generative head.  
- They process the entire input sequence simultaneously, allowing each token to attend to all others via self‑attention.  
- Pre‑training objectives (e.g., masked language modeling) teach the model to infer missing tokens from surrounding context.  
- At inference, the learned contextual embeddings are used for downstream tasks such as classification, tagging, or retrieval.  
- Examples include BERT, RoBERTa, ALBERT, and ELECTRA, which excel at understanding‑oriented NLP problems.
