### Installing and checking Python packages

If have not installed packages you can uncomment the cell below to install packages.
This is only needed once and after installation you can comment them again to stop it from running. 

In [1]:
# !pip install numpy pandas matplotlib openai requests pypdf


Checking installed package:

In [2]:
!pip list | findstr "numpy pandas matplotlib openai requests pypdf"


matplotlib              3.10.7
matplotlib-inline       0.1.7
numpy                   2.3.3
openai                  2.6.0
pandas                  2.3.3
pypdf                   6.1.3
requests                2.32.5


# Introduction to Large Language Models (LLMs) for Research

This notebook is the first module of a workshop on using Large Language Models (LLMs) in research workflows. It focuses on:

- Setting up access to an LLM via the **Groq** API using the **OpenAI-compatible** Python client.
- Running your **first query** and understanding the request/response structure.
- Exploring the notion of **temperature** (randomness) in generation.
- Creating a **continuous conversation** by keeping your own chat history.

We adopt a **text-only** approach (no image/graph understanding) to ensure minimal setup and maximum reproducibility.

By the end of this notebook, you will be able to:
1. Configure a connection to the Groq API.
2. Send chat messages with the OpenAI-compatible client.
3. Control generation behaviour via `temperature`.
4. Maintain and reuse chat history for a continuing conversation.

## Background (very briefly)
An LLM is a probabilistic model over text. Given a sequence of tokens $x_{1:t}$, it assigns probabilities to the next token $x_{t+1}$. At inference, models sample from a distribution such as
$$p(x_{t+1}=i\mid x_{1:t}) = \mathrm{softmax}\!\left(\frac{z_i}{T}\right),$$
where $z_i$ is the logit for token $i$ and $T>0$ is the **temperature**. Lower $T$ concentrates probability mass on high-logit tokens (more deterministic), while higher $T$ spreads it out (more diverse). We will **demonstrate** this behaviour below.


## 1) Environment Setup
This section prepares the Python environment. On QCIF's HPC JupyterLab image, the required package (`openai`) should already be installed. If you're running elsewhere and encounter an `ImportError`, uncomment the `%pip install` line.

**What this cell does:**
- (Optionally) installs the OpenAI Python client.
- Imports the required modules.
- Does **not** make any external calls yet.


In [3]:
# If running outside the provided environment, uncomment the next line:
# %pip install openai
import os
from openai import OpenAI


## 2) Configure API Connection (Groq)
LLM APIs are **stateless** web services. We'll configure a client that speaks the OpenAI-compatible protocol, pointing it to Groq's base URL.

**What you'll do in this cell:**
1. Paste your Groq API key (created at <https://console.groq.com>).
2. Set the base URL for Groq's OpenAI-compatible endpoint.
3. Instantiate the client.

**Notes:**
- Keep your API key private. In shared workshops, you can paste it, run this cell, and then clear the visible text.
- You can also store keys in environment variables or use a `.env` file if preferred.


In [None]:
# Paste your Groq API key below (between quotes). 
# Note this is not a secure way of entering API key because it is visible to everyone that sees your notebook.  
os.environ["GROQ_API_KEY"] = ""  # <-- replace with your key.
model = "llama-3.3-70b-versatile" # Select your model https://console.groq.com/docs/models
os.environ["BASE_URL"] = "https://api.groq.com/openai/v1" # Groq uses an OpenAI-compatible API surface; we just change the base URL.


AttributeError: module 'os' has no attribute 'enviro'

In [None]:
# Create the client
client = OpenAI(
    api_key=os.environ["GROQ_API_KEY"],
    base_url=os.environ["BASE_URL"],
)
print("Groq client initialized!")


Groq client initialized!


## 3) First LLM Call — "Hello LLM" (Llama-3 70B)
Here we send a **single-turn** prompt with minimal scaffolding. The API expects a list of `messages`, where each message has a `role` and `content`.

**Roles:**
- `system`: high-level instructions (tone, persona, formatting).
- `user`: your question or instruction.
- `assistant`: the model's reply (the API returns this).

**What this cell does:**
- Creates a tiny conversation with `system` and `user` messages.
- Calls the model `llama3-70b-8192` for higher-quality outputs compared to 8B.
- Prints the model's reply.

You can modify the user content and re-run to see different responses.


In [None]:
messages = [
    {"role": "system", "content": "You are a helpful research assistant. Be concise."},
    {"role": "user", "content": "Explain what a Large Language Model is in two sentences."}
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=0 # lower temperature -> more deterministic
)

print(response.choices[0].message.content)


A Large Language Model (LLM) is a type of artificial intelligence (AI) designed to process and understand human language, generating human-like text based on the input it receives. LLMs are trained on vast amounts of text data, allowing them to learn patterns and relationships in language, and can be used for tasks such as language translation, text summarization, and conversation generation.


### Examining the response object

In [None]:
for str in response:
    print(str)

('id', 'chatcmpl-60e9c394-acc3-4abb-b0e8-a0f03fdf247a')
('choices', [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='A Large Language Model (LLM) is a type of artificial intelligence (AI) designed to process and understand human language, generating human-like text based on the input it receives. LLMs are trained on vast amounts of text data, allowing them to learn patterns and relationships in language, and can be used for tasks such as language translation, text summarization, and conversation generation.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))])
('created', 1761253879)
('model', 'llama-3.3-70b-versatile')
('object', 'chat.completion')
('service_tier', 'on_demand')
('system_fingerprint', 'fp_4cfc2deea6')
('usage', CompletionUsage(completion_tokens=78, prompt_tokens=57, total_tokens=135, completion_tokens_details=None, prompt_tokens_details=None, queue_time=0.17950201, prompt_time

In [None]:
response.choices[0]

Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='A Large Language Model (LLM) is a type of artificial intelligence (AI) designed to process and understand human language, generating human-like text based on the input it receives. LLMs are trained on vast amounts of text data, allowing them to learn patterns and relationships in language, and can be used for tasks such as language translation, text summarization, and conversation generation.', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))

## 4) Understanding `temperature`
The **temperature** parameter adjusts the **randomness** of token sampling. Intuitively, the model produces a probability distribution over possible next tokens from its logits $z$. The temperature rescales those logits:

$$p_i = \mathrm{softmax}\!\left(\frac{z_i}{T}\right) = \frac{\exp(z_i/T)}{\sum_j \exp(z_j/T)}.$$

- Lower $T$ (e.g. $T=0.2$): the distribution is *sharper* around high-probability tokens, yielding more **stable** outputs.
- Higher $T$ (e.g. $T=0.8$): the distribution is *flatter*, encouraging **diversity** and sometimes creativity.

**What this cell does:**
- Sends the *same* prompt twice, once with `temperature=0.2` and once with `temperature=0.8`.
- Prints both answers so you can compare tone and variability.


In [None]:
prompt = "Describe the role of LLMs in academic research in one sentence."
for temp in [0.2, 0.8]:
    print("Temperature:", temp)
    for i in range(5):
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            temperature=0 # lower temperature -> more deterministic
        )
        print(response.choices[0].message.content)


Temperature: 0.2
A Large Language Model (LLM) is a type of artificial intelligence designed to process and understand human language, using complex algorithms to learn patterns and relationships within vast amounts of text data. By training on massive datasets, LLMs can generate coherent and contextually relevant text, answer questions, and even engage in conversation, mimicking human-like language abilities.
A Large Language Model (LLM) is a type of artificial intelligence (AI) designed to process and understand human language, generating human-like text based on the input it receives. LLMs are trained on vast amounts of text data, allowing them to learn patterns and relationships in language, and can be used for tasks such as language translation, text summarization, and conversation generation.
A Large Language Model (LLM) is a type of artificial intelligence designed to process and understand human language, using complex algorithms to learn patterns and relationships within vast a

## 5) Continuous Conversation (Keeping History)
LLM APIs do **not** keep state between calls. To build a conversation, you keep a list of messages and send the *entire* recent history each time. We'll implement a small helper that:

- Appends the user's message to a global `chat_history` list.
- Calls the model with that history.
- Appends the assistant's reply back into the history.
- Returns the latest reply for display.

We also keep the temperature low for focused answers. For longer chats, you can cap history to the last *k* turns to control token usage.


In [None]:
chat_history = []

def chat(user_input, temperature=0.2, max_turns=8):
    """Send one user turn and get a reply, preserving context.
    - Keeps system prompt + last `max_turns` user/assistant messages.
    """
    chat_history.append({"role": "user", "content": user_input})

    # Keep only the most recent `max_turns` pairs to control context size
    system = chat_history[:1]
    recent = chat_history[-(max_turns*2):] if len(chat_history) > 1 else []
    window = system + recent

    resp = client.chat.completions.create(
        model=model,
        messages=window,
        temperature=temperature,
    )
    reply = resp.choices[0].message.content
    chat_history.append({"role": "assistant", "content": reply})
    return reply


In [None]:
chat_history = [
    {"role": "system", "content": "Tailor your answers for a bioinformatician."}
]

chat("What is logistic regression?")
chat("How does it differ from linear regression?")

for ch in chat_history:
    print(ch)

{'role': 'system', 'content': 'Tailor your answers for a bioinformatician.'}
{'role': 'user', 'content': 'What is logistic regression?'}
{'role': 'user', 'content': 'How does it differ from linear regression?'}


In [None]:
chat("what are some other classification methods?")

for ch in chat_history:
    print(ch)

{'role': 'system', 'content': 'Tailor your answers for a bioinformatician.'}
{'role': 'user', 'content': 'What is logistic regression?'}
{'role': 'user', 'content': 'How does it differ from linear regression?'}
{'role': 'user', 'content': 'what are some other classification methods?'}


## 8) Exercise — Your First Prompt
Try your own research-related prompts. A few ideas:

1. Summarise your current project in **one paragraph**.
2. Ask for **three open research questions** in your field.
3. Request a **draft methods paragraph** describing your dataset and analysis steps.

Remember you can tweak `temperature` to trade off consistency vs creativity.


In [None]:
chat_history = [
    {"role": "system", "content": "Tailor your answers for a bioinformatician."}
]

In [None]:
# Example: replace with your own question(s)
print(chat("Summarise the challenges in renewable energy policy research."))


As a bioinformatician, you're likely familiar with complex systems and data-driven approaches. Renewable energy policy research presents several challenges that can be broken down into the following categories:

1. **Integration and Interoperability**: Renewable energy sources, such as solar and wind power, have variable output, making it challenging to integrate them into existing energy grids. This requires advanced forecasting, grid management, and energy storage systems.
2. **Data Quality and Availability**: High-quality, granular data on energy production, consumption, and grid operations is essential for informed policy decisions. However, data gaps, inconsistencies, and lack of standardization hinder research and policy development.
3. **Complexity and Uncertainty**: Renewable energy systems involve complex interactions between technological, economic, social, and environmental factors. Uncertainties, such as climate change and policy fluctuations, make it difficult to predict o

## 9) Wrap-Up & Next Steps
- You configured an OpenAI-compatible client to talk to **Groq**.
- You sent your first prompts using **Llama-3 70B** and explored the impact of `temperature`.
- You kept conversation state locally in a Python list and learned how to save it.

In the **next notebook**, we'll connect to a scholarly API to fetch abstracts and practice **literature summarisation** and **structured extraction**.

**Key terms:** tokens, temperature, logits, softmax, stateless API, chat history.
