# Basic LLM Calls and RAG Workflow
## Quick Start Guide for Large Language Models (LLMs)

For practice purposes, I use OpenAI's GPT-4o-mini models for text based Q&A. 

All environment variables are set in the `.env` file, which is not included in this repository for security reasons.

The basic LLM call is made using the `openai` Python package, which is installed via pip. 

```python
import openai

client = openai.Client(api_key="your-api-key",base_url= 'your-base-url')  # Set your OpenAI API key and base URL

response = openai.ChatCompletion.create(
    model="gpt-4o-mini", # or "gpt-4o-mini-preview" for the preview version
    temperature=0.7, # controls randomness in the response
    max_tokens=100, # maximum number of tokens in the response
    top_p=1.0,
    messages=[
        {"role": "user", "content": message}
    ]
)
message = "What is the capital of France?"
messages = "You are a helpful assistant. Please answer the question: " + message
print(response.choices[0].message.content)
```

In [None]:
#Optional
!pip install openai
!pip install python-dotenv

In [None]:
import openai
from openai import OpenAI
import os

# Set API key and base URL globally
api_key = os.getenv("OPENAI_API_KEY") # Use correct env var name from your .env or manually set it
print("OpenAI API Key:", api_key[:10] + "..." if api_key else "Not found")  # Only show first 10 chars for security
base_url = os.getenv("OPENAI_API_BASE", "https://api.openai.com/v1")  # Use correct env var name from your .env
print("OpenAI Base URL:", base_url)

# Initialize OpenAI client with API key and base URL
client = OpenAI(
    api_key=api_key,
    base_url=base_url
)


In [None]:
# Example usage of the OpenAI client
message = "What is the capital of France?"

messages = "You are a helpful assistant. Please answer the question: " + message 
#this is the final prompt sent

response = client.chat.completions.create(
    model="gpt-4o-mini", # or "gpt-4o-mini-preview" for the preview version
    temperature=0.7, # controls randomness in the response
    max_tokens=100, # maximum number of tokens in the response
    top_p=1.0,
    messages=[
        {"role": "user", "content": message}
    ]
)

print(response.choices[0].message.content)
# this will return the response from the model, which should be "Paris" for this question

LLMs can answer quetions about non-text data, such as images, audio, and video.

In [None]:
# Answer question from picture with 4o and image
image_url ='https://static.independent.co.uk/s3fs-public/thumbnails/image/2017/03/28/13/kitten.jpg?quality=75&width=1250&crop=3%3A2%2Csmart&auto=webp'

response_pic = client.chat.completions.create(
    model="gpt-4o",
    temperature=0.7,
    max_tokens=300,
    top_p=1.0,
    messages=[
        {"role": "user", 
         "content": [{"type": "text", "text": "What is this a picture of?"}, 
                     {"type": "image_url", "image_url": image_url}]
            }
        ]
)
print(response_pic.choices[0].message.content)

## LLM Calls with Multiple Rounds
This is useful for more complex interactions where the model needs to maintain context over multiple exchanges.

Different from the single round call, we need to maintain a list of messages that represent the conversation history. The messages list is constructed with alternating user and assistant roles.

```python
import openai
client = openai.Client(api_key="your-api-key",base_url= 'your-base-url')  # Set your OpenAI API key and base URL

def multi_round_chat(messages):
    response = client.chat.completions.create(
        model="gpt-4o-mini", # or "gpt-4o-mini-preview" for the preview version
        temperature=0.7, # controls randomness in the response
        max_tokens=100, # maximum number of tokens in the response
        top_p=1.0,
        messages=messages
    )
    return response.choices[0].message.content  

user_input = input("You: ")
messages = [{"role": "user", "content": user_input}]
while True:
    response = multi_round_chat(messages)
    print("Assistant:", response)
    messages.append({"role": "assistant", "content": response})
    user_input = input("You: ")
    messages.append({"role": "user", "content": user_input})
```

In [None]:
import openai
import os
from openai import OpenAI
from IPython.display import Markdown, display

api_key = os.getenv("OPENAI_API_KEY") # Use correct env var name from your .env or manually set it
print("OpenAI API Key:", api_key[:10] + "..." if api_key else "Not found")  # Only show first 10 chars for security
base_url = os.getenv("OPENAI_API_BASE", "https://api.openai.com/v1")  # Use correct env var name from your .env
print("OpenAI Base URL:", base_url)

# Initialize OpenAI client with API key and base URL
client = OpenAI(
    api_key=api_key,
    base_url=base_url
)

In [None]:
def multi_round_chat(messages):
    response = client.chat.completions.create(
        model="gpt-4o-mini", # or "gpt-4o-mini-preview" for the preview version
        temperature=0.7, # controls randomness in the response
        max_tokens=400, # maximum number of tokens in the response
        top_p=1.0,
        messages=messages
    )
    return response.choices[0].message.content  

user_input = input("You: ")
print("You:", user_input)
# Initialize messages with the user's input
messages = [{"role": "user", "content": user_input}]
for _ in range(3):
    response = multi_round_chat(messages)
    print("Assistant:")
    display(Markdown(response)) 
    print("-" * 50) 
    messages.append({"role": "assistant", "content": response})
    
    user_input = input("You: ")
    print("You:", user_input)
    messages.append({"role": "user", "content": user_input})


The code above allows for a multi-round conversation with the LLM. The user can input a message, and the assistant will respond based on the conversation history. The loop continues until the user decides to stop. It can be viewed as a simple chat interface with the LLM.


## Structure the Response
A large quantity of data are stored in a structured format, such as JSON or XML. LLMs can also return structured data in the response.
```python
house_data = [
    {
        "address": "123 Main St",
        "price": 500000,
        "bedrooms": 3,
        "bathrooms": 2,
        "features": ["garage", "garden"]
    },
    {
        "address": "456 Elm St",
        "price": 600000,
        "bedrooms": 4,
        "bathrooms": 3,
        "features": ["pool", "fireplace"]
    }
]
```
This is a simple example of how to structure the response from an LLM. With f-strings, you can easily format the output to include variable data in a readable way. For example:
```python
description = f"The house at {house_data[0]['address']} has {house_data[0]['bedrooms']} bedrooms and is priced at ${house_data[0]['price']}. It features a {', '.join(house_data[0]['features'])}."

print(description)
```
This will output:
```
The house at 123 Main St has 3 bedrooms and is priced at $500000. It features a garage, garden.
```


In [None]:
# Example data for a real estate listing
house_data = [
    {"address": "123 Main St, Springfield", "price": 500000, "bedrooms": 3, "bathrooms": 2, "features": ["garage", "garden"]},
    {"address": "456 Elm St, Springfield", "price": 600000, "bedrooms": 4, "bathrooms": 3, "features": ["pool", "fireplace"]},
    {"address": "789 Oak St, Springfield", "price": 450000, "bedrooms": 2, "bathrooms": 1, "features": ["fenced yard", "new roof"]},
    {"address": "101 Pine St, Springfield", "price": 700000, "bedrooms": 5, "bathrooms": 4, "features": ["home office", "basement"]},
    {"address": "202 Maple St, Springfield", "price": 550000, "bedrooms": 3, "bathrooms": 2, "features": ["deck", "modern kitchen"]},
]
# Accessing the data with simple print and formatting
print("Real Estate Listings:")
for house in house_data:
    print(f"Address: {house['address']}, Price: ${house['price']}, Bedrooms: {house['bedrooms']}, Bathrooms: {house['bathrooms']}, Features: {', '.join(house['features'])}")
    print("-" * 50)  # Separator for readability

# Improved formatting with descriptive text
def house_info(houses):
    layout = ''
    for house in houses:
        layout += f"House located at {house['address']} is priced at ${house['price']}. It has {house['bedrooms']} bedrooms and {house['bathrooms']} bathrooms. Notable features include: {', '.join(house['features'])}.\n"
        layout += "=*=" * 20 + "\n"  # Separator for readability
    return layout
print("Formatted Real Estate Listings:")
formatted_info = house_info(house_data)
print(formatted_info)

## Putting It All Together (LLM + Structured Data)
You can combine the LLM calls with structured data to create a more complex application. For example, you can use the LLM to generate a summary of a dataset or to answer questions about the data.

The workflow can be summarized as follows:
1. **Load the data**: Load the structured data from a file or database.
2. **Process the data**: Use Python to process the data and prepare it for the
3. **Call the LLM**: Use the processed data as input to the LLM, either as part of the prompt or as a separate message in a multi-round conversation.   

### Diagram
```mermaid
flowchart TD
    A[Load Data] --> B[Process Data]        
    B --> C[Call LLM]
    C --> D[Receive Response]
    D --> E[Display Result]
    E --> F[User Input]
    F --> C
```

In [17]:
# Initializing
import openai
import os
from openai import OpenAI
from IPython.display import Markdown, display

api_key = os.getenv("OPENAI_API_KEY") # Use correct env var name from your .env or manually set it
print("OpenAI API Key:", api_key[:10] + "..." if api_key else "Not found")  # Only show first 10 chars for security
base_url = os.getenv("OPENAI_API_BASE", "https://api.openai.com/v1")  # Use correct env var name from your .env
print("OpenAI Base URL:", base_url)

# Initialize OpenAI client with API key and base URL
client = OpenAI(
    api_key=api_key,
    base_url=base_url
)

OpenAI API Key: sk-nzH4YCa...
OpenAI Base URL: https://xiaoai.plus/v1


In [None]:
# Load data
house_data = [
    {"address": "123 Main St, Springfield", "price": 500000, "bedrooms": 3, "bathrooms": 2, "features": ["garage", "garden"]},
    {"address": "456 Elm St, Springfield", "price": 600000, "bedrooms": 4, "bathrooms": 3, "features": ["pool", "fireplace"]},
    {"address": "789 Oak St, Springfield", "price": 450000, "bedrooms": 2, "bathrooms": 1, "features": ["fenced yard", "new roof"]},
    {"address": "101 Pine St, Springfield", "price": 700000, "bedrooms": 5, "bathrooms": 4, "features": ["home office", "basement"]},
    {"address": "202 Maple St, Springfield", "price": 550000, "bedrooms": 3, "bathrooms": 2, "features": ["deck", "modern kitchen"]},
]
def house_info(houses):
    layout = ''
    for house in houses:
        layout += f"House located at {house['address']} is priced at ${house['price']}. It has {house['bedrooms']} bedrooms and {house['bathrooms']} bathrooms. Notable features include: {', '.join(house['features'])}.\n"
        layout += "=*=" * 20 + "\n"  # Separator for readability
    return layout

# Define a function to call LLM
def sys_prompt(house_data, user_query):
    prompt = "You are a real estate assistant. Use the following houses information to answer users queries:\n"
    prompt += house_info(house_data)
    prompt += "\nNow, answer the user's query based on the provided house information. If you don't know the answer, say 'I don't know'. \n"
    prompt += f"User Query: {user_query}\n"
    return prompt


def generate_llm_response(prompt, api_key=api_key, base_url=base_url):
        client = OpenAI(api_key=api_key, base_url=base_url)
        response = client.chat.completions.create(
        model="gpt-4o-mini",  # or "gpt-4o-mini-preview" for the preview version
        temperature=0.7,
        max_tokens=500,
        top_p=1.0,
        messages=[{"role": "user", "content": prompt}]
    )
        return response.choices[0].message.content

user_input = input("You: ")
print("You:", user_input)
# Initialize messages with the user's input
messages = [{"role": "user", "content": user_input}]

# Generate response using the LLM with data
response = generate_llm_response (sys_prompt(house_data, user_input), api_key=api_key, base_url=base_url)
print("Assistant:")
display(Markdown(response))



You: any house in china
Assistant:


I don't know.