# Overview of the Artifact Processing Code

This code is designed to help you process multiple text artifacts using OpenAI’s API. It includes an example experiment that summarizes `.txt` files, handling errors and retries automatically. 

## What You Should Do

1. **Study the Code**: Read through the code to understand how it works, focusing on:
   - How files are added to a queue for processing.
   - How the example function (`summarize_future_history`) applies the experiment.
   - How errors and retries are handled.

2. **Run the Code as Is**: Before modifying, ensure the code works in your environment:
   - Place `.txt` files in the folder specified in the script (`data/future` by default).
   - Run the script and confirm the output matches the description.

3. **Modify for Your Experiments**:
   - Replace the example function with one tailored to your experiment.
   - Adapt the prompts, file types, or output format as needed.

## Support Resources

You are not alone! If you have questions or need help:
- **ChatGPT**: Ask for assistance with adapting or debugging your code.
- **Your Classmates**: Collaborate with teammates and even across teams to brainstorm and troubleshoot.
- **Team 6**: Reach out to the coding support team for technical help.
- **Your Instructor (Prof. Allen)**: I’m here to guide you and answer your questions.

Take it step by step, and remember: experimenting is part of the process. You've got this!


## Reading and Validating Your OpenAI API Key

This example script demonstrates how to securely read and validate your OpenAI API key from an external file. It is important to follow these instructions carefully to ensure that your key is correctly formatted and can be used with the OpenAI API.

**Steps to Use This Code**:

1. **Create a Text File for Your Key**:
   - Create a `.txt` file named `Team 00 API Key.txt`. Replace `00` with your team number (e.g., `Team 01 API Key.txt`).
   - If your file is stored in a subfolder (e.g., `keys/`), adjust the file path in the code accordingly.

2. **File Format**:
   - The first line of the file should be a comment or identifier (e.g., `team-00-key`).
   - The second line must contain your actual API key, which should begin with `sk-svcacct-`.
   - Example file content:
     ```
     team-00-key
     sk-svcacct-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
     ```

3. **How the Script Works**:
   - The script checks if the file exists at the specified path.
   - It reads the file and validates that:
     - The file contains at least two lines.
     - The key on the second line starts with the required prefix `sk-svcacct-`.

4. **Example Usage**:
   - Replace the file path in the script with your actual file path if different.
   - Run the script in a Python environment. If successful, the first and last characters of the key will be printed for confirmation. Example output:
     ```
     Validated API key: sk-svc...XXXXX
     ```

5. **Handling Errors**:
   - If the file is missing, incorrectly formatted, or the key is invalid, an error message will explain the issue:
     - `FileNotFoundError`: The file does not exist at the specified path.
     - `ValueError`: The file format is invalid, or the key does not start with `sk-svcacct-`.

6. **Security Best Practices**:
   - Never share your API key publicly or include it in source code that others can access.
   - Use this script to validate and load the key securely.

By following these instructions, you can ensure your OpenAI API key is properly loaded and ready for use in your final project.


In [9]:
import os

api_key_file_path = "Team 06 API Key.txt"  # Replace 00 with the appropriate number
# api_key_file_path = "keys/Team 06 API Key.txt"  # If your key is in a subfolder, you can include that like I did here

def read_and_validate_api_key(file_path):
    """
    Reads and validates the OpenAI API key from the given file.
    
    Args:
        file_path (str): Path to the file containing the API key.
    
    Returns:
        str: The validated API key.
    
    Raises:
        ValueError: If the file format is invalid or the key doesn't meet the specifications.
        FileNotFoundError: If the file does not exist.
    """
    # Check if the file exists
    if not os.path.exists(file_path):
        raise FileNotFoundError(f"API key file '{file_path}' not found.")
    
    # Read the file
    with open(file_path, "r", encoding="utf-8") as f:
        lines = f.readlines()
    
    # Validate the file format
    if len(lines) < 2:
        raise ValueError(f"API key file '{file_path}' must have at least two lines.")
    
    # Strip the first line (e.g., "team-00-key") and validate the second line
    raw_key = lines[1].strip()  # Second line is the key
    if not raw_key.startswith("sk-svcacct-"):
        raise ValueError(f"Invalid API key format in file '{file_path}'. Key must start with 'sk-proj-'.")
    
    # Return the validated key
    return raw_key

# Example usage:
try:
    api_key = read_and_validate_api_key(api_key_file_path)
    print(f"Validated API key: {api_key}") #{api_key[:8]}...{api_key[-6:]}")  # Output only part of the key for confirmation
except Exception as e:
    print(f"Error: {e}")


Validated API key: sk-svcacct-eJP7gnYRjprYl4S68AzSdD7z0p-d5PpUdJ7o-5IcA3nzJXCDpIdPGpSCp9nDdUKU8T3BlbkFJZT7kWHsinq3Nw0FVgd63Mv1EFzOc9SXafjI3dpUQxRwRx_SYr2B6xQvvf9cs4pX4gA


## Installing the OpenAI Package and Initializing the OpenAI Client

This guide explains how to install the latest version of the OpenAI package, initialize the OpenAI client, and check the package version. It is essential to use the **latest version** of the package to ensure compatibility with your project.

**Installing or Updating the OpenAI Package**:

To ensure you have the correct version of the OpenAI package, follow these steps:

- Install the OpenAI package if you don't have it:
  ```python
  pip install openai
  ```

- Update to the latest version if you already have an older version installed:
  ```python
  pip install --upgrade openai
  ```

**Code Explanation**:

The Python code in the code cell below sets up your connection to the OpenAI API and verifies that your environment is using the latest package version.

1. **Importing the OpenAI Object**:
   - `from openai import OpenAI` imports the **OpenAI object**, which is required for interacting with OpenAI's API.

2. **Creating the OpenAI Client**:
   - `client = OpenAI(api_key=api_key)` creates the client that allows your program to communicate with OpenAI's API using your validated API key.

3. **Printing the Package Version**:
   - The `openai_version_module.__version__` attribute retrieves the current version of the OpenAI package installed in your environment.
   - This is useful for ensuring that you are using the latest version. If not, run `pip install --upgrade openai` to update.

By following these steps and using the latest version of the OpenAI package, you can correctly set up the client and ensure compatibility with the new API.


In [12]:
from openai import OpenAI

# Initialize the OpenAI object using the validated API key
client = OpenAI(api_key=api_key)

# Print the OpenAI package version
import openai as openai_version_module
print(f"OpenAI package version: {openai_version_module.__version__}")

OpenAI package version: 1.59.9


## Test the API

This code is provided as a test case. It uses the OpenAI client to generate a Python program that prints "Hello, World." Here's a quick breakdown:

1. **Function Purpose**:
   - `generate_hello_world(client)` sends a request to the OpenAI API asking it to generate Python code that displays "Hello, World."

2. **How It Works**:
   - The `client.chat.completions.create` method sends the prompt to the GPT-4o-mini model.
   - The `messages` argument includes a system role ("You are a helpful assistant.") and a user query asking for the program.

3. **Return Value**:
   - The function extracts and returns the generated code from the API's response.

4. **Output**:
   - The generated code is printed to the console.

If the API request fails, an error is raised with a descriptive message. If it succeeds, it should output something like this:

```python
print("Hello world")
```


In [15]:
def generate_hello_world(client):
    """
    Generate a Python program that prints 'Hello, World' using OpenAI API.
    
    Args:
        client: The initialized OpenAI API client.
        
    Returns:
        str: The Python program generated by the API.
    """
    try:
        # Send a query to the OpenAI API
        completion = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                { "role": "system", "content": "You are a helpful assistant." },
                { "role": "user", "content": "Write a program in Python that displays 'Hello world'. "
                 "Only output the code. Do not output any commentary." },
            ]
        )
        # Extract the generated program
        return completion.choices[0].message.content
    except Exception as e:
        raise RuntimeError(f"Error connecting to OpenAI API: {e}")

content = generate_hello_world(client)
print(content)


```python
print("Hello world")
```


## Function to Send a Query to OpenAI's API

This code defines a generic function, `generate_response`, to simplify sending queries to OpenAI's API. Here's a brief explanation:

**Function Purpose**:
- `generate_response(client, system_role, user_prompt, model="gpt-4o-mini")` generates a response from the OpenAI API based on the specified system role and user prompt.

**How It Works**:
1. Arguments:
   - `client`: The OpenAI API client used to send requests.
   - `system_role`: A description of the AI's role or context. This helps set the "personality" or expertise of the AI. For example:
     - You are a helpful assistant.
     - You are an expert in 18th-century literature.
   - `user_prompt`: The specific query or instruction you want the AI to address. For example:
     - Write a poem about the moon.
     - Explain the key events of the American Revolution in one paragraph.
   - `model`: The AI model to use, with the default being "gpt-4o-mini". This parameter can remain as is unless otherwise instructed.

2. API Request:
   - Combines the `system_role` and `user_prompt` into a query and sends it to the OpenAI API.

3. Response Handling:
   - Extracts and returns the response content generated by the AI.

**Example Usage**:
The following example asks for a fact about Bratislava:
- System Role: You are a knowledgeable assistant specializing in geography.
- User Prompt: Tell me an interesting fact about Bratislava.

The AI's response will include a fact aligned with the role and query provided.

**Error Handling**:
If there is an issue (e.g., network error or invalid API key), a descriptive error will be raised.

This function is flexible and reusable for various queries, making it easier to interact with OpenAI's API.

**Understanding Prompts**:

1. System Prompt (System Role):
   - Sets the tone, expertise, or behavior of the AI. For example:
     - To make the AI polite and helpful: You are a helpful assistant.
     - To make the AI an expert on a topic: You are an expert in quantum mechanics.
   - This is like providing the AI with a "job description."

2. User Prompt:
   - Specifies the exact question or task you want the AI to handle. For example:
     - Explain how solar panels generate electricity.
     - Summarize the book 'Pride and Prejudice' in one sentence.

The system and user prompts work together to guide the AI's response.


In [18]:
def generate_response(client, system_role, user_prompt, model="gpt-4o-mini"):
    """
    Generate a response using OpenAI API based on the provided system role and user prompt.
    
    Args:
        client: The initialized OpenAI API client.
        system_role (str): The system's role or context, e.g., "You are a helpful assistant."
        user_prompt (str): The user's query or instruction.
        model (str): The name of the OpenAI model to use. Default is "gpt-4o-mini".
        
    Returns:
        str: The response generated by the OpenAI API.
    """
    try:
        # Send the query to the OpenAI API
        completion = client.chat.completions.create(
            model=model,
            messages=[
                { "role": "system", "content": system_role },
                { "role": "user", "content": user_prompt },
            ]
        )
        # Extract and return the response content
        return completion.choices[0].message.content
    except Exception as e:
        raise RuntimeError(f"Error connecting to OpenAI API: {e}")

# Example: Ask about Bratislava
# You can replace the prompt here with anything you like!
try:
    system_role = "You are a knowledgeable, helpful assistant specializing in sports information."
    user_prompt = "Tell me about the legacy of Tom Brady."
    
    response = generate_response(client, system_role, user_prompt)
    print("Response:")
    print(response)
except Exception as error:
    print(f"Error: {error}")


Response:
Tom Brady is widely regarded as one of the greatest quarterbacks in the history of the NFL, and his legacy is marked by numerous achievements, records, and contributions to the sport. Here are some key aspects of his legacy:

### Championships and Records
1. **Super Bowl Success**: Brady won seven Super Bowl titles (XXVI, XXXVI, XXXVIII, XLIX, LI, LIII with the New England Patriots, and LV with the Tampa Bay Buccaneers), more than any other player in NFL history.
2. **Super Bowl MVPs**: He was named Super Bowl MVP five times, showcasing his ability to perform in critical moments.
3. **Career Statistics**: Brady retired as the all-time leader in several key statistical categories, including passing touchdowns (624), passing yards (89,214), and career wins (251 as a starting quarterback).

### Longevity and Performance
1. **Longevity**: Brady’s career spanned 23 seasons, from 2000 to 2022, and he played at an elite level well into his 40s, which is unprecedented for a quarterba

### Multiline Strings and String Concatenation in Python

**Multiline Strings**:
In Python, you can create strings that span multiple lines using triple quotes (`"""` or `'''`):

```python
"""This is a string
that spans multiple lines.
It is useful for long messages or formatted text."""
```

**String Concatenation**:
Strings can be combined (concatenated) using the `+` operator:

```python
part1 = "Hello"
part2 = "World"
combined = part1 + " " + part2  # Result: "Hello World"
```

**Best Practice for Long User Prompts**:
Use concatenation to dynamically build strings:

```python
user_shape = 'circle'
user_prompt = "Write a Python program " + \
              "that calculates the area of a " + user_shape + "."
```

**When to Use Each**:

- **Multiline strings** are best for **long, fixed messages.**
- **Concatenation** is helpful when parts of the string are **generated dynamically or come from variables.**

By understanding these concepts, you can effectively create and customize prompts for the AI.

## Summarizing a Future History with OpenAI

This example demonstrates how to use the OpenAI API to analyze and summarize a student artifact called a "future history." The provided function reads the text of the file, sends it to OpenAI's API with a specific prompt, and returns a one-sentence summary. You can copy and modify this function for your own experiments with OpenAI.

### How It Works

1. **The Function**:
   - `summarize_future_history` is designed to:
     - Read the content of a text file.
     - Use OpenAI's API to generate a one-sentence summary of a future history.

2. **The Prompts**:
   - **System Prompt**: Explains to the AI what a "future history" is and sets the task's context.
   - **User Prompt**: Provides the file's text and asks the AI to summarize it in one sentence.

3. **The Test Code**:
   - Reads the file `data/future/future_history_0909.txt`.
   - Passes the file content to the function.
   - Outputs the generated summary.


**Simple understanding of the next block of code**:

If you would like for ChatGPT to take into account the instructions and rubric of this assignment, then this is what you have to do.
On Moodle, under Final Project, there is a link called "Documents provided by client (updated 1/23 Thu)" that will take you to OneDrive.
There, you need to download "instructions-history.txt" and "rubric-history-1.txt"
Then, go to the "code-and-data-quick-start" folder that you've been using for this assignment, and create a new folder titled "Information"
In that folder, you will add "instructions-history.txt" and "rubric-history-1.txt"
Then, this code should run.

In [57]:
def generate_understanding(client, system_role, instructions_content, rubric_content, model="gpt-4o-mini"):
    """
    Generate a response using OpenAI API to confirm understanding of two provided files.
    
    Args:
        client: The initialized OpenAI API client.
        system_role (str): The system's role or context, e.g., "You are a helpful assistant."
        instructions_content (str): The content of the instructions file.
        rubric_content (str): The content of the rubric file.
        model (str): The name of the OpenAI model to use. Default is "gpt-4o-mini".
        
    Returns:
        str: The response generated by the OpenAI API.
    """
    try:
        # Construct the prompt
        user_prompt = f"""
        I am providing you with two files to understand the context of the assignment. Give a brief understanding of what the assignment is. Here are the files:
        1. Instructions on what to accomplish in the future history:
        {instructions_content}

        2. Rubric on how the assignment is judged or graded:
        {rubric_content}

        Please confirm that you now understand the provided content.
        """

        # Send the query to the OpenAI API
        completion = client.chat.completions.create(
            model=model,
            messages=[
                { "role": "system", "content": system_role },
                { "role": "user", "content": user_prompt },
            ]
        )
        # Extract and return the response content
        return completion.choices[0].message.content
    except Exception as e:
        raise RuntimeError(f"Error connecting to OpenAI API: {e}")

# Example: Providing files for understanding
try:
    # System role
    system_role = "You are a helpful assistant that reads the files given to it and understands what is trying to be accomplished. This will be helpful when reviewing the future histories."

    # Read the content of the files
    instructions_history_path = "information/instructions-history.txt"
    rubrics_history_path = "information/rubric-history-1.txt"

    with open(instructions_history_path, 'r') as file:
        instructions_content = file.read()

    with open(rubrics_history_path, 'r') as file:
        rubric_content = file.read()

    # Generate the response
    response = generate_understanding(client, system_role, instructions_content, rubric_content, model="gpt-4o-mini")
    print("Response:")
    print(response)
except Exception as error:
    print(f"Error: {error}")

Response:
I understand the provided content. The assignment is for students to write a "Future History" narrative, imagining their first year of college has already taken place and writing a letter to a high school friend about that experience. The purpose of the exercise is to help students vividly envision their future by narrating it in the past tense, encouraging introspection and creativity. 

The rubric categorizes the assessment criteria into six areas, evaluating aspects such as the use of perspective, vividness and specificity of details, reflection on growth, organizational coherence, use of freewriting and revision, and overall writing quality. Each category is graded on a scale from Beginning to Exemplary, guiding students on how to enhance their writing to meet expectations. 

If you have any more specific questions or need further assistance, feel free to ask!


In [24]:
# Simplified function to summarize a future history
def summarize_future_history(client, file_path):
    """
    Summarize the content of a future history file using OpenAI API.
    
    Args:
        client: The initialized OpenAI API client.
        file_path (str): Path to the file to be summarized.
    
    Returns:
        str: The one-sentence summary of the future history.
    """
    try:
        # Read the file content
        with open(file_path, "r", encoding="utf-8") as f:
            file_content = f.read()

        # System prompt
        system_role = (
            "You are a helpful assistant. Your task is to read future histories written by students."
            " A 'future history' is a narrative written in past tense about events that the writer imagines will happen in the future."
        )

        # User prompt with the file content
        user_prompt = f"Summarize the future history in one sentence.:\n\n{file_content}"

        # Call the existing generate_response function
        response = generate_response(client, system_role, user_prompt, 'gpt-4o')

        return response.strip()

    except Exception as e:
        raise RuntimeError(f"Error summarizing file '{file_path}': {e}")

# Test code
try:
    file_path = "data/future/future_history_nojo.txt"  # Update with the correct file path
    summary = summarize_future_history(client, file_path)
    print(f"Summary of the future history for {file_path}:")
    print(summary)
except Exception as error:
    print(f"Error: {error}")


Summary of the future history for data/future/future_history_nojo.txt:
The narrative reflects on a transformative college experience at Centre College, highlighting academic achievements, athletic endeavors, meaningful relationships, and future plans for a gap year and graduate studies.


### Modifying the Code Above

You can adapt the code above for your own projects by following these steps:

1. **Change the File Path**:
   - Replace `data/future/future_history_0909.txt` with the path to your own text file.

2. **Update the Prompts**:
   - Modify the system prompt to give the AI a new context or purpose. For example:
     - "You are an assistant helping students improve their creative writing."
   - Update the user prompt to ask for something different. For example:
     - "Rewrite the following text to make it more descriptive."

3. **Experiment with the Output**:
   - Try asking the AI for more detailed responses by changing the prompt to:
     - "Summarize the following future history in 3-5 sentences."
   - You can also switch the model from `gpt-4o` to another model if directed.

4. **Reuse for Different Tasks**:
   - Use the same structure to ask the AI to:
     - Provide feedback on writing.
     - Generate new content based on the provided text.
     - Analyze text for themes or patterns.

### Notes for Experiments

- Ensure the text you provide is appropriate for the task.
- You can experiment with prompt phrasing to refine the AI's responses.
- If the output isn’t what you expect, adjust the prompts or try breaking long files into smaller sections.

By modifying this example, you can explore a variety of tasks with OpenAI's API.


## Summarizing `.txt` Files with OpenAI: How This Code Works

This script processes all `.txt` files in a specified folder (`data/future/` by default) and sends them to OpenAI’s API for analysis. It applies the `summarize_future_history` function, which generates a one-sentence summary for each artifact. Here’s how it works:

1. The script scans the folder for files with the `.txt` extension and adds them to a queue for processing.
2. Each file is processed by the `summarize_future_history` function, which reads the file content, sends it to OpenAI, and retrieves a summary.
3. If the API fails to process a file, the script retries up to three times with increasing delays (1, 2, and 4 seconds).
4. For each file, the script outputs the file name and its generated summary. Files that fail after all retries are skipped.

This setup ensures reliable processing of all `.txt` files in the folder, handling errors and skipping non-text files.


In [30]:
import os
import time
from queue import Queue

# If necessary, replace `data/future` with the path to where your artifacts are located
artifacts_file_path = 'data/future/'

# Create a queue for files to be processed
file_queue = Queue()

# Add files to the queue
# Use os.listdir to list all files in the directory
for file_name in os.listdir(artifacts_file_path):
    file_path = os.path.join(artifacts_file_path, file_name)
    if os.path.isfile(file_path) and file_name.lower().endswith(".txt"):
        file_queue.put(file_path)

# Parameters for retrying failed files
max_retries = 3  # Maximum number of retries for a single file
retry_delays = [1, 2, 4]  # Delays between retries in seconds (exponential backoff)

# Track attempts for each file
attempts = {}

while not file_queue.empty():
    file_path = file_queue.get()
    attempts[file_path] = attempts.get(file_path, 0) + 1

    try:
        # Call the summarization function
        summary = summarize_future_history(client, file_path)

        # Output the result
        print(f"Summarizing file: {file_path}")
        print("Summary:")
        print(summary)
        print("\n" + "-" * 50 + "\n")  # Separator for readability

    except Exception as error:
        print(f"Error processing file '{file_path}': {error}")

        # Retry if the file hasn't reached the max retries
        if attempts[file_path] <= max_retries:
            print(f"Retrying file: {file_path} (attempt {attempts[file_path]})")
            file_queue.put(file_path)  # Add back to the queue

            # Add delay for retries
            time.sleep(retry_delays[min(attempts[file_path] - 1, len(retry_delays) - 1)])
        else:
            print(f"Max retries reached for file: {file_path}. Skipping.")


Summarizing file: data/future/future_history_4909.txt
Summary:
In Spring 2025, I grew personally and academically by stepping outside my comfort zone through challenging classes, overcoming routine struggles after my athletic season, becoming more involved in campus organizations, and achieving various goals while also nurturing my faith and relationships.

--------------------------------------------------

Summarizing file: data/future/future_history_7187.txt
Summary:
The writer describes their successful and enjoyable first year of college, highlighting involvement in a rock climbing club, maintaining good grades with the support of academically minded friends, and overcoming initial challenges in adjusting to college life.

--------------------------------------------------

Summarizing file: data/future/future_history_3123.txt
Summary:
The writer reflects on their fulfilling college experience, highlighting strong friendships, participation in volleyball and cultural events, excel

## How to Adapt This Script for Your Experiments

You can modify this script to apply different experiments to your artifacts. To keep your work organized, consider creating a new Jupyter notebook for each experiment. Here’s how to adapt it:

1. Replace the `summarize_future_history` function with one that performs your experiment. For example:
   - Analyzing text for themes.
   - Providing feedback on writing.
   - Rewriting text for a different purpose.

   Example:

   ```python
   def analyze_themes(client, file_path):
       # Custom function for analyzing themes
       return result
   ```

3. Change the file path to point to your folder of artifacts:

   ```python
   all_files_path = "path/to/your/files/"
   ```

5. Adjust the file type filter if you want to process files other than `.txt`:
   ```python
   if os.path.isfile(file_path) and file_name.lower().endswith(".md"):
   ```

7. Modify the output to suit your experiment. For example, save results to a file:

   ```python
   with open("experiment_results.txt", "a", encoding="utf-8") as output_file:
       output_file.write(f"File: {file_path}\nResult:\n{result}\n\n")
   ```
   
9. Create separate Jupyter notebooks for each experiment. Include:
   - Your custom function.
   - This script, modified to call your function.

By keeping each experiment in its own notebook, you’ll have a clear record of your work and can easily manage different analyses with OpenAI’s API.
