# Workshop Module: Logging & Version Control for Computational Research

## Part 1: Logging Your Work 

### Why Logging Matters

* Reproducibility: What prompt did I use?
* Accountability: When did I run this?
* Version tracking: What changed since last run?

###  Example Task: Log a Simple LLM Prompt

- Save log files in a `./logs` folder by date.
- This appends a log line to `./logs/logfile_YYYY-MM-DD.jsonl`.
- Use `.jsonl` (JSON Lines) format for logging experiments, easy to parse later
    ```python
    import pandas as pd
    df = pd.read_json("log.jsonl", lines=True)
    ```



In [4]:
from openai import OpenAI
import os
from pathlib import Path
from typing import List

def llm_openai(prompt: str, llm_model: str) -> str:
    """Call OpenAI ChatCompletion API and return output text."""

    # Set your OpenAI API key as an environment variable before running
    api_key = Path("./openai.key").read_text().strip()
    client = OpenAI(api_key=api_key)

    completion = client.chat.completions.create(
        model=llm_model,
        messages=[
            {
                "role": "user",
                "content": prompt
            }
        ],
        temperature=0,
    )   
    return completion.choices[0].message.content


def llm_faketest(prompt: str, llm_model: str) -> str:
    """
    Executes a prompt against a specified LLM model.
    This is a placeholder for a real API call.
    """
    print(f"Executing prompt with model: {llm_model}")
    return f"Fake output for: {prompt}"


In [None]:
import datetime
import json
import os

def llm_with_logging(prompt, llm_model, llm_func):
    """
    Executes an LLM prompt using a provided function, saves the prompt and logs the interaction.
    Args:
        prompt (str): The prompt to send to the LLM.
        llm_model (str): The name of the LLM model to use.
        llm_func (callable): The function to execute the LLM call. Defaults to llm_execute.
    Returns:
        str: The output from the LLM.
    """
    log_dir = "logs"
    prompts_dir = "prompts"
    
    os.makedirs(log_dir, exist_ok=True)
    os.makedirs(prompts_dir, exist_ok=True)
    
    now = datetime.datetime.now()
    date_str = now.date().isoformat()
    timestamp_str = now.strftime("%Y%m%d-%H%M%S")
    
    output_file = os.path.join(log_dir, f"logfile_{date_str}.jsonl")
    prompt_file = os.path.join(prompts_dir, f"prompt_{timestamp_str}.txt")

    with open(prompt_file, "w") as f:
        f.write(prompt)
    print(f"Saved prompt to {prompt_file}")

    output = llm_func(prompt, llm_model)

    log_entry = {
        "timestamp": now.isoformat(),
        "model": llm_model,
        "prompt": prompt,
        "output": output,
    }

    with open(output_file, "a") as f:
        f.write(json.dumps(log_entry) + "\n")

    print(f"Logged interaction to {output_file}")
    return output



In [8]:
test_prompt = "Tell a quick fun fact about business."
llm_model = "gpt-4.1-nano"
output = llm_with_logging(test_prompt, "test-model", llm_faketest)
output = llm_with_logging(test_prompt, llm_model, llm_openai)



Saved prompt to prompts/prompt_20250618-105847.txt
Executing prompt with model: test-model
Logged interaction to logs/logfile_2025-06-18.jsonl
Saved prompt to prompts/prompt_20250618-105847.txt
Logged interaction to logs/logfile_2025-06-18.jsonl



## Part 2: Version Control with Git 

### Why Use Git

* Track code and prompt changes over time
* Collaborate with co-authors or advisors
* Restore earlier versions

### Quick Git Demo

- **Step 1**: Initialize a Git repo

    ```bash
    git init llm-project
    cd llm-project
    ```

- **Step 2**: Add your script

    ```bash
    cp ../log_llm_prompt.py .
    git add log_llm_prompt.py
    git commit -m "Initial LLM logging script"
    ```

- **Step 3**: Modify your prompt and commit again
    Edit the `prompt` line in `log_llm_prompt.py`, then:

    ```bash
    git diff                # See what changed
    git add log_llm_prompt.py
    git commit -m "Updated prompt to focus on financial trends"
    ```

- **Step 4**: View history
    ```bash
    git log --oneline
    ```


## Key Takeaways

* Log every prompt and output with timestamps and model version
* Use Git to track changes in your code and prompts
* These practices make your research more reproducible, especially when using LLMs or modeling pipelines


* Include model version/hash in logs (for real models like OpenAI or Claude)

