# End of week 1 exercise

To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,
and responds with an explanation. This is a tool that you will be able to use yourself during the course!

### Dual-Mode Explainer
This notebook runs dual explainer runs two llm models side by side (html) and vertically (markdown) in response to technical/coding questions.
* Allows simple variable toggles for controlling model configuration dynamically (eg temperature)
* runs dual models side by side for comparison with some markdown to html trickery
* provides two question template, coding and general technical


In [None]:
# imports
import os          # OS utilities, environment variables, file paths
import threading   # Run parallel threads for streaming (dual mode explainer)

from IPython.display import HTML, Markdown, display, update_display
from openai import OpenAI
import re

# dual model constants
MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'gemma3:4b'


### Configuration for models
I want a set of default conifiguration that I can override at runtime for side by side comparisons.

In [None]:
class Config:
    # OpenAI
    OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
    OPENAI_MODEL = MODEL_GPT
    OPENAI_TEMPERATURE = 0.7

    # Ollama
    OLLAMA_BASE_URL = "http://localhost:11434/v1"
    OLLAMA_MODEL = MODEL_LLAMA
    OLLAMA_TEMPERATURE = 0.3


### Shared Utilities for streaming and creating clients.
LLM's produce markdown, which can be streamed, but in a linear format. However this presents a problem when trying to display LLM question/answers side by side, as markdown down not really have a concept of layout as per HTML/css.  So I have a few simple functions that allow LLM results to be chunked in parallel and written to an HTML table, with all the markdown goodness removed.

The alternative would be to install a markdown -> html converter, or create a function to do that, but that seems a world of hurt.. htmls tags are open/close in nature and markdown is far simpler, which would required matching a markdown tag for open, and maybe a newline for close tag, but then you would hit #, ##, <h1></h1>, greedy regex, and now instead one problem you have many problems... I am sure an AI agent could eventually solve the algorithm without any unit tests... but nup... move on.. for this don't need formatting goodness

In [None]:
def create_client(api_key=None, base_url=None, model=None, temperature=None):
    """
    Create a client+model+params bundle as a dictionary.
    """
    return {
        "client": OpenAI(api_key=api_key, base_url=base_url),
        "model": model,
        "temperature": temperature
    }

def get_stream(client_bundle, question: str):
    return client_bundle["client"].chat.completions.create(
        model=client_bundle["model"],
        temperature=client_bundle["temperature"],
        messages=[
            {"role": "system", "content": "You are a technical explainer."},
            {"role": "user", "content": question}
        ],
        stream=True
    )


def stream_answer_into_table(client_bundle, question: str, column_id: str, display_handle):
    stream = get_stream(client_bundle, question)

    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ""
        html = markdown_to_html(response)
        js = f"""
        <script>
        document.getElementById("{column_id}").innerHTML = `{html}`;
        </script>
        """
        update_display(HTML(js), display_id=display_handle.display_id)

    return response

def stream_answer_markdown(client_bundle, question: str) -> str:
    stream = stream = get_stream(client_bundle, question)

    response = ""
    display_handle = display(Markdown(""), display_id=True)
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        update_display(Markdown(response), display_id=display_handle.display_id)

    return response


def markdown_to_html(md: str) -> str:
    html = md

    # HEADINGS
    html = re.sub(r'^#{3}\s*(.*)', r'<h3>\1</h3>', html, flags=re.MULTILINE)
    html = re.sub(r'^#{2}\s*(.*)', r'<h2>\1</h2>', html, flags=re.MULTILINE)
    html = re.sub(r'^#{1}\s*(.*)', r'<h1>\1</h1>', html, flags=re.MULTILINE)

    # HORIZONTAL RULE (---)
    html = re.sub(r'^\s*---+\s*$', r'<hr>', html, flags=re.MULTILINE)

    # BOLD
    html = re.sub(r'\*\*(.*?)\*\*', r'<b>\1</b>', html)

    # ITALIC
    html = re.sub(r'\*(.*?)\*', r'<i>\1</i>', html)

    # INLINE CODE
    html = re.sub(r'`(.*?)`', r'<code>\1</code>', html)

    # NEWLINES → <br>
    html = html.replace('\n', '<br>')

    return html

def stream_parallel_table(client_a, client_b, question):
    # Display initial empty table
    table_html = f"""
    <table style="width:100%; border-collapse:collapse;">
        <tr>
            <th>{client_a['model']}</th>
            <th>{client_b['model']}</th>
        </tr>
        <tr>
            <td id="colA" style="white-space:pre-wrap;"></td>
            <td id="colB" style="white-space:pre-wrap;"></td>
        </tr>
    </table>
    """
    handle = display(HTML(table_html), display_id=True)

    # Launch threads to stream each model into its column
    thread_a = threading.Thread(target=stream_answer_into_table, args=(client_a, question, "colA", handle))
    thread_b = threading.Thread(target=stream_answer_into_table, args=(client_b, question, "colB", handle))

    thread_a.start()
    thread_b.start()

    thread_a.join()
    thread_b.join()


### Allow dynamic override of config for comparisons
I would like to override specific properties like temperature dynamically, see what the results are etc..

In [None]:
def build_openai_client(config: Config, *, model=None, temperature=None):
    return create_client(
        api_key=config.OPENAI_API_KEY,
        base_url=None,
        model=model or config.OPENAI_MODEL,
        temperature=temperature or config.OPENAI_TEMPERATURE
    )

def build_ollama_client(config: Config, *, model=None, temperature=None):
    return create_client(
        api_key="ollama",
        base_url=config.OLLAMA_BASE_URL,
        model=model or config.OLLAMA_MODEL,
        temperature=temperature or config.OLLAMA_TEMPERATURE
    )


### A Coding specific question template

In [None]:
# question = """
# Please explain what this code does and why:
#     stream = client_bundle["client"].chat.completions.create(
#         model=client_bundle["model"],
#         temperature=client_bundle["temperature"],
#         messages=[
#             {"role": "system", "content": "You are a technical explainer."},
#             {"role": "user", "content": question}
#         ],
#         stream=True
# # """

question = """
Please modify the following code and rather than strip out markdown symbols, wrap symbol in open and close html equvelanets:
    def strip_markdown(md: str) -> str:
        symbols = ["**", "*", "`", "###", "##", "#"]
        result = md
        for sym in symbols:
            result = result.replace(sym, "")
        return result
"""

### An Architectural and Concepts question template



In [None]:
question = """
Please explain what this concept is and why is it important:
"Setting a MODEL temperature for a specific coding question, within a coding / deterministic context, what effect does temperature variation have?
"""

In [None]:
config = Config()
overrideTemperature =  0.9 # no creativity
openai_client = build_openai_client(config, temperature=overrideTemperature)
ollama_client = build_ollama_client(config, temperature=overrideTemperature)

### Run Ollama and OpenAI clients in parallel
Compare results of different models side by side (dual mode explainer)
Issues:
* Displayed as html, therefor current implementation loses formatting (heading, bullet points) and leads to a lesser semantic understanding
* Could retain or convert markdown -> html by introducing new libraries such as markdown.

In [None]:
# display side by side in table format
table_html = f"""
<style>
    h1, h2, h3, h4, h5, h6 {{
        margin-top: 0;
        margin-bottom: 0;
    }}

    p {{
        margin: 0;
    }}

    ul, ol {{
        margin-top: 0;
        margin-bottom: 0;
        padding-left: 20px;
    }}
</style>

<table style="width:100%; border-collapse:collapse;">
  <tr>
    <th style="text-align:left; vertical-align:top;"><h1>{openai_client["model"]} - Temp: {openai_client["temperature"]}</h1></th>
    <th style="text-align:left; vertical-align:top;"><h1>{ollama_client["model"]} - Temp: {ollama_client["temperature"]}</h1></th>
  </tr>
  <tr>
    <td id="colA" style="white-space:pre-wrap; text-align:left; vertical-align:top;"></td>
    <td id="colB" style="white-space:pre-wrap; text-align:left; vertical-align:top;"></td>
  </tr>
</table>
"""

handle = display(HTML(table_html), display_id=True)

stream_parallel_table(openai_client, ollama_client, question)

### Run sequentially with streams as pure markdown

In [None]:
display(Markdown(f"# Model: {openai_client['model']} — Temp: {openai_client['temperature']}"))
stream_answer_markdown(openai_client, question)

display(Markdown("---"))

display(Markdown(f"# Model: {ollama_client['model']} — Temp: {ollama_client['temperature']}"))
stream_answer_markdown(ollama_client, question)
