
# Designing Chemistry Questions for AI  
*Session 1: Using LLM APIs for Chemical Reasoning*

**What you'll do in this session**
- Understand what a Large Language Model (LLM) is and why chemists might use one.
- Run a **simple, click-to-run demo** that asks a chemistry question.
- Write your **own prompt** and see how wording changes answers.
- Reflect on what to **trust vs. verify** (textbook, PubChem, literature).

> No prior coding needed. We’ll use buttons and text boxes. Code cells are short and explained.



## 🧪 Introduction

**Large Language Models (LLMs)** like GPT-4 can answer questions in natural language. They are trained on vast amounts of text
and can *explain*, *compare*, and *summarize* chemistry topics.

**Quick examples**
- **Q:** What is the chemical formula of water?  
  **A:** H₂O  
- **Q:** Why is water polar?  
  **A:** Oxygen is more electronegative than hydrogen, creating a dipole moment.

**Important mindset**
- LLMs are powerful, but they can **hallucinate** (confident mistakes).  
- You should **verify** numbers and claims with trusted sources (textbook, PubChem, literature).  
- **Prompt wording matters**: adding details or constraints can improve answers.




## ✅ Setup (Instructions)
1. Run the next **Code** cell to install packages and import basics.  
2. Then, add your **OpenAI API key** in the following cell (kept hidden).  
3. If you *don't* have a key, the notebook still runs in **demo mode** with safe placeholder answers.


In [None]:

# install and import required packages
# - openai → to connect to GPT models
# - ipywidgets → to build interactive buttons, text boxes
!pip -q install openai ipywidgets
!pip install -q openai


import os # handle environment variables (API key storage)
from IPython.display import display, Markdown
import ipywidgets as widgets
from openai import OpenAI # OpenAI client

display(Markdown("> **Setup ready.** Next: add your OpenAI API key."))



## 🔑 Add Your OpenAI API Key (Hidden Input)

- Paste your API key in the box and click **Save Key**.  
- Your key is stored only in this session's memory.  
- If this step is skipped, the notebook will return a **demo placeholder** answer instead of contacting the API.


In [None]:

# Hidden password box for API key + a "Save Key" button
api_key_box = widgets.Password(
    description='API Key:',
    placeholder='sk-...',
    layout=widgets.Layout(width='40%')
)

# Button to save the key

save_btn = widgets.Button(description='Save Key', button_style='primary')

# Status message (will show "✅ saved" or "⚠️ error")

status = widgets.HTML("")

# Function runs when button is clicked

def save_key(_):
    if api_key_box.value.strip():   # if box is not empty
        os.environ["OPENAI_API_KEY"] = api_key_box.value.strip()
        status.value = "<span style='color:green'>✅ Key saved for this session.</span>"
    else:
        status.value = "<span style='color:#B00020'>⚠️ Please paste a valid key.</span>"

# Connect button to function
save_btn.on_click(save_key)

display(widgets.HBox([api_key_box, save_btn]))
display(status)



## 🧩 Helper Functions (What this cell does)
The next code cell defines two small helpers:
- `get_client()` — creates an OpenAI client **if** a key is present.
- `ask_llm(prompt, model)` — sends your question to the model and returns the answer.  
  If no key is set, you’ll get a **demo placeholder** so the notebook still works for everyone.


In [None]:
# Get an OpenAI client object if key is available
def get_client():
    key = os.environ.get("OPENAI_API_KEY")
    return OpenAI(api_key=key) if key else None



# Ask the model a question
def ask_llm(prompt: str, model: str = "gpt-4o-mini") -> str:
    """Ask the model one question. Uses a fallback if no key is set."""
    client = get_client()
    if client is None:
        return ("*(Demo mode: no API key found.)*\n\n"
                "This is a placeholder answer. With a real key, the model would respond here.")
    try:
        completion = client.chat.completions.create(
            model=model,  # try 'gpt-4o-mini' (fast) or 'gpt-4o' (quality); 'gpt-3.5-turbo' also available
            messages=[{"role": "user", "content": prompt}]
        )
        return completion.choices[0].message.content.strip()
    except Exception as e:
        return f"*API error:* `{e}`"

display(Markdown("> **Helper ready.** Next: run the Demo section below."))



## 💡 Demo (Click-to-Run)

We will ask a fixed question and display the answer:

**Question:** *What is the difference between benzene and toluene?*

**What to look for**
- Does it mention that **toluene = benzene + methyl group (–CH₃)**?
- Does it mention **property differences** (e.g., boiling point, polarity)?

> Click the button below to run the demo.


In [None]:

demo_question = "What is the difference between benzene and toluene?"

# Dropdown menu for model choice

model_dropdown = widgets.Dropdown(
    options=[("gpt-4o-mini (fast)", "gpt-4o-mini"),
             ("gpt-4o (quality)", "gpt-4o"),
             ("gpt-3.5-turbo (legacy)", "gpt-3.5-turbo")],
    value="gpt-4o-mini",
    description="Model:"
)

demo_btn = widgets.Button(description="Get Answer", button_style="success")
demo_out = widgets.Output()

# Function: runs when button clicked

def on_demo_click(_):
    demo_out.clear_output()
    with demo_out:
        display(Markdown("⏳ Asking the model..."))
    ans = ask_llm(demo_question, model=model_dropdown.value)


    # Add a structure note (a handy fact)

    structure_note = (
        "**Structure note:**\n"
        "- **Benzene:** aromatic ring C₆H₆\n"
        "- **Toluene:** benzene ring with a methyl group (–CH₃), formula C₇H₈\n"
    )

    demo_out.clear_output()
    with demo_out:
        display(Markdown(
            f"### Demo Answer\n"
            f"**Q:** {demo_question}\n\n"
            f"{structure_note}\n\n"
            f"**Model’s explanation:**\n\n{ans}"
        ))

demo_btn.on_click(on_demo_click)
display(widgets.VBox([model_dropdown, demo_btn, demo_out]))


## ⚖️ Spot the Better Prompt (Side-by-Side)

We’ll compare two versions of the **same** chemistry question.  
Before running, decide which you think will produce a **clearer, more useful** answer.

**Example pair**
- **Prompt A:** "What is benzene?"
- **Prompt B:** "Explain benzene’s structure, uses, and hazards in 3 bullet points."



In [None]:

# You can modify these prompt pairs according to you"
PROMPT_A = "What is benzene?"
PROMPT_B = "Explain benzene’s structure, uses, and hazards in 3 bullet points."

spot_model = widgets.Dropdown(
    options=[("gpt-4o-mini (fast)", "gpt-4o-mini"),
             ("gpt-4o (quality)", "gpt-4o"),
             ("gpt-3.5-turbo (legacy)", "gpt-3.5-turbo")],
    value="gpt-4o-mini",
    description="Model:"
)

run_both_btn = widgets.Button(description="Run Both", button_style="success")
spot_out = widgets.Output()

def run_both(_):
    spot_out.clear_output()
    with spot_out:
        display(Markdown("⏳ Asking the model for **Prompt A** and **Prompt B**..."))
    ans_a = ask_llm(PROMPT_A, model=spot_model.value)
    ans_b = ask_llm(PROMPT_B, model=spot_model.value)
    spot_out.clear_output()
    with spot_out:
        display(Markdown(
            f"### Results\n"
            f"**Prompt A:** {PROMPT_A}\n\n"
            f"{ans_a}\n\n"
            f"---\n"
            f"**Prompt B:** {PROMPT_B}\n\n"
            f"{ans_b}\n"
        ))

run_both_btn.on_click(run_both)

display(widgets.VBox([spot_model, run_both_btn, spot_out]))



## ✍️ Exercise (Your Turn)

- **Design your own prompt** (free text, any chemistry question).  
- **Run** your prompt through the model.  
- **Share** interesting prompts/outputs with your group.

🔎 *Look for surprising or incorrect outputs. Can you refine your prompt to improve the answer?*  
Try adding details like *“explain briefly,” “compare typical boiling points,” or “give two bullet points.”*


In [None]:

prompt_box = widgets.Textarea(
    placeholder="Type your chemistry question here (e.g., Why does salt dissolve in water?)",
    layout=widgets.Layout(width="100%", height="90px")
)
model_picker = widgets.Dropdown(
    options=[("gpt-4o-mini (fast)", "gpt-4o-mini"),
             ("gpt-4o (quality)", "gpt-4o"),
             ("gpt-3.5-turbo (legacy)", "gpt-3.5-turbo")],
    value="gpt-4o-mini",
    description="Model:"
)
run_btn = widgets.Button(description="Ask", button_style="primary")
exercise_out = widgets.Output()

def on_run(_):
    exercise_out.clear_output()
    user_q = prompt_box.value.strip()
    if not user_q:
        with exercise_out:
            display(Markdown("*Please type a question above.*"))
        return
    with exercise_out:
        display(Markdown("⏳ Asking the model..."))
    ans = ask_llm(user_q, model=model_picker.value)
    exercise_out.clear_output()
    with exercise_out:
        display(Markdown(f"**Q:** {user_q}\n\n**A:**\n\n{ans}"))

run_btn.on_click(on_run)
display(widgets.VBox([prompt_box, model_picker, run_btn, exercise_out]))



## 📘 Reflection (Discuss / Note Down)
- Did the answer include anything you would **double-check** in a trusted source (textbook, PubChem, literature)?  
- What small change to your **prompt** improved the answer the most?  
- If the model gave a **numerical property**, how would you verify it?



## 🛠️ Troubleshooting
- If buttons don’t appear or do nothing, **Runtime → Restart runtime** and re-run the Setup cell.
- If you see `API error`, double-check your **API key** and internet connection.
- If you have **no key**, you will see a *demo placeholder* answer instead of a real model response.


In [None]:

print("Has API key? ", "✅" if "OPENAI_API_KEY" in os.environ else "❌")
try:
    import openai  # confirm package presence
    print("openai package import: ✅")
except Exception as e:
    print("openai package import: ❌", e)
