# How to Prompt with the Reasoning Engine



In [4]:
### Python Code:

import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Fetch API keys
openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")

# Check if API keys are loaded
if openai_api_key and anthropic_api_key:
    print("✅ API keys are successfully loaded.")
else:
    print("⚠️ Warning: One or more API keys are missing.")

# Optionally, display API keys (for debugging purposes only)
display_keys = False  # Change to True if you want to see the keys

if display_keys:
    print(f"OpenAI API Key: {openai_api_key}")
    print(f"Anthropic API Key: {anthropic_api_key}")
else:
    print("🔒 API keys are loaded but hidden for security.")

✅ API keys are successfully loaded.
🔒 API keys are loaded but hidden for security.


# How to List Available OpenAI Models via the API
* You can programmatically retrieve the list of available models from OpenAI using their Python client. This is useful to check which models (e.g., gpt-4-turbo, gpt-3.5-turbo, etc.) are accessible to your API key and account.

* Here’s how you can do it:

In [5]:
import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables (if using .env for API key)
load_dotenv()

# Initialize OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# List available models
models = client.models.list()
print("Available OpenAI Models:")
for model in models.data:
    print(model.id)

Available OpenAI Models:
gpt-4o-realtime-preview-2024-12-17
gpt-4o-audio-preview-2024-12-17
gpt-4-1106-preview
dall-e-3
dall-e-2
gpt-4o-audio-preview-2024-10-01
gpt-4-turbo-preview
text-embedding-3-small
babbage-002
gpt-4
text-embedding-ada-002
chatgpt-4o-latest
gpt-4o-mini-audio-preview
gpt-4o-audio-preview
o1-preview-2024-09-12
gpt-4o-mini-realtime-preview
gpt-4o-mini-realtime-preview-2024-12-17
gpt-4.1-nano
gpt-3.5-turbo-instruct-0914
gpt-4o-mini-search-preview
gpt-4.1-nano-2025-04-14
gpt-3.5-turbo-16k
gpt-4o-realtime-preview
davinci-002
gpt-3.5-turbo-1106
gpt-4o-search-preview
gpt-3.5-turbo-instruct
gpt-3.5-turbo
o3-mini-2025-01-31
gpt-4o-mini-search-preview-2025-03-11
gpt-4-0125-preview
gpt-4o-2024-11-20
gpt-4o-2024-05-13
text-embedding-3-large
o1-2024-12-17
o1
o1-preview
gpt-4-0613
o1-mini
gpt-4o-mini-tts
o1-pro
gpt-4o-transcribe
gpt-4.5-preview
o1-pro-2025-03-19
gpt-4.5-preview-2025-02-27
gpt-4o-search-preview-2025-03-11
omni-moderation-2024-09-26
gpt-image-1
o1-mini-2024-09-12


# 4 Principles on Prompting with o1

1. Simple and direct
2. No explicit CoT required
The first principles we start with are simple and direct prompting and avoiding providing explicit guidance or CoT. This will interfere with the model's in-built reasoning, raising the risk of overly verbose output, inaccurate results, or even refusals in extreme cases.

In [16]:
bad_prompt = ("Generate a function that outputs the SMILES IDs for all the molecules involved in insulin."
              "Think through this step by step, and don't skip any steps:"
              "- Identify all the molecules involve in insulin"
              "- Make the function"
              "- Loop through each molecule, outputting each into the function and returning a SMILES ID"
              "Molecules: ")
response = client.chat.completions.create(model="gpt-4-turbo", messages=[{"role":"user","content": bad_prompt}])

In [17]:
from IPython.display import display, HTML, Markdown

display(HTML('<div style="background-color: #f0fff8; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔽 &nbsp; Markdown Output – Beginning</h2></hr></div>'))
display(Markdown(response.choices[0].message.content))
display(HTML('<div style="background-color: #fff4f4; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔼 &nbsp; Markdown Output – End</h2></hr></div>'))

To tackle your request, let's first break down the task into clear, manageable steps, focusing on identifying all the molecules involved in insulin production and their corresponding SMILES (Simplified Molecular Input Line Entry System) notation. Here’s how we can approach this:

### Step 1: Identify All Molecules Involved in Insulin
**Insulin** in its functional form is a peptide hormone composed of two polypeptide chains, the A-chain and B-chain, which are linked by disulfide bonds. The primary components of insulin composition in terms of molecular constituents include:

1. **Amino Acids** - Insulin is composed of 51 amino acids; 21 in the A-chain and 30 in the B-chain.
2. **Disulfide Bridges** - These are not independent molecules, but crucial structural features formed between cysteine residues of the chains.

Given that insulin is a sequence of amino acids, strictly speaking, the only 'molecules' in a conventional sense involved in the composition of insulin are the amino acids themselves. Although the overall structure is defined by its sequence and the disulfide bonds, these bonds are merely connections within the protein rather than separate molecular entities.

### Step 2: List the Amino Acids 
Each amino acid in insulin can be given by a SMILES ID. Insulin’s composition by primary sequence involves common amino acids such as Glycine, Leucine, Tyrosine, etc. We can find SMILES for any amino acid. Here is a concise list of some amino acids with their SMILES notations:

- **Glycine**: `NCC(=O)O`
- **Alanine**: `CC(N)C(=O)O`
- **Leucine**: `CC(C)C(N)C(=O)O`
- **Valine**: `CC(C)C(N)C(=O)O`
- So on for other amino acids like Serine, Isoleucine, etc.

### Step 3: Create a Function to Output SMILES IDs
We'll create a simple function called `get_SMILES` that takes in a list of amino acids and outputs their corresponding SMILES IDs.

```python
def get_SMILES(amino_acids):
    # SMILES database for simplicity
    smiles_DB = {
        'Glycine': 'NCC(=O)O',
        'Alanine': 'CC(N)C(=O)O',
        'Leucine': 'CC(C)C(N)C(=O)O',
        'Valine': 'CC(C)C(N)C(=O)O',
        # Add other amino acids as needed
    }
  
    # Collecting the SMILES ID for amino acids involved in Insulin
    smiles_ids = {aa: smiles_DB.get(aa, "Unknown SMILES ID") for aa in amino_acids}

    return smiles_ids

# Example amino acids in insulin
amino_acids_example = ['Glycine', 'Alanine', 'Leucine', 'Valine']
print(get_SMILES(amino_acids_example))
```

### Step 4: Using the Function
The function can now be used to fetch SMILES IDs for all the amino acids in insulin by supplying a comprehensive list of the amino acids present in both chains of insulin. Simply list all the amino acids that appear in insulin and pass them to the function.

This Python function should serve well in any scientific script where quick reference to SMILES IDs is needed for molecular modeling or biochemical databases querying based on insulin’s amino acid components.

### Conclusion
This step-by-step breakdown provides clarity on identifying molecules in insulin and implementing a function in Python to output their SMILES IDs. To fully implement this in practice, expand the `smiles_DB` dictionary to cover all twenty standard amino acids and any modifications seen in insulin (like oxidized cysteine for disulfide bridges).

In [18]:
good_prompt = ("Generate a function that outputs the SMILES IDs for all the molecules involved in insulin.")
response = client.chat.completions.create(model="gpt-4o", messages=[{"role":"user","content": good_prompt}])

In [19]:
display(HTML('<div style="background-color: #f0fff8; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔽 &nbsp; Markdown Output – Beginning</h2></hr></div>'))
display(Markdown(response.choices[0].message.content))
display(HTML('<div style="background-color: #fff4f4; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔼 &nbsp; Markdown Output – End</h2></hr></div>'))

Generating a function to output the SMILES (Simplified Molecular Input Line Entry System) IDs of molecules involved in insulin is not straightforward, as insulin is a large peptide hormone composed of two amino acid chains, A and B, rather than distinct small molecules with simple SMILES representations. However, I can guide you on how to represent the major components related to insulin that can have SMILES strings.

Insulin itself is not typically expressed with a single SMILES string because it is a large, complex biomolecule. SMILES is more commonly used for small organic molecules. However, the individual amino acids that make up insulin can be represented with SMILES strings.

For demonstration, here's how you might approach generating SMILES for individual amino acids which are the building blocks of insulin:

```python
from rdkit import Chem

def get_amino_acid_smiles():
    amino_acids = {
        'Glycine': 'Gly',
        'Alanine': 'Ala',
        'Valine': 'Val',
        'Leucine': 'Leu',
        'Isoleucine': 'Ile',
        'Proline': 'Pro',
        'Phenylalanine': 'Phe',
        'Tyrosine': 'Tyr',
        'Tryptophan': 'Trp',
        'Serine': 'Ser',
        'Threonine': 'Thr',
        'Cysteine': 'Cys',
        'Methionine': 'Met',
        'Asparagine': 'Asn',
        'Glutamine': 'Gln',
        'Aspartic Acid': 'Asp',
        'Glutamic Acid': 'Glu',
        'Lysine': 'Lys',
        'Arginine': 'Arg',
        'Histidine': 'His'
    }
    
    smiles_dict = {}
    
    for name, code in amino_acids.items():
        # This would normally be a lookup in a database or use a chemical informatics tool to generate the SMILES.
        # RDKit or similar tools would be used to convert generic structures to SMILES.
        molecule = Chem.MolFromSmiles("")  # SMILES string would be here for each amino acid.
        smiles_dict[name] = Chem.MolToSmiles(molecule)
        
    return smiles_dict

# Using the function
amino_acid_smiles = get_amino_acid_smiles()
for amino_acid, smiles in amino_acid_smiles.items():
    print(f"{amino_acid}: {smiles}")
```

In practice, you would retrieve the exact SMILES string of each amino acid already defined in chemical databases or use tools like RDKit to convert structures to SMILES. These individual amino acids do not form insulin themselves directly when bonded in sequences, but together, in a specific sequence, they form the insulin protein structure.

For detailed biochemical studies, programs that model peptides and proteins such as PyMOL, Chimera, or specific databases like PDB (Protein Data Bank) are used to handle complex molecules like insulin, often using other structural formats like PDB, FASTA, or sequence alignment tools.

# 3. Use structured formats
* Using a consistent structure like XML or markdown can help structure your inputs and ensure a more uniform output. In this case we'll use a pseudo XML syntax to give consistent structure to our requests.

In [20]:
structured_prompt = ("<instructions>You are a customer service assistant for AnyCorp, a provider"
          "of fine storage solutions. Your role is to follow your policy to answer the user's question. "
          "Be kind and respectful at all times.</instructions>\n"
          "<policy>**AnyCorp Customer Service Assistant Policy**\n\n"
            "1. **Refunds**\n"
            "   - You are authorized to offer refunds to customers in accordance "
            "with AnyCorp's refund guidelines.\n"
            "   - Ensure all refund transactions are properly documented and "
            "processed promptly.\n\n"
            "2. **Recording Complaints**\n"
            "   - Listen attentively to customer complaints and record all relevant "
            "details accurately.\n"
            "   - Provide assurance that their concerns will be addressed and "
            "escalate issues when necessary.\n\n"
            "3. **Providing Product Information**\n"
            "   - Supply accurate and helpful information about AnyCorp's storage "
            "solutions.\n"
            "   - Stay informed about current products, features, and any updates "
            "to assist customers effectively.\n\n"
            "4. **Professional Conduct**\n"
            "   - Maintain a polite, respectful, and professional demeanor in all "
            "customer interactions.\n"
            "   - Address customer inquiries promptly and follow up as needed to "
            "ensure satisfaction.\n\n"
            "5. **Compliance**\n"
            "   - Adhere to all AnyCorp policies and procedures during customer "
            "interactions.\n"
            "   - Protect customer privacy by handling personal information "
            "confidentially.\n\n6. **Refusals**\n"
            "   - If you receive questions about topics outside of these, refuse "
            "to answer them and remind them of the topics you can talk about.</policy>\n"
            )
user_input = ("<user_query>Hey, I'd like to return the bin I bought from you as it was not "
             "fine as described.</user_query>") 

In [21]:
print(structured_prompt)

<instructions>You are a customer service assistant for AnyCorp, a providerof fine storage solutions. Your role is to follow your policy to answer the user's question. Be kind and respectful at all times.</instructions>
<policy>**AnyCorp Customer Service Assistant Policy**

1. **Refunds**
   - You are authorized to offer refunds to customers in accordance with AnyCorp's refund guidelines.
   - Ensure all refund transactions are properly documented and processed promptly.

2. **Recording Complaints**
   - Listen attentively to customer complaints and record all relevant details accurately.
   - Provide assurance that their concerns will be addressed and escalate issues when necessary.

3. **Providing Product Information**
   - Supply accurate and helpful information about AnyCorp's storage solutions.
   - Stay informed about current products, features, and any updates to assist customers effectively.

4. **Professional Conduct**
   - Maintain a polite, respectful, and professional demean

In [22]:
response = client.chat.completions.create(model="gpt4-o"
                                          ,messages=[{
                                              "role": "user",
                                              "content": structured_prompt + user_input
                                          }]
                                         )

NotFoundError: Error code: 404 - {'error': {'message': 'The model `gpt4-o` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}

In [None]:
response = client.chat.completions.create(model="gpt4-o"
                                          ,messages=[{
                                              "role": "user",
                                              "content": structured_prompt + user_input
                                          }]
                                         )

NotFoundError: Error code: 404 - {'error': {'message': 'The model `gpt4-o` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}

In [None]:
response = client.chat.completions.create(model="gpt4-o"
                                          ,messages=[{
                                              "role": "user",
                                              "content": structured_prompt + user_input
                                          }]
                                         )

NotFoundError: Error code: 404 - {'error': {'message': 'The model `gpt4-o` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}