In [11]:
import warnings
warnings.filterwarnings("ignore")

from helper import get_openai_api_key
openai_api_key = get_openai_api_key()

In [16]:
import json
from IPython.display import display, Markdown, HTML
from openai import OpenAI

client = OpenAI(api_key=openai_api_key)
GPT_MODEL = 'gpt-4o'
GPT_MODEL_MINI = 'gpt-40-mini'
o1_MODEL = 'o1-mini'

In [None]:
no_CoT_prompt = ("Generate a function that outputs the SMILES IDs for all the molecules involved in insulin.")
response = client.chat.completions.create(model=GPT_MODEL,messages=[{"role":"user","content": no_CoT_prompt}])

In [18]:
display(HTML('<div style="background-color: #f0fff8; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔽 &nbsp; Markdown Output – Beginning</h2></hr></div>'))
display(Markdown(response.choices[0].message.content))
display(HTML('<div style="background-color: #fff4f4; padding: 10px; border-radius: 5px; border: 1px solid #d3d3d3;"></hr><h2>🔼 &nbsp; Markdown Output – End</h2></hr></div>'))

Creating a function that outputs the SMILES (Simplified Molecular Input Line Entry System) notation for all molecules involved in insulin is quite challenging due to the complexity of insulin. Insulin is a peptide hormone composed of two peptide chains (A chain and B chain) connected by disulfide bonds. Each of these chains consists of a sequence of amino acids. Representing entire proteins or complex peptides like insulin using SMILES is not typical, as SMILES is generally used for small molecules.

However, if you're looking to represent the basic building blocks or a simplified version of insulin, you would first need the SMILES for each amino acid involved. Below is a Python function that provides SMILES strings for common amino acids. You could extend this to map the ones found in insulin, albeit insulin is better represented in its entirety through sequence databases rather than as SMILES due to its complexity.

```python
def get_amino_acid_smiles(amino_acid):
    # Dictionary mapping single-letter amino acid codes to SMILES
    amino_acid_smiles = {
        'A': 'CC(C(=O)O)N',        # Alanine
        'R': 'C(C(C(=O)O)N)CN=C(N)N', # Arginine
        'N': 'CC(C(=O)O)NC(=O)C',  # Asparagine
        'D': 'CC(C(=O)O)C(=O)O',   # Aspartic acid
        'C': 'C(C(C(=O)O)N)S',     # Cysteine
        'Q': 'CCC(C(=O)O)NC(=O)C', # Glutamine
        'E': 'CCC(C(=O)O)C(=O)O',  # Glutamic acid
        'G': 'C(C(=O)O)N',         # Glycine
        'H': 'C1=C(NC=N1)C(C(=O)O)N', # Histidine
        'I': 'CC(C)C(C(=O)O)N',    # Isoleucine
        'L': 'C(C(C(=O)O)N)CC',    # Leucine
        'K': 'CCCC(C(=O)O)N',      # Lysine
        'M': 'CSCC(C(=O)O)N',      # Methionine
        'F': 'CC(C(=O)O)N',        # Phenyalanine
        'P': 'C1CNC1C(=O)O',       # Proline
        'S': 'C(C(C(=O)O)N)O',     # Serine
        'T': 'C(C(C(=O)O)N)O',     # Threonine
        'W': 'C1=CC=C2C(=C1)C(=CN2)C(=O)O', # Tryptophan
        'Y': 'C1=CC=C(C=C1)CC(C(=O)O)N', # Tyrosine
        'V': 'CC(C(=O)O)N',        # Valine
    }
    
    return amino_acid_smiles.get(amino_acid.upper(), "Unknown")

# Example usage:
amino_acid = 'A'  # Alanine, for example
print(get_amino_acid_smiles(amino_acid))
```

For specific insulin structure, you'd need to map its amino acid sequence using the above dictionary. However, insulin is better represented by peptide sequences using FASTA format or in databases like PDB or UniProt rather than trying to force it into SMILES.