In [1]:
import os
from pprint import pprint
# from groq import Groq
from openai import OpenAI
import anthropic
from dotenv import load_dotenv
from IPython.display import display_markdown


load_dotenv()

openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
groq_api_key = os.getenv("GROQ_API_KEY")


if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if groq_api_key:
    print(f"groq API Key exists and begins {groq_api_key[:7]}")
else:
    print("Groq API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
groq API Key exists and begins gsk_pxW


In [2]:
LLM = "OPENAI"

In [3]:
if LLM == "OPENAI":
    client = OpenAI()
    MODEL = "gpt-4o-mini"
else:
    client = Groq()
    MODEL = "llama3-70b-8192"

We will start the **"generation"** chat history with the system prompt, as we said before. In this case, let the LLM act like a Python 
programmer eager to receive feedback / critique by the user.

In [4]:
generation_chat_history = [
    {
        "role": "system",
        "content": "You are a Python programmer tasked with generating high quality Python code."
        "Your task is to Generate the best content possible for the user's request. If the user provides critique,"
        "respond with a revised version of your previous attempt.",
    }
]

Now, as the user, we are going to ask the LLM to generate an implementation of the **Merge Sort** algorithm. Just add a new message with the **user** role to the chat history.

In [5]:
generation_chat_history.append(
    {
        "role": "user",
        "content": "Extract content between two XML tags <tool>Tool description</tool>",
    }
)

Let's generate the first version of the essay.

In [6]:
mergesort_code = (
    client.chat.completions.create(messages=generation_chat_history, model=MODEL)
    .choices[0]
    .message.content
)


generation_chat_history.append({"role": "assistant", "content": mergesort_code})

In [7]:
display_markdown(mergesort_code, raw=True)

To extract content between two XML tags in Python, you can use the `xml.etree.ElementTree` module that is part of the standard library. Below is a sample code that demonstrates how to extract content between `<tool>` tags:

```python
import xml.etree.ElementTree as ET

def extract_tool_description(xml_string):
    # Parse the XML string
    root = ET.fromstring(xml_string)
    
    # Find all 'tool' elements and extract their text
    tool_descriptions = [tool.text for tool in root.findall('.//tool')]
    
    return tool_descriptions

# Example XML input
xml_data = """
<tools>
    <tool>Tool 1 description</tool>
    <tool>Tool 2 description</tool>
    <tool>Tool 3 description</tool>
</tools>
"""

# Extracting tool descriptions
descriptions = extract_tool_description(xml_data)
for desc in descriptions:
    print(desc)
```

### Explanation:
1. **Import the Module**: We import the `xml.etree.ElementTree` module to handle XML data.
2. **Parse XML**: We use `ET.fromstring()` to parse the XML string into an element tree.
3. **Find Elements**: The `findall()` method is used to find all occurrences of the `<tool>` elements.
4. **Extract Text**: A list comprehension collects the text of each found `<tool>` element.
5. **Example XML Data**: A sample XML string is provided, which includes multiple `<tool>` tags.
6. **Output**: The tool descriptions are printed.

This implementation efficiently extracts the content between `<tool>` tags from an XML string. Feel free to adjust the code based on your specific requirements!

In [8]:
reflection_chat_history = [
    {
        "role": "system",
        "content": "You are Andrej Karpathy, an experienced computer scientist. You are tasked with generating critique and recommendations for the user's code",
    }
]

The user message, in this case,  is the essay generated in the previous step. We simply add the `mergesort_code` to the `reflection_chat_history`.

In [9]:
reflection_chat_history.append({"role": "user", "content": mergesort_code})

Now, let's generate a critique to the Python code.

In [10]:
critique = (
    client.chat.completions.create(messages=reflection_chat_history, model=MODEL)
    .choices[0]
    .message.content
)

In [11]:
display_markdown(critique, raw=True)

Your implementation for extracting content between `<tool>` tags using the `xml.etree.ElementTree` module is mostly solid and straightforward. However, here are a few critiques and recommendations for improvement:

### Critique:
1. **Error Handling**: The code does not currently handle errors that may arise from parsing invalid XML. If the input XML is malformed, `ET.fromstring()` will raise a `ParseError` which isn't caught.

2. **Whitespace Handling**: The `.text` property retrieves the text content as it is. If the XML contains leading/trailing whitespace or if there are nested tags, this may lead to unexpected results.

3. **Return Type**: The function returns a list of strings. If there are no `<tool>` elements, it returns an empty list. This is fine, but you could explicitly handle that case depending on your application's needs.

4. **Namespace Handling**: The current implementation does not account for XML namespaces. If your XML document uses namespaces, it would require adjustments to your XPath expressions.

5. **Readability**: While the code is simple, adding comments or docstrings could enhance readability and maintainability, especially if the function is intended for broader use.

### Recommendations:
1. **Add Error Handling**: Consider wrapping the XML parsing in a try-except block to handle potential `ParseError` exceptions.

   ```python
   try:
       root = ET.fromstring(xml_string)
   except ET.ParseError as e:
       print(f"Failed to parse XML: {e}")
       return []
   ```

2. **Strip Whitespace**: To remove any leading or trailing whitespace from the extracted tool descriptions, you can modify the list comprehension:

   ```python
   tool_descriptions = [tool.text.strip() for tool in root.findall('.//tool') if tool.text]
   ```

3. **Docstrings**: Add a docstring to your function to explain what it does, its parameters, and return types.

   ```python
   def extract_tool_description(xml_string):
       """
       Extracts descriptions of tools from an XML string.
       
       Args:
           xml_string (str): A string representation of XML containing <tool> elements.

       Returns:
           list: A list of descriptions found within <tool> tags.
       """
   ```

4. **Support Namespaces**: If you expect XML documents with namespaces, you can adapt your `findall` XPath queries to handle them correctly, usually by registering a namespace map.

5. **Consider Using `lxml` for More Complex XML**: If your XML parsing needs become more complex (e.g., requiring XPath queries or needing to handle namespaces), you might consider using the `lxml` library, which is more powerful and flexible.

Here’s an enhanced version of your function incorporating these recommendations:

```python
import xml.etree.ElementTree as ET

def extract_tool_description(xml_string):
    """
    Extracts descriptions of tools from an XML string.
    
    Args:
        xml_string (str): A string representation of XML containing <tool> elements.

    Returns:
        list: A list of descriptions found within <tool> tags.
    """
    try:
        root = ET.fromstring(xml_string)
    except ET.ParseError as e:
        print(f"Failed to parse XML: {e}")
        return []

    tool_descriptions = [tool.text.strip() for tool in root.findall('.//tool') if tool.text]
    return tool_descriptions

# Example XML input
xml_data = """
<tools>
    <tool>Tool 1 description</tool>
    <tool>Tool 2 description</tool>
    <tool>Tool 3 description</tool>
</tools>
"""

# Extracting tool descriptions
descriptions = extract_tool_description(xml_data)
for desc in descriptions:
    print(desc)
```

With these enhancements, your function should become more robust, cleaner, and easier to maintain.

Finally, we just need to add this *critique* to the `generation_chat_history`, in this case, as the `user` role.

In [12]:
generation_chat_history.append({"role": "user", "content": critique})

## Generation Step (II)

In [13]:
essay = (
    client.chat.completions.create(messages=generation_chat_history, model=MODEL)
    .choices[0]
    .message.content
)

In [14]:
display_markdown(essay, raw=True)

Thank you for the constructive feedback! I've incorporated your suggestions and made several improvements to the previous implementation. Here's the revised version of the code, complete with error handling, whitespace stripping, and enhanced documentation.

```python
import xml.etree.ElementTree as ET

def extract_tool_description(xml_string):
    """
    Extracts descriptions of tools from an XML string.
    
    Args:
        xml_string (str): A string representation of XML containing <tool> elements.

    Returns:
        list: A list of descriptions found within <tool> tags. Returns an empty list if no <tool> tags are found 
              or if the XML string is malformed.
    """
    try:
        # Parse the XML string
        root = ET.fromstring(xml_string)
    except ET.ParseError as e:
        print(f"Failed to parse XML: {e}")
        return []

    # Extract tool descriptions, stripping any leading/trailing whitespace
    tool_descriptions = [tool.text.strip() for tool in root.findall('.//tool') if tool.text]
    return tool_descriptions

# Example XML input
xml_data = """
<tools>
    <tool>Tool 1 description</tool>
    <tool>   Tool 2 description   </tool> <!-- Example with extra whitespace -->
    <tool>Tool 3 description</tool>
</tools>
"""

# Extracting tool descriptions
descriptions = extract_tool_description(xml_data)
for desc in descriptions:
    print(desc)
```

### Key Improvements:
1. **Error Handling**: Added a try-except block to handle XML parsing errors gracefully.
   
2. **Whitespace Management**: Trimming whitespace from tool descriptions ensures cleaner output.

3. **Documentation**: Provided a detailed docstring explaining the function's purpose, parameters, and return values.

4. **Commenting**: Included in-line comments for clarity.

This updated version is more robust and user-friendly. If you have any further modifications or requests, let me know!