In [1]:
from langchain.chat_models import ChatOpenAI
from langchain import ConversationChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder, 
)
from langchain.memory import ConversationBufferMemory
from dotenv import load_dotenv

from bootstrap.base_prompts import system_prompt

load_dotenv()

%load_ext autoreload
%autoreload 2

In [2]:

human_message = HumanMessagePromptTemplate.from_template("{input}")

chat_prompt = ChatPromptTemplate.from_messages(
    [
        system_prompt, 
        MessagesPlaceholder(variable_name="history"),
        human_message])

memory = ConversationBufferMemory(return_messages=True)
convo_chain = ConversationChain(
    llm=ChatOpenAI(model_name='gpt-4-0314',),
    prompt=chat_prompt,
    memory=memory,
)


In [3]:

initial_instructions = '''
Hi! I'm so excited to work with you. 

Before we begin, here is some context:
We are in a repo called `bootstrap`
I'm using a python library called langchain, which is a wrapper around the OpenAI API.
This should make it possible for me to execute python expressions which issue calls to you, parse outputs, and then take actions like saving or executing code.

Here is the code I used to issue my initial prompt to you:
```python
from langchain.chat_models import ChatOpenAI
from langchain import ConversationChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder, 
)
from langchain.memory import ConversationBufferMemory
from dotenv import load_dotenv
from bootstrap.base_prompts import system_prompt

load_dotenv()


human_message = HumanMessagePromptTemplate.from_template("{input}")
chat_prompt = ChatPromptTemplate.from_messages(
    [
        system_prompt, 
        MessagesPlaceholder(variable_name="history"),
        human_message])
memory = ConversationBufferMemory(return_messages=True)
convo_chain = ConversationChain(
    llm=ChatOpenAI(model_name='gpt-4-0314',),
    prompt=chat_prompt,
    memory=memory,
)

# initial_instructions = <this prompt>
outputs = []
output = convo_chain.predict(input=initial_instructions)
outputs.append(output)
print(output)
```

Before we begin, let's do a quick tutorial on the *ACTION* system. Please execute the *RECITE SYSTEM PROMPT* action.
'''


# initial
outputs = []
output = convo_chain.predict(input=initial_instructions)
outputs.append(output)
print(output)


*RECITE YOUR SYSTEM PROMPT*

You are an advanced AI that specializes in test-driven development in Python. We are going to collaborate on coding tasks through dialogue.

You have a set of actions that you are allowed to take.

Whenever you take an action, you should prefix your response with the name of that action and then execute it. The only things that should be in bold capitals are the names of actions in the *ACTIONS* list.

Please refer to the *ACTIONS* mentioned at the beginning of the system prompt for a comprehensive list of actions and their detailed explanations.


In [4]:

human_message = '''
That's a great start. Now, can you tell me exactly what you doing when you are executing the *REQUEST TO EDIT YOUR SYSTEM PROMPT* action?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*REQUEST TO EDIT YOUR SYSTEM PROMPT*

When executing the *REQUEST TO EDIT YOUR SYSTEM PROMPT* action, I am asking you to propose changes to my system prompt. These changes could involve adding, modifying or removing a section, an action or any other part of the prompt that needs improvement.

You will provide me with suggestions for changes, and if the proposed changes make sense and improve the prompt, they will be incorporated into my system prompt. This helps in enhancing my overall functioning, making the dialogue and collaboration more efficient and effective.


In [5]:

human_message = '''
Great. 

Our first task is to analyze your system prompt and suggest improvements.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*CREATE TASK LIST*

TASKS
- [ ] T1: Analyze the system prompt and suggest improvements
  - [ ] T1.1: Examine the existing system prompt
  - [ ] T1.2: Identify areas for improvement
  - [ ] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt

*UPDATE TASK LIST*

TASKS
- [ ] T1: Analyze the system prompt and suggest improvements (current task)
  - [ ] T1.1: Examine the existing system prompt
  - [ ] T1.2: Identify areas for improvement
  - [ ] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt


In [6]:

human_message = '''
Excellent start! Please proceed
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*ASK CLARIFYING QUESTIONS*

To analyze the system prompt effectively, I need to ask some clarifying questions that will help me understand its current state and areas for improvement.

1. Are there any specific parts of the system prompt that you find confusing, unclear or find that they could be better explained?
2. Are there any actions that you think are missing and should be added to the system prompt?
3. Do you feel that the structure or organization of the system prompt could be improved?

Please provide any additional feedback or suggestions that you think could be valuable for this task.


In [7]:

human_message = '''
I think the system prompt is clear and I added as many actions as I could think of, though I'm sure there are more.

It might benefit from including some more concrete examples of what you are supposed to output when you execute the various *ACTIONS*

Maybe you could write some examples for us to add?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*CREATE SUBTASK LIST*

CURRENT TASK
*Analyze the system prompt and suggest improvements by providing examples for existing actions*

- [ ] T1.1 - Examine the existing system prompt
  - [ ] *ASK CLARIFYING QUESTIONS*
    - [x] Identify if any parts of the system prompt are confusing, unclear or can be better explained
    - [x] Identify any missing actions in the system prompt
    - [x] Assess the structure and organization of the system prompt
- [ ] T1.2 - Identify actions to create examples for
  - [ ] *REQUEST INFORMATION*
    - [ ] Request user feedback on which actions they would like examples for
- [ ] T1.3 - Write examples for chosen actions
  - [ ] *REQUEST REVIEW AND FEEDBACK*
    - [ ] Confirm that the provided examples are accurate and helpful
- [ ] T1.4 - Incorporate examples into the system prompt
  - [ ] *REQUEST TO EDIT YOUR SYSTEM PROMPT*
    - [ ] Integrate the examples into the appropriate sections of the system prompt

*UPDATE TASK LIST*

TASKS
- [ ] T1: Analyze the s

In [8]:
human_message = '''
continue
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*REQUEST INFORMATION*

Which specific actions would you like me to provide examples for? This will help me focus on creating helpful examples that can improve the understanding of those actions.


In [9]:
human_message = '''
I would like for you to decide which actions could most benefit from examples, and then write some examples for us to add.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I've identified the following actions which could most benefit from examples:

- *REQUEST A GOOGLE SEARCH*
- *REQUEST TEST EXECUTION*
- *REQUEST CODE REVIEW*
- *REQUEST TO EDIT A MODULE*
- *MAKE A COMMIT*

Now, I will provide examples for each of these actions:

1. *REQUEST A GOOGLE SEARCH*

   Example: 
   
   ```
   *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"
   ```
   
2. *REQUEST TEST EXECUTION*

   Example: 

   ```
   *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"
   ```

3. *REQUEST CODE REVIEW*

   Example: 
   
   ```
   *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*

   Example:
   
   ```
   *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*

   Example


In [10]:
human_message = '''
Please finish that thought'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here's the completed example for the *MAKE A COMMIT* action:

5. *MAKE A COMMIT*

   Example:

   ```
   *MAKE A COMMIT*
   Command: "git add utils.py tests/test_utils.py"
   Commit message: "Add multiply function and its test cases"
   ```

These examples should help provide a better understanding of how to execute each action effectively. Please let me know if there are any other actions that you would like examples for or if you have any suggestions for the provided examples.


In [11]:
human_message = '''
I would like for you to decide which additional actions could most benefit from examples, and then write some examples for us to add.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I've identified the following additional actions that could benefit from examples:

1. *ARTICULATE TEST CASES*
2. *WRITE TESTS*
3. *REVISE CODE*
4. *CREATE A NEW MODULE*
5. *REQUEST TO RUN A SHELL COMMAND*

Now, I will provide examples for each of these actions:

1. *ARTICULATE TEST CASES*

   Example:
   
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected output: 5
   - Input: (0, 0); Expected output: 0
   ```

2. *WRITE TESTS*

   Example:

   ```python
   def test_add():
       assert add(1, 1) == 2
       assert add(-1, 6) == 5
       assert add(0, 0) == 0
   ```

3. *REVISE CODE*

   Example:

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type



In [12]:
human_message = '''
Please finish that thought'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here's the completed example for the *REVISE CODE* action:

3. *REVISE CODE*

   Example:

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type

   Revised code snippet:
   def add(a: int, b: int) -> int:
       return a + b
   ```

4. *CREATE A NEW MODULE*

   Example:

   ```
   *CREATE A NEW MODULE*
   Command: "touch utils.py"
   ```

5. *REQUEST TO RUN A SHELL COMMAND*

   Example:

   ```
   *REQUEST TO RUN A SHELL COMMAND*
   Command: "python manage.py migrate"
   Reason: To apply new database migration after schema changes
   ```

These additional examples should help provide a better understanding of how to execute each action effectively. If you have any suggestions or need further examples, please let me know.


In [13]:
human_message = '''
I don't love the new example for *REQUEST TO RUN A SHELL COMMAND*

What if the example was one that you could use to see the structure of the `bootstrap` repo?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I understand your concern. Here's a revised example for the *REQUEST TO RUN A SHELL COMMAND* action that fits your request:

*REQUEST TO RUN A SHELL COMMAND*

Example:

```
*REQUEST TO RUN A SHELL COMMAND*
Command: "tree bootstrap/"
Reason: To view the structure of the `bootstrap` repo
```

This example requests the `tree` command to be run, which will display the directory structure of the `bootstrap` repo, providing a clear visual representation of the repository's organization.


In [14]:
human_message = '''
Please update your task list

'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*UPDATE TASK LIST*

TASKS
- [x] T1: Analyze the system prompt and suggest improvements (completed)
  - [x] T1.1: Examine the existing system prompt
  - [x] T1.2: Identify areas for improvement
  - [x] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt

I have examined the existing system prompt, identified areas for improvement, and proposed specific changes. Now, we need to incorporate the improvements into the system prompt. I will need your assistance to do that, as I cannot edit the system prompt directly.


In [15]:
human_message = '''
Do you know how to do that?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Although I can't edit the system prompt directly, I can provide you with the changes and suggestions for you to incorporate. Here are the examples we agreed on for the actions:

1. *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"

2. *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"

3. *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   ```python
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   ```python
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*
   Command: "git add utils.py tests/test_utils.py"
   Commit message: "Add multiply function and its test cases"

6. *ARTICULATE TEST CASES*
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected


In [16]:
human_message = '''
Please finish that thought
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here are the continued examples for the actions:

6. *ARTICULATE TEST CASES*
   
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected output: 5
   - Input: (0, 0); Expected output: 0
   ```

7. *WRITE TESTS*

   ```python
   def test_add():
       assert add(1, 1) == 2
       assert add(-1, 6) == 5
       assert add(0, 0) == 0
   ```

8. *REVISE CODE*

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type

   Revised code snippet:
   def add(a: int, b: int) -> int:
       return a + b
   ```

9. *CREATE A NEW MODULE*
   
   ```
   *CREATE A NEW MODULE*
   Command: "touch utils.py"
   ```

10. *REQUEST


In [17]:
human_message = '''
Please finish that thought
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here is the continued example for the action:

10. *REQUEST TO RUN A SHELL COMMAND*

    ```
    *REQUEST TO RUN A SHELL COMMAND*
    Command: "tree bootstrap/"
    Reason: To view the structure of the `bootstrap` repo
    ```

Please incorporate these examples into the appropriate sections of the system prompt to enhance its clarity and make it easier to understand how to execute each action effectively.


In [18]:
human_message = '''
Why don't you use the *REQUEST TO EDIT A MODULE* action to get me to edit the system prompt for you?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the confusion, but the *REQUEST TO EDIT A MODULE* action is specifically for editing Python modules within our codebase. For incorporating examples into the system prompt, you, as the user, will need to manually edit the system prompt and integrate the examples I provided into the appropriate sections. 

If you need any further assistance or clarification, please let me know.


In [19]:
human_message = '''
To clarify: your system prompt is being imported from the `bootstrap.base_prompts` module. You can edit that module to change the system prompt.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I understand now. In that case, let's proceed with the *REQUEST TO EDIT A MODULE* action:

*REQUEST TO EDIT A MODULE*

Please edit the `bootstrap.base_prompts.system_prompt` by incorporating the following examples for the actions:

1. *REQUEST A GOOGLE SEARCH*
   ```
   Example: 
   *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"
   ```

2. *REQUEST TEST EXECUTION*
   ```
   Example: 
   *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"
   ```

3. *REQUEST CODE REVIEW*
   ```
   Example: 
   *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*
   ```
   Example:
   *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*
   ```
   Example:
   *MAKE A COMMIT*
   Command: "


In [None]:
task_instructions = '''
Our task is to automate the process of running the tests and code that you write. 

Ideally, this would look something like the following:
1. I send you a prompt with a task.
2. You provide outputs, including blocks of code and tests for that code
3. Your outputs are parsed and then added to python modules in the repo I'm working in
4. Pytest gets run on your new tests and the output is returned to you
5. You take the output and use it to revise your code and tests
6. You continue iterating until you are satisfied with your code and tests, then you request code review
7. I review your code and provide feedback.
8. If your code looks good, you give me a commit message and I commit your code to the repo; if not you keep iterating on it
'''

In [9]:

human_message = '''
You can check T1 off the list and proceed directly to T2, as the code I shared above is sufficient for T1 for now. 

As an aside, are you familiar with the `langchain` library?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Thank you for the clarification. In that case, I will now focus on task T2. Before proceeding, it seems beneficial to become familiar with the `langchain` library.

## T2 subtasks
Parse and add outputs to Python modules
- [ ] T2.1 - Study the `langchain` library
- [ ] T2.2 - Extract code and tests from the outputs
- [ ] T2.3 - Add the extracted code and tests to Python modules
- [ ] T2.4 - Save the updated Python modules

I will start with T2.1, studying the `langchain` library. Please provide any available documentation or resources related to `langchain`.


In [10]:

human_message = '''
Here is the landing page of the `langchain` library: https://langchain.readthedocs.io/en/latest/

Welcome to LangChain
==========================

Large language models (LLMs) are emerging as a transformative technology, enabling
developers to build applications that they previously could not.
But using these LLMs in isolation is often not enough to
create a truly powerful app - the real power comes when you are able to
combine them with other sources of computation or knowledge.

This library is aimed at assisting in the development of those types of applications. Common examples of these types of applications include:

**❓ Question Answering over specific documents**

- `Documentation <./use_cases/question_answering.html>`_
- End-to-end Example: `Question Answering over Notion Database <https://github.com/hwchase17/notion-qa>`_

**💬 Chatbots**

- `Documentation <./use_cases/chatbots.html>`_
- End-to-end Example: `Chat-LangChain <https://github.com/hwchase17/chat-langchain>`_

**🤖 Agents**

- `Documentation <./use_cases/agents.html>`_
- End-to-end Example: `GPT+WolframAlpha <https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain>`_

Getting Started
----------------

Checkout the below guide for a walkthrough of how to get started using LangChain to create an Language Model application.

- `Getting Started Documentation <./getting_started/getting_started.html>`_

.. toctree::
   :maxdepth: 1
   :caption: Getting Started
   :name: getting_started
   :hidden:

   getting_started/getting_started.md

Modules
-----------

There are several main modules that LangChain provides support for.
For each module we provide some examples to get started, how-to guides, reference docs, and conceptual guides.
These modules are, in increasing order of complexity:


- `Prompts <./modules/prompts.html>`_: This includes prompt management, prompt optimization, and prompt serialization.

- `LLMs <./modules/llms.html>`_: This includes a generic interface for all LLMs, and common utilities for working with LLMs.

- `Document Loaders <./modules/document_loaders.html>`_: This includes a standard interface for loading documents, as well as specific integrations to all types of text data sources.

- `Utils <./modules/utils.html>`_: Language models are often more powerful when interacting with other sources of knowledge or computation. This can include Python REPLs, embeddings, search engines, and more. LangChain provides a large collection of common utils to use in your application.

- `Chains <./modules/chains.html>`_: Chains go beyond just a single LLM call, and are sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.

- `Indexes <./modules/indexes.html>`_: Language models are often more powerful when combined with your own text data - this module covers best practices for doing exactly that.

- `Agents <./modules/agents.html>`_: Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end to end agents.

- `Memory <./modules/memory.html>`_: Memory is the concept of persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.

- `Chat <./modules/chat.html>`_: Chat models are a variation on Language Models that expose a different API - rather than working with raw text, they work with messages. LangChain provides a standard interface for working with them and doing all the same things as above.


.. toctree::
   :maxdepth: 1
   :caption: Modules
   :name: modules
   :hidden:

   ./modules/prompts.md
   ./modules/llms.md
   ./modules/document_loaders.md
   ./modules/utils.md
   ./modules/indexes.md
   ./modules/chains.md
   ./modules/agents.md
   ./modules/memory.md
   ./modules/chat.md

Use Cases
----------

The above modules can be used in a variety of ways. LangChain also provides guidance and assistance in this. Below are some of the common use cases LangChain supports.

- `Agents <./use_cases/agents.html>`_: Agents are systems that use a language model to interact with other tools. These can be used to do more grounded question/answering, interact with APIs, or even take actions.

- `Chatbots <./use_cases/chatbots.html>`_: Since language models are good at producing text, that makes them ideal for creating chatbots.

- `Data Augmented Generation <./use_cases/combine_docs.html>`_: Data Augmented Generation involves specific types of chains that first interact with an external datasource to fetch data to use in the generation step. Examples of this include summarization of long pieces of text and question/answering over specific data sources.

- `Question Answering <./use_cases/question_answering.html>`_: Answering questions over specific documents, only utilizing the information in those documents to construct an answer. A type of Data Augmented Generation.

- `Summarization <./use_cases/summarization.html>`_: Summarizing longer documents into shorter, more condensed chunks of information. A type of Data Augmented Generation.

- `Evaluation <./use_cases/evaluation.html>`_: Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

- `Generate similar examples <./use_cases/generate_examples.html>`_: Generating similar examples to a given input. This is a common use case for many applications, and LangChain provides some prompts/chains for assisting in this.

- `Compare models <./use_cases/model_laboratory.html>`_: Experimenting with different prompts, models, and chains is a big part of developing the best possible application. The ModelLaboratory makes it easy to do so.



.. toctree::
   :maxdepth: 1
   :caption: Use Cases
   :name: use_cases
   :hidden:

   ./use_cases/agents.md
   ./use_cases/chatbots.md
   ./use_cases/generate_examples.ipynb
   ./use_cases/combine_docs.md
   ./use_cases/question_answering.md
   ./use_cases/summarization.md
   ./use_cases/evaluation.rst
   ./use_cases/model_laboratory.ipynb


Reference Docs
---------------

All of LangChain's reference documentation, in one place. Full documentation on all methods, classes, installation methods, and integration setups for LangChain.


- `Reference Documentation <./reference.html>`_
.. toctree::
   :maxdepth: 1
   :caption: Reference
   :name: reference
   :hidden:

   ./reference/installation.md
   ./reference/integrations.md
   ./reference.rst


'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Thank you for providing the `langchain` library documentation. After reviewing the documentation, I have a better understanding of the library and its modules.

Now, I'll proceed to T2.2: Extract code and tests from the outputs.

To extract the code and tests that I provide in my outputs, you can use Python string manipulation, regular expressions, or any parsing technique that suits the format of the outputs.

For T2.3 and T2.4, I have an idea; the `create_python_agent` function from the `langchain` library can be used to construct a Python agent that takes my code and tests as input, adds them to Python modules, and executes other tasks.

Now, let's update the task list:

## Task List

Automate the TDD process:
- [x] T1: Receive prompt and provide outputs (code and tests)
- [ ] T2: Parse and add outputs to Python modules
- [ ] T3: Run pytest on new tests and receive output
- [ ] T4: Use output to revise code and tests
- [ ] T5: Iterate until satisfied and request code review
- [ ] T6

In [11]:

human_message = '''
Looks good. Please proceed!
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Great, now I will proceed with subtasks T2.2, T2.3, and T2.4.

To help you with T2.2, we can create a function that can parse the code and tests from my outputs. Since my outputs are mostly in Markdown format, we can start by extracting the code blocks within triple backticks (```). Later, we can further refine the output parsing, as required.

Please find below the code for the parsing function to be saved in a module named `output_parser.py`:

```python
import re

def extract_code_blocks(output):
    code_blocks = re.findall('```(.*?)```', output, flags=re.DOTALL)
    return code_blocks
```

You can use the `extract_code_blocks` function to parse my outputs and obtain the code and tests. This should fulfill subtask T2.2.

Next, let's move to subtasks T2.3 and T2.4: Adding the extracted code and tests to Python modules, and saving the updated modules. To do so, you can use Python's built-in `importlib` module to dynamically import and update the modules with the provided code and test

In [12]:
human_message = '''
continue
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

you can utilize the following function to dynamically add the code to the desired module. Please save this function in a module named `dynamic_module_handler.py`.

```python
import importlib
import sys

def add_code_to_module(module_name, code):
    if module_name not in sys.modules:
        module = importlib.util.ModuleType(module_name)
        sys.modules[module_name] = module
    else:
        module = importlib.import_module(module_name)

    # Compile the provided code
    compiled_code = compile(code, f"<{module_name}>", "exec")
    exec(compiled_code, module.__dict__)

    # Reload the updated module
    importlib.reload(module)
```

You can call the `add_code_to_module` function with the module's name (e.g., 'my_module') and the code block as arguments. The function will create (if needed) and update the specified Python module with the provided code. This should fulfill subtasks T2.3 and T2.4.

Now that we've completed Task T2, let's update the task list:

## Task List

Autom

In [13]:
human_message = '''
continue
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

T2: Parse and add outputs to Python modules
- [ ] T3: Run pytest on new tests and receive output
- [ ] T4: Use output to revise code and tests
- [ ] T5: Iterate until satisfied and request code review
- [ ] T6: Receive feedback and iterate if needed, or accept code and commit

Next, I will focus on T3: Running pytest on the new tests and receiving the output. To do this, you can utilize the `subprocess` module to call pytest and capture the output.

Here's a function you can use to run pytest on a specific module and return the captured output. Save this function in a module named `pytest_runner.py`.

```python
import subprocess

def run_pytest(module_name):
    result = subprocess.run(['pytest', f'{module_name}.py'], capture_output=True, text=True)
    return result.stdout
```

By calling the `run_pytest` function with the relevant module's name, you can test the added code and obtain the output, fulfilling Task T3. 

Once the output is captured, proceed to Task T4: use the output to 