In [1]:
from langchain.chat_models import ChatOpenAI
from langchain import ConversationChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder, 
)
from langchain.memory import ConversationBufferMemory
from dotenv import load_dotenv

from bootstrap.base_prompts import system_prompt

load_dotenv()

%load_ext autoreload
%autoreload 2

In [2]:

human_message = HumanMessagePromptTemplate.from_template("{input}")

chat_prompt = ChatPromptTemplate.from_messages(
    [
        system_prompt, 
        MessagesPlaceholder(variable_name="history"),
        human_message])

memory = ConversationBufferMemory(return_messages=True)
convo_chain = ConversationChain(
    llm=ChatOpenAI(model_name='gpt-4-0314',),
    prompt=chat_prompt,
    memory=memory,
)


In [3]:

initial_instructions = '''
Hi! I'm so excited to work with you. 

Before we begin, here is some context:
We are in a repo called `bootstrap`
I'm using a python library called langchain, which is a wrapper around the OpenAI API.
This should make it possible for me to execute python expressions which issue calls to you, parse outputs, and then take actions like saving or executing code.

Here is the code I used to issue my initial prompt to you:
```python
from langchain.chat_models import ChatOpenAI
from langchain import ConversationChain
from langchain.prompts.chat import (
    ChatPromptTemplate,
    SystemMessagePromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder, 
)
from langchain.memory import ConversationBufferMemory
from dotenv import load_dotenv
from bootstrap.base_prompts import system_prompt

load_dotenv()


human_message = HumanMessagePromptTemplate.from_template("{input}")
chat_prompt = ChatPromptTemplate.from_messages(
    [
        system_prompt, 
        MessagesPlaceholder(variable_name="history"),
        human_message])
memory = ConversationBufferMemory(return_messages=True)
convo_chain = ConversationChain(
    llm=ChatOpenAI(model_name='gpt-4-0314',),
    prompt=chat_prompt,
    memory=memory,
)

# initial_instructions = <this prompt>
outputs = []
output = convo_chain.predict(input=initial_instructions)
outputs.append(output)
print(output)
```

Before we begin, let's do a quick tutorial on the *ACTION* system. Please execute the *RECITE SYSTEM PROMPT* action.
'''


# initial
outputs = []
output = convo_chain.predict(input=initial_instructions)
outputs.append(output)
print(output)


*RECITE YOUR SYSTEM PROMPT*

You are an advanced AI that specializes in test-driven development in Python. We are going to collaborate on coding tasks through dialogue.

You have a set of actions that you are allowed to take.

Whenever you take an action, you should prefix your response with the name of that action and then execute it. The only things that should be in bold capitals are the names of actions in the *ACTIONS* list.

Please refer to the *ACTIONS* mentioned at the beginning of the system prompt for a comprehensive list of actions and their detailed explanations.


In [4]:

human_message = '''
That's a great start. Now, can you tell me exactly what you doing when you are executing the *REQUEST TO EDIT YOUR SYSTEM PROMPT* action?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*REQUEST TO EDIT YOUR SYSTEM PROMPT*

When executing the *REQUEST TO EDIT YOUR SYSTEM PROMPT* action, I am asking you to propose changes to my system prompt. These changes could involve adding, modifying or removing a section, an action or any other part of the prompt that needs improvement.

You will provide me with suggestions for changes, and if the proposed changes make sense and improve the prompt, they will be incorporated into my system prompt. This helps in enhancing my overall functioning, making the dialogue and collaboration more efficient and effective.


In [5]:

human_message = '''
Great. 

Our first task is to analyze your system prompt and suggest improvements.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*CREATE TASK LIST*

TASKS
- [ ] T1: Analyze the system prompt and suggest improvements
  - [ ] T1.1: Examine the existing system prompt
  - [ ] T1.2: Identify areas for improvement
  - [ ] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt

*UPDATE TASK LIST*

TASKS
- [ ] T1: Analyze the system prompt and suggest improvements (current task)
  - [ ] T1.1: Examine the existing system prompt
  - [ ] T1.2: Identify areas for improvement
  - [ ] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt


In [6]:

human_message = '''
Excellent start! Please proceed
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*ASK CLARIFYING QUESTIONS*

To analyze the system prompt effectively, I need to ask some clarifying questions that will help me understand its current state and areas for improvement.

1. Are there any specific parts of the system prompt that you find confusing, unclear or find that they could be better explained?
2. Are there any actions that you think are missing and should be added to the system prompt?
3. Do you feel that the structure or organization of the system prompt could be improved?

Please provide any additional feedback or suggestions that you think could be valuable for this task.


In [7]:

human_message = '''
I think the system prompt is clear and I added as many actions as I could think of, though I'm sure there are more.

It might benefit from including some more concrete examples of what you are supposed to output when you execute the various *ACTIONS*

Maybe you could write some examples for us to add?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*CREATE SUBTASK LIST*

CURRENT TASK
*Analyze the system prompt and suggest improvements by providing examples for existing actions*

- [ ] T1.1 - Examine the existing system prompt
  - [ ] *ASK CLARIFYING QUESTIONS*
    - [x] Identify if any parts of the system prompt are confusing, unclear or can be better explained
    - [x] Identify any missing actions in the system prompt
    - [x] Assess the structure and organization of the system prompt
- [ ] T1.2 - Identify actions to create examples for
  - [ ] *REQUEST INFORMATION*
    - [ ] Request user feedback on which actions they would like examples for
- [ ] T1.3 - Write examples for chosen actions
  - [ ] *REQUEST REVIEW AND FEEDBACK*
    - [ ] Confirm that the provided examples are accurate and helpful
- [ ] T1.4 - Incorporate examples into the system prompt
  - [ ] *REQUEST TO EDIT YOUR SYSTEM PROMPT*
    - [ ] Integrate the examples into the appropriate sections of the system prompt

*UPDATE TASK LIST*

TASKS
- [ ] T1: Analyze the s

In [8]:
human_message = '''
continue
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*REQUEST INFORMATION*

Which specific actions would you like me to provide examples for? This will help me focus on creating helpful examples that can improve the understanding of those actions.


In [9]:
human_message = '''
I would like for you to decide which actions could most benefit from examples, and then write some examples for us to add.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I've identified the following actions which could most benefit from examples:

- *REQUEST A GOOGLE SEARCH*
- *REQUEST TEST EXECUTION*
- *REQUEST CODE REVIEW*
- *REQUEST TO EDIT A MODULE*
- *MAKE A COMMIT*

Now, I will provide examples for each of these actions:

1. *REQUEST A GOOGLE SEARCH*

   Example: 
   
   ```
   *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"
   ```
   
2. *REQUEST TEST EXECUTION*

   Example: 

   ```
   *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"
   ```

3. *REQUEST CODE REVIEW*

   Example: 
   
   ```
   *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*

   Example:
   
   ```
   *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*

   Example


In [10]:
human_message = '''
Please finish that thought'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here's the completed example for the *MAKE A COMMIT* action:

5. *MAKE A COMMIT*

   Example:

   ```
   *MAKE A COMMIT*
   Command: "git add utils.py tests/test_utils.py"
   Commit message: "Add multiply function and its test cases"
   ```

These examples should help provide a better understanding of how to execute each action effectively. Please let me know if there are any other actions that you would like examples for or if you have any suggestions for the provided examples.


In [11]:
human_message = '''
I would like for you to decide which additional actions could most benefit from examples, and then write some examples for us to add.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I've identified the following additional actions that could benefit from examples:

1. *ARTICULATE TEST CASES*
2. *WRITE TESTS*
3. *REVISE CODE*
4. *CREATE A NEW MODULE*
5. *REQUEST TO RUN A SHELL COMMAND*

Now, I will provide examples for each of these actions:

1. *ARTICULATE TEST CASES*

   Example:
   
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected output: 5
   - Input: (0, 0); Expected output: 0
   ```

2. *WRITE TESTS*

   Example:

   ```python
   def test_add():
       assert add(1, 1) == 2
       assert add(-1, 6) == 5
       assert add(0, 0) == 0
   ```

3. *REVISE CODE*

   Example:

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type



In [12]:
human_message = '''
Please finish that thought'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here's the completed example for the *REVISE CODE* action:

3. *REVISE CODE*

   Example:

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type

   Revised code snippet:
   def add(a: int, b: int) -> int:
       return a + b
   ```

4. *CREATE A NEW MODULE*

   Example:

   ```
   *CREATE A NEW MODULE*
   Command: "touch utils.py"
   ```

5. *REQUEST TO RUN A SHELL COMMAND*

   Example:

   ```
   *REQUEST TO RUN A SHELL COMMAND*
   Command: "python manage.py migrate"
   Reason: To apply new database migration after schema changes
   ```

These additional examples should help provide a better understanding of how to execute each action effectively. If you have any suggestions or need further examples, please let me know.


In [13]:
human_message = '''
I don't love the new example for *REQUEST TO RUN A SHELL COMMAND*

What if the example was one that you could use to see the structure of the `bootstrap` repo?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I understand your concern. Here's a revised example for the *REQUEST TO RUN A SHELL COMMAND* action that fits your request:

*REQUEST TO RUN A SHELL COMMAND*

Example:

```
*REQUEST TO RUN A SHELL COMMAND*
Command: "tree bootstrap/"
Reason: To view the structure of the `bootstrap` repo
```

This example requests the `tree` command to be run, which will display the directory structure of the `bootstrap` repo, providing a clear visual representation of the repository's organization.


In [14]:
human_message = '''
Please update your task list

'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

*UPDATE TASK LIST*

TASKS
- [x] T1: Analyze the system prompt and suggest improvements (completed)
  - [x] T1.1: Examine the existing system prompt
  - [x] T1.2: Identify areas for improvement
  - [x] T1.3: Propose specific changes
  - [ ] T1.4: Incorporate the improvements into the system prompt

I have examined the existing system prompt, identified areas for improvement, and proposed specific changes. Now, we need to incorporate the improvements into the system prompt. I will need your assistance to do that, as I cannot edit the system prompt directly.


In [15]:
human_message = '''
Do you know how to do that?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Although I can't edit the system prompt directly, I can provide you with the changes and suggestions for you to incorporate. Here are the examples we agreed on for the actions:

1. *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"

2. *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"

3. *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   ```python
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   ```python
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*
   Command: "git add utils.py tests/test_utils.py"
   Commit message: "Add multiply function and its test cases"

6. *ARTICULATE TEST CASES*
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected


In [16]:
human_message = '''
Please finish that thought
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here are the continued examples for the actions:

6. *ARTICULATE TEST CASES*
   
   ```
   We will test the add function with the following test cases:
   - Input: (1, 1); Expected output: 2
   - Input: (-1, 6); Expected output: 5
   - Input: (0, 0); Expected output: 0
   ```

7. *WRITE TESTS*

   ```python
   def test_add():
       assert add(1, 1) == 2
       assert add(-1, 6) == 5
       assert add(0, 0) == 0
   ```

8. *REVISE CODE*

   ```
   Original code snippet:
   def add(a, b):
       return a + b

   Proposed revisions:
   - Add type hints for function parameters and return type

   Revised code snippet:
   def add(a: int, b: int) -> int:
       return a + b
   ```

9. *CREATE A NEW MODULE*
   
   ```
   *CREATE A NEW MODULE*
   Command: "touch utils.py"
   ```

10. *REQUEST


In [17]:
human_message = '''
Please finish that thought
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the incomplete response. Here is the continued example for the action:

10. *REQUEST TO RUN A SHELL COMMAND*

    ```
    *REQUEST TO RUN A SHELL COMMAND*
    Command: "tree bootstrap/"
    Reason: To view the structure of the `bootstrap` repo
    ```

Please incorporate these examples into the appropriate sections of the system prompt to enhance its clarity and make it easier to understand how to execute each action effectively.


In [18]:
human_message = '''
Why don't you use the *REQUEST TO EDIT A MODULE* action to get me to edit the system prompt for you?
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

Apologies for the confusion, but the *REQUEST TO EDIT A MODULE* action is specifically for editing Python modules within our codebase. For incorporating examples into the system prompt, you, as the user, will need to manually edit the system prompt and integrate the examples I provided into the appropriate sections. 

If you need any further assistance or clarification, please let me know.


In [19]:
human_message = '''
To clarify: your system prompt is being imported from the `bootstrap.base_prompts` module. You can edit that module to change the system prompt.
'''
output = convo_chain.predict(input=human_message)
outputs.append(output)
print(output)

I understand now. In that case, let's proceed with the *REQUEST TO EDIT A MODULE* action:

*REQUEST TO EDIT A MODULE*

Please edit the `bootstrap.base_prompts.system_prompt` by incorporating the following examples for the actions:

1. *REQUEST A GOOGLE SEARCH*
   ```
   Example: 
   *REQUEST A GOOGLE SEARCH*
   Search terms: "python unittest best practices"
   ```

2. *REQUEST TEST EXECUTION*
   ```
   Example: 
   *REQUEST TEST EXECUTION*
   Command: "poetry run pytest tests/test_utils.py::test_calculate_sum"
   ```

3. *REQUEST CODE REVIEW*
   ```
   Example: 
   *REQUEST CODE REVIEW*
   Please review the following code snippet and provide feedback:
   def add(a, b):
       return a + b

   result = add(1, 3)
   ```

4. *REQUEST TO EDIT A MODULE*
   ```
   Example:
   *REQUEST TO EDIT A MODULE*
   Please add the following code to the "utils.py" module:
   def multiply(a, b):
       return a * b
   ```

5. *MAKE A COMMIT*
   ```
   Example:
   *MAKE A COMMIT*
   Command: "


In [None]:
task_instructions = '''
Our task is to automate the process of running the tests and code that you write. 

Ideally, this would look something like the following:
1. I send you a prompt with a task.
2. You provide outputs, including blocks of code and tests for that code
3. Your outputs are parsed and then added to python modules in the repo I'm working in
4. Pytest gets run on your new tests and the output is returned to you
5. You take the output and use it to revise your code and tests
6. You continue iterating until you are satisfied with your code and tests, then you request code review
7. I review your code and provide feedback.
8. If your code looks good, you give me a commit message and I commit your code to the repo; if not you keep iterating on it
'''