# TEMPORARY

In [1]:
from gpt_controller.util.models import *
method_list = [method for method in dir(Task) if method.startswith('_') is False]
list(method_list)

['complete',
 'conclusion',
 'function_content',
 'function_name',
 'get_context',
 'id',
 'pause',
 'print_conclusion',
 'start',
 'start_time',
 'status',
 'stop_time',
 'total_time']

# To Do
- [ ] make function to check a set of sentences with a predefined label through the system and see if it gives the correct output.
- [ ] make database folder with one .csv file for each run of the system that is saved when closing the system no matter what error occured.
- [ ] If the system reproduces a learned function, it should take the current context into account when adapting the function. But the decision_making context as well as the acting() context are called with the learned function context
- [ ] implement the process of calling a learned function. If the system starts to reproduce a learned function, whenever there is a function call comprised of a code sequence generated in a single task, tell chatGPT to adapt that function based on the current context.
- [ ] add simple functionality to each acting module
- [ ] finish `act()` function
- [ ] finish memorization of new task sequences
    - [ ] `act()` function triggers a new learning task to be added to the stack
    - [ ] `make_decision()` calls the finalization of the top learned task based on querrying the system if it thinks it is done with the task.

- [ ] make prompt tester for `function_similarity.txt`.

If time allows:
- [ ] Expensive but fast workaround for memorizing function calls with arguments. Instead of trying to give chatGPT the json schema of the newly created function, just save the code snippet generated and, when trying to reproduce a task that contains a function call, just run the code snippet with the adapted variables inputted by chatGPT and continue in the function task sequence with the new task.



# To do presentation

Use `przi presentation` https://prezi.com/p/create-prezi/
- [ ] make a mapping of the system
- [ ] 1: I wanted to approach the problem of zero-shot action generation for novel as well as known tasks.
- [ ] 2: chatGPT through recent works (such as microsoft and other examples of papers you read about it) has showcased reasoning capabilities through natural language based input.
- [ ] 3: Talk about ReAct design of the system
- [ ] 4: Talk about the overview state machine of the system
- [ ] 5: Present the cognition module and its design (give list of functions such as recalling, memorizing, as well as labelling and decision making)
    - Talk about the what is being leveraged on chatGPT and why.
- [ ] 6: Present the acting modules as a collective general concept
- [ ] 7: Present the memorization module and its benefits
- [ ] 8: Present the decision making module and its design with a focus on what is leveraged on chatGPT

Finalize:
- [ ] : Optimizations that I see possible in the future are:
    - [ ] : Creating a pseudolanguage for the ReAct system that is more efficient for the system to parse as well as reducing token count (I think)
    - [ ] : Maybe using a different language model more targeted towards the context of a robotic system such as InstructGPT (although it is trained on smaller data sets)
    - [ ] : Using the Microsoft seen way of asking chatGPT to call a sequence of functions and save that function, if successful instead of a task sequence. Thus reducing reaction time as well as decreasing the ammount of chatGPT calls.
    - [ ] : And others that will be found either in the paper of the presentation or in the README of the github repository.
- [ ] : The way this system is made makes it fairly user friendly to extend as long as you maintain a set of rules. Some of the possible extensions are:
    - [ ] : Adding more acting modules to the system such as a module that can control a robotic arm or a module that can control a drone.
    - [ ] : Adding more cognition modules to the system such as a module that can understand images and `point` for example (using chatGPT4) or a module that can understand audio.
    - [ ] : Adding more decision making modules to the system such as a module that can understand the context of the conversation and make decisions based on that.
    - [ ] : And others that will be found either in the paper of the presentation or in the README of the github repository.
- [ ] : I hope that this presentation has given you a good overview of the ReAct system and I am open to any questions you might have.

# System Demo
- Paper: Towards interactive robotics: Zero-shot action sequence generation using reasoning through ChatGPT
- This system presents a take on a ReAct model (more info: https://react-lm.github.io/)
- Creator: Andrei Dragomir

Add "fancy" version of the system demo here.
pip install tts
follow this guide to plug in text-to-speech https://www.youtube.com/watch?v=MYRgWwis1Jk
!! call tts with gpu=True to use gpu


## Implementation

The system works on 3 forms of LLM conversations:

Reasoning: Where the system reasons based on its information about the environment, activity logs or other information it has access to together with its general knowledge.

When receiving a task to fulfill, generate goal_predicates.

The thinking process behind fulfilling a task (called _end goal_) is as follows:
- Step 1: Load knowledge about its own activity up until that point (not extremely detailed ones, just conclusions of thought processes)
- Step 2: Define the next step to take based on the current state of the environment (Perceive, Reason, Act, Communicate)
- Step 3: Load functionality for any of the type of actions.
- Step 4: Call the functionality using chatGPT with the task goal being the goal defined at step 2.
- Step 5: After the call completes, save the new state of the environment and the new activity log independent of the action's success.
- Step 6: 
    - If the action was successful, check if the end goal was reached by evaluating goal predicates. 
    - Else, repeat from step 1.

## Imports and initialization

In [2]:
import os
import tiktoken
import time
from gpt_controller.config import *
from gpt_controller.util.models import *
from gpt_controller.util.labels import *
from gpt_controller.cognition.machine import Machine

machine = Machine(None)

## Testing

### Manipulation

- Limitations:
  - Objects that have the attribute "is_accessible" set to true make the contained objects graspable.
- Policy:
  - To take an object from within another object you must recursively reach that object from the top most container object to the object that contains the required object.
  - If any object in the sequence fails to be accessed, give context as to which object is not accessible and why.

### Perception

- Through perception you can find an object that you do not have yet in memory.
- You can also do specific checks on objects

### Reasoning

### Prompts

#### Benchmarks

In [3]:
# Function to get prompt deck from the prompt directory as a dictionary
# Returns a dictionary of prompt names and prompts
def get_prompt_deck() -> dict[str, str]:
    prompt_deck = {}
    for root, dirs, files in os.walk(PROMPT_PATH):
        for name in files:
            if name.endswith(".txt"):
                prompt_location = os.path.abspath(os.path.join(root, name))
                try:
                    with open(prompt_location, "r") as f:
                        prompt = f.read()
                        f.flush()
                    prompt_deck[name] = prompt
                except OSError as e:
                    print("Error: Prompt {} could not be loaded with reason: {}".format(name, e.args[0]))
                    continue
    return prompt_deck    
        
def num_tokens_prompt(prompt: str):
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.encoding_for_model(CHATGPT_MODEL)
    num_tokens = len(encoding.encode(prompt))
    return num_tokens

In [4]:
for name, prompt in get_prompt_deck().items():
    num_tokens = num_tokens_prompt(prompt)
    if num_tokens == 0:
        continue
    print("Prompt: {} \nNumber of tokens: {}\n".format(name, num_tokens))

Prompt: decision_making.txt 
Number of tokens: 400

Prompt: general_capabilities.txt 
Number of tokens: 101

Prompt: label_input.txt 
Number of tokens: 62

Prompt: memorize_object.txt 
Number of tokens: 253

Prompt: function_similarity.txt 
Number of tokens: 199

Prompt: get_goal_predicates.txt 
Number of tokens: 389

Prompt: question_about_context.txt 
Number of tokens: 373

Prompt: label_segment_input copy.txt 
Number of tokens: 289

Prompt: segment_input.txt 
Number of tokens: 76

Prompt: memorize_derivate_function.txt 
Number of tokens: 91



#### Targeted prompt tests

- This section provides tests for specific prompts that are used in the system.
- The tests are done by running the prompt through chatGPT and checking if the output is the expected one.

##### Labelling

- You can make sets of prompts for a specific type of label and run them all at once using the helper function `run_label_tests()`.
    - You must pass a list of strings that you want to test
    - You must specify what label you want to test.
    - The function returns an accuracy score of the tests.

In [5]:
def run_label_tests(prompt_deck : list[tuple[Label, str]], expected_label : Label):
    right_answers = 0
    for label, prompt in prompt_deck:
        provided_label = machine.label(prompt, expected_label.__class__)
        time.sleep(2)
        if provided_label == label:
            right_answers += 1
        else:
            print("Prompt: {}\nExpected Label: {}\nProvided Label: {}\n".format(prompt, expected_label, provided_label))
    return right_answers/len(prompt_deck)

Task type labelling

In [6]:
task_set = [
    "Peel 5 potatoes.",
    "Stir the pot of soup.",
    "Chop onions.",
    "Pour ingredients into a mixing bowl.",
    "Wash fruits and vegetables.",
    "Measure ingredients for the recipe.",
    "Open the jar of jam.",
    "Grate cheese.",
    "Preheat the oven to a specific temperature.",
    "Set a timer for cooking or baking.",
    "Mix ingredients together in a bowl.",
    "Slice some bread.",
    "Pour liquids into containers.",
    "Clean countertop with the sponge.",
    "Load the dishwasher.",
    "Retrieve items from cabinets or shelves.",
    "Remove hot dishes from the oven.",
    "Garnish plates with herbs or sauces.",
    "Arrange food on a serving platter.",
    "Brew a cup of coffee or tea.",
    "Clean kitchen utensils.",
    "Wipe spills from the floor.",
    "Refill water bottles or containers.",
    "Organize pantry items.",
    "Dispose of food waste."
]


labelled_task_set = [(UserInputLabel.TASK, prompt) for prompt in task_set]

run_label_tests(labelled_task_set, UserInputLabel.TASK)

1.0

Generated task labelling

In [20]:
task_topic = "Make a salad"

task_sequence = [
    "Recall any knowledge about the location of the ingredients.",
    "Move to the refrigerator.",
    "Open the fridge",
    "Look for vegetables for the salad.",
    "Pick up a tomato",
    "Place it on the counter.",
    "Pick up a cucumber.",
    "Place it on the counter.",
    "Pick up a lettuce.",
    "Place it on the counter.",
    "Close the refrigerator.",
    "Move to the counter.",
    "Find a knife.",
    "Ask where the knife could be located.",
    "Pick up the knife.",
    "Slice the vegetables.",
    "Which vegetable should be sliced first?",
    "Slice the tomato.",
    "Slice the cucumber.",
    "Move to the sink."
]

for prompt in task_sequence:
    label = machine.label(prompt, TaskLabel)
    print("Prompt: {}\nLabel: {}\n".format(prompt, label))
    time.sleep(2)


Prompt: Recall any knowledge about the location of the ingredients.
Label: TaskLabel.COGNITION

Prompt: Move to the refrigerator.
Label: TaskLabel.NAVIGATION

Prompt: Open the fridge
Label: TaskLabel.MANIPULATION

Prompt: Find vegetables for the salad.
Label: TaskLabel.MANIPULATION

Prompt: Pick up a tomato
Label: TaskLabel.MANIPULATION

Prompt: Place it on the counter.
Label: TaskLabel.MANIPULATION

Prompt: Pick up a cucumber.
Label: TaskLabel.MANIPULATION

Prompt: Place it on the counter.
Label: TaskLabel.MANIPULATION

Prompt: Pick up a lettuce.
Label: TaskLabel.MANIPULATION

Prompt: Place it on the counter.
Label: TaskLabel.MANIPULATION

Prompt: Close the refrigerator.
Label: TaskLabel.MANIPULATION

[31mError: Getting completion (1/3) failed with reason: The server is overloaded or not ready yet.
[0m
Prompt: Move to the counter.
Label: TaskLabel.NAVIGATION

Prompt: Find a knife.
Label: TaskLabel.MANIPULATION

Prompt: Ask where the knife could be located.
Label: TaskLabel.INQUIRY



##### User input segmentation

In [18]:
input_string = '''Tell me what color the fork has then tell me what color the knife has. You should only use your cognitive abilities, without perceiving the environment to extract data'''

conversation = Conversation(ConversationType.LABELLING)
conversation.messages.append(Message(Role.SYSTEM, machine.load_prompt('segment_input.txt')))
conversation.messages.append(Message(Role.USER, input_string))

completion = machine.process(conversation.messages)
print(completion['content'])



You are a phrase splitting algorithm.
The user will give you a text input, and you will return a list of simple sentences derived from segmenting the input from the user.
The sentences should be understandable isolated so wherever pronouns are used to describe an object, replace the pronouns with the name of the object.
You must return a list of sentences in a form that is parsable in python into a list.
You must only return the splitting list, no prose.

Here's a general approach to this process:

1. Identify the main clause: Determine the main clause or the central part of the sentence that expresses a complete thought.
2. Look for coordinating conjunctions: Identify any coordinating conjunctions (such as "and," "but," "or," etc.) that can be used to divide the sentence into separate clauses.
3. Identify subordinating clauses: Identify any subordinating clauses (such as those introduced by words like "because," "although," "if," etc.) that can be treated as separate sentences or comb

##### Goal predicate generation

## Future Work

- Implementing a ReAct system for interfacing with chatGPT
    - ReAct is a system that allows for the generation of actions from a given state.
    - It is a system that is trained on a dataset of actions and states.

    **Inspiration**
    https://react-lm.github.io/

- Implement defensive json parsing for the chatGPT system
    - Use a json schema to validate the input json
    - Use a library like `Langchain` or `llmparser` to recover from slightly malformed json outputs