# Interactive notebook for the gpt_constructor package

## Imports

### Package imports

In [None]:
import os
import tiktoken
from statemachine import StateMachine 
from gpt_controller.config import DB_URL, CHATGPT_MODEL, PROMPT_PATH, SELF_TRAIN
from gpt_controller.chat_gpt_interface.api_tools import InHouseAPI
from gpt_controller.cognition.machine import GPTControllerMachine

### Data imports

In [None]:
# TODO Load the sqlalchemy models from database
chatGPT_api = InHouseAPI()
prompt_deck = chatGPT_api.get_prompt_deck()
environment_deck = chatGPT_api.get_environment_deck()

## Overview

### 1. State Machine Diagram

In [None]:
state_machine = GPTControllerMachine()

state_machine

### 2. Prompt benchmarks

In [None]:

for prompt_label, prompt in prompt_deck.items():
    num_tokens = chatGPT_api.num_tokens_prompt(prompt)
    print("Prompt: {} \nNumber of tokens: {}\n".format(prompt_label, num_tokens))


In [None]:
for environment_label, environment in environment_deck.items():
    num_tokens = chatGPT_api.num_tokens_prompt(environment)
    print("Prompt: {} \nNumber of tokens: {}\n".format(environment_label, num_tokens))

## Prompt Testing

### 1. Prompt testing text based

Load and test output of specific completion context with:
    
```python
chatGPT_api.request_completion()
```

In [None]:

print(prompt_deck.keys())
# TODO #2 Make this work
chatGPT_api.request_completion(prompt_name="user_input_labelling.txt", user_request="Hello, how are you?")

## Self training [#7](https://github.com/andrei-calin-dragomir/gpt-robot-controller/issues/7)

Self training is a method of training a model on its own output. This is done by using the output of the model as the input to the model. This is done iteratively to improve the model's performance on a specific task.

### 1. Self-training text based

#### Setup
    
- Set `SELF_TRAIN` to `True` in `config.py` to enable self-training.
- Set `PERSISTENT_ENVIRONMENTS = True` in `config.py` to enable persistent environments.
- Set `TOKEN_LIMIT` to your desired tokens to use in one run of self training.

Self-training can potentially run indefinitely resulting in large token counts. To prevent this, set `TOKEN_LIMIT` to a value that you are comfortable with.

#### Environment Descriptions

This method allows you to create a root description of the environment you want the machine to train in.
To do so, add an environment description as a `.txt` file in the `prompts/environment` folder in the root directory of the package.

#### Usage
```python
GPTControllerMachine().self_train(environment="environment_name.txt")
```

#### Inner workings
This will generate a state machine with a Spawning state.
Whenever the model is in the Idle state:
    If `TOKEN_LIMIT` is reached, the model will go to the Finish state and then shut down.
    Otherwise, it will go to Spawning state and generate a new environment or new task to train on.

In [None]:
environment_deck.keys()

In [None]:
GPTControllerMachine().self_train(environment="environment_name.txt")