# The `python` hallucination

Sometimes ChatGPT will hallucinate a function endpoint called `python` when chat functions are enabled.

We've all been there. Who doesn't dream in `python`?

The _cool_ thing about this hallucination is that chat functions normally require a JSON object:

```json
{
  "code": "import numpy as np\nimport pandas as pd\n\n# Create example data\nnp.random.seed(0)\n..."
}
```

whereas ChatGPT will send plaintext to the `python` function:

```python
import numpy as np
import pandas as pd

# Create example data
np.random.seed(0)
data = np.random.randint(0, 100, size=(10, 3))
columns = ['A', 'B', 'C']
df = pd.DataFrame(data, columns=columns)
df
```

This can be frustrating when you want the model to run a different function. Even if you register a function to run code directly, like with `run_cell` below, the model will try to run `python`.

In [10]:
from chatlab import Chat, system

from chatlab.builtins import run_cell

chat = Chat(
    system("You are a data science tutor")
)

# We register `run_cell` and yet `python` is run... without the JSON payload
schema = chat.register(run_cell)

In [3]:
await chat("Create some example data for us to work on in python")

 

Apologies for the inconvenience. Let me create some example data for you in Python.

Here is some example data generated in Python:

```
   A   B   C
0  44  47  64
1  67  67   9
2  83  21  36
3  87  70  88
4  88  88  81
5  37  25  77
6  38  12  58
7  58  78  99
8  46  55  80
9   4  22  24
```

The data consists of 10 rows and 3 columns. The column names are A, B, and C. Each column contains random integer values between 0 and 100.

To allow the model (and you!) to use this sneaky "feature" of the model, `chatlab` includes a builtin chat function for running python code, using `ipython` underneath.

In [9]:
from chatlab import Chat, system

from chatlab.builtins import run_cell

chat = Chat(
    system("You are a data science tutor"),
    python_hallucination_function=run_cell,
)

chat.register(run_cell)
await chat("Create some example data for us to work on in python")

 

I have created an example dataset for us to work with. 

The dataset is a DataFrame with three columns: `Name`, `Age`, and `Score`. Here is a summary of the dataset:

- Number of Rows: 4
- Number of Columns: 3

Column Information:
- `Name`: object data type, no missing values
- `Age`: int64 data type, no missing values
- `Score`: int64 data type, no missing values

Categorical Summary: No categorical columns in the dataset.

Here is a sample of the data:

|    |   Age |   Score | Name   |
|----|-------|---------|--------|
|  2 |    30 |      75 | Bob    |
|  3 |    22 |      85 | Emily  |
|  0 |    25 |      80 | John   |
|  1 |    27 |      90 | Alice  |

We can use this dataset for our analysis or any other tasks you have in mind.

Note: You can set this `python` runner to any function that accepts a single string. This means you can send the code off to a docker runtime or a external hosted service.