In [1]:
import os
import openai
import time
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv()
import numpy as np



#### `pr_objfields(obj)` ####
For example, `pr_objfields(runs)` to all three run objects and their contents.

In [2]:
# The following function prints the fields of an object (e.g. a message)
# If the field itself is another class, then it prints its fields as well 
# on separate lines intended by 4 spaces
def pr_objfields(obj):
    for field in obj:
        print(field)
        if isinstance(field, object):
            for subfield in field:
                print("------>", subfield) 
# It is however easier to use the VS Code Debugger to inspect the object
# When using the debugger, it is easier if I do things in separate functions so that
# the scope is limited ot the variables in that function

# Time stamp to local time
def ts2lt(ts):
    return time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(ts))


In [3]:
from IPython.display import Markdown

def md_custom(text, size=12):
    return Markdown(f"<span style='font-size: {size}px;'>{text}</span>")

# Usage
# display(md_custom("This is some text with a larger font size.", size=18))
def mdc(s, size=14):
    display(md_custom(s, size))

## Playground Execution ##
I try a thread of questions and answers on Playground.  I will set the Playground up exactly the same way as I did in the `qa.ipynb` of [the last post](https://halimgur.substack.com/p/qanda).

![alt text](/Users/Halim/git/probot/pics/Assistant_for_General.jpg "")

The above is a screendump from https://platform.openai.com/assistants. Click on the Assistant name to open it as follows:

![alt text](pics/Assistant_Summary.jpg "")


When I asked the question "**What is the local government debt in China?**", the notebook `qa.ipynb` identified the following three pages as most relevant:

|itext|similarity|title|
|--|--|--|
|33|0.930614|Retirement: A Journey of Continued Relevance and Learning|
|20|0.816555|Will there be a war?|
|10|0.785798|ROGUE - Renaissance on globe with upheavals everywhere|

I added those three files to the "Assistant for general" [in the Playground](https://platform.openai.com/playground?mode=assistant&assistant=asst_LCdnKNcUtkxVg1aDoiQ9vHCX):

![alt text](pics/playground_start.jpg "")

I will now start the Thread with my first question, which is "What is the local government debt in China?". Since the relevant file has already been identified and uploaded, the Playground Assistant produces the following answer:

>In 2020, analysis by Professor Li and his PhD student estimated that China's local government debt approached 90 trillion yuan (12.49 trillion U.S. dollars), which was about 88% of the GDP at that time. This figure is significantly higher than the estimates typically provided by organizations like the International Monetary Fund or the World Bank, which usually place it around 60 trillion yuan (8.33 trillion U.S. dollars), approximately 50% of the GDP【11†source】.

My next question is "**Who is Professor Li?**" and the Assistant produces the following question in the Playground:

>Professor Li, referred to in the debt analysis, is Professor David Daokui Li, who holds the Mansfield Freeman Chair of the Department of Finance of the School of Economics and Management at Tsinghua University【16†source】.

![alt text](pics/playground_q2.jpg "")

I then asked the question "**How does this debt number compare against typical salaries in China?**".  I thought I was pushing my luck with this question but the Assistant gave a good answer on the Playground:

>The local government debt per person in China was estimated at 75,000 yuan, which was mentioned to be the annual salary of a typical young mechanical engineer in Luoyang in 2019【21†source】. This suggests that the debt per capita is roughly equivalent to a year's salary for a young professional in that region.

![alt text](pics/playground_q3.jpg "")

Let us stop here. This seems to be working but when I tried to do the same by calls to the API through my python program instead of using the playground, the performance was quite poor. 

This suggests that the Assistant prompts and other calls to the model in Playground are different from what I did in `qa.ipynb`.  In this notebook, I will try to understand how the Assistant talks to the LLM in the Playground and compare it to how I make it talk in my program.  The differences should be instructive.

In [4]:
%run ./globals.py
# md(showglobals())

## Playground Assistant ##
The ID of the Playground Assistant is given on the Playground screen.  I copied and pasted it from the Playground screen

In [5]:
Assistant_ID="asst_LCdnKNcUtkxVg1aDoiQ9vHCX"

This should be the same Assistant that I created in `qa.ipynb`. Let us confirm this is the case by retrieving the name if the `qa.ipynb` assistant from where we saved it on the local disk:

In [6]:
from ute import getassid, init_openai
qass=getassid(DataFile)
s="\n\n|Assistant|ID|\n|---|---|\nPlayground screen|%s|\n|in %s|%s|\n\n"\
    %(Assistant_ID, DataFile, qass)
mdc(s, size=14)
#
mdc("\n\nThey are the same since I have not changed the Assistant ID in the Playground screen.\n\n")

<span style='font-size: 14px;'>

|Assistant|ID|
|---|---|
Playground screen|asst_LCdnKNcUtkxVg1aDoiQ9vHCX|
|in data/general/data.txt|asst_LCdnKNcUtkxVg1aDoiQ9vHCX|

</span>

<span style='font-size: 14px;'>

They are the same since I have not changed the Assistant ID in the Playground screen.

</span>

## Playground Thread ##
The Thread ID is displayed at the top of the column where the User/Assistant conversation is printed.  I copy and paste the thread ID into `Thread_ID`:

In [7]:
Thread_ID="thread_nPbJmSWtwS4umNYBNqrkGoLy"
mdc("`Thread ID`='%s'\n\n"%Thread_ID)

<span style='font-size: 14px;'>`Thread ID`='thread_nPbJmSWtwS4umNYBNqrkGoLy'

</span>

A `Thread` is of course a thread of messages exchanged between the user and the OpenAI Assistant.  OpenAI portal keeps all interactions for a time, which I believe to be one month but I am not sure.  The messages however belong to the user so I need to establish my credentials using my OpenAI identity:

In [8]:
(Client, LLM)=init_openai(model="gpt-4-1106-preview")

Note that the `Thread` object itseld does not have a `Messages` field.  The following retrieves the Playground thread.

In [9]:
t = Client.beta.threads.retrieve(Thread_ID)

The Thread record has only three fields.  The following tabulates the values of these fields:

In [10]:
s="\n\n|Attribute|Value|\n|---|---|\n"
s+="|ID|`%s`|\n"%t.id
s+="|Created|`%s`|\n"%(ts2lt(t.created_at))
s+="|Object name|`%s`|\n"%t.object
mdc(s, size=14)
mdc("\n\n")

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|Created|`2024-01-19 17:13:38`|
|Object name|`thread`|
</span>

<span style='font-size: 14px;'>

</span>

The messages are not stored in the thread but they are associated with the thread and you need to know the Thread ID to retrieve the message history.  This is what I do below.

The type for `thread_messages` is `<class 'openai.pagination.SyncCursorPage[ThreadMessage]>`.  A `SynCursorPage` object is iterable but does not have a length and we cannot address its members by an index,i.e. it is not subscriptable.  Since this is a very small example, there is hardly a performance penalty in converting it to a `list`.  This is what I do first after I retrieve the messages:

In [11]:

thread_messages = Client.beta.threads.messages.list(Thread_ID)
message_list=list(thread_messages)
# print(thread_messages.data)
# pr_objfields(thread_messages)

def tabulate_message(i):
    message=message_list[i]
    s="\n\n|Attribute|Value|\n|---|---|\n"
    s+="|ID|`%s`|\n"%message.id
    s+="|Created|`%s`|\n"%(ts2lt(message.created_at))
    s+="|Object name|`%s`|\n"%message.object
    s+="|assistant_id|`%s`|\n"%message.assistant_id
    s+="|thread_id|`%s`|\n"%message.thread_id
    s+="|run_id|`%s`|\n"%message.run_id
    s+="|file_ids|`%s`|\n"%message.file_ids
    s+="|role|`%s`|\n"%message.role
    s+="|content|`%s`|\n"%message.content
    mdc(s)
    mdc("\n\n")

mdc("\n\nThe following shows the fields of the six messages belonging to this thread. The messages are \
ordered in reverse. The last message \
is the initial question.  The second last message is the OpenAI answer to this question.")

for i in range(len(message_list)):
    mdc("\n\n#### Message %d ####\n\n"%i)
    tabulate_message(i)

<span style='font-size: 14px;'>

The following shows the fields of the six messages belonging to this thread. The messages are ordered in reverse. The last message is the initial question.  The second last message is the OpenAI answer to this question.</span>

<span style='font-size: 14px;'>

#### Message 0 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_FCWRhMvCELcmeC7SSG8eGX5a`|
|Created|`2024-01-19 17:20:41`|
|Object name|`thread.message`|
|assistant_id|`asst_LCdnKNcUtkxVg1aDoiQ9vHCX`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`run_F9wljGwD4VMTGv3mXgBwZTVi`|
|file_ids|`[]`|
|role|`assistant`|
|content|`[MessageContentText(text=Text(annotations=[], value="The local government debt per person in China was estimated at 75,000 yuan, which was mentioned to be the annual salary of a typical young mechanical engineer in Luoyang in 2019【21†source】. This suggests that the debt per capita is roughly equivalent to a year's salary for a young professional in that region."), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Message 1 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_Cti97CJOjdXLr9Zg58LVgZ5R`|
|Created|`2024-01-19 17:20:30`|
|Object name|`thread.message`|
|assistant_id|`None`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`None`|
|file_ids|`[]`|
|role|`user`|
|content|`[MessageContentText(text=Text(annotations=[], value='How does this debt number compare against typical salaries in China?'), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Message 2 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_4EDhkx9NnIbqxJWEkJWoXTDd`|
|Created|`2024-01-19 17:16:26`|
|Object name|`thread.message`|
|assistant_id|`asst_LCdnKNcUtkxVg1aDoiQ9vHCX`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`run_Ld7XKFuNMbFNRCSNZ0lXbWFh`|
|file_ids|`[]`|
|role|`assistant`|
|content|`[MessageContentText(text=Text(annotations=[], value='Professor Li, referred to in the debt analysis, is Professor David Daokui Li, who holds the Mansfield Freeman Chair of the Department of Finance of the School of Economics and Management at Tsinghua University【16†source】.'), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Message 3 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_VAWfKppInPeKQzoosBDBvgwq`|
|Created|`2024-01-19 17:16:18`|
|Object name|`thread.message`|
|assistant_id|`None`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`None`|
|file_ids|`[]`|
|role|`user`|
|content|`[MessageContentText(text=Text(annotations=[], value='Who is Professor Li?'), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Message 4 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_tvWcVSuSNs3bVYnW72yMxvLx`|
|Created|`2024-01-19 17:13:51`|
|Object name|`thread.message`|
|assistant_id|`asst_LCdnKNcUtkxVg1aDoiQ9vHCX`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`run_LGBgHiDIt5XXMhkDkA8zfPQI`|
|file_ids|`[]`|
|role|`assistant`|
|content|`[MessageContentText(text=Text(annotations=[], value="In 2020, analysis by Professor Li and his PhD student estimated that China's local government debt approached 90 trillion yuan (12.49 trillion U.S. dollars), which was about 88% of the GDP at that time. This figure is significantly higher than the estimates typically provided by organizations like the International Monetary Fund or the World Bank, which usually place it around 60 trillion yuan (8.33 trillion U.S. dollars), approximately 50% of the GDP【11†source】."), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Message 5 ####

</span>

<span style='font-size: 14px;'>

|Attribute|Value|
|---|---|
|ID|`msg_patD069uJnyIpkZKD6eOwgWo`|
|Created|`2024-01-19 17:13:39`|
|Object name|`thread.message`|
|assistant_id|`None`|
|thread_id|`thread_nPbJmSWtwS4umNYBNqrkGoLy`|
|run_id|`None`|
|file_ids|`[]`|
|role|`user`|
|content|`[MessageContentText(text=Text(annotations=[], value='What is the local government debt in China?'), type='text')]`|
</span>

<span style='font-size: 14px;'>

</span>

In [12]:
def messagetext(messages, i):
    return messages.data[i].content[0].text.value
messages=Client.beta.threads.messages.list(Thread_ID)
m=int(len(messages.data)/2)
mdc("There are %d interactions in this thread.\n\n* %d questions\n* %d answers\n\n"%(m, m, m))
mdc("I will tabulate them in a table.\n\n")
s="\n\n|Index|Question|Answer|\n|---|---|---|\n"
for i in range(m):
    j=2*i
    s+="|%d"%i
    s+="|%s"%messagetext(messages, j+1)
    s+="|%s|\n"%messagetext(messages, j)
mdc(s)
mdc("\n\n")
mdc("I printed them in the order they are stored in the thread. \
The first Q&A is at the bottom and the last Q&A is at the top\n\n")


<span style='font-size: 14px;'>There are 3 interactions in this thread.

* 3 questions
* 3 answers

</span>

<span style='font-size: 14px;'>I will tabulate them in a table.

</span>

<span style='font-size: 14px;'>

|Index|Question|Answer|
|---|---|---|
|0|How does this debt number compare against typical salaries in China?|The local government debt per person in China was estimated at 75,000 yuan, which was mentioned to be the annual salary of a typical young mechanical engineer in Luoyang in 2019【21†source】. This suggests that the debt per capita is roughly equivalent to a year's salary for a young professional in that region.|
|1|Who is Professor Li?|Professor Li, referred to in the debt analysis, is Professor David Daokui Li, who holds the Mansfield Freeman Chair of the Department of Finance of the School of Economics and Management at Tsinghua University【16†source】.|
|2|What is the local government debt in China?|In 2020, analysis by Professor Li and his PhD student estimated that China's local government debt approached 90 trillion yuan (12.49 trillion U.S. dollars), which was about 88% of the GDP at that time. This figure is significantly higher than the estimates typically provided by organizations like the International Monetary Fund or the World Bank, which usually place it around 60 trillion yuan (8.33 trillion U.S. dollars), approximately 50% of the GDP【11†source】.|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>I printed them in the order they are stored in the thread. The first Q&A is at the bottom and the last Q&A is at the top

</span>

## `Run`s associated with this thread ##
The above three pairs of messages (Q and A) are generated in three runs.  OpenAI creates a new `run` object everytime a question is posted.  These `run` objects are associated with the thread and can be tetrieved as I do below.

In [13]:
runs = Client.beta.threads.runs.list(Thread_ID)
runs_list=list(runs)
mdc("There are %d runs in this thread.\n\n"%len(runs_list))

<span style='font-size: 14px;'>There are 3 runs in this thread.

</span>

Note that in order to count the number of runs, I convert `runs` to a `list`.  This would be too memory intensive if I had many runs but in this case it is no problem. The original `runs` object is of type `SyncCursorPage[Run]`. It is iterable but does not inherently have a length.

I wrote a short script, `pr_objfields()`, which I will ise to dosplay the contents of the runs objects. 

### `run` object ###



In [14]:

# The following function tabulates the fields of the ith run in the thread
def tabulate_run(i):
    run=runs_list[i]
    s="\n\n|Field|Value|\n|---|---|\n"
    s+="|id|%s|\n"%run.id
    s+="|assistant_id|%s|\n"%run.assistant_id
    s+="|cancelled_at|%s|\n"%ts2lt(run.cancelled_at)
    s+="|completed_at|%s|\n"%ts2lt(run.completed_at)
    s+="|created_at|%s|\n"%ts2lt(run.created_at)
    s+="|expires_at|%s|\n"%run.expires_at
    s+="|failed_at|%s|\n"%run.failed_at
    s+="|file_ids|%s|\n"%run.file_ids
    s+="|status|%s|\n"%run.status
    s+="instructions|%s|\n"%run.instructions
    s+="|last_error|%s|\n"%run.last_error
    s+="|metadata|%s|\n"%run.metadata
    s+="|model|%s|\n"%run.model
    s+="|object|%s|\n"%run.object
    s+="|required_action|%s|\n"%run.required_action
    s+="started_at|%s|\n"%ts2lt(run.started_at)
    s+="|status|%s|\n"%run.status
    s+="|thread_id|%s|\n"%run.thread_id
    s+="|tools|%s|\n"%run.tools
    s+="|usage|%s|\n"%run.usage
    #
    mdc(s)
    mdc("\n\n")

mdc("I will tabulate the fields of the three runs.  As you see below, the `run` object is \
quite simple.  I make the following observations:\n\n\
* The File IDs are the same that were attached to the Assistant and the same for all three runs.\n\n\
* `status` is complete for all three runs.  During the execution, we watch this field whilke waiting \
for the run to complete\n\n\
* The `assistant_id~ is the same as the one in the Playground screen\n\n\
* The `thread_id` is the same as the one in the Playground screen\n\n\
* The `model` is the same as the one in the Playground screen\n\n\
* Time stamps are the UNIX timestamps and present the number of seconds that have elapsed \
since January 1, 1970 (UTC).  I conver them to a normal time format.\n\n\
* `instructions` have the same content for all three runs.  It is the same as the instructions for the Assistant.\n\n")
#
mdc("\n\nThe questions and the answers do not appear in the run object.  \
They are in the messages object.\n\n")

for i in range(len(runs_list)):
    mdc("\n\n#### Run %d ####\n\n"%i)
    tabulate_run(i)

<span style='font-size: 14px;'>I will tabulate the fields of the three runs.  As you see below, the `run` object is quite simple.  I make the following observations:

* The File IDs are the same that were attached to the Assistant and the same for all three runs.

* `status` is complete for all three runs.  During the execution, we watch this field whilke waiting for the run to complete

* The `assistant_id~ is the same as the one in the Playground screen

* The `thread_id` is the same as the one in the Playground screen

* The `model` is the same as the one in the Playground screen

* Time stamps are the UNIX timestamps and present the number of seconds that have elapsed since January 1, 1970 (UTC).  I conver them to a normal time format.

* `instructions` have the same content for all three runs.  It is the same as the instructions for the Assistant.

</span>

<span style='font-size: 14px;'>

The questions and the answers do not appear in the run object.  They are in the messages object.

</span>

<span style='font-size: 14px;'>

#### Run 0 ####

</span>

<span style='font-size: 14px;'>

|Field|Value|
|---|---|
|id|run_F9wljGwD4VMTGv3mXgBwZTVi|
|assistant_id|asst_LCdnKNcUtkxVg1aDoiQ9vHCX|
|cancelled_at|2024-01-21 21:07:05|
|completed_at|2024-01-19 17:20:47|
|created_at|2024-01-19 17:20:30|
|expires_at|None|
|failed_at|None|
|file_ids|['file-G44JGQWzbImx8y3dy1OAuVNe', 'file-T9igFxYMHEelpJr1C07YJgLa', 'file-wcNOIlA5IU5wb7h4hOosEblg']|
|status|completed|
instructions|You are a school teacher answering students questions about the course material provided to you in text files. If the  response is not in the text files, you can respond with 'I don't know'.|
|last_error|None|
|metadata|{}|
|model|gpt-4-1106-preview|
|object|thread.run|
|required_action|None|
started_at|2024-01-19 17:20:31|
|status|completed|
|thread_id|thread_nPbJmSWtwS4umNYBNqrkGoLy|
|tools|[ToolAssistantToolsRetrieval(type='retrieval')]|
|usage|{'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0}|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Run 1 ####

</span>

<span style='font-size: 14px;'>

|Field|Value|
|---|---|
|id|run_Ld7XKFuNMbFNRCSNZ0lXbWFh|
|assistant_id|asst_LCdnKNcUtkxVg1aDoiQ9vHCX|
|cancelled_at|2024-01-21 21:07:05|
|completed_at|2024-01-19 17:16:30|
|created_at|2024-01-19 17:16:19|
|expires_at|None|
|failed_at|None|
|file_ids|['file-G44JGQWzbImx8y3dy1OAuVNe', 'file-T9igFxYMHEelpJr1C07YJgLa', 'file-wcNOIlA5IU5wb7h4hOosEblg']|
|status|completed|
instructions|You are a school teacher answering students questions about the course material provided to you in text files. If the  response is not in the text files, you can respond with 'I don't know'.|
|last_error|None|
|metadata|{}|
|model|gpt-4-1106-preview|
|object|thread.run|
|required_action|None|
started_at|2024-01-19 17:16:19|
|status|completed|
|thread_id|thread_nPbJmSWtwS4umNYBNqrkGoLy|
|tools|[ToolAssistantToolsRetrieval(type='retrieval')]|
|usage|{'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0}|
</span>

<span style='font-size: 14px;'>

</span>

<span style='font-size: 14px;'>

#### Run 2 ####

</span>

<span style='font-size: 14px;'>

|Field|Value|
|---|---|
|id|run_LGBgHiDIt5XXMhkDkA8zfPQI|
|assistant_id|asst_LCdnKNcUtkxVg1aDoiQ9vHCX|
|cancelled_at|2024-01-21 21:07:05|
|completed_at|2024-01-19 17:13:58|
|created_at|2024-01-19 17:13:40|
|expires_at|None|
|failed_at|None|
|file_ids|['file-G44JGQWzbImx8y3dy1OAuVNe', 'file-T9igFxYMHEelpJr1C07YJgLa', 'file-wcNOIlA5IU5wb7h4hOosEblg']|
|status|completed|
instructions|You are a school teacher answering students questions about the course material provided to you in text files. If the  response is not in the text files, you can respond with 'I don't know'.|
|last_error|None|
|metadata|{}|
|model|gpt-4-1106-preview|
|object|thread.run|
|required_action|None|
started_at|2024-01-19 17:13:40|
|status|completed|
|thread_id|thread_nPbJmSWtwS4umNYBNqrkGoLy|
|tools|[ToolAssistantToolsRetrieval(type='retrieval')]|
|usage|{'prompt_tokens': 0, 'completion_tokens': 0, 'total_tokens': 0}|
</span>

<span style='font-size: 14px;'>

</span>

In [15]:
t = Client.beta.threads

# Get all attributes and methods of the object
attributes = dir(t)

# Filter out the methods and special attributes (those starting with '__')
fields = [attr for attr in attributes if not callable(getattr(t, attr)) and not attr.startswith('__')]

# Print each field and its value
for field in fields:
    mdc(f"**{field}**: {getattr(t, field)}")


<span style='font-size: 14px;'>**_client**: <openai.OpenAI object at 0x10bfc9b10></span>

<span style='font-size: 14px;'>**messages**: <openai.resources.beta.threads.messages.messages.Messages object at 0x10c0ee550></span>

<span style='font-size: 14px;'>**runs**: <openai.resources.beta.threads.runs.runs.Runs object at 0x10c0ede10></span>

<span style='font-size: 14px;'>**with_raw_response**: <openai.resources.beta.threads.threads.ThreadsWithRawResponse object at 0x10bfcb750></span>