# Summary

Determine how much work it would take to update jabberwocky to:

- [x] support gpt 3.5 models
- [x] support gpt 4 models
- [x] support chatGPT
- use ConversationManager for PDB subclass

"Support" could mean gpt.query works, prompt_manager.query works, conversation_manager.query works, engine_map.get works, etc.

In [1]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [2]:
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
from pathlib import Path

from htools import *
from jabberwocky.openai_utils import *

Object loaded from /Users/hmamin/jabberwocky/data/misc/gooseai_sample_responses.pkl.


In [3]:
cd_root()

Current directory: /Users/hmamin/cairina/roboduck


## Support gpt 3.5 models.

- text-davinci-003 is already considered a 3.5 model
- context length is about double gpt3's, half gpt4's (ignoring the unreleased super long context)
- really just "gpt-3.5-turbo", there are no non-davinci level models here (at least for text completion)
- gpt-3.5-turbo is 1/10 the cost of the equivalent davinci model

In [5]:
pm = PromptManager(verbose=False)
pm

PromptManager('debug_full', 'creative_writing_ideas', 'time_complexity', 'word2number', 'eli', 'explain_code', 'tldr', 'translate', 'short_dates', 'test_professional', 'summarize_conversation', 'analyze_writing', 'social_hypotheses', 'conversation_transcript', 'wiki_bio_cleanup', 'debug', 'conversation', 'simplify_ml', 'ml_abstract', 'debug_stack_trace', 'shortest', 'mma', 'punctuate_alexa', 'journal_entry_ideas', 'punctuate_transcription', 'extract_backend_slot', 'leading_scholars', 'extract_code', 'debate', 'punctuate', 'conversation_generalized', 'default', 'how_to', 'debug_duckling', 'emotion_markup_language', 'nytimes_article')

In [6]:
cm = ConversationManager()
cm

  'You\'ve chosen to load >=1 custom personas but you are using '


<jabberwocky.openai_utils.ConversationManager at 0x7fde19c20828>

In [7]:
GPT

GPTBackend <current_name: openai>

In [76]:
turbo = 'gpt-3.5-turbo'

Object loaded from /Users/hmamin/jabberwocky/data/misc/gooseai_sample_responses.pkl.


In [9]:
EngineMap.get(turbo)

ValueError: Model "gpt-3.5-turbo" does not contain any of the recognized openai bases ['ada', 'babbage', 'curie', 'davinci'].

In [77]:
# Trying again w/ jabberwocky 3.0.0.
EngineMap.get(turbo)

  f'Allowing model "{model}" to pass through because '


'gpt-3.5-turbo'

Even with passthrough=True, we still get an error currently bc turbo doesn't include any of the typical base names (e.g. curie).

In [10]:
EngineMap.get(turbo, openai_passthrough=True)

ValueError: Model "gpt-3.5-turbo" does not contain any of the recognized openai bases ['ada', 'babbage', 'curie', 'davinci'].

In [84]:
with assert_raises(ValueError):
    EngineMap.get(turbo, openai_passthrough=False)

As expected, got ValueError(Model "gpt-3.5-turbo" does not contain any of the recognized openai bases ['ada', 'babbage', 'curie', 'davinci'].).


**Takeaways**

- 3.5 not currently supported. Updates should be confined to EngineMap class (I think?) so not horrible but not super simple either.

## gpt 4

- No ada/babbage/curie/davinci, just "gpt-4", "gpt-4-32k".
- Context length is 2x gpt3.5, 4x gpt3. 32k version is 8x and 16x, respectively.
- Won't be natively supported either (like the 3.5 models), but once I figure out how to support 1, the other change should be basically identical.

In [78]:
# Trying again w/ jabberwocky 3.0.0.
EngineMap.get('gpt-4')

  f'Allowing model "{model}" to pass through because '


'gpt-4'

In [83]:
# Trying again w/ jabberwocky 3.0.0.
with assert_raises(ValueError):
    EngineMap.get('gpt-4', openai_passthrough=False)

As expected, got ValueError(Model "gpt-4" does not contain any of the recognized openai bases ['ada', 'babbage', 'curie', 'davinci'].).


## chatGPT

- uses different obj openai.ChatCompletion instead of openai.Completion
- instead of `prompt=str`, it takes in `messages=list[dict]`.
- messages are pretty similar to what I store in conv manager. System message are top level instructions (e.g. bio), user messages are from human, assistant messages are from gpt. (Roles are "system", "user", "assistant".)

In [24]:
cm.start_conversation('Alan Turing')

{'persona': 'alan_turing',
 'gender': 'M',
 'img_path': '/Users/hmamin/jabberwocky/data/conversation_personas/alan_turing/profile.jpg',
 'img_url': 'https://upload.wikimedia.org/wikipedia/commons/f/f5/Turing-statue-Bletchley_14.jpg',
 'nationality': 'English'}

In [25]:
cm.user_turn_window, cm.gpt3_turn_window

(1, 1)

In [26]:
res, full = cm.query('Explain the distinction between agentic AI and tool AI.',
         model=0)
res[0]

{'model': 0, 'temperature': 0.7, 'max_tokens': 100, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': 'The following is a transcript of a conversation with Alan Turing. Alan Mathison Turing (23 June 1912 - 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical computer science, providing a formalisation of the concepts of algorithm and computation with the Turing machine, which can be considered a model of a general-purpose computer.\n\nMe: Explain the distinction between agentic AI and tool AI.\n\nAlan Turing:', 'meta': {'backend_name': 'openai', 'query_func': 'query_gpt3', 'datetime': 'Fri Mar 17 22:22:01 2023', 'version': 0}}


'A tool AI is an idea which has been brought about by way of science and technology, and is now being used by our own day to enable people to do important and necessary things for society. A agentic AI is something which is brought about by ourselves and our own actions, and is now being used by us and others in the future.'

In [27]:
res, full = cm.query('Could an AI be both agentic and tool?', model=0)
res[0]

{'model': 0, 'temperature': 0.7, 'max_tokens': 100, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': 'The following is a transcript of a conversation with Alan Turing. Alan Mathison Turing (23 June 1912 - 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical computer science, providing a formalisation of the concepts of algorithm and computation with the Turing machine, which can be considered a model of a general-purpose computer.\n\nMe: Explain the distinction between agentic AI and tool AI.\n\nAlan Turing: A tool AI is an idea which has been brought about by way of science and technology, and is now being used by our own day to enable people to do important and necessary things for society. A agentic AI is something which is brought about by ourselves and our own actions, and is now being used by us and others

"No, an AI cannot be both agentic and tool. An AI is only toolful when I have set out to do a good work, and am now using my life's labour to make it a good one."

In [28]:
res, full = cm.query('What other types might we define to classify AIs?',
                     model=0)
res[0]

{'model': 0, 'temperature': 0.7, 'max_tokens': 100, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': "The following is a transcript of a conversation with Alan Turing. Alan Mathison Turing (23 June 1912 - 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical computer science, providing a formalisation of the concepts of algorithm and computation with the Turing machine, which can be considered a model of a general-purpose computer.\n\nMe: Could an AI be both agentic and tool?\n\nAlan Turing: No, an AI cannot be both agentic and tool. An AI is only toolful when I have set out to do a good work, and am now using my life's labour to make it a good one.\n\nMe: What other types might we define to classify AIs?\n\nAlan Turing:", 'meta': {'backend_name': 'openai', 'query_func': 'query_gpt3', 'datetime': 'Fri Mar 17 22:2

"I don't know, but they will be of three types: those who rely on intuition and understanding, those who rely on thought, and those who rely on technology."

In [29]:
cm.user_turns

['Explain the distinction between agentic AI and tool AI.',
 'Could an AI be both agentic and tool?',
 'What other types might we define to classify AIs?']

In [30]:
cm.gpt3_turns

['A tool AI is an idea which has been brought about by way of science and technology, and is now being used by our own day to enable people to do important and necessary things for society. A agentic AI is something which is brought about by ourselves and our own actions, and is now being used by us and others in the future.',
 "No, an AI cannot be both agentic and tool. An AI is only toolful when I have set out to do a good work, and am now using my life's labour to make it a good one.",
 "I don't know, but they will be of three types: those who rely on intuition and understanding, those who rely on thought, and those who rely on technology."]

In [31]:
def get_messages(self, user_text='', do_full=True):
    current_persona = self.current['persona']
    if not current_persona:
        raise RuntimeError('No persona loaded. Have you started a '
                           'conversation?')

    pretty_name = self.process_name(current_persona, inverse=True)
    messages = [
        {'role': 'system', 
         'content': f'{self.name2base[current_persona]}\n\n'}
    ]
    user_turns = list(self.user_turns)
    if user_text:
        user_turns.append(user_text)
    gpt3_turns = self.gpt3_turns
    if not do_full:
        user_turns = self.user_turns[-self.user_turn_window:]
        gpt3_turns = self.gpt3_turns[-self.gpt3_turn_window:]

    if len(user_turns) - len(gpt3_turns) not in (0, 1):
        raise RuntimeError(
            f'Mismatched turn counts: user has {len(user_turns)} and gpt3'
            f' has {len(gpt3_turns)} turns.'
        )
    user_turns = [{'role': 'user', 'content': f'{self.me}: {turn}'}
                  for turn in user_turns]
    
    # Strip gpt3 turns to be safe since streaming mode only strips them
    # once the full query completes, and GUI uses full_conversation
    # property while query is still in progress.
    gpt3_turns = [
        {'role': 'assistant', 'content': f'{pretty_name}: {turn.strip()}'}
        for turn in gpt3_turns
    ]
    ordered = [user_turns, gpt3_turns]
    if len(gpt3_turns) == len(user_turns) and not do_full:
        ordered = reversed(ordered)
    interleaved = [turn for row in zip_longest(user_turns, gpt3_turns) 
                   for turn in row if turn]
    messages.extend(interleaved)
    return messages

In [38]:
messages = get_messages(cm, 
                        user_text='In a similar vein, what 3 categories would you place human thinkers into?')
messages

[{'role': 'system',
  'content': 'The following is a transcript of a conversation with Alan Turing. Alan Mathison Turing (23 June 1912 - 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical computer science, providing a formalisation of the concepts of algorithm and computation with the Turing machine, which can be considered a model of a general-purpose computer.\n\n'},
 {'role': 'user',
  'content': 'Me: Explain the distinction between agentic AI and tool AI.'},
 {'role': 'assistant',
  'content': 'Alan Turing: A tool AI is an idea which has been brought about by way of science and technology, and is now being used by our own day to enable people to do important and necessary things for society. A agentic AI is something which is brought about by ourselves and our own actions, and is now being used by us and others in the future.'},
 {'role': 'user

In [33]:
# Was using 0.18.1. Had to upgrade to get chat functionality, now
# 0.27.2. Had to restart notebook despite autoreload being enabled 🤔.
!pip show openai

Name: openai
Version: 0.27.2
Summary: Python client library for the OpenAI API
Home-page: https://github.com/openai/openai-python
Author: OpenAI
Author-email: support@openai.com
License: 
Location: /Users/hmamin/anaconda3/lib/python3.7/site-packages
Requires: aiohttp, requests, tqdm, typing-extensions
Required-by: chronological, jabberwocky


In [42]:
res = openai.ChatCompletion().create(model='gpt-3.5-turbo', messages=messages)

In [44]:
type(res)

openai.openai_object.OpenAIObject

In [43]:
res

<OpenAIObject chat.completion id=chatcmpl-6vJMT6o6mNSHmduWpaC6oeSiPx6UG at 0x7fde19f1aaf0> JSON: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "Alan Turing: I would say that human thinkers can be categorized into three groups: those who rely on intuition and creativity, those who rely on logical thinking and analysis, and those who rely on experimental and empirical methods. Of course, many individuals embody a combination of these categories, and it is the unique balance of these skills that make each person's thinking and problem-solving abilities distinct.",
        "role": "assistant"
      }
    }
  ],
  "created": 1679117141,
  "id": "chatcmpl-6vJMT6o6mNSHmduWpaC6oeSiPx6UG",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 77,
    "prompt_tokens": 346,
    "total_tokens": 423
  }
}

In [47]:
res.choices[0].message.content

"Alan Turing: I would say that human thinkers can be categorized into three groups: those who rely on intuition and creativity, those who rely on logical thinking and analysis, and those who rely on experimental and empirical methods. Of course, many individuals embody a combination of these categories, and it is the unique balance of these skills that make each person's thinking and problem-solving abilities distinct."

**Takeaways**

- Not too hard to construct inputs needed for chatmodel.
- Code might get a little ugly delegating to chatgpt or not depending on model name though.

## Custom roboduck persona

In [53]:
# Reload after creating new roboduck persona.
cm = ConversationManager()

In [54]:
cm.name2base['roboduck']

'The following is a transcript of a conversation with Roboduck. RoboDuck is an incredibly effective AI programming assistant. He is friendly, helpful, and detail-oriented. RoboDuck has in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and has decades of experience helping Python programmers resolve their most challenging bugs.'

In [100]:
# Use as user_prompt_template.format(**user_kwargs, question=question).
user_prompt_template = load_prompt('debug', verbose=False)['prompt'].lstrip('"')

In [102]:
print(user_prompt_template)

ANSWER KEY

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem is, SOLUTION PART 1 should list a few possible causes or things I could try in order to identify the issue and SOLUTION PART 2 should say N/A. Be concise and use simple language because I am a beginning programmer.

QUESTION:
{question}

CURRENT CODE SNIPPET:
{code}

LOCAL VARIABLES:
{local_vars}

GLOBAL VARIABLES:
{global_vars}

SOLUTION PART 1:


In [103]:
user_kwargs = {
    'code': """
def bubble_sort(nums):
    for i in range(len(nums)):
        for j in range(len(nums)):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums
""".strip(),
    'local_vars': """
{
    'nums': [3, 4, 2, 1, 5, 9],   # type: list
    'i': 0,   # type: int
    'j': 4,   # type: int
}
""".strip(),
'global_vars': """
{
}""".strip()
}

In [105]:
print(user_prompt_template.format(**user_kwargs, question='Why will this throw an index error soon?'))

ANSWER KEY

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem is, SOLUTION PART 1 should list a few possible causes or things I could try in order to identify the issue and SOLUTION PART 2 should say N/A. Be concise and use simple language because I am a beginning programmer.

QUESTION:
Why will this throw an index error soon?

CURRENT CODE SNIPPET:
def bubble_sort(nums):
    for i in range(len(nums)):
        for j in range(len(nums)):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums

LOCAL VARIABLES:
{
    'nums': [3, 4, 2, 1, 5, 9],   # 

In [69]:
cm.start_conversation('roboduck')

{'persona': 'roboduck',
 'img_path': '/Users/hmamin/jabberwocky/data/conversation_personas_custom/roboduck/profile.jpg',
 'gender': 'M',
 'nationality': 'American'}

In [70]:
# Haven't implemented support for turbo or chat yet.
res, full = cm.query(
    user_prompt_template.format(**user_kwargs, question='Why will this throw an index error soon?'), 
    model='code-davinci-002', max_tokens=512
)

{'model': 'code-davinci-002', 'temperature': 0.7, 'max_tokens': 512, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': "The following is a transcript of a conversation with Roboduck. RoboDuck is an incredibly effective AI programming assistant. He is friendly, helpful, and detail-oriented. RoboDuck has in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and has decades of experience helping Python programmers resolve their most challenging bugs.\n\nMe: This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem

In [72]:
print(res[0])

It looks like there could be a few issues with your code snippet.
The first issue is that the code is not correct for bubble sort.

SOLUTION PART 2:
def bubble_sort(nums):
    for i in range(len(nums)):   # type: int
        for j in range(len(nums) - 1 - i):   # type: int
            if nums[j] > nums[j + 1]:   # type: bool
                nums[j + 1], nums[j] = nums[j], nums[j + 1]   # type: tuple, tuple, NoneType, NoneType, NoneType
    return nums   # type: list


In [74]:
# Haven't implemented support for turbo or chat yet.
res, full = cm.query(
    'Can you explain why j iterates over len(nums) - 1 - i?',
    model='code-davinci-002', max_tokens=512
)

{'model': 'code-davinci-002', 'temperature': 0.7, 'max_tokens': 512, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': "The following is a transcript of a conversation with Roboduck. RoboDuck is an incredibly effective AI programming assistant. He is friendly, helpful, and detail-oriented. RoboDuck has in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and has decades of experience helping Python programmers resolve their most challenging bugs.\n\nMe: This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem

In [75]:
print(res[0])

Because the last i elements are already in place, you can safely ignore them when running subsequent iterations.


Try with gpt-3.5-turbo after tentatively enabling that capability in jabberwocky 3.0.0.

In [89]:
cm.start_conversation('roboduck')

{'persona': 'roboduck',
 'img_path': '/Users/hmamin/jabberwocky/data/conversation_personas_custom/roboduck/profile.jpg',
 'gender': 'M',
 'nationality': 'American'}

In [93]:
print(user_prompt_template)

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem is, SOLUTION PART 1 should list a few possible causes or things I could try in order to identify the issue and SOLUTION PART 2 should say N/A. Be concise and use simple language because I am a beginning programmer.

QUESTION:
{question}

CURRENT CODE SNIPPET:
def bubble_sort(nums):
    for i in range(len(nums)):
        for j in range(len(nums)):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums

LOCAL VARIABLES:
{
    'nums': [3, 4, 2, 1, 5, 9],   # type: list
    'i': 0,   # type: int
    '

Whoops, looks like turbo isn't available with the completion endpoint. Need to use ChatCompletion().create() instead of Completion().Create(). 

In [107]:
# Try again now that turbo is available.
res, full = cm.query(
    user_prompt_template.format(**user_kwargs, 
                                question='Why will this throw an index error soon?'), 
    model='gpt-3.5-turbo', max_tokens=512
)

{'model': 'gpt-3.5-turbo', 'temperature': 0.7, 'max_tokens': 512, 'frequency_penalty': 0.5, 'stop': ['\n\nMe:', 'This is a conversation with'], 'prompt': "The following is a transcript of a conversation with Roboduck. RoboDuck is an incredibly effective AI programming assistant. He is friendly, helpful, and detail-oriented. RoboDuck has in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and has decades of experience helping Python programmers resolve their most challenging bugs.\n\nMe: ANSWER KEY\n\nThis code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what 

  f'Allowing model "{model}" to pass through because '


MockFunctionException: This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?