In [13]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [15]:
from langchain.prompts import PromptTemplate, \
    HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain.schema import SystemMessage
from langchain.chat_models import ChatOpenAI
from langchain.chat_models.base import BaseChatModel
from langchain.schema import ChatResult, ChatGeneration
import matplotlib.pyplot as plt
import numpy as np
import os
import pandas as pd
from pathlib import Path

from htools import *
from jabberwocky.openai_utils import *

In [3]:
cd_root()

Current directory: /Users/hmamin/roboduck


In [4]:
os.environ['OPENAI_API_KEY'] = api_key = load_openai_api_key()

In [5]:
print(load_prompt('debug')['prompt'])

debug: Could try davinci text as well but codex is free for now. You may want to strip triple double-quotes from the end in case codex generates them (we don't include that as a stop phrase because codex might generate a docstring as part of a correct code snippet).
-------------------------------------------------------------------------------

"""ANSWER KEY

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it. In the section titled SOLUTION PART 2, write a corrected version of the input code snippet. If you don't know what the problem is, SOLUTION PART 1 should list a few possible causes or things I could try in order to identify the issue and SOLUTION PART 2 should say N/A. Be concise and use simple language because I am a begin

In [6]:
system_prompt_text = """You are an incredibly effective AI programming assistant. You have in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and your goal is to help Python programmers resolve their most challenging bugs.
"""
system_prompt = SystemMessage(content=system_prompt_text)

In [7]:
user_prompt_text = """This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two parts. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it (if you don't know what the problem is, SOLUTION PART 1 should instead list a few possible causes or things I could try in order to identify the issue). In the section titled SOLUTION PART 2, write a corrected version of the input code snippet (if you don't know, SOLUTION PART 2 should say None). SOLUTION PART 2 must contain only python code - there must not be any English explanation outside of code comments or docstrings. Be concise and use simple language because I am a beginning programmer.

QUESTION:
{question}

CURRENT CODE SNIPPET:
{code}

LOCAL VARIABLES:
{local_vars}

GLOBAL VARIABLES:
{global_vars}"""
user_prompt_template = HumanMessagePromptTemplate.from_template(
    user_prompt_text
)

In [8]:
kwargs = {
    'question': 'Why will this throw an index error soon?',
    'code': """def bubble_sort(nums):
    for i in range(len(nums)):
        for j in range(len(nums)):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums""",
    'local_vars': """{
    'nums': [3, 4, 2, 1, 5, 9],   # type: list
    'i': 0,   # type: int
    'j': 4,   # type: int
}""",
    'global_vars': """{
}"""
}

In [9]:
messages = [
    system_prompt,
    user_prompt_template.format(**kwargs)
]

In [10]:
print('\n'.join(m.content for m in messages))

You are an incredibly effective AI programming assistant. You have in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and your goal is to help Python programmers resolve their most challenging bugs.

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two parts. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it (if you don't know what the problem is, SOLUTION PART 1 should instead list a few possible causes or things I could try in order to identify the issue). In the section titled SOLUTION PART 2, write a corrected version of the input code snippet (if you don't know, SOLUTION PART 2 should say None). SOLUTION PART 2 must contain only python code - there must not be any English ex

In [358]:
chat = ChatOpenAI(temperature=.66, a='b', max_tokens=33, 
                  model_name='gpt-3.5-turbo')
chat

ChatOpenAI(verbose=False, callback_manager=<langchain.callbacks.shared.SharedCallbackManager object at 0x7fed643756d0>, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', model_kwargs={'temperature': 0.66, 'a': 'b'}, openai_api_key=None, request_timeout=60, max_retries=6, streaming=False, n=1, max_tokens=33)

In [359]:
chat.model_kwargs

{'temperature': 0.66, 'a': 'b'}

In [360]:
chat.max_tokens

33

In [58]:
res = chat(messages)
print(res.content)

SOLUTION PART 1:
The code will throw an index error soon because the inner loop is iterating up to the length of the list, which means that on the last iteration, `nums[j + 1]` will be out of range. To fix this, we need to change the range of the inner loop to `range(len(nums) - i - 1)`.

SOLUTION PART 2:

```
def bubble_sort(nums):
    for i in range(len(nums)):
        for j in range(len(nums) - i - 1):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums
```


In [60]:
messages.append(res)
messages.append(
    user_prompt_template.format(
        **{**kwargs, 
           'question': 'Can you revise your solution so you only find the length of nums once?'}
    )
)

In [62]:
res = chat(messages)
print(res.content)

SOLUTION PART 1:
The problem with the current code is that it is calling `len(nums)` twice in the inner loop, which is inefficient. To fix this, we can store the length of `nums` in a variable before the loop and use that variable instead.

SOLUTION PART 2:

```
def bubble_sort(nums):
    n = len(nums)
    for i in range(n):
        for j in range(n - i - 1):
            if nums[j] > nums[j + 1]:
                nums[j + 1], nums[j] = nums[j], nums[j + 1]
    return nums
```


Took first stab at storing this info in a file (py for now). Try loading.

In [617]:
from inspect import Parameter
import importlib
from langchain.schema import AIMessage
from langchain.callbacks.base import CallbackManager, BaseCallbackHandler
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from roboduck.prompts import load_template
from roboduck.utils import colored
import time

In [645]:
chat.chat.callback_manager.on_llm_new_token??

In [767]:
class LiveTypingCallbackHandler(StreamingStdOutCallbackHandler):
    # This parent class mostly just gives prevents us from having to write a
    # bunch of boilerplate methods that do nothing, but it also makes sense
    # since this implements a specific subcase of streaming to stdout.
    always_verbose = True
    
    def __init__(self, color='green', sleep=.01):
        self.color = color
        self.sleep = sleep
    
    def on_llm_new_token(self, token, **kwargs):
        """Run on new LLM token. Only available when streaming is enabled."""
        for char in token:
            sys.stdout.write(colored(char, self.color))
            time.sleep(self.sleep)
        sys.stdout.flush()

In [733]:
class DummyChatModel:
    # We'd have to be a bit more rigid about expects init args if we want to
    # subclass from BaseChatModel. For now this is fine.
    
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)
        self.verbose = getattr(self, 'verbose', True)
        
    def __call__(self, messages, stop=None):
        return self._generate(messages, stop=stop).generations[0].message
    
    def _generate(self, messages, stop=None):
        res = messages[-1].content.upper()
        if self.streaming:
            tokens = [tok + ' ' for tok in res.split(' ')]
            tokens[-1] = tokens[-1].rstrip(' ')
            for token in tokens:
                self.callback_manager.on_llm_new_token(
                    token,
                    verbose=self.verbose,
                )
        message = AIMessage(content=res)
        return ChatResult(generations=[ChatGeneration(message=message)])
    
    async def _agenerate(self, messages, stop=None):
        warnings.warn(
            f'{type(self).__name__} doesn\'t provide a real _agenerate '
            'method. Calling synchronous generate() instead.'
        )
        return self._generate(messages, stop=stop)

In [692]:
# Old approach. Decorator is not as useful for our use case because we want to
# define the base function once but create several variants of it using
# different fields.
# def add_kwargs(*fields):
#     def decorator(func):
#         # In practice langchain checks for this anyway if we ask for a
#         # completion, but outside of that context we need typecheck
#         # because otherwise we could provide no kwargs and _func wouldn't
#         # complain. Just use generic type because we only care that a value is
#         # provided.
#         @typecheck(**{f: object for f in fields})
#         @wraps(func)
#         def wrapper(**kwargs):
#             return func(**kwargs)

#         sig = signature(wrapper)
#         params_ = {field: Parameter(field, Parameter.KEYWORD_ONLY)
#                    for field in fields}
#         wrapper.__signature__ = sig.replace(parameters=params_.values())
#         return wrapper
#     return decorator


# @add_kwargs('question', 'abc')
# def _reply(**kwargs):
#     return kwargs

In [714]:
def add_kwargs(func, fields, hide_fields=(), strict=False):
    # Hide_fields must have default values in existing function. They will not
    # show up in the new docstring and the user will not be able to pass in a
    # value when calling the new function - it will always use the default.
    # To set different defaults, you can pass in a partial rather than a 
    # function as the first arg here.
    @wraps(func)
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)

    if hide_fields and not strict:
        raise ValueError(
            'You must set strict=True when providing one or more '
            'hide_fields. Otherwise the user can still pass in those args.'
        )
    sig = signature(wrapper)
    params_ = {k: v for k, v in sig.parameters.items()}
    
    # Remove any fields we want to hide.
    for field in hide_fields:
        if field not in params_:
            warnings.warn(f'No need to hide field {field} because it\'s not '
                          'in the existing function signature.')
        elif params_.pop(field).default == Parameter.empty:
            raise TypeError(
                f'Field "{field}" is not a valid hide_field because it has '
                'no default value in the original function.'
            )
            
    if getattr(params_.pop('kwargs', None), 'kind') != Parameter.VAR_KEYWORD:
        raise TypeError(f'Function {func} must accept **kwargs.')
    new_params = {
        field: Parameter(field, Parameter.KEYWORD_ONLY)
        for field in fields
    }
    overlap = set(new_params) & set(params_)
    if overlap:
        raise RuntimeError(
            f'Some of the kwargs you tried to inject into {func} already '
            'exist in its signature. This is not allowed because it\'s '
            'unclear how to resolve default values and parameter type.'
        )

    params_.update(new_params)
    wrapper.__signature__ = sig.replace(parameters=params_.values())
    if strict:
        # In practice langchain checks for this anyway if we ask for a
        # completion, but outside of that context we need typecheck
        # because otherwise we could provide no kwargs and _func wouldn't
        # complain. Just use generic type because we only care that a value is
        # provided.
        wrapper = typecheck(wrapper, **{f: object for f in fields})
    return wrapper

def _reply(key, bar=-1, **kwargs):
    """Test docstring."""
    return key, bar, kwargs

reply = add_kwargs(_reply, ['statement', 'response'])
strict_reply = add_kwargs(_reply, ['statement', 'response'], strict=True)
with assert_raises(TypeError):
    no_key_reply = add_kwargs(_reply, ['statement', 'response'],
                              hide_fields=['key'], strict=True)
    
    
def reply_(key='aaa', bar=-1, **kwargs):
    """Test docstring."""
    return key, bar, kwargs


no_key_reply = add_kwargs(reply_, ['statement', 'response'],
                          hide_fields=['key'], strict=True)
diff_key_reply = add_kwargs(
    partial(reply_, key='zzz'), ['statement', 'response'],
    hide_fields=['key'], strict=True
)

As expected, got TypeError(Field "key" is not a valid hide_field because it has no default value in the original function.).


In [9]:
reply('a', bar=99, statement=44, response=-1)

('a', 99, {'statement': 44, 'response': -1})

In [10]:
reply('a', bar=99, statement=44)

('a', 99, {'statement': 44})

In [11]:
strict_reply('a', bar=99, statement=44, response=-1)

('a', 99, {'statement': 44, 'response': -1})

In [12]:
with assert_raises(TypeError):
    strict_reply('a', bar=99, statement=44)

As expected, got TypeError(missing a required argument: 'response').


In [13]:
no_key_reply(bar=99, statement=44, response=-1)

('aaa', 99, {'statement': 44, 'response': -1})

In [14]:
with assert_raises(TypeError):
    no_key_reply(bar=99, statement=44, response=-1, key=99)

As expected, got TypeError(got an unexpected keyword argument 'key').


In [15]:
diff_key_reply(bar=99, statement=44, response=-1)

('zzz', 99, {'statement': 44, 'response': -1})

In [16]:
with assert_raises(TypeError):
    diff_key_reply(bar=99, statement=44, response=-1, key='eeee')

As expected, got TypeError(got an unexpected keyword argument 'key').


In [770]:
def extract_code(text, join_multi=True, multi_prefix_template='\n\n# {i}\n'):
    # multi_prefix_template is only used when join_multi=True and multiple
    # chunks are found.
    chunks = re.findall("(?s)```\n(.*?)\n```", text)
    if not join_multi:
        return chunks
    if len(chunks) > 1:
        chunks = [multi_prefix_template.format(i=i) + chunk 
                  for i, chunk in enumerate(chunks, 1)]
        chunks[0] = chunks[0].lstrip()
    return ''.join(chunks)

In [747]:
class Chat:
    
    def __init__(self, system, user, chat_class=ChatOpenAI,
                 history=(), streaming=True, **kwargs):
        # kwargs can include both model_kwargs to pass to chat_class 
        # (things like temperature or top_p that affect completion directly)
        # and other miscellaneous kwargs `callback_manager` or `verbose`.
        self.kwargs = dict(kwargs)
        self.kwargs.update(streaming=streaming)
        if 'callback_manager' not in self.kwargs:
            self.kwargs['callback_manager'] = CallbackManager(
                [LiveTypingCallbackHandler()]
            )
        self.chat = chat_class(**self.kwargs)
        self.system_message = SystemMessage(content=system)
        if isinstance(user, str):
            user = {'reply': user}
        self.user_templates = {
            k: HumanMessagePromptTemplate.from_template(v)
            for k, v in user.items()
        }
        self.default_user_key = next(iter(self.user_templates))
        self.default_user_fields = (self.user_templates[self.default_user_key]
                                    .input_variables)
        self._history = list(history) or [self.system_message]
        self._create_reply_methods()
        
    def _create_reply_methods(self):
        """Creates two options for user to send replies:
        1. call chat.reply(), using the key_ arg to determine which type of
        user_message is sent. The docstring shows the default user 
        message type's fields but if you set the key accordingly you can pass
        in fields for another message type. We choose not to infer key_
        because some user_message types may accept the same fields.
        2. call methods like chat.question() or chat.statement(), where 'chat'
        and 'statement' are the names of all available user message types
        (i.e. the keys of the `user` dict in the prompt config file). You can
        not pass in key_.
        """
        for k, v in self.user_templates.items():
            if hasattr(self, k):
                warnings.warn(
                    f'Name collision: prompt defines user message type {k} '
                    f'but Chat class already has a method with that name. '
                    f'Method will be named {k}_ instead.'
                )
                k = k + '_'
            meth = add_kwargs(partial(self._reply, key_=k), 
                              fields=v.input_variables,
                              hide_fields=['key_'],
                              strict=True)
            setattr(self, k, meth)
        setattr(
            self, 
            'reply', 
            add_kwargs(self._reply, self.default_user_fields, strict=False)
        )
        
    @classmethod
    def from_config(cls, name, **kwargs):
        template = load_template(name)
        if kwargs:
            template['kwargs'].update(kwargs)
        kwargs = template.pop('kwargs', {})
        return cls(**template, **kwargs)
        
    def _user_message(self, *, key_='', **kwargs):
        key = key_ or self.default_user_key
        template = self.user_templates[key]
        return template.format(**kwargs)
    
    def _reply(self, *, key_='', **kwargs):
        user_message = self._user_message(key_=key_, **kwargs)
        self._history.append(user_message)
        try:
            response = self.chat(self._history)
        except Exception as e:
            self._history.pop(-1)
            raise e
        self._history.append(response)
        return response
    
    def history(self, sep='\n\n', speaker_prefix=True):
        """Return chat history as a single string."""
        res = []
        for row in self._history:
            reply = row.content
            if speaker_prefix:
                speaker = type(row).__name__.split('Message')[0]
                reply = f'{speaker}: {reply}'
            res.append(reply)
        return sep.join(res)

In [774]:
chat.user_templates['contextful'].input_variables

['code', 'global_vars', 'local_vars', 'next_line', 'question']

In [775]:
chat.user_templates['contextless'].input_variables

['question']

In [748]:
chat = Chat.from_config('debug', chat_class=DummyChatModel)

In [751]:
tmp = chat._user_message(
    code='a = 3\nb = 4', question='Why?', local_vars='{3: 4}',
    global_vars='{True: False}', next_line='b = 4'
)

print(tmp.content)

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two sections, separated by an empty line. Do not include section titles of any kind. In section 1, use plain English to explain what the problem is and how to fix it (if you couldn't identify the problem, section 1 should instead list a few possible causes or things I could try in order to identify the issue). In section 2, write a corrected version of the input code snippet (if you don't know, section 2 should be empty). Section 2 must contain only python code - there must not be any English explanation outside of code comments or docstrings. Section 2 must be entirely enclosed in one pair of triple backticks ("```"). 

QUESTION:
Why?

CURRENT CODE SNIPPET:
a = 3
b = 4

NEXT LINE:
b = 4

LOCAL VARIABLES:
{3: 4}

GLOBAL VARIABLES:
{True: False}


In [752]:
tmp = chat._user_message(
    key_='contextless',
    question='Why?'
)

print(tmp.content)

QUESTION:
Why?


In [753]:
chat.contextless(question='x')

[32mQ[39m[32mU[39m[32mE[39m[32mS[39m[32mT[39m[32mI[39m[32mO[39m[32mN[39m[32m:[39m[32m
[39m[32mX[39m

AIMessage(content='QUESTION:\nX', additional_kwargs={})

In [754]:
with assert_raises(TypeError):
    chat.contextless(question='x', key_='contextful')

As expected, got TypeError(got an unexpected keyword argument 'key_').


In [756]:
chat.contextful(
    code='a = 3\nb = 4', question='Why?', local_vars='{3: 4}',
    global_vars='{True: False}', next_line='b = 4'
)

In [540]:
chat.reply(
    code='a = 3\nb = 4', question='Why?', local_vars='{3: 4}',
    global_vars='{True: False}', next_line='b = 4'
)

AIMessage(content="THIS CODE SNIPPET IS NOT WORKING AS EXPECTED. HELP ME DEBUG IT. FIRST READ MY QUESTION, THEN EXAMINE THE SNIPPET OF CODE THAT IS CAUSING THE ISSUE AND LOOK AT THE VALUES OF THE LOCAL AND GLOBAL VARIABLES. YOUR RESPONSE MUST HAVE EXACTLY TWO SECTIONS, SEPARATED BY AN EMPTY LINE. DO NOT INCLUDE SECTION TITLES OF ANY KIND. IN SECTION 1, USE PLAIN ENGLISH TO EXPLAIN WHAT THE PROBLEM IS AND HOW TO FIX IT (IF YOU COULDN'T IDENTIFY THE PROBLEM, SECTION 1 SHOULD INSTEAD LIST A FEW POSSIBLE CAUSES OR THINGS I COULD TRY IN ORDER TO IDENTIFY THE ISSUE). IN SECTION 2, WRITE A CORRECTED VERSION OF THE INPUT CODE SNIPPET (IF YOU DON'T KNOW, SECTION 2 SHOULD BE EMPTY). SECTION 2 MUST CONTAIN ONLY PYTHON CODE - THERE MUST NOT BE ANY ENGLISH EXPLANATION OUTSIDE OF CODE COMMENTS OR DOCSTRINGS.\n\nQUESTION:\nWHY?\n\nCURRENT CODE SNIPPET:\nA = 3\nB = 4\n\nNEXT LINE:\nB = 4\n\nLOCAL VARIABLES:\n{3: 4}\n\nGLOBAL VARIABLES:\n{TRUE: FALSE}", additional_kwargs={})

In [757]:
chat.reply(question='How are you?', key_='contextless')

[32mQ[39m[32mU[39m[32mE[39m[32mS[39m[32mT[39m[32mI[39m[32mO[39m[32mN[39m[32m:[39m[32m
[39m[32mH[39m[32mO[39m[32mW[39m[32m [39m[32mA[39m[32mR[39m[32mE[39m[32m [39m[32mY[39m[32mO[39m[32mU[39m[32m?[39m

AIMessage(content='QUESTION:\nHOW ARE YOU?', additional_kwargs={})

In [542]:
print(chat.history())

System: You are an incredibly effective AI programming assistant. You have in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and your goal is to help Python programmers resolve their most challenging bugs. Be concise and use simple language.

Human: QUESTION:
x

AI: QUESTION:
X

Human: This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two sections, separated by an empty line. Do not include section titles of any kind. In section 1, use plain English to explain what the problem is and how to fix it (if you couldn't identify the problem, section 1 should instead list a few possible causes or things I could try in order to identify the issue). In section 2, write a corrected version of the input code snippet (if you don't know, section 2

In [543]:
# This would make more sense if our messages included speakers, i.e. if a
# user_message looked like 'Me: {reply}' and gpt was prompted to reply like
# 'Robert Sapolsky: {reply}' (for example).
print(chat.history(speaker_prefix=False))

You are an incredibly effective AI programming assistant. You have in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and your goal is to help Python programmers resolve their most challenging bugs. Be concise and use simple language.

QUESTION:
x

QUESTION:
X

This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two sections, separated by an empty line. Do not include section titles of any kind. In section 1, use plain English to explain what the problem is and how to fix it (if you couldn't identify the problem, section 1 should instead list a few possible causes or things I could try in order to identify the issue). In section 2, write a corrected version of the input code snippet (if you don't know, section 2 should be empty). Section

Real openai obj.

In [758]:
chat = Chat.from_config('debug')
chat.chat

ChatOpenAI(verbose=False, callback_manager=<langchain.callbacks.base.CallbackManager object at 0x7fdb293cca30>, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', model_kwargs={'temperature': 0.0, 'top_p': 1.0, 'frequency_penalty': 0.0, 'presence_penalty': 0.0, 'logit_bias': {}, 'stop': ['QUESTION', 'SOLUTION PART 3']}, openai_api_key=None, request_timeout=60, max_retries=6, streaming=True, n=1, max_tokens=512)

In [760]:
chat.kwargs

{'model_name': 'gpt-3.5-turbo',
 'max_tokens': 512,
 'temperature': 0.0,
 'top_p': 1.0,
 'frequency_penalty': 0.0,
 'presence_penalty': 0.0,
 'logit_bias': {},
 'stop': ['QUESTION', 'SOLUTION PART 3'],
 'streaming': True,
 'callback_manager': <langchain.callbacks.base.CallbackManager at 0x7fdb293cca30>}

In [763]:
res = chat.reply(
    code='a = 3\nb = ([0, 1], [2, 3])\nb[1].append(a)',
    question='I thought tuples were immutable. Why doesn\'t appending a throw an error?',       
    local_vars='{"a": 3, "b": ([0, 1], [2, 3])}',
    global_vars='{"x": True}', next_line='b[1].append(a)'
)

[32mA[39m[32mp[39m[32mp[39m[32me[39m[32mn[39m[32md[39m[32mi[39m[32mn[39m[32mg[39m[32m [39m[32mt[39m[32mo[39m[32m [39m[32ma[39m[32m [39m[32mt[39m[32mu[39m[32mp[39m[32ml[39m[32me[39m[32m [39m[32mi[39m[32ms[39m[32m [39m[32mn[39m[32mo[39m[32mt[39m[32m [39m[32ma[39m[32ml[39m[32ml[39m[32mo[39m[32mw[39m[32me[39m[32md[39m[32m [39m[32mb[39m[32me[39m[32mc[39m[32ma[39m[32mu[39m[32ms[39m[32me[39m[32m [39m[32mt[39m[32mu[39m[32mp[39m[32ml[39m[32me[39m[32ms[39m[32m [39m[32ma[39m[32mr[39m[32me[39m[32m [39m[32mi[39m[32mm[39m[32mm[39m[32mu[39m[32mt[39m[32ma[39m[32mb[39m[32ml[39m[32me[39m[32m.[39m[32m [39m[32mH[39m[32mo[39m[32mw[39m[32me[39m[32mv[39m[32me[39m[32mr[39m[32m,[39m[32m [39m[32mi[39m[32mn[39m[32m [39m[32mt[39m[32mh[39m[32mi[39m[32ms[39m[32m [39m[32mc[39m[32mo[39m[32md[39m[32me[39m[32m [39m[32ms[39m[32mn[39m[32mi[39

In [764]:
print(extract_code(res.content))

# Corrected code snippet
a = 3
b = ([0, 1], [2, 3])
b = (b[0], b[1] + [a])


In [769]:
extract_code(res.content)

'# Corrected code snippet\na = 3\nb = ([0, 1], [2, 3])\nb = (b[0], b[1] + [a])'

Testing overriding stop kwargs to make sure it's working.

In [215]:
# manager = CallbackManager([StreamingStdOutCallbackHandler()])
manager = CallbackManager([LiveTypingCallbackHandler()])
# manager.on_llm_new_token('caterpillar', verbose=True)

In [221]:
chat = Chat.from_config('debug',
                        streaming=True,
                        callback_manager=manager)
chat.chat

ChatOpenAI(verbose=True, callback_manager=<langchain.callbacks.base.CallbackManager object at 0x7fdb28ae8280>, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', model_kwargs={'temperature': 0.0, 'top_p': 0.99, 'frequency_penalty': 0.2, 'presence_penalty': 0.0, 'logit_bias': {37811: -100, 27901: -50}, 'stop': ['QUESTION', 'SOLUTION PART 3']}, openai_api_key=None, request_timeout=60, max_retries=6, streaming=True, n=1, max_tokens=512)

In [222]:
res = chat.reply(code='foo *= 3', global_vars='{}', local_vars='{}', 
                 next_line='?', question='Why does this throw a name error?')

[32mS[0m[32mO[0m[32mL[0m[32mU[0m[32mT[0m[32mI[0m[32mO[0m[32mN[0m[32m [0m[32mP[0m[32mA[0m[32mR[0m[32mT[0m[32m [0m[32m1[0m[32m:[0m[32m
[0m[32mT[0m[32mh[0m[32me[0m[32m [0m[32mc[0m[32mo[0m[32md[0m[32me[0m[32m [0m[32mi[0m[32ms[0m[32m [0m[32mt[0m[32mr[0m[32my[0m[32mi[0m[32mn[0m[32mg[0m[32m [0m[32mt[0m[32mo[0m[32m [0m[32mm[0m[32mu[0m[32ml[0m[32mt[0m[32mi[0m[32mp[0m[32ml[0m[32my[0m[32m [0m[32mt[0m[32mh[0m[32me[0m[32m [0m[32mv[0m[32ma[0m[32ml[0m[32mu[0m[32me[0m[32m [0m[32mo[0m[32mf[0m[32m [0m[32ma[0m[32m [0m[32mv[0m[32ma[0m[32mr[0m[32mi[0m[32ma[0m[32mb[0m[32ml[0m[32me[0m[32m [0m[32mn[0m[32ma[0m[32mm[0m[32me[0m[32md[0m[32m [0m[32m"[0m[32mf[0m[32mo[0m[32mo[0m[32m"[0m[32m [0m[32mb[0m[32my[0m[32m [0m[32m3[0m[32m,[0m[32m [0m[32mb[0m[32mu[0m[32mt[0m[32m [0m[32mt[0m[32mh[0m[32me[0m[32m [0m[32mv[0m[32ma[0m

In [205]:
chat = Chat.from_config('debug',
                        {'max_tokens': 10, 'stop': ['SOLUTION PART 2']},
                        streaming=True,
                        callback_manager=manager)
chat.chat

ChatOpenAI(verbose=True, callback_manager=<langchain.callbacks.base.CallbackManager object at 0x7fdb286b7c40>, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', model_kwargs={'temperature': 0.0, 'top_p': 0.99, 'frequency_penalty': 0.2, 'presence_penalty': 0.0, 'logit_bias': {37811: -100, 27901: -50}, 'stop': ['SOLUTION PART 2']}, openai_api_key=None, request_timeout=60, max_retries=6, streaming=True, n=1, max_tokens=10)

In [164]:
chat.model_kwargs

{'model_name': 'gpt-3.5-turbo',
 'temperature': 0.0,
 'top_p': 0.99,
 'max_tokens': 10,
 'frequency_penalty': 0.2,
 'presence_penalty': 0.0,
 'logit_bias': {37811: -100, 27901: -50},
 'stop': ['SOLUTION PART 2']}

In [165]:
chat.chat.streaming

True

In [166]:
chat.chat.callback_manager.handlers

[<langchain.callbacks.streaming_stdout.StreamingStdOutCallbackHandler at 0x7fdb29132910>]

In [167]:
res = chat.reply(code='foo *= 3', global_vars='{}', local_vars='{}', 
                 next_line='?', question='Why does this throw a name error?')

SOLUTION PART 1:
The code is trying

In [168]:
res

AIMessage(content='SOLUTION PART 1:\nThe code is trying', additional_kwargs={})

In [170]:
print(chat.history())

System: You are an incredibly effective AI programming assistant. You have in-depth knowledge across a broad range of sub-fields within computer science, software development, and data science, and your goal is to help Python programmers resolve their most challenging bugs.

Human: This code snippet is not working as expected. Help me debug it. First read my question, then examine the snippet of code that is causing the issue and look at the values of the local and global variables. Your response must have exactly two parts. In the section titled SOLUTION PART 1, use plain English to explain what the problem is and how to fix it (if you don't know what the problem is, SOLUTION PART 1 should instead list a few possible causes or things I could try in order to identify the issue). In the section titled SOLUTION PART 2, write a corrected version of the input code snippet (if you don't know, SOLUTION PART 2 should say None). SOLUTION PART 2 must contain only python code - there must not be

In [111]:
handler = chat.chat.callback_manager.handlers[0]

In [112]:
handler.ignore_llm

False

In [113]:
handler.on_llm_new_token??

In [57]:
ChatOpenAI(stop=['a'])._default_params

{'model': 'gpt-3.5-turbo',
 'request_timeout': 60,
 'max_tokens': None,
 'stream': False,
 'n': 1,
 'stop': ['a']}

## Custom Chain for multi part response

Trying to see if we can/should use chains for streaming multi-part responses. (When not streaming, we could often just ask for a json/yaml response, though this often seems to mess up code formatting, at least with chatgpt (turbo model). Two key considerations that convinced me this might be necessary vs just regex/str parsing the results:

- this assumes we don't want to print certain sep chars/section titles, but ONLY use them for parsing. We can't just parse after the fact because we'd have already printed those bits by then.
- we also want to force the model to respond to include each section. Less of a concern with the turbo/4 models, but before it was common to see gpt leave out some sections. If we just requested a static template output, this is a higher risk. If we omit section titles in favor of a some rare sep char to remove (e.g. "###'), it would be harder to validate which section(s), if any, are missing.

In [254]:
from langchain.chains import LLMChain
from langchain.chains.base import Chain
from langchain.llms import OpenAI
from langchain.schema import LLMResult, AgentAction, AgentFinish
from typing import Dict, List, Any, Union

In [524]:
class MultiPartResponseChain(Chain):
    chains: List[Chain]
        
    @property
    def input_keys(self) -> List[str]:
        return self.chains[0].input_keys
    
    @property
    def output_keys(self) -> List[str]:
        return ['text']
    
    def _call(self, inputs: Dict[str, str]) -> Dict[str, str]:
        parts = []
        prompt_str = self.chains[0].prompt.format(**inputs)
        res = self.chains[0](inputs)['text']
        prompt_str = prompt_str + res
        parts.append(res)
        print(prompt_str)
        print(parts)
        print('-' * 79)
        for chain in self.chains[1:]:
            res = chain.run(partial_response=prompt_str)
            parts.append(res)
            prompt_str = chain.prompt.format(partial_response=prompt_str) + res
            print(prompt_str)
            print(parts)
            print('-' * 79)
        return {'text': parts}

In [530]:
# tmp_llm = OpenAI(model_name='text-curie-001', max_tokens=20)
tmp_chat = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0, 
                      max_tokens=20)

In [526]:
chain0 = LLMChain(
    prompt=PromptTemplate(input_variables=['animal'],
                          template='I want to know some facts about {animal}s. Answer each question on a new line with only a few words. What sound do they make?'), 
    llm=tmp_chat
)

chains = [chain0] + [
    LLMChain(
        prompt=PromptTemplate(input_variables=['partial_response'],
                              template='{partial_response}\n\nWhat color are they typically?'), 
        llm=tmp_chat
    ),
    LLMChain(
        prompt=PromptTemplate(input_variables=['partial_response'],
                              template='{partial_response}\n\nHow many years do they live on average?'), 
        llm=tmp_chat
    ),
    LLMChain(
        prompt=PromptTemplate(input_variables=['partial_response'],
                              template='{partial_response}\n\nWhat is their favorite food?'), 
        llm=tmp_chat
    )
]


# BELOW: old non-chat chains.
# chain0 = LLMChain(
#     prompt=PromptTemplate(input_variables=['animal'],
#                           template='I want to know some facts about {animal}s. Answer each question on a new line with only a few words. What sound do they make?'), 
#     llm=tmp_llm
# )

# chains = [chain0] + [
#     LLMChain(
#         prompt=PromptTemplate(input_variables=['partial_response'],
#                               template='{partial_response}\n\nWhat color are they typically?'), 
#         llm=tmp_llm
#     ),
#     LLMChain(
#         prompt=PromptTemplate(input_variables=['partial_response'],
#                               template='{partial_response}\n\nHow many years do they live on average?'), 
#         llm=tmp_llm
#     ),
#     LLMChain(
#         prompt=PromptTemplate(input_variables=['partial_response'],
#                               template='{partial_response}\n\nWhat is their favorite food?'), 
#         llm=tmp_llm
#     )
# ]

In [527]:
mchain = MultiPartResponseChain(chains=chains)

In [528]:
res = mchain.run(animal='duck')

I want to know some facts about ducks. Answer each question on a new line with only a few words. What sound do they make?

Ducks make a quack.
['\n\nDucks make a quack.']
-------------------------------------------------------------------------------
I want to know some facts about ducks. Answer each question on a new line with only a few words. What sound do they make?

Ducks make a quack.

What color are they typically?

Most ducks are brown, but some are white.
['\n\nDucks make a quack.', '\n\nMost ducks are brown, but some are white.']
-------------------------------------------------------------------------------
I want to know some facts about ducks. Answer each question on a new line with only a few words. What sound do they make?

Ducks make a quack.

What color are they typically?

Most ducks are brown, but some are white.

How many years do they live on average?

Ducks typically live about six years.
['\n\nDucks make a quack.', '\n\nMost ducks are brown, but some are white.',

In [529]:
res

['\n\nDucks make a quack.',
 '\n\nMost ducks are brown, but some are white.',
 '\n\nDucks typically live about six years.',
 '\n\nDucks love to eat bugs, worms, and other small animals.']

## Signature surgery

Inintial prototype for inserting fields into signature and docstring.

In [107]:
class Foo:
    def __init__(self, fields, name2fields):
        self.reply = self._make_func(self._reply, fields)
        for k, v in name2fields.items():
            setattr(self, k, self._make_func(self._reply, v))

    def _reply(self, **kwargs):
        print('Calling _reply')
        return {'kwargs': kwargs, 'completion': 'new text...'}
    
    def _make_func(self, func, fields):
        # In practice I think langchain checks for this anyway if we ask for a
        # completion, but outside of that context typecheck would be necessary
        # because otherwise we can provide no kwargs and _func won't complain. 
        @typecheck(**{f: str for f in fields})
        @wraps(func)
        def wrapper(**kwargs):
            return func(**kwargs)
        
        sig = signature(wrapper)
        params_ = {field: Parameter(field, Parameter.KEYWORD_ONLY)
                   for field in fields}
        wrapper.__signature__ = sig.replace(parameters=params_.values())
        return wrapper

In [100]:
f = Foo(
    ['a', 'dog', 'x'],
    {'question': ['fact', 'question', 'answer'],
     'statement': ['salutation', 'name']}
)

In [101]:
f.reply

<function __main__.Foo._reply(*, a, dog, x)>

In [102]:
f.question

<function __main__.Foo._reply(*, fact, question, answer)>

In [103]:
f.statement

<function __main__.Foo._reply(*, salutation, name)>

In [104]:
with assert_raises(TypeError):
    f.statement()

As expected, got TypeError(missing a required argument: 'salutation').


In [105]:
f.statement(salutation='hi', name='harry')

Calling _reply


{'kwargs': {'salutation': 'hi', 'name': 'harry'}, 'completion': 'new text...'}

In [106]:
f.question(fact='birds are sad', question='why?', answer='yes')

Calling _reply


{'kwargs': {'fact': 'birds are sad', 'question': 'why?', 'answer': 'yes'},
 'completion': 'new text...'}

## Debug scratch

See if we can use frames to identify whether we need to provide context for a user message (i.e. if frame has changed since we last did).

In [4]:
from roboduck.debugger import duck

In [70]:
def binary_search(x, nums):
    if not nums:
        return -1
    duck(backend='repeat')
    mid = len(nums) // 2
    if x == nums[mid]:
        return x
    if x > nums[mid]:
        return binary_search(x, nums[mid + 1:])
    if x < nums[mid]:
        return binary_search(x, nums[:mid])

In [71]:
nums = [33, 44, 55, 66, 77, 88, 99, 111]

In [72]:
binary_search(3, nums)

> <ipython-input-70-8a37149461d4>(5)binary_search()
-> mid = len(nums) // 2
>>> l .
  1  	def binary_search(x, nums):
  2  	    if not nums:
  3  	        return -1
  4  	    duck(backend='repeat')
  5  ->	    mid = len(nums) // 2
  6  	    if x == nums[mid]:
  7  	        return x
  8  	    if x > nums[mid]:
  9  	        return binary_search(x, nums[mid + 1:])
 10  	    if x < nums[mid]:
 11  	        return binary_search(x, nums[:mid])
>>> y?
next line:     mid = len(nums) // 2
[32m[Duck] [0m[32m"[0m[32m"[0m[32m"[0m>>> n
> <ipython-input-70-8a37149461d4>(6)binary_search()
-> if x == nums[mid]:
>>> n
> <ipython-input-70-8a37149461d4>(8)binary_search()
-> if x > nums[mid]:
>>> n
> <ipython-input-70-8a37149461d4>(10)binary_search()
-> if x < nums[mid]:
>>> n
> <ipython-input-70-8a37149461d4>(11)binary_search()
-> return binary_search(x, nums[:mid])
>>> l .
  6  	    if x == nums[mid]:
  7  	        return x
  8  	    if x > nums[mid]:
  9  	        return binary_search(x, nums[

BdbQuit: 

In [13]:
binary_search(33, nums)

33

In [14]:
binary_search(39, nums)

-1

In [15]:
binary_search(111, nums)

111

In [16]:
binary_search(112, nums)

-1

In [73]:
def test():
    for i in range(5):
        print(i)
        duck(backend='repeat')

In [74]:
test()

0
> <ipython-input-73-943befe6b744>(2)test()
-> for i in range(5):
>>> i
0
>>> l .
  1  	def test():
  2  ->	    for i in range(5):
  3  	        print(i)
  4  	        duck(backend='repeat')
[EOF]
>>> y?
frmae_id 140192236901280
next line:     for i in range(5):
[32m[Duck] [0m[32m"[0m[32m"[0m[32m"[0m>>> n
> <ipython-input-73-943befe6b744>(3)test()
-> print(i)
>>> i
1
>>> y?
frmae_id 140192236901280
next line:         print(i)
[32m[Duck] [0m[32m"[0m[32m"[0m[32m"[0m>>> n
1
> <ipython-input-73-943befe6b744>(4)test()
-> duck(backend='repeat')
>>> i
1
>>> n
> <ipython-input-73-943befe6b744>(2)test()
-> for i in range(5):
>>> i
1
>>> l .
  1  	def test():
  2  ->	    for i in range(5):
  3  	        print(i)
  4  	        duck(backend='repeat')
[EOF]
>>> i
1
>>> y?
frmae_id 140192236901280
next line:     for i in range(5):
[32m[Duck] [0m[32m"[0m[32m"[0m[32m"[0m>>> n
> <ipython-input-73-943befe6b744>(3)test()
-> print(i)
>>> l .
  1  	def test():
  2  	    for i in ra

BdbQuit: 

## Test docstring/signature

In [16]:
from roboduck.debugger import DuckDB

In [9]:
foo(prompt_name='debug_full', chat_class='abc')

{'prompt_name': 'debug_full', 'chat_class': 'abc'}


In [12]:
@add_docstring(DuckDB.__init__)
def foo(**kwargs):
    print(kwargs)

In [None]:
foo()

## Scratch

In [27]:
from roboduck.langchain.chat import Chat

In [28]:
chat = Chat.from_config('debug')

In [29]:
msg = chat.user_message(
    code='foo.sense()',
    global_vars='{"a": 6}', local_vars='{True: False}', 
    next_line='return None', question='Why?'
)

In [30]:
print(msg.content)

I'm debugging some code that is not working as expected and I need your help. First read my question, then examine the problematic code snippet and the current program state. Your response must have exactly two sections (1. natural language explanation, and 2. code) separated by an empty line. It should appear to the user as a single section, however - do NOT include section titles of any kind. In section 1, use plain English to answer my question. If you don't know the answer, section 1 should instead list a few possible explanations or actions I could take in order to identify the issue. If it would contribute to a more helpful answer, use section 2 to provide a corrected version of the input code snippet (leave section 2 empty otherwise). If section 2 is not empty, it must must be entirely enclosed in one pair of triple backticks ("```") and contain only python code - it cannot include any English explanation outside of code comments or docstrings.

QUESTION:
Why?

CODE SNIPPET:
foo