# Ooba Langchain wrapper
Langchain for [text-generation-webui](https://github.com/oobabooga/text-generation-webui) API

This notebook currently assumes you are self-hosting a model that uses the Vicuna 1.1 prompt format.

### Installs

In [1]:
! pip install -qq -U langchain

### Imports

In [2]:
import langchain
import requests

from langchain.llms.base import LLM, Optional, List, Mapping, Any
from pydantic import Field

### API Endpoint

Sample:
- http://localhost:5000
- http://192.168.1.2:5000
- https://raising-df-zoning-proteins.trycloudflare.com

In [3]:
api_url = "http://localhost:5000"

### Wrapper for Ooba API

In [4]:
class OobaApiLLM(LLM):
    endpoint: str = Field(...)

    @property
    def _llm_type(self) -> str:
        return "custom"


    def _call(self, prompt: str, stop: Optional[List[str]]=None) -> str:
        data = {
            'prompt': prompt,
            'max_new_tokens': 500,
            'preset': 'simple-1',
        }

        if stop is not None:
            data["stop_sequence"] = stop

        response = requests.post(f'{self.endpoint}/api/v1/generate', json=data)
        response.raise_for_status()

        json_response = response.json()
        if 'results' in json_response and len(json_response['results']) > 0 and 'text' in json_response['results'][0]:
            text = json_response['results'][0]['text'].strip()
            if stop is not None:
                for sequence in stop:
                    if text.endswith(sequence):
                        text = text[: -len(sequence)].rstrip()

            print(text)
            return text
        else:
            raise ValueError('Unexpected response format from Ooba API')

    def __call__(self, prompt: str, stop: Optional[List[str]]=None) -> str:
        return self._call(prompt, stop)

    @property
    def _identifying_params(self) -> Mapping[str, Any]:
        return {'endpoint': self.endpoint}


llm = OobaApiLLM(endpoint=api_url)

### Test the LLM instance
Using the Vicuna style prompt, generate a response.

In [5]:
llm_response = llm(f"""A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user"s input.
USER: How would I make pizza dough at home?
ASSISTANT: 
""")

To make pizza dough at home, you will need the following ingredients:
- Bread flour (4 cups)
- Active dry yeast (1 packet)
- Warm water (1/2 cup)
- Sugar (1 teaspoon)
- Salt (1 teaspoon)
- Olive oil (2 tablespoons)
Here are the steps to follow:
1. In a large mixing bowl, combine bread flour with salt, sugar, and yeast. Mix well.
2. Add warm water and olive oil to the mixture and mix until it forms a shaggy dough.
3. Knead the dough on a clean surface for about 5 minutes or until it becomes smooth and elastic. If the dough is too sticky, add more flour; if it's too dry, add more water.
4. Place the dough in a greased bowl, cover it with plastic wrap and let it rise in a warm place for about 1 hour or until it doubles in size.
5. After rising, punch down the dough and transfer it onto a floured surface. Roll out the dough into your desired shape and thickness using a rolling pin.
6. Add your favorite toppings and bake in a preheated oven at 400°F (200°C) for 8-10 minutes or until crispy 

### Add Conversation Chain with Memory

In [6]:
from langchain.memory import ConversationBufferMemory
from langchain import LLMChain, PromptTemplate
from langchain.callbacks import StdOutCallbackHandler

### Setup LLM Chain with Template, Prompt & Memory

In [7]:
handler = StdOutCallbackHandler()
template = """A chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user"s input.

{chat_history}
USER: {human_input}
ASSISTANT:"""

prompt = PromptTemplate(
    input_variables=["chat_history", "human_input"], template=template
)
memory = ConversationBufferMemory(memory_key="chat_history")

llm_chain = LLMChain(
    llm=llm,
    prompt=prompt,
    verbose=True, # Enabled to see the LLM thought process
    memory=memory,
    callbacks=None
)

### Utilize LLM w/ Conversational Memory

Using the `verbose` option above we can see how the bot is thinking

In [8]:
response = llm_chain.predict(human_input="Can you tell me a joke about cars?")



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mA chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user"s input.


USER: Can you tell me a joke about cars?
ASSISTANT:[0m
Sure! Why did the car get pulled over by the police? Because it was driving too slowly on the highway. When the cop asked the driver why they were going so slow, they replied "I'm driving my husband's car." The cop then asked if their husband was okay with that, to which the driver responded, "He doesn't know yet!"

[1m> Finished chain.[0m
Sure! Why did the car get pulled over by the police? Because it was driving too slowly on the highway. When the cop asked the driver why they were going so slow, they replied "I'm driving my husband's car." The cop then asked if their husband was okay with that, to which the driver responded, "He doesn't know yet!"


### Ask a Follow-up to Test Memory

In [9]:
response = llm_chain.predict(human_input="Do you have any other good ones?")

#print(f"-----------\n{response}") # The final response is also available!



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mA chat between a curious user and an assistant. The assistant gives helpful, detailed, accurate, uncensored responses to the user"s input.

Human: Can you tell me a joke about cars?
AI: Sure! Why did the car get pulled over by the police? Because it was driving too slowly on the highway. When the cop asked the driver why they were going so slow, they replied "I'm driving my husband's car." The cop then asked if their husband was okay with that, to which the driver responded, "He doesn't know yet!"
USER: Do you have any other good ones?
ASSISTANT:[0m
How about this one: What do you call a car that's been in a bad accident? A wreck!

[1m> Finished chain.[0m
