<a href="https://colab.research.google.com/github/tykiww/llm_implementations/blob/main/langchain_exploration.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Langchain Prompt Templating Practice

Looking to create a structured prompt engine to stabilize the model persona.

For this example, I am creating a simple summarizer.

Inspo for this came from Sam Witteveen's Langchain Basics YT channel:
https://www.youtube.com/watch?v=J_0qvRt4LNk&list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ

I still need to get the hang of using google's flan-t5-xxl appropriately, but I will get there.

That isn't the point of this exercise anyways.

In [10]:
# Install packages
!pip -q install langchain huggingface_hub torch pyyaml

### Setup

In [2]:
# connect to google drive to retrieve credentials.
import yaml
from google.colab import drive

def google_drive():
    drive.mount('/content/drive',
                force_remount=True)

def load_yaml(pathname):
    with open(pathname, "r") as f:
        config = yaml.load(f, Loader=yaml.FullLoader)
    return config

google_drive()
config = load_yaml("drive/MyDrive/config.yaml")

Mounted at /content/drive


### Session Info

In [3]:
# what gpu (colab)?
from torch import cuda

def gpu_info():
    device = cuda.get_device_name()
    n_gpus = cuda.device_count()
    print(device+',', n_gpus)

gpu_info()

Tesla T4, 1


### Modeling

I tend to like keeping things as a cohesive unit, so everything is built as methods within a class

#### <u>Model Object Setup</u>

In [4]:
import os, re
from langchain.llms import HuggingFaceHub
from langchain import PromptTemplate, FewShotPromptTemplate
from langchain.chains import LLMChain

In [86]:
class LLModelNavigator:
    def __init__(self, token, modelname, model_kwargs):
        self.token = token
        self.modelname = modelname
        self.model_kwargs = model_kwargs

    def find_curly(self, text):
        pattern = r'{(.*?)}'
        match = re.search(pattern, text)
        if match:
            return match.group(1)
        else:
            return None

    def find_curly(self, text):
        pattern = r'{(.*?)}'
        matches = re.findall(pattern, text)
        return matches

    def set_tokens(self):
        os.environ['HUGGINGFACEHUB_API_TOKEN'] = self.token

    def set_template(self, prompt_template):
        prompt = PromptTemplate(
            input_variables=self.find_curly(prompt_template),
            template=prompt_template)
        return prompt

    def fewshot_template(self,kwargs):
        split = kwargs["text"].split(kwargs["text_separator"])

        prompt = FewShotPromptTemplate(
        examples=kwargs["examples"],
        example_prompt=kwargs["example_prompt"],
        prefix=split[0],
        suffix=split[1],
        input_variables=self.find_curly(kwargs["text"]),
        example_separator=kwargs["example_separator"]
        )
        return prompt

    def build_model(self,prompt):
        self.set_tokens()

        model = HuggingFaceHub(
            repo_id=self.modelname,
            model_kwargs=self.model_kwargs)
        chain = LLMChain(llm=model, prompt=prompt)

        return chain

#### <u>Model Object Build</u>

In [133]:
# set up model navigator
navigator = LLModelNavigator(
    token=config['tokens']['huggingface'],
    modelname="google/flan-t5-xxl", # Replace this field with similar sized model repos..
    model_kwargs={"temperature":.1,
                  "max_length":512,
                  "load_in_8bit":True,
                  "device_map": "auto"})


#### <u>General Templated Approach</u>

In [134]:
# create prompt in the navigator
prompt = navigator.set_template(prompt_template=
    """
    Summarize the following text:
    {answer}
    """)

# create chained model
llm = navigator.build_model(prompt)

In [135]:
# Not the greatest performance, but will do for now.

llm.run("""
 Peter and Elizabeth took a taxi to attend the night party in the city.
 While in the party, Elizabeth collapsed and was rushed to the hospital.
 Since she was diagnosed with a brain injury, the doctor told Peter to stay besides her until she gets well.
 Therefore, Peter stayed with her at the hospital for 3 days without leaving.
        """)

'Peter stayed with Elizabeth at the hospital for 3 days.'

#### <u>FewShot Templated Approach</u>

Now this one is way more interesting.

My guess is the structure gives more concrete guidance

In [136]:
# First, create the list of few shot examples.
examples = [
    {"word": "happy", "antonym": "sad"},
    {"word": "tall", "antonym": "short"},
    {"word": "big", "antonym": "small"}
]

# Next, create a generic example prompt with it
example_prompt = navigator.set_template(prompt_template=
    """
    Word: {word}
    Antonym: {antonym}\n
    """)

# After, create the actual prompt using the FewShot Template
prompt_text = """
Give the antonym of every input

Word: {input}\nAntonym:
"""

In [137]:
# Here is a superhero version for fun.
# Run this instead of the cell above for a different take.

# First, create the list of few shot examples.
examples = [
    {"superhero": "Superman", "villain": "Lex Luthor"},
    {"superhero": "Batman", "villain": "Joker"},
    {"superhero": "Spider-Man", "villain": "Green Goblin"}
]

# Next, create a generic example prompt with it
example_prompt = navigator.set_template(prompt_template=
    """
    Superhero: {superhero}
    Villain: {villain}\n
    """)

# After, create the actual prompt using the FewShot Template
prompt_text = """
Give the villain of every superhero input

Superhero: {input}\nVillain:
"""

In [138]:
# Set up the prompt in the navigator
fewshot_prompt = navigator.fewshot_template(
    kwargs={
        "examples": examples,
        "example_prompt": example_prompt,
        "text": prompt_text,
        "text_separator": "\n\n",
        "example_separator": "\n\n"
    }
)

# create chained model
llm = navigator.build_model(fewshot_prompt)


In [139]:
# not bad.
llm.run("The Flash").title()

'Zoom'