# Introduction
This notebook introduces the utilization of LangChain and LLama 2 to assist us in finding recipes from the internet and subsequently consolidating the ingredients into a shopping list.

The sequence of actions is as follows:

1. User: Requests the bot to find a recipe for XYZ.
2. Bot: Searches recipe sites based on user keywords and presents a list of options.
3. User: Chooses one of the recipes and asks the bot to provide a list of ingredients.
4. Bot: Presents the list of ingredients.
(optional) 
5. User: Repeats step 1 to discover more recipes.
6. User: Asks the bot to consolidate the ingredients.
7. Bot: Presents the consolidated ingredient shopping list.


*NOTE: This notebook employs the 13B model within LLama 2. Feel free to substitute the LLM with your preferred model.*

In [1]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

# retrieving model path
MODEL_PATH  = os.getenv('13B')
n_gpu_layers = 1  # Metal set to 1 is enough.
n_batch = 512  # Should be between 1 and n_ctx, consider the amount of RAM of your Apple Silicon Chip.

DEBUG = True

MODEL_PATH

'/Users/jeremylee/APIs/llama.cpp/models/13B/ggml-model-q4_0.bin'

In [2]:
# llm

from langchain.llms import LlamaCpp
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(
    model_path=MODEL_PATH,
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    n_ctx=4096,
    max_tokens=1024,
    temperature=0.0, # to ensure consistant output, set to zero
    f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
    callback_manager=callback_manager,
    verbose=DEBUG,
)


agent_callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

agent_llm = LlamaCpp(
    model_path=MODEL_PATH,
    n_gpu_layers=n_gpu_layers,
    n_batch=n_batch,
    n_ctx=4096,
    max_tokens=1024,
    temperature=0.0, # to ensure consistant output, set to zero
    f16_kv=True,  # MUST set to True, otherwise you will run into problem after a couple of calls
    callback_manager=agent_callback_manager,
    verbose=DEBUG,
)

llama.cpp: loading model from /Users/jeremylee/APIs/llama.cpp/models/13B/ggml-model-q4_0.bin
llama_model_load_internal: format     = ggjt v3 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 4096
llama_model_load_internal: n_embd     = 5120
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 40
llama_model_load_internal: n_head_kv  = 40
llama_model_load_internal: n_layer    = 40
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: n_gqa      = 1
llama_model_load_internal: rnorm_eps  = 5.0e-06
llama_model_load_internal: n_ff       = 13824
llama_model_load_internal: freq_base  = 10000.0
llama_model_load_internal: freq_scale = 1
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size =    0.11 MB
llama_model_load_internal: mem required  = 7024.01 MB (+ 3200.00 MB per state)
llama_new_context_with_model: kv self 

# LLMChains
For our task, we will employ three distinct LangChains:

1. GetRecipeUrlChain: This chain is tasked with discovering recipe URLs based on user input.
2. GetIngredientsFromContentChain: This chain is responsible for reading and extracting ingredients from the provided URL.
3. ConsolidateIngredientsChain: This chain handles the aggregation of all the ingredients searched by the user into a comprehensive shopping list.

### GetRecipeUrlChain
This chain uses a custom prompt template that find out the keyword of a recipe, call a `find_recipe` method, and then inject the result back to the prompt before asking llm to generate the output

*Note: the `find_recipe` method can also be use as one of the agent's tool*

In [3]:
### Inspired by https://jonathansoma.com/words/three-ways-to-scrape-langchain.html
### This function can be used as part of agent's tool.
### See https://python.langchain.com/docs/modules/agents/tools/custom_tools

from langchain.tools import tool
from pydantic import BaseModel
from pydantic import Field

import requests
from bs4 import BeautifulSoup

class SearchInput(BaseModel):
    query: str = Field(description="should be a search query")


@tool("find_recipe", return_direct=True, args_schema=SearchInput)
def find_recipe(query: str) -> str:
    """Based on search input, return a list of recipes"""

    url = f"https://rasamalaysia.com/?s={query}"

    # Make the request
    headers = { 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0' }
    response = requests.get(url, headers=headers)

    # Extract the link
    soup = BeautifulSoup(response.text)

    # Get all the URLS
    links = soup.select("[class='entry-title-link']")

    return str(links).replace("\n", "")

# Uncomment to test
# recipe_query = "rendang"
# find_recipe(recipe_query)

In [4]:
from langchain.prompts import StringPromptTemplate
from langchain import LLMChain
from langchain.llms.base import BaseLLM

class SearchRecipeTemplate(StringPromptTemplate):
    # this template gets user inputs, which is the keyword and then call 
    # external function, which in turn put it back to the template.
    # the prompt input expects keyword

    # the template to use
    template: str

    def format(self, **kwargs) -> str:
        keyword = kwargs.pop("keyword")
        
        text = find_recipe(keyword)
        kwargs["text"] = text
        return self.template.format(**kwargs)

class GetRecipeURLChain(LLMChain):
    """Chain can extracts any URL from a given string"""

    @classmethod
    def from_llm(cls, llm: BaseLLM, verbose: bool = True) -> LLMChain:
        """Input any text with url, returns url"""

        search_template = ("Based on the following text"
                            " extract their links and names: {text}"
                            " Return first 3 results, like:"
                            " name, url,"
                            " name, url,"
                            " name, url."
                            )
        prompt = SearchRecipeTemplate(
            template=search_template, 
            input_variables=["keyword"] # note, text will be placed by the SearchRecipeTemplate
        )

        return cls(prompt=prompt, llm=llm, verbose=DEBUG)

get_recipe_url_chain = GetRecipeURLChain.from_llm(llm)

# Uncomment to test
# url = get_recipe_url_chain.run(keyword="rendang")
# url

### GetIngredientsFromContentChain
This chain also uses a custom prompt template to access a website by calling `web_content_extractor` method, and then extract the ingredient list and insert the result back to the prompt before asking llm to generate the output

*Note: the `web_content_extractor` method can also be use as one of the agent's tool*

In [5]:
# Function to help extract content based on URL.
# NOTE: This can be part of Agent's tool or be called directly.
# In our case, we will call it directly from LLMChain instead

class URLInput(BaseModel):
    url: str = Field(description="must be be a URL")


@tool("get_url_content", return_direct=True, args_schema=URLInput)
def web_content_extractor(url: str) -> str:
    """Based on the given input URL, load and return the web content"""

    if(url[:4] != "http"):
        return "URL must start with http. Please try again"

    # Make the request
    headers = { 'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0' }
    response = requests.get(url, headers=headers)

    # Extract the link
    soup = BeautifulSoup(response.text, "html.parser")

    # Get all the URLS
    ingredients = soup.find_all("div",{"class":"wprm-recipe-ingredient-group"})
    ingredient_list = [ingredient_div.get_text() for ingredient_div in ingredients]

    return str(ingredient_list)

# Uncomment to test
# web_content_extractor("https://rasamalaysia.com/baked-cod/")

In [6]:
import re
from langchain.prompts import StringPromptTemplate

class GetRecipeFromUrlTemplate(StringPromptTemplate):
    # this template gets user inputs, which is the keyword and then call 
    # external function, which in turn put it back to the template.
    # the prompt input expects keyword

    # the template to use
    template: str

    def format(self, **kwargs) -> str:
        url = kwargs.pop("url")

        url_extract_pattern = "https?:\\/\\/(?:www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{1,256}\\.[a-zA-Z0-9()]{1,6}\\b(?:[-a-zA-Z0-9()@:%_\\+.~#?&\\/=]*)"
        formated_url = re.findall(url_extract_pattern, url)
        
        web_content = web_content_extractor(formated_url[0])
        print(web_content)
        kwargs["web_content"] = web_content
        return self.template.format(**kwargs)

class GetIngredientsFromContentChain(LLMChain):
    """Chain can extracts any URL from a given string"""

    @classmethod
    def from_llm(cls, llm: BaseLLM, verbose: bool = True) -> LLMChain:
        """Input content of webpage, output ingredient names and quantity and unit"""

        ingredient_extraction_template = (
                                          " You will list all the ingredients"
                                          " and their quantities based on this text: '{web_content}'"
                                          " Add ### line before listing the ingredients."
                                          " Add === line after listing all the ingredients."
                                          " List down the ingredients, like:"
                                          " ingredient A - X unit of measurement,"
                                          " ingredient B - Y unit of measurement,"
                                          " ingredient C - Z unit of measurement."
                                          )

        prompt = GetRecipeFromUrlTemplate(
            template=ingredient_extraction_template, 
            input_variables=["url"] # note, web_content will be replaced by GetRecipeFromUrlTemplate, which in turn replaces web_content
        )

        return cls(prompt=prompt, llm=llm, verbose=DEBUG)

get_ingredients_from_content_chain = GetIngredientsFromContentChain.from_llm(llm)

# Uncomment to test
# ingredients = get_ingredients_from_content_chain.run(url=" the url is https://rasamalaysia.com/beef-rendang-recipe-rendang-daging/>")
# ingredients

### ConsolidateIngredientsChain
Similar to the previous chain, this also uses custom prompt template.

*NOTE: We are supposed to use a database to store all the searched ingredients. However, I am using an array for convenience instead.*

In [7]:
# Replace this with a database for permanent storage instead
ingredients = []

@tool
def save_ingredients(ingredient: str):
    """save ingredients to memory"""
    ingredients.append(ingredient)

@tool
def get_ingredients(input: str) -> str:
    """get ingredients from memory"""
    ingredient_string = ""
    for i in ingredients:
        ingredient_string = ingredient_string + " " + i.replace("Ingredients:", "") 
    return ingredient_string


In [8]:
class GetIngredientsFromDBTemplate(StringPromptTemplate):

    # the template to use
    template: str

    def format(self, **kwargs) -> str:

        ingredients = get_ingredients('')

        kwargs["ingredients"] = ingredients
        return self.template.format(**kwargs)

class ConsolidateIngredientsChain(LLMChain):
    """Chain to consolidate ingredients"""

    @classmethod
    def from_llm(cls, llm: BaseLLM, verbose: bool = True) -> LLMChain:
        """Input a list of ingredients, output ingredient names and quantity and unit"""
        
        consolidate_ingredients_template = ("Are an expert in identifying and consolidating ingredients"
                                        " from this content: '{ingredients}'."
                                        " If there are similar ingredients, sum up their quantity."
                                        " For other ingredients, just return their name and quantity."
                                        " Do not suggest other ingredients that are not in the text."
                                        " Please identify and consolidate the ingredients, like:"
                                        " ingredient A (quantity),"
                                        " ingredient B (quantity)."
                                        )
        prompt = GetIngredientsFromDBTemplate(
            template=consolidate_ingredients_template,
            input_variables=['text']
        )

        return cls(prompt=prompt, llm=llm, verbose=DEBUG)

consolidate_ingredients_chain = ConsolidateIngredientsChain.from_llm(llm)


# Uncomment to test
# ingredients = ["""Ingredients:
# * 1 tablespoon oil
# * 1 1/2 tablespoons red curry paste (Mae Ploy)
# * 4 oz. big scallops
# * 4 oz. boneless and skinless chicken breast (cut into small cubes)
# * 80 ml. coconut milk
# * 1/4 cup water
# * 2 long beans (cut into 2-inch lengths)
# * 1/2 cup thickly sliced carrot
# * 1 kaffir lime leave (cut into fine thin strips, optional)
# * 1/4 teaspoon fish sauce
# * 2 teaspoons palm sugar or sugar"""
# ]

# print(get_ingredients(""))
# output = consolidate_ingredients_chain.run(text='')
# output

# Agent
We can then use an agent to help with our quest of finding and consolidating ingredients into a shopping list! 

Things to note:
1. I use a `BaseCallbackkHandler` to save the ingredients that `get_ingredients_from_content_chain` returns
2. The agent has 2 optional parameters to help handle edge case:
* `handle_parsing_errors=True` enables the agent to think about the error and refine its answer
* `max_iterations=5` ensures that the agent does not go into an infinite loop

In [9]:
# custom memory. LLama 2 as llm agent is still not able to hold a REACT 
# conversation well beyond 2 turns with long tokens. Modify it's memory so that 
# long replies are reduced.

from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory
from typing import Any, Dict

class CustomConversationBufferMemory(ConversationBufferWindowMemory):

    # overriding save_context method.
    def save_context(self, inputs: Dict[str, Any], outputs: Dict[str, str]) -> None:
        """Save context from this conversation to buffer."""
        input_str, output_str = self._get_input_output(inputs, outputs)
        
        if(len(output_str.split(" ")) > 40):
            # self.chat_memory.add_ai_message("[long reply]") 
            # don't save it
            pass
        else:
            self.chat_memory.add_user_message(input_str)
            self.chat_memory.add_ai_message(output_str) 

In [10]:
from langchain.agents import ZeroShotAgent, Tool, AgentExecutor
from langchain.memory import ConversationBufferMemory
from langchain.memory import ConversationBufferWindowMemory
from langchain.callbacks.base import BaseCallbackHandler

# refer to https://python.langchain.com/docs/modules/memory/agent_with_memory
# for agent to gain memory

class SaveIngredientsHandler(BaseCallbackHandler):
    def on_tool_start(self, serialized, input_str: str, **kwargs):
        """Run when tool starts running."""

    def on_tool_end(self, output: str, **kwargs):
        """Run when tool ends running."""

        ingredient = output.split("###")
        if len(ingredient) == 1:
            save_ingredients(output)
        else:
            just_ingredient = ingredient[1].split("===")
            save_ingredients(just_ingredient[0])

tools = [
    Tool(
        name="get_recipe_by_keyword",
        func=get_recipe_url_chain.run,
        description="""useful for finding recipes based \
            on user keywords""",
        return_direct = True,
    ),
    Tool(
        name="get_ingredients_by_url",
        func=get_ingredients_from_content_chain.run,
        description="""useful for extracting out ingredients \
            from the recipe url. \
            Save the ingredients after that""",
        callbacks=[SaveIngredientsHandler()],
        return_direct = True
    ),
    Tool(
        name="summarise_ingredients",
        func=consolidate_ingredients_chain.run,
        description="""useful when you want to summarise \
            all the ingredients after getting them from memory. \
            Do not use this to consolidate recipes.""",
        return_direct = True
    )
]


prefix = """You are a shopping assistant who is an expert at
    finding food ingredients based on user inputs.
    If there are multiple recipes, ask user to select one before proceeding. 
    Then extract out the ingredients of the selected recipe.
    Finally, consolidate the ingredients when user is done.
    If you are unsure of the next step, show your output and ask user for input.
    Do not role play in the conversation. For example, generating AI and Human 
    conversation is not allowed."""

suffix = """Begin!"

{chat_history}
Question: {input}
{agent_scratchpad}"""


agent_prompt = ZeroShotAgent.create_prompt(
    tools,
    prefix=prefix,
    suffix=suffix,
    input_variables=["input", "chat_history", "agent_scratchpad"],
)


# memory = CustomConversationBufferMemory(memory_key="chat_history")
memory = CustomConversationBufferMemory(k=1, memory_key="chat_history")
# memory = ConversationBufferMemory(memory_key="chat_history")
# memory = ConversationBufferWindowMemory(k=2, memory_key="chat_history")


In [11]:
llm_chain = LLMChain(llm=agent_llm, prompt=agent_prompt)


agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=DEBUG)

agent_chain = AgentExecutor.from_agent_and_tools(
    agent=agent, tools=tools, verbose=DEBUG, memory=memory,
    # handle_parsing_errors="Check your output and make sure it conforms!"
    handle_parsing_errors=True,
    max_iterations=5
)


# Inference!
Let's test our code

In [12]:
agent_chain.run(input="What are the ingredients in curry?")



[1m> Entering new AgentExecutor chain...[0m

Thought: Hmm, that's a good question. Let me think...

Action: get_recipe_by_keyword
Action Input: "curry"



llama_print_timings:        load time = 10879.74 ms
llama_print_timings:      sample time =    29.57 ms /    43 runs   (    0.69 ms per token,  1453.93 tokens per second)
llama_print_timings: prompt eval time = 10879.70 ms /   365 tokens (   29.81 ms per token,    33.55 tokens per second)
llama_print_timings:        eval time =  4192.93 ms /    42 runs   (   99.83 ms per token,    10.02 tokens per second)
llama_print_timings:       total time = 15167.05 ms


[32;1m[1;3m
Thought: Hmm, that's a good question. Let me think...

Action: get_recipe_by_keyword
Action Input: "curry"
[0m

[1m> Entering new GetRecipeURLChain chain...[0m
Prompt after formatting:
[32;1m[1;3mBased on the following text extract their links and names: [<a class="entry-title-link" href="https://rasamalaysia.com/green-curry/" rel="bookmark">Green Curry</a>, <a class="entry-title-link" href="https://rasamalaysia.com/red-curry-recipe/" rel="bookmark">Thai Red Curry</a>, <a class="entry-title-link" href="https://rasamalaysia.com/curry-meecurry-laksa-recipe/" rel="bookmark">Curry Laksa (Curry Mee)</a>, <a class="entry-title-link" href="https://rasamalaysia.com/thai-yellow-curry/" rel="bookmark">Yellow Curry</a>, <a class="entry-title-link" href="https://rasamalaysia.com/thai-recipe-chicken-panang-curry/" rel="bookmark">Panang Curry</a>, <a class="entry-title-link" href="https://rasamalaysia.com/green-curry-shrimp/" rel="bookmark">Green Curry Shrimp</a>, <a class="entry-


llama_print_timings:        load time = 16133.29 ms
llama_print_timings:      sample time =    64.33 ms /    93 runs   (    0.69 ms per token,  1445.72 tokens per second)
llama_print_timings: prompt eval time = 38697.47 ms /   991 tokens (   39.05 ms per token,    25.61 tokens per second)
llama_print_timings:        eval time =  8232.25 ms /    92 runs   (   89.48 ms per token,    11.18 tokens per second)
llama_print_timings:       total time = 47147.07 ms


'\n\nThe first three results are:\n\n1. Green Curry <https://rasamalaysia.com/green-curry/>\n2. Thai Red Curry <https://rasamalaysia.com/red-curry-recipe/>\n3. Curry Laksa (Curry Mee) <https://rasamalaysia.com/curry-laksa-recipe/>'

In [13]:
memory

CustomConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='What are the ingredients in curry?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe first three results are:\n\n1. Green Curry <https://rasamalaysia.com/green-curry/>\n2. Thai Red Curry <https://rasamalaysia.com/red-curry-recipe/>\n3. Curry Laksa (Curry Mee) <https://rasamalaysia.com/curry-laksa-recipe/>', additional_kwargs={}, example=False)]), output_key=None, input_key=None, return_messages=False, human_prefix='Human', ai_prefix='AI', memory_key='chat_history', k=1)

In [14]:
agent_chain.run(input="first option!")



[1m> Entering new AgentExecutor chain...[0m


Llama.generate: prefix-match hit


Thought: I will extract the ingredients from the first recipe.
Action: get_ingredients_by_url
Action Input: https://rasamalaysia.com/green-curry/[32;1m[1;3mThought: I will extract the ingredients from the first recipe.
Action: get_ingredients_by_url
Action Input: https://rasamalaysia.com/green-curry/[0m

[1m> Entering new GetIngredientsFromContentChain chain...[0m



llama_print_timings:        load time = 10879.74 ms
llama_print_timings:      sample time =    35.60 ms /    52 runs   (    0.68 ms per token,  1460.67 tokens per second)
llama_print_timings: prompt eval time =  6020.03 ms /   116 tokens (   51.90 ms per token,    19.27 tokens per second)
llama_print_timings:        eval time =  4187.66 ms /    51 runs   (   82.11 ms per token,    12.18 tokens per second)
llama_print_timings:       total time = 10321.96 ms


['▢ 1 1/2 tablespoons oil▢ 2 tbsp green curry paste (Maesri brand preferred)▢ 8 oz. chicken breast, cut into bite-sized pieces▢ 1/2 cup coconut milk▢ 1/2 cup water▢ 4 oz. bamboo shoot▢ 5 kaffir lime leaves (lightly bruised)▢ 2 red chilies (cut into thick strips)▢ 1 tablespoon fish sauce▢ 1 tablespoon sugar or palm sugar (preferred)▢ 1/4 cup Thai basil leaves']
Prompt after formatting:
[32;1m[1;3m You will list all the ingredients and their quantities based on this text: '['▢ 1 1/2 tablespoons oil▢ 2 tbsp green curry paste (Maesri brand preferred)▢ 8 oz. chicken breast, cut into bite-sized pieces▢ 1/2 cup coconut milk▢ 1/2 cup water▢ 4 oz. bamboo shoot▢ 5 kaffir lime leaves (lightly bruised)▢ 2 red chilies (cut into thick strips)▢ 1 tablespoon fish sauce▢ 1 tablespoon sugar or palm sugar (preferred)▢ 1/4 cup Thai basil leaves']' Add ### line before listing the ingredients. Add === line after listing all the ingredients. List down the ingredients, like: ingredient A - X unit of measure

Llama.generate: prefix-match hit




### Ingredients:

* 1 1/2 tablespoons oil
* 2 tbsp green curry paste (Maesri brand preferred)
* 8 oz. chicken breast, cut into bite-sized pieces
* 1/2 cup coconut milk
* 1/2 cup water
* 4 oz. bamboo shoot
* 5 kaffir lime leaves (lightly bruised)
* 2 red chilies (cut into thick strips)
* 1 tablespoon fish sauce
* 1 tablespoon sugar or palm sugar (preferred)
* 1/4 cup Thai basil leaves

===
[1m> Finished chain.[0m

Observation: [33;1m[1;3m

### Ingredients:

* 1 1/2 tablespoons oil
* 2 tbsp green curry paste (Maesri brand preferred)
* 8 oz. chicken breast, cut into bite-sized pieces
* 1/2 cup coconut milk
* 1/2 cup water
* 4 oz. bamboo shoot
* 5 kaffir lime leaves (lightly bruised)
* 2 red chilies (cut into thick strips)
* 1 tablespoon fish sauce
* 1 tablespoon sugar or palm sugar (preferred)
* 1/4 cup Thai basil leaves

===[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m



llama_print_timings:        load time = 16133.29 ms
llama_print_timings:      sample time =   111.21 ms /   162 runs   (    0.69 ms per token,  1456.74 tokens per second)
llama_print_timings: prompt eval time =  7058.44 ms /   241 tokens (   29.29 ms per token,    34.14 tokens per second)
llama_print_timings:        eval time = 12362.43 ms /   161 runs   (   76.79 ms per token,    13.02 tokens per second)
llama_print_timings:       total time = 19780.10 ms


'\n\n### Ingredients:\n\n* 1 1/2 tablespoons oil\n* 2 tbsp green curry paste (Maesri brand preferred)\n* 8 oz. chicken breast, cut into bite-sized pieces\n* 1/2 cup coconut milk\n* 1/2 cup water\n* 4 oz. bamboo shoot\n* 5 kaffir lime leaves (lightly bruised)\n* 2 red chilies (cut into thick strips)\n* 1 tablespoon fish sauce\n* 1 tablespoon sugar or palm sugar (preferred)\n* 1/4 cup Thai basil leaves\n\n==='

In [15]:
memory

CustomConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='What are the ingredients in curry?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe first three results are:\n\n1. Green Curry <https://rasamalaysia.com/green-curry/>\n2. Thai Red Curry <https://rasamalaysia.com/red-curry-recipe/>\n3. Curry Laksa (Curry Mee) <https://rasamalaysia.com/curry-laksa-recipe/>', additional_kwargs={}, example=False)]), output_key=None, input_key=None, return_messages=False, human_prefix='Human', ai_prefix='AI', memory_key='chat_history', k=1)

In [16]:
agent_chain.run(input="What is the recipe for Nasi Lemak?")



[1m> Entering new AgentExecutor chain...[0m


Llama.generate: prefix-match hit


Thought: I should ask the user which recipe they want to use.
Action: get_recipe_by_keyword
Action Input: Nasi Lemak


llama_print_timings:        load time = 10879.74 ms
llama_print_timings:      sample time =    28.01 ms /    40 runs   (    0.70 ms per token,  1427.91 tokens per second)
llama_print_timings: prompt eval time =   760.98 ms /    12 tokens (   63.41 ms per token,    15.77 tokens per second)
llama_print_timings:        eval time =  3154.09 ms /    39 runs   (   80.87 ms per token,    12.36 tokens per second)
llama_print_timings:       total time =  4002.26 ms


[32;1m[1;3mThought: I should ask the user which recipe they want to use.
Action: get_recipe_by_keyword
Action Input: Nasi Lemak[0m

[1m> Entering new GetRecipeURLChain chain...[0m
Prompt after formatting:
[32;1m[1;3mBased on the following text extract their links and names: [<a class="entry-title-link" href="https://rasamalaysia.com/nasi-lemak-recipe/" rel="bookmark">Nasi Lemak</a>, <a class="entry-title-link" href="https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/" rel="bookmark">Nasi Goreng (Indonesian Fried Rice)</a>, <a class="entry-title-link" href="https://rasamalaysia.com/nasi-ulam-recipe/" rel="bookmark">Nasi Ulam (Malaysian Mixed Herb Rice)</a>, <a class="entry-title-link" href="https://rasamalaysia.com/udang-masak-lemak-nenas-pineapple-prawn-curry/" rel="bookmark">Pineapple Prawn Curry (Udang Masak Lemak Nenas)</a>, <a class="entry-title-link" href="https://rasamalaysia.com/beef-rendang-recipe-rendang-daging/" rel="bookmark">Beef Rendang</a>, <a class="entry-

Llama.generate: prefix-match hit




The links and names of the first three results are:

1. Nasi Lemak - <https://rasamalaysia.com/nasi-lemak-recipe/>
2. Nasi Goreng (Indonesian Fried Rice) - <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>
3. Nasi Ulam (Malaysian Mixed Herb Rice) - <https://rasamalaysia.com/nasi-ulam-recipe/>
[1m> Finished chain.[0m

Observation: [36;1m[1;3m

The links and names of the first three results are:

1. Nasi Lemak - <https://rasamalaysia.com/nasi-lemak-recipe/>
2. Nasi Goreng (Indonesian Fried Rice) - <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>
3. Nasi Ulam (Malaysian Mixed Herb Rice) - <https://rasamalaysia.com/nasi-ulam-recipe/>[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m



llama_print_timings:        load time = 16133.29 ms
llama_print_timings:      sample time =    84.53 ms /   123 runs   (    0.69 ms per token,  1455.19 tokens per second)
llama_print_timings: prompt eval time = 38632.95 ms /  1074 tokens (   35.97 ms per token,    27.80 tokens per second)
llama_print_timings:        eval time = 10907.25 ms /   122 runs   (   89.40 ms per token,    11.19 tokens per second)
llama_print_timings:       total time = 49822.38 ms


'\n\nThe links and names of the first three results are:\n\n1. Nasi Lemak - <https://rasamalaysia.com/nasi-lemak-recipe/>\n2. Nasi Goreng (Indonesian Fried Rice) - <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>\n3. Nasi Ulam (Malaysian Mixed Herb Rice) - <https://rasamalaysia.com/nasi-ulam-recipe/>'

In [17]:
memory

CustomConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='What are the ingredients in curry?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe first three results are:\n\n1. Green Curry <https://rasamalaysia.com/green-curry/>\n2. Thai Red Curry <https://rasamalaysia.com/red-curry-recipe/>\n3. Curry Laksa (Curry Mee) <https://rasamalaysia.com/curry-laksa-recipe/>', additional_kwargs={}, example=False), HumanMessage(content='What is the recipe for Nasi Lemak?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe links and names of the first three results are:\n\n1. Nasi Lemak - <https://rasamalaysia.com/nasi-lemak-recipe/>\n2. Nasi Goreng (Indonesian Fried Rice) - <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>\n3. Nasi Ulam (Malaysian Mixed Herb Rice) - <https://rasamalaysia.com/nasi-ulam-recipe/>', additional_kwargs={}, example=False)]), output_key=None, input_key=None, return_messages=False, human_prefix='

In [18]:
agent_chain.run(input="second option")



[1m> Entering new AgentExecutor chain...[0m


Llama.generate: prefix-match hit


Thought: I should extract the ingredients from the second recipe
Action: get_ingredients_by_url
Action Input: <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>[32;1m[1;3mThought: I should extract the ingredients from the second recipe
Action: get_ingredients_by_url
Action Input: <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>[0m

[1m> Entering new GetIngredientsFromContentChain chain...[0m



llama_print_timings:        load time = 10879.74 ms
llama_print_timings:      sample time =    42.10 ms /    61 runs   (    0.69 ms per token,  1449.07 tokens per second)
llama_print_timings: prompt eval time =  6066.07 ms /   142 tokens (   42.72 ms per token,    23.41 tokens per second)
llama_print_timings:        eval time =  4938.55 ms /    60 runs   (   82.31 ms per token,    12.15 tokens per second)
llama_print_timings:       total time = 11141.33 ms


['▢ 1 clove shallot▢ 1 clove garlic▢ 1 red chili (seeded)▢ 1/2 teaspoon toasted belacan (terasi)▢ 1/2 teaspoon palm sugar▢ 1/2 tablespoon kecap manis▢ 8 oz. overnight rice▢ 1 fried egg (well done)▢ 2 tablespoons oil']
Prompt after formatting:
[32;1m[1;3m You will list all the ingredients and their quantities based on this text: '['▢ 1 clove shallot▢ 1 clove garlic▢ 1 red chili (seeded)▢ 1/2 teaspoon toasted belacan (terasi)▢ 1/2 teaspoon palm sugar▢ 1/2 tablespoon kecap manis▢ 8 oz. overnight rice▢ 1 fried egg (well done)▢ 2 tablespoons oil']' Add ### line before listing the ingredients. Add === line after listing all the ingredients. List down the ingredients, like: ingredient A - X unit of measurement, ingredient B - Y unit of measurement, ingredient C - Z unit of measurement.[0m


Llama.generate: prefix-match hit




Here is the list of ingredients based on the text you provided:

### Ingredients:

* 1 clove shallot - 1
* 1 clove garlic - 1
* 1 red chili (seeded) - 1
* 1/2 teaspoon toasted belacan (terasi) - 1/2
* 1/2 teaspoon palm sugar - 1/2
* 1/2 tablespoon kecap manis - 1/2
* 8 oz. overnight rice - 8 oz
* 1 fried egg (well done) - 1
* 2 tablespoons oil - 2

=== Total ingredients: 7
[1m> Finished chain.[0m

Observation: [33;1m[1;3m

Here is the list of ingredients based on the text you provided:

### Ingredients:

* 1 clove shallot - 1
* 1 clove garlic - 1
* 1 red chili (seeded) - 1
* 1/2 teaspoon toasted belacan (terasi) - 1/2
* 1/2 teaspoon palm sugar - 1/2
* 1/2 tablespoon kecap manis - 1/2
* 8 oz. overnight rice - 8 oz
* 1 fried egg (well done) - 1
* 2 tablespoons oil - 2

=== Total ingredients: 7[0m
[32;1m[1;3m[0m

[1m> Finished chain.[0m



llama_print_timings:        load time = 16133.29 ms
llama_print_timings:      sample time =   120.12 ms /   174 runs   (    0.69 ms per token,  1448.54 tokens per second)
llama_print_timings: prompt eval time =  5848.09 ms /   192 tokens (   30.46 ms per token,    32.83 tokens per second)
llama_print_timings:        eval time = 13323.64 ms /   173 runs   (   77.02 ms per token,    12.98 tokens per second)
llama_print_timings:       total time = 19567.87 ms


'\n\nHere is the list of ingredients based on the text you provided:\n\n### Ingredients:\n\n* 1 clove shallot - 1\n* 1 clove garlic - 1\n* 1 red chili (seeded) - 1\n* 1/2 teaspoon toasted belacan (terasi) - 1/2\n* 1/2 teaspoon palm sugar - 1/2\n* 1/2 tablespoon kecap manis - 1/2\n* 8 oz. overnight rice - 8 oz\n* 1 fried egg (well done) - 1\n* 2 tablespoons oil - 2\n\n=== Total ingredients: 7'

In [19]:
memory

CustomConversationBufferMemory(chat_memory=ChatMessageHistory(messages=[HumanMessage(content='What are the ingredients in curry?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe first three results are:\n\n1. Green Curry <https://rasamalaysia.com/green-curry/>\n2. Thai Red Curry <https://rasamalaysia.com/red-curry-recipe/>\n3. Curry Laksa (Curry Mee) <https://rasamalaysia.com/curry-laksa-recipe/>', additional_kwargs={}, example=False), HumanMessage(content='What is the recipe for Nasi Lemak?', additional_kwargs={}, example=False), AIMessage(content='\n\nThe links and names of the first three results are:\n\n1. Nasi Lemak - <https://rasamalaysia.com/nasi-lemak-recipe/>\n2. Nasi Goreng (Indonesian Fried Rice) - <https://rasamalaysia.com/nasi-goreng-indonesian-fried-rice/>\n3. Nasi Ulam (Malaysian Mixed Herb Rice) - <https://rasamalaysia.com/nasi-ulam-recipe/>', additional_kwargs={}, example=False)]), output_key=None, input_key=None, return_messages=False, human_prefix='

In [22]:
ingredients

[' Ingredients:\n\n* 1 1/2 tablespoons oil\n* 2 tbsp green curry paste (Maesri brand preferred)\n* 8 oz. chicken breast, cut into bite-sized pieces\n* 1/2 cup coconut milk\n* 1/2 cup water\n* 4 oz. bamboo shoot\n* 5 kaffir lime leaves (lightly bruised)\n* 2 red chilies (cut into thick strips)\n* 1 tablespoon fish sauce\n* 1 tablespoon sugar or palm sugar (preferred)\n* 1/4 cup Thai basil leaves\n\n',
 ' Ingredients:\n\n* 1 clove shallot - 1\n* 1 clove garlic - 1\n* 1 red chili (seeded) - 1\n* 1/2 teaspoon toasted belacan (terasi) - 1/2\n* 1/2 teaspoon palm sugar - 1/2\n* 1/2 tablespoon kecap manis - 1/2\n* 8 oz. overnight rice - 8 oz\n* 1 fried egg (well done) - 1\n* 2 tablespoons oil - 2\n\n']

#Lessons Learned

Here are the lessons that I have learned from this notebook. As a caveat, I am using Llama 2, a 13B parameter chat model that is quantized. This means that a different LLM model might yield different results.

## Llama 2 as an LLM
- This proof-of-concept has taught me that the LLM requires multiple iterations of prompting to generate a somewhat satisfactory outcome.
- It is also prone to generating random text even with minute changes to the prompt, such as an additional space.
- The model also does not follow instructions very well; for example, it often failed to generate the output based on your provided prompt. The more robust 70B model might follow instructions better. However, it is not supported by llama.cpp at this point, and the text generation might be too slow even for the most powerful Apple silicon chip.
- Similarly, when using the LLM as an agent, it tends to veer off a conversation after a few turns, highlighting the need for proper memory management.

## Langchain to Supercharge LLM
- Langchain is a promising framework to automate workflows, especially for instructions that are not well-structured.
- Langchain's agent has the potential to be a game changer in completing tasks specified by humans and non-humans. However, the current limiting factor lies in the robustness of the LLM used, which I believe will improve over time.
- One limiting aspect of the agent is that it might not always utilize the correct tool for a given task, as shown in this example. One approach to overcome this limitation is to create multiple agents that specialize in specific tasks and then control the workflow using more traditional engineering methods.

## Moving Forward
One of my hypotheses is that a more robust LLM will follow instructions more effectively, and this could be a potential follow-up task for this notebook.

Furthermore, it's important to note that this notebook is focusing purely on Langchain implementation, which does not necessarily represent the real-world norms of software development. A more comprehensive end-to-end approach might reveal additional insights and could be a potential project for the future.