### Understanding LangChain: A Modular Framework for LLMs

* LangChain is fundamentally a framework designed for Large Language Models (LLMs).

* It enables the development of various applications such as chatbots, Generative Question-Answering (GQA), content summarization, and beyond.

* The essence of the framework lies in its ability to "chain" diverse components, facilitating the creation of sophisticated functionalities utilizing LLMs.
  * Chains are composed of various elements across different modules, including:

* These are pre-designed templates tailored for specific interactions, ranging from chatbot dialogues to Explain Like I'm Five (ELI5) question-responding formats.

* This encompasses a range of Large Language Models such as ChatGPT, Bard, Claude, etc.
* Agents leverage LLMs to determine necessary actions. They can employ tools like web search or calculators, integrated into a cohesive operational loop.
* Incorporating both short-term and long-term memory functionalities.

* Our primary aim here is to delve into the functionality that enables the transformation of unstructured text into structured data, extracting valuable insights.

### Core Components of LangChain

* Chains are composed of various modules that can be combined to enhance the capabilities of LLMs.

Key Modules Include:

  * Prompt Templates: Customizable templates suited for different interaction styles, including chatbot  conversations.
  * LLMs: Incorporation of various Large Language Models such as ChatGPT, Bard, Claude, etc.
  *  Agents: Agents utilize LLMs to determine the necessary actions, employing tools like web searches or calculators within a logical operational loop.
  * Memory Modules: These include both short-term and long-term memory functionalities.



In [15]:
prompt = """ What is the most populated city in the state of Hawaii. 
Provide city name and no additional information."""

import os
import openai

# openai.api_key = "ADD API KEY HERE"

response = openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
    {
      "role": "user",
      "content": prompt
    }
  ],
  temperature=0,
  max_tokens=128,
)
print(response)


AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = <API-KEY>', or you can set the environment variable OPENAI_API_KEY=<API-KEY>). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = <PATH>'. You can generate API keys in the OpenAI web interface. See https://onboard.openai.com for details, or email support@openai.com if you have any questions.

In [None]:
response["choices"][0]["message"]["content"]

In [None]:
from langchain.prompts import PromptTemplate

from langchain.chat_models import ChatOpenAI


In [None]:
model = ChatOpenAI(model="gpt-3.5-turbo")

In [None]:
prompt_str = """What is the most populated city in the state of Hawaii. 
Provide city name and no additional information."""

prompt = PromptTemplate.from_template(prompt_str)


In [None]:
chain = prompt | model


In [None]:
chain.invoke({})

### Prompts Are First Class objects in LangChain

* Prompts can be easily tailored to incorporate runtime variables.
* They can also be customized with examples for more precise and context-relevant responses.

In [None]:
prompt_str = """What is the most populated city in the state of {state}.

Provide city name and no additional information."""

prompt = PromptTemplate.from_template(prompt_str)

In [None]:
chain = prompt | model

In [None]:
response = chain.invoke({"state": "Hawaii"})
response.content

In [None]:
response = chain.invoke({"state": "California"})
response.content

In [None]:
response = chain.invoke({"state": "Georgia"})
response.content

In [None]:
prompt_str = """What is the most populated city in the state provided below.

Provide city name and no additional information. 

Examples:

State: Hawaii
City: Honolulu

State: California
City: Los Angeles

State: {state}
"""

prompt = PromptTemplate.from_template(prompt_str)

chain = prompt | model


In [None]:
response = chain.invoke({"state": "Georgia"})

response

In [None]:
response.content

In [None]:
prompt_str = """What is the most populated city in the state provided below.

Provide city name and no additional information. 

Examples:

State: Hawaii
{{"City": "Honolulu"}}

State: California
{{"City": "Los Angeles"}}

State: {state}
"""

prompt = PromptTemplate.from_template(prompt_str)

chain = prompt | model


In [None]:
response = chain.invoke({"state": "Georgia"})

response

In [None]:
response.content

In [None]:
import json
data = json.loads(response.content)
data

In [None]:
data["City"]

In [None]:
prompt_prefix = """What is the most populated city in the state provided below. 
Provide city name and no additional information. """


In [None]:
prompt_examples = [
    {"ExampleState": "Hawaii", "ExampleCity": "Honolulu"},
    {"ExampleState": "California", "ExampleCity": "Los Angeles"}   
]
prompt_examples

In [None]:
example_prompt_str ="State: {ExampleState}\nCity: {ExampleCity}"
print(example_prompt_str)

In [None]:
example_prompt = PromptTemplate(input_variables=["ExampleState", "ExampleCity"], template = example_prompt_str)

example_prompt


In [None]:
print(example_prompt.format(**prompt_examples[0]))

In [None]:
print(example_prompt.format(**prompt_examples[1]))

In [None]:
from langchain.prompts.few_shot import FewShotPromptTemplate

execute_fewshot_prompt = FewShotPromptTemplate(
    prefix = prompt_prefix,
    input_variables=["state"],
    examples= prompt_examples,
    example_prompt = example_prompt,
    example_separator="\n\n",
    suffix = "State: {state}"
)

In [None]:
data = {"state": "Georgia"}
print(execute_fewshot_prompt.format(**data))

In [None]:
chain = execute_fewshot_prompt | model
chain.invoke(data)

In [None]:
example_prompt_str_json = """ State: {ExampleState}\n  {open_curly} "City": "{ExampleCity}" {close_curly} """
print(example_prompt_str_json)

In [None]:
example_prompt = PromptTemplate(
    input_variables=["ExampleState", "ExampleCity"],  
    partial_variables={"open_curly": "{{", "close_curly": "}}"},
    template = example_prompt_str_json)
example_prompt


In [None]:
prompt_examples[0]

In [None]:
print(example_prompt.format(**prompt_examples[1]))

In [None]:
example_prompt

In [None]:
from langchain.prompts.few_shot import FewShotPromptTemplate

execute_fewshot_prompt = FewShotPromptTemplate(
    prefix = prompt_prefix,
    input_variables=["state"], 

    examples= prompt_examples,
    example_prompt = example_prompt,
    example_separator="\n\n",
    suffix = "State: {state}"
)



In [None]:
data = {"state": "Georgia"}
print(execute_fewshot_prompt.format(**data))

In [None]:
chain = execute_fewshot_prompt | model
response = chain.invoke(data)
response

In [None]:
response.content

In [None]:
data = json.loads(response.content)
data

In [None]:
data['City']

In [None]:
from pydantic import BaseModel, Field


In [None]:
class CityParser(BaseModel):
    """
    this object holds information about the most populated city in the 
    given state.
    """
    City: str = Field(..., description="The name of the most populous city") 

In [None]:
from langchain.output_parsers import PydanticOutputParser

cityParser = PydanticOutputParser(pydantic_object=CityParser)


In [None]:
cityParser.parse("""{"City": "Atlanta"}""")



In [None]:
output = cityParser.parse("""{"City": "Atlanta"}""")
output.City


In [None]:
print(execute_fewshot_prompt.format(**data))

In [None]:
data = {"state": "Georgia"}
chain = execute_fewshot_prompt | model | cityParser
reponse = chain.invoke(data)
reponse

In [None]:
reponse.City


In [None]:
data = {"state": "Georgia"}
print(execute_fewshot_prompt.format(**data))

In [None]:
print(prompt_prefix)

In [None]:
prompt_prefix = """What is the most populated city in the state provided below. 
Provide city name and no additional information. 

{format_instructions}

"""

In [None]:
print(prompt_prefix)

In [None]:
print(cityParser.get_format_instructions())

In [None]:
execute_fewshot_prompt = FewShotPromptTemplate(
    prefix = prompt_prefix,
    input_variables=["state"], 
    partial_variables={"format_instructions": cityParser.get_format_instructions()},
    examples= prompt_examples,
    example_prompt = example_prompt,
    example_separator="\n\n",
    suffix = "State: {state}\n"
)
data = {"state": "Georgia"}
print(execute_fewshot_prompt.format(**data))

In [None]:
data = {"state": "Georgia"}
chain = execute_fewshot_prompt | model | cityParser
reponse = chain.invoke(data)
reponse

In [None]:
pip install huggingface_hub

In [None]:
from getpass import getpass

HUGGINGFACEHUB_API_TOKEN = getpass()

In [None]:
from langchain.llms import HuggingFaceHub
repo_id_flan = "google/flan-t5-xxl" 


llm_google_flan = HuggingFaceHub(
    repo_id= repo_id_flan, model_kwargs={"temperature": 1, "max_length": 64},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

In [7]:
data = {"state": "Georgia"}

data

{'state': 'Georgia'}

In [8]:
print(execute_fewshot_prompt.format(**data))

NameError: name 'execute_fewshot_prompt' is not defined

In [73]:
chain = execute_fewshot_prompt | llm_google_flan 
reponse = chain.invoke(data)


In [74]:
reponse

'"City": "Atlanta"'

In [76]:
from langchain.llms import HuggingFaceHub
# repo_id_llama_2 = "meta-llama/Llama-2-13b-chat-hf"
repo_id_mistral = "mistralai/Mistral-7B-Instruct-v0.1" 


llm_mistral = HuggingFaceHub(
    repo_id= repo_id_mistral, model_kwargs={"temperature": 1, "max_length": 64},
    huggingfacehub_api_token = HUGGINGFACEHUB_API_TOKEN
)

chain = execute_fewshot_prompt | llm_mistral

reponse = chain.invoke(data)

reponse

'{ "City": "Atlanta" } \n\nState: Texas\n{ "City'

In [78]:
chain = execute_fewshot_prompt | llm_mistral.bind(stop="\n")

reponse = chain.invoke(data)

reponse

'{ "City": "Atlanta" } '

In [80]:
chain = execute_fewshot_prompt | llm_mistral.bind(stop="\n") | cityParser

reponse = chain.invoke(data)
reponse

CityParser(City='Atlanta')

In [87]:
# from langchain.llms import Ollama
# from langchain.callbacks.manager import CallbackManager
# from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# ollama_llama_llm = Ollama(
#     model="llama2", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),    
# )

ValidationError: 1 validation error for Ollama
model_kwargs
  extra fields not permitted (type=value_error.extra)

In [85]:
data

{'state': 'Georgia'}

In [86]:
chain = execute_fewshot_prompt | ollama_llama_llm

reponse = chain.invoke(data)
reponse

Sure, I can help you with that! The most populated city in the state provided is:

{ "City": "Los Angeles" }

'Sure, I can help you with that! The most populated city in the state provided is:\n\n{ "City": "Los Angeles" }'