## LangChain Prompts & Example Selectors
- Prompts refer to the text that is sent to the language model for processing.
- They serve as instructions or queries that elicit specific responses from the model.
- Prompts can be simple or more instructional, depending on the desired output.

In [1]:
from dotenv import load_dotenv, dotenv_values
import google.generativeai as genai
from IPython.display import Markdown, display
import pandas as pd 
import os
load_dotenv()
my_api_key = os.getenv("GOOGLE_API_KEY") 
genai.configure(api_key=my_api_key)

#### String PromptTemplates
These prompt templates are used to format a single string, and generally are used for simpler inputs. For example, a common way to construct and use a PromptTemplate is as follows:

In [2]:
# Import Prompt
from langchain_core.prompts import PromptTemplate
from langchain_google_genai.llms import GoogleGenerativeAI

prompt_template = PromptTemplate.from_template("Tell me a joke about {topic}")

prompt = prompt_template.invoke({"topic": "cats"})
llm = GoogleGenerativeAI(model="models/text-bison-001")
result = llm.invoke(prompt)
print(result)

What do you call a cat that always gets what it wants?

A purr-suasive cat!


#### ChatPromptTemplates
These prompt templates are used to format a list of messages. These "templates" consist of a list of templates themselves. For example, a common way to construct and use a ChatPromptTemplate is as follows:

In [3]:
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "Tell me a joke about {topic}")
])

prompt = prompt_template.invoke({"topic": "cats"})
from langchain_google_genai.chat_models import  ChatGoogleGenerativeAI
llm = ChatGoogleGenerativeAI(model= "gemini-1.5-flash", temperature = 0.1) # "chat-bison@001"
result = llm.invoke(prompt)
print(result.content)


Why don't cats play poker? 

Because they always have an ace up their sleeve! 😹 



Setting convert_system_message_to_human to True is deprecated



In [4]:
from langchain_core.messages import HumanMessage, SystemMessage

result = llm.invoke(
    [
        SystemMessage(content="Answer only yes or no."),
        HumanMessage(content="Is apple a fruit?"),
    ]
)
print(result.content)


Yes. 



### MessagesPlaceholder
This prompt template is responsible for adding a list of messages in a particular place. In the above ChatPromptTemplate, we saw how we could format two messages, each one a string. But what if we wanted the user to pass in a list of messages that we would slot into a particular spot? This is how you use MessagesPlaceholder.

In [5]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage

prompt_template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    MessagesPlaceholder("msgs")
])

template = prompt_template.invoke({"msgs": [HumanMessage(content="hi!")]})
result = llm.invoke(template)
result.content

'Hi there! 👋  How can I help you today? 😊 \n'

### Few Shot Template

In [6]:
from langchain_core.prompts import PromptTemplate

example_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")

examples = [
    {
        "question": "Who lived longer, Muhammad Ali or Alan Turing?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
""",
    },
    {
        "question": "When was the founder of craigslist born?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
""",
    },
    {
        "question": "Who was the maternal grandfather of George Washington?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
""",
    },
    {
        "question": "Are both the directors of Jaws and Casino Royale from the same country?",
        "answer": """
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
""",
    },
]

In [None]:
from langchain_core.prompts import FewShotPromptTemplate

prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    suffix="Question: {input}",
    input_variables=["input"],
)

# prompt =  prompt.invoke({"input": "Who was the father of Mary Ball Washington?"}).to_string()

prompt =  prompt.invoke({"input": "Which mountain is higher - Mt Everest or Sandia?"}).to_string()

result = llm.invoke(prompt)
result.content

### Example selectors
In case of a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.



|Name     | Description |
|:--------:|:--------:|
|  Similarity | Uses semantic similarity between inputs and examples to decide which examples to choose.   |  
|  MMR  | Uses Max Marginal Relevance between inputs and examples to decide which examples to choose.  | 
| Length  |  Selects examples based on how many can fit within a certain length   |
| Ngram  |  Uses ngram overlap between inputs and examples to decide which examples to choose.      |       

In [12]:
!pip install langchain_chroma

Collecting langchain_chroma
  Obtaining dependency information for langchain_chroma from https://files.pythonhosted.org/packages/10/05/34b30ff33af5ea7e6e5b6d1bf8ea3a0f2c1462c6b7f750f21dd0179fdf1e/langchain_chroma-0.1.2-py3-none-any.whl.metadata
  Downloading langchain_chroma-0.1.2-py3-none-any.whl.metadata (1.3 kB)
Downloading langchain_chroma-0.1.2-py3-none-any.whl (9.3 kB)
Installing collected packages: langchain_chroma
Successfully installed langchain_chroma-0.1.2



[notice] A new release of pip is available: 23.2.1 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


#### Select by similarity
This object selects examples based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.



In [14]:
from langchain_chroma import Chroma
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate
from langchain_google_genai import GoogleGenerativeAIEmbeddings

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

# Examples of a pretend task of creating antonyms.
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

In [31]:
example_selector = SemanticSimilarityExampleSelector.from_examples(
    # The list of examples available to select from.
    examples,
    # The embedding class used to produce embeddings which are used to measure semantic similarity.
    GoogleGenerativeAIEmbeddings(model="models/text-embedding-004"),
    # The VectorStore class that is used to store the embeddings and do a similarity search over.
    Chroma,
    # The number of examples to produce.
    k=1,
)
similar_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

# Input is a feeling, so should select the happy/sad example
prompt = similar_prompt.format(adjective="worried") 
result = llm.invoke(prompt)
result.content

'Input: worried\nOutput: **calm** \n'

In [20]:
# You can add new examples to the SemanticSimilarityExampleSelector as well
similar_prompt.example_selector.add_example(
    {"input": "enthusiastic", "output": "apathetic"}
)
print(similar_prompt.format(adjective="passionate"))

Give the antonym of every input

Input: enthusiastic
Output: apathetic

Input: passionate
Output:


#### Select by maximal marginal relevance (MMR)
The MaxMarginalRelevanceExampleSelector selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.

In [39]:
!pip install faiss-cpu

Collecting faiss-cpu
  Obtaining dependency information for faiss-cpu from https://files.pythonhosted.org/packages/4c/e1/657eb537027b2d7aa0f0ccfc58aee6fe0252ea3d9e49472aecc5c7f30992/faiss_cpu-1.8.0.post1-cp311-cp311-win_amd64.whl.metadata
  Downloading faiss_cpu-1.8.0.post1-cp311-cp311-win_amd64.whl.metadata (3.8 kB)
Downloading faiss_cpu-1.8.0.post1-cp311-cp311-win_amd64.whl (14.6 MB)
   ---------------------------------------- 0.0/14.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/14.6 MB 330.3 kB/s eta 0:00:45
   ---------------------------------------- 0.1/14.6 MB 656.4 kB/s eta 0:00:23
   ---------------------------------------- 0.2/14.6 MB 1.3 MB/s eta 0:00:11
   - -------------------------------------- 0.6/14.6 MB 3.3 MB/s eta 0:00:05
   --- ------------------------------------ 1.2/14.6 MB 5.4 MB/s eta 0:00:03
   ----- ---------------------------------- 2.1/14.6 MB 7.8 MB/s eta 0:00:02
   ----------- ---------------------------- 4.1/14.6 MB 13.0 MB/s eta 0:00:


[notice] A new release of pip is available: 23.2.1 -> 24.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [40]:

from langchain_core.example_selectors import  MaxMarginalRelevanceExampleSelector 
from langchain_community.vectorstores import FAISS

mmr_example_selector = MaxMarginalRelevanceExampleSelector.from_examples(
    # The list of examples available to select from.
    examples,
    # The embedding class used to produce embeddings which are used to measure semantic similarity.
    GoogleGenerativeAIEmbeddings(model="models/text-embedding-004"),
    # The VectorStore class that is used to store the embeddings and do a similarity search over.
    FAISS,
    # The number of examples to produce.
    k=2,
)
mmr_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=mmr_example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)

# Input is a feeling, so should select the happy/sad example
prompt = mmr_prompt.format(adjective="worried") 
prompt


'Give the antonym of every input\n\nInput: happy\nOutput: sad\n\nInput: windy\nOutput: calm\n\nInput: worried\nOutput:'

#### Select by length
This example selector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more.

In [43]:
from langchain_core.example_selectors import LengthBasedExampleSelector
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

# Examples of a pretend task of creating antonyms.
examples = [
    {"input": "happy", "output": "sad"},
    {"input": "tall", "output": "short"},
    {"input": "energetic", "output": "lethargic"},
    {"input": "sunny", "output": "gloomy"},
    {"input": "windy", "output": "calm"},
]

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)
example_selector = LengthBasedExampleSelector(
    # The examples it has available to choose from.
    examples=examples,
    # The PromptTemplate being used to format the examples.
    example_prompt=example_prompt,
    # The maximum length that the formatted examples should be.
    # Length is measured by the get_text_length function below.
    max_length=25,
    # The function used to get the length of a string, which is used
    # to determine which examples to include. It is commented out because
    # it is provided as a default value if none is specified.
    # get_text_length: Callable[[str], int] = lambda x: len(re.split("\n| ", x))
)
dynamic_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Input: {adjective}\nOutput:",
    input_variables=["adjective"],
)
# An example with small input, so it selects all examples.
print(dynamic_prompt.format(adjective="big"))

Give the antonym of every input

Input: happy
Output: sad

Input: tall
Output: short

Input: energetic
Output: lethargic

Input: sunny
Output: gloomy

Input: windy
Output: calm

Input: big
Output:


#### Select by n-gram overlap
The NGramOverlapExampleSelector selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive.

The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.

In [51]:
from langchain_community.example_selectors.ngram_overlap import NGramOverlapExampleSelector
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Input: {input}\nOutput: {output}",
)

# Examples of a fictional translation task.
examples = [
    {"input": "See Spot run.", "output": "Ver correr a Spot."},
    {"input": "My dog barks.", "output": "Mi perro ladra."},
    {"input": "Spot can run.", "output": "Spot puede correr."},
]

In [53]:
example_selector = NGramOverlapExampleSelector(
    # The examples it has available to choose from.
    examples=examples,
    # The PromptTemplate being used to format the examples.
    example_prompt=example_prompt,
    # The threshold, at which selector stops.
    # It is set to -1.0 by default.
    threshold=-1.0,
    # For negative threshold:
    # Selector sorts examples by ngram overlap score, and excludes none.
    # For threshold greater than 1.0:
    # Selector excludes all examples, and returns an empty list.
    # For threshold equal to 0.0:
    # Selector sorts examples by ngram overlap score,
    # and excludes those with no ngram overlap with input.
)
dynamic_prompt = FewShotPromptTemplate(
    # We provide an ExampleSelector instead of examples.
    example_selector=example_selector,
    example_prompt=example_prompt,
    prefix="Give the Spanish translation of every input",
    suffix="Input: {sentence}\nOutput:",
    input_variables=["sentence"],
)
# An example input with large ngram overlap with "Spot can run."
# and no overlap with "My dog barks."
print(dynamic_prompt.format(sentence="Spot can run fast."))

Give the Spanish translation of every input

Input: Spot can run.
Output: Spot puede correr.

Input: See Spot run.
Output: Ver correr a Spot.

Input: My dog barks.
Output: Mi perro ladra.

Input: Spot can run fast.
Output:


### Example Selectors Use Cases
This table encapsulates the core application scenarios and use cases for each type of example selector, highlighting their unique advantages in different contexts.

| **Example Selector** | **Scenario**                          | **Use Case**                                                                                          |
|----------------------|---------------------------------------|-------------------------------------------------------------------------------------------------------|
| **Similarity**       | Customer Support Chatbots             | Selecting semantically similar past queries and resolutions for relevant responses.                   |
|                      | Personalized Learning                 | Providing tailored learning material by selecting examples similar to student queries.                |
|                      | E-commerce Product Recommendations    | Recommending products based on semantic similarity to user browsing history.                          |
|                      | Mental Health Chatbot                 | Providing contextually appropriate and empathetic responses based on similarity to previous cases.    |
| **MMR**              | News Article Summarization            | Selecting relevant and diverse sentences to create comprehensive summaries.                           |
|                      | Content Recommendation Systems        | Balancing relevance and novelty in content recommendations.                                           |
|                      | Academic Research Paper Summarization | Ensuring summaries include both the most relevant and diverse points.                                 |
|                      | Playlist Generation                   | Creating music playlists that match user preferences while introducing variety.                       |
| **Length**           | SMS or Tweet Generation               | Ensuring generated content fits within character limits for platforms like SMS and Twitter.           |
|                      | Mobile App Responses                  | Selecting concise responses that fit within screen space constraints.                                 |
|                      | Real-Time Translation                 | Providing translations that are concise and fit the interface constraints.                            |
|                      | Automated Report Generation           | Adhering to length requirements for readable and relevant summaries or reports.                       |
| **Ngram**            | Legal Document Analysis               | Ensuring responses align with specific legal terminology and language.                                |
|                      | Plagiarism Detection                  | Identifying text with high similarity to existing documents for accurate plagiarism detection.        |
|                      | Code Autocompletion                   | Improving code suggestions by selecting snippets with high ngram overlap to the current input.        |
|                      | Historical Text Analysis              | Identifying recurring themes or concepts in historical documents.                                     |

