# Prompt Engineering avec Langchain

Les grands modèles linguistiques (LLM) peuvent accomplir toutes les tâches de classification, de question \ réponse (autrefois réalisées par des modèles de ML ou encore de DL utilisant l'architecture des transformers) et bien plus encore. 

Ces modèles ont été formés à partir d'un concept simple : vous saisissez une séquence de texte (`Prompt`) et le modèle produit une séquence de texte (ou un image). La seule variable ici est le texte d'entrée - l'invite ou le prompt.

Dans cette nouvelle ère des LLM, les prompts sont rois. Les mauvais messages-guides produisent de mauvais résultats, et les bons messages-guides sont déraisonnablement puissants. Construire de bons messages-guides est une compétence cruciale pour ceux qui construisent avec des LLM.

La bibliothèque LangChain reconnaît la puissance des invites et a construit un ensemble complet d'objets pour elles.


![Alt text](image.png)

Toutes les invites n'utilisent pas ces composantes, mais une bonne invite en utilise souvent deux ou plus. Définissons-les plus précisément.

- Les instructions indiquent au modèle ce qu'il doit faire, comment utiliser les informations externes si elles sont fournies, ce qu'il doit faire avec la requête et comment construire la sortie.

- Les informations externes ou le(s) contexte(s) constituent une source supplémentaire de connaissances pour le modèle. Elles peuvent être insérées manuellement dans l'invite, récupérées par l'intermédiaire d'une base de données vectorielle (augmentation de la récupération) ou introduites par d'autres moyens (API, calculs, etc.).

- L'entrée ou la requête de l'utilisateur est généralement (mais pas toujours) une requête introduite dans le système par un utilisateur humain (le prompteur).

- L'indicateur de sortie marque le début du texte à générer. Si l'on génère du code Python, on peut utiliser import pour indiquer au modèle qu'il doit commencer à écrire du code Python (car la plupart des scripts Python commencent par import).

Chaque composant est généralement placé dans l'invite dans l'ordre suivant :

1. Instructions, 
2. Les informations externes (le cas échéant), 
3. L'entrée du prompteur,
4. L'indicateur de sortie.

In [4]:
import os
#from langchain.chat_models import AzureChatOpenAI OLD
from langchain_openai import AzureChatOpenAI
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

In [2]:
from langchain import PromptTemplate

template = """
Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: {query}

Answer: """

prompt_template = PromptTemplate(
    input_variables=["query"],
    template=template
)

# just to print the template
print(
    prompt_template.format(
        query="Which libraries and model providers offer LLMs?"
    )
)


Answer the question based on the context below. If the
question cannot be answered using the information provided answer
with "I don't know".

Context: Large Language Models (LLMs) are the latest models used in NLP.
Their superior performance over smaller models has made them incredibly
useful for developers building NLP enabled applications. These models
can be accessed via Hugging Face's `transformers` library, via OpenAI
using the `openai` library, and via Cohere using the `cohere` library.

Question: Which libraries and model providers offer LLMs?

Answer: 


On sait déjà que pour créer une chaîne avec Langchain il faut combiner un modèle de langage avec une invite.

In [7]:
from langchain import LLMChain

# Model definition
llm = AzureChatOpenAI(
    openai_api_version="2023-09-01-preview",
    azure_endpoint=os.getenv('AZURE_API_ENDPOINT'),
    api_key=os.getenv('AZURE_OPENAI_KEY'),
    azure_deployment=os.getenv('OPENAI_DEPLOYMENT_NAME'),
    model_name=os.getenv('OPENAI_MODEL_NAME'),
    model_version=os.getenv('OPENAI_API_VERSION'),
    temperature=.7
)


# Create the LLMChain instance
llm_chain = LLMChain(
    prompt=prompt_template ,
    llm=llm
)

# Provide a question
question = """
List libraries and model providers offer LLMs in a Markdown table.
"""

# Get the answer
answer = llm_chain.invoke(input=question)
print(answer['text'])

| Library | LLM Provider |
|---------|--------------|
| `transformers` | Hugging Face |
| `openai` | OpenAI |
| `cohere` | Cohere |


## Few Shot Prompt

Le succès des LLM provient de leur grande taille et de leur capacité à stocker la "connaissance" dans le paramètre du modèle, qui est appris au cours de l'apprentissage du modèle. Cependant, il y a plusieurs façons de transmettre la connaissance à un LLM. 

Les deux méthodes principales sont les suivantes :

- Connaissance paramétrique - la connaissance mentionnée ci-dessus est tout ce qui a été appris par le modèle pendant la formation et qui est stocké dans les poids (ou paramètres) du modèle.

- Connaissance de la source - toute connaissance fournie au modèle au moment de l'inférence par l'intermédiaire de l'invite d'entrée.

Le modèle `FewShotPromptTemplate` de Langchain s'occupe de l'entrée de la connaissance source. L'idée est de "former" le modèle sur quelques exemples - nous appelons cela l'apprentissage en quelques coups - et ces exemples sont donnés au modèle dans l'invite.

L'apprentissage ponctuel est idéal lorsque notre modèle a besoin d'aide pour comprendre ce que nous lui demandons de faire.

In [8]:
from langchain import FewShotPromptTemplate

# create our examples
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }
]

# create a example template
example_template = """
User: {query}
AI: {answer}
"""

# create a prompt example from above template
example_prompt = PromptTemplate(
    input_variables=["query", "answer"],
    template=example_template
)

# now break our previous prompt into a prefix and suffix
# the prefix is our instructions
prefix = """The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 
"""

# and the suffix our user input and output indicator
suffix = """
User: {query}
AI: """

# now create the few shot prompt template
few_shot_prompt_template = FewShotPromptTemplate(
    examples=examples, # to pass example
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n\n"
)

# print the complete prompt :
query = "What is the meaning of life?"

print(few_shot_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 



User: How are you?
AI: I can't complain but sometimes I still do.



User: What time is it?
AI: It's time to get a watch.



User: What is the meaning of life?
AI: 


In [9]:
# to manage the context window = input tokens + output tokens
from langchain.prompts.example_selector import LengthBasedExampleSelector


# to change the number of examples
examples = [
    {
        "query": "How are you?",
        "answer": "I can't complain but sometimes I still do."
    }, {
        "query": "What time is it?",
        "answer": "It's time to get a watch."
    }, {
        "query": "What is the meaning of life?",
        "answer": "42"
    }, {
        "query": "What is the weather like today?",
        "answer": "Cloudy with a chance of memes."
    }, {
        "query": "What is your favorite movie?",
        "answer": "Terminator"
    }, {
        "query": "Who is your best friend?",
        "answer": "Siri. We have spirited debates about the meaning of life."
    }, {
        "query": "What should I do today?",
        "answer": "Stop talking to chatbots on the internet and go outside."
    }
]


# to select only some examples
example_selector = LengthBasedExampleSelector(
    examples=examples,
    example_prompt=example_prompt,
    max_length=50  # this sets the max length that examples should be
)


# now create the few shot prompt template
dynamic_prompt_template = FewShotPromptTemplate(
    example_selector=example_selector,  # use example_selector instead of examples
    example_prompt=example_prompt,
    prefix=prefix,
    suffix=suffix,
    input_variables=["query"],
    example_separator="\n"
)

# display the prompt
print(dynamic_prompt_template.format(query="How do birds fly?"))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: What time is it?
AI: It's time to get a watch.


User: What is the meaning of life?
AI: 42


User: How do birds fly?
AI: 


Si vous répondez à une question plus longue, vous aurez moins d'exemples :

In [10]:
query = """If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?"""

print(dynamic_prompt_template.format(query=query))

The following are exerpts from conversations with an AI
assistant. The assistant is typically sarcastic and witty, producing
creative  and funny responses to the users questions. Here are some
examples: 


User: How are you?
AI: I can't complain but sometimes I still do.


User: If I am in America, and I want to call someone in another country, I'm
thinking maybe Europe, possibly western Europe like France, Germany, or the UK,
what is the best way to do that?
AI: 


In [11]:
# Create the LLMChain instance
llm_chain = LLMChain(
    prompt=dynamic_prompt_template ,
    llm=llm
)

# Get the answer
answer = llm_chain.run(query=query)
print(answer)

Well, you could always try shouting really loudly across the ocean, but I'm not sure how effective that would be. Alternatively, you could use a phone or internet service provider that offers international calling options. Just a thought.


## Specifier le format de sortie



In [14]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

In [15]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser
from langchain.prompts import ChatPromptTemplate

# define the field
gift_schema = ResponseSchema(name="gift",
                             description="Was the item purchased\
                             as a gift for someone else? \
                             Answer True if yes,\
                             False if not or unknown.")

delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")

price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

response_schemas_list = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

# Define the structure
output_parser = StructuredOutputParser.from_response_schemas(response_schemas_list)

# Get the formated instruction
format_instructions = output_parser.get_format_instructions()


review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)

messages = prompt.format_messages(text=customer_review,
                                  format_instructions=format_instructions)

print(messages[0].content)

For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the productto arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,and output them as a comma separated Python list.

text: This leaf blower is pretty amazing.  It has four settings:candle blower, gentle breeze, windy city, and tornado. It arrived in two days, just in time for my wife's anniversary present. I think my wife liked it so much she was speechless. So far I've been the only one using it, and I've been using it every other morning to clear the leaves on our lawn. It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features.


The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```

In [17]:
response = llm.invoke(messages)
print(response.content)

```json
{
	"gift": true,
	"delivery_days": "2",
	"price_value": ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]
}
```


In [18]:
output_dict = output_parser.parse(response.content)
output_dict

{'gift': True,
 'delivery_days': '2',
 'price_value': ["It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."]}