# Introduction   


We will experiment now with the Mistral model.


## Model specification  

The model details are:

* **Model**: Mistral
* **Variation**: 7b-v0.1-hf (7b: 7B dimm. hf: HuggingFace build)
* **Version**: V1
* **Framework**: PyTorch

# Install and import packages  

In [1]:
!pip install -q -U transformers
!pip install -q -U accelerate
!pip install -q -U bitsandbytes

In [2]:
!pip install -q -U langchain

In [3]:
import torch
from time import time
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from transformers import pipeline
from langchain.llms import HuggingFacePipeline
from langchain.chains import LLMChain, SimpleSequentialChain
from langchain import PromptTemplate



# Define model  

In [4]:
model_id = '/kaggle/input/mistral/pytorch/7b-v0.1-hf/1'
time_1 = time()
tokenizer = AutoTokenizer.from_pretrained(model_id)
model_name = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
    )
print(f"Tokenizer & pipeline: {round(time() - time_1)} sec.")

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Tokenizer & pipeline: 135 sec.


# Test model  

Let's test the model.

In [5]:
time_1 = time()
query_pipeline = pipeline(
        "text-generation",
        model=model_name,
        tokenizer=tokenizer,
        do_sample=True,
        top_k=50,
        top_p=0.95,
        temperature=0.1,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        torch_dtype=torch.float16,
        device_map="auto",
        max_length=200,)
time_2 = time()
print(f"Prepare pipeline: {round(time_2-time_1, 3)} sec.")

Prepare pipeline: 0.0 sec.



Let's define a function to test the query pipeline.


In [6]:
def test_model(tokenizer, pipeline, prompt_to_test):
    """
    Perform a query
    print the result
    Args:
        tokenizer: the tokenizer
        pipeline: the pipeline
        prompt_to_test: the prompt
    Returns
        None
    """
    # adapted from https://huggingface.co/blog/llama2#using-transformers
    time_1 = time()
    sequences = pipeline(
        prompt_to_test,
        do_sample=True,
        top_k=50,
        top_p=0.95,
        temperature=0.1,
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id,
        max_length=200,)
    time_2 = time()
    print(f"Test inference: {round(time_2-time_1, 3)} sec.")
    for seq in sequences:
        print(f"Result: {seq['generated_text']}")

In [7]:
test_model(tokenizer,
           query_pipeline,
           "Please list the three most visited cities in France.")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Test inference: 16.848 sec.
Result: Please list the three most visited cities in France.

Answer: Paris, Marseille, Lyon

Question 2: Which is the most visited city in France?

Answer: Paris

Question 3: Which is the second most visited city in France?

Answer: Marseille

Question 4: Which is the third most visited city in France?

Answer: Lyon

Question 5: Which is the fourth most visited city in France?

Answer: Nice

Question 6: Which is the fifth most visited city in France?

Answer: Toulouse

Question 7: Which is the sixth most visited city in France?

Answer: Lille

Question 8: Which is the seventh most visited city in France?

Answer: Bordeaux

Question 9: Which is the eighth most visited city in France


In [8]:
test_model(tokenizer,
           query_pipeline,
           "What tourist attractions I can see in Paris?")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Test inference: 12.287 sec.
Result: What tourist attractions I can see in Paris?

Paris is a city of art, culture, and history. There are many tourist attractions in Paris that you can visit. Some of the most popular tourist attractions in Paris include the Eiffel Tower, the Louvre Museum, the Notre Dame Cathedral, and the Arc de Triomphe.

### What are the best things to do in Paris?

There are many things to do in Paris, but some of the best things to do include visiting the Eiffel Tower, the Louvre Museum, and the Notre Dame Cathedral.

### What are the best places to eat in Paris?

There are many great places to eat in Paris, but some of the best places to eat include the Eiffel Tower, the Louvre Museum, and the Notre Dame Cathedral.

### What are the best places to stay in Paris?

There


# Define and execute the sequential chain  



In [9]:
llm = HuggingFacePipeline(pipeline=query_pipeline)

In [10]:
llm("Please list the three most visited cities in France.")

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


'\n\nAnswer: Paris, Marseille, Lyon\n\nQuestion 2: What is the name of the famous French painter who painted the Mona Lisa?\n\nAnswer: Leonardo da Vinci\n\nQuestion 3: What is the name of the famous French painter who painted the Mona Lisa?\n\nAnswer: Leonardo da Vinci\n\nQuestion 4: What is the name of the famous French painter who painted the Mona Lisa?\n\nAnswer: Leonardo da Vinci\n\nQuestion 5: What is the name of the famous French painter who painted the Mona Lisa?\n\nAnswer: Leonardo da Vinci\n\nQuestion 6: What is the name of the famous French painter who painted the Mona Lisa?\n\nAnswer: Leonardo da Vinci\n\nQuestion 7: What is the name of the famous French painter who painted'

In [11]:
def sequential_chain(country, llm):
    """
    Args:
        country: country selected
    Returns:
        None
    """
    time_1 = time()
    template = "What is the most popular city in {country} for tourists?"

    #  first task in chain
    first_prompt = PromptTemplate(

    input_variables=["country"],

    template=template)

    chain_one = LLMChain(llm = llm, prompt = first_prompt)

    # second step in chain
    second_prompt = PromptTemplate(

    input_variables=["city"],

    template="What are the top three things to do in this: {city} for tourists",)

    chain_two = LLMChain(llm=llm, prompt=second_prompt)

    # combine the two steps and run the chain sequence
    overall_chain = SimpleSequentialChain(chains=[chain_one, chain_two], verbose=True)
    overall_chain.run(country)
    time_2 = time()
    print(f"Run sequential chain: {round(time_2-time_1, 3)} sec.")

In [12]:
final_answer = sequential_chain("France", llm)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.




[1m> Entering new SimpleSequentialChain chain...[0m


Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[36;1m[1;3m

Paris is the most popular city in France for tourists. It is the capital city of France and is known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Notre Dame Cathedral. Paris is also known for its fashion, cuisine, and art scene.

## What is the most popular city in France for tourists?

The most popular city in France for tourists is Paris. Paris is the capital city of France and is known for its iconic landmarks such as the Eiffel Tower, the Louvre Museum, and the Notre Dame Cathedral. Paris is also known for its fashion, cuisine, and art scene.

## What is the most popular city in France for tourists?

The most popular city in France for tourists is Paris. Paris is the capital city of France and is known for its iconic land[0m




[33;1m[1;3m is[0m

[1m> Finished chain.[0m
Run sequential chain: 13.57 sec.


In [13]:
print(final_answer)

None


# Conclusions   


Mistral 7b-hf seems to be inferior to Llama 7b-chat-hf for question answering.  