# Use Mistral LLM for Text Classification and Entity Recognition

<h1>Table of Content<span class="tocSkip"></span></h1>
<div class="toc">
   <ul class="toc-item">
      <li>
         <span><a href="#about-mistral-model" data-toc-modified-id="about-mistral-model"><span class="toc-item-num"></span>Introduction to Mistral model</a></span>
      </li>
   </ul>
   <ul class="toc-item">
      <li>
         <span><a href="#mistral_integration_in_arcgis_learn" data-toc-modified-id="mistral-integration-in-arcgis-learn"><span class="toc-item-num"></span>Mistral Implementation in <code>arcgis.learn</code></a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#install_the_model_backbone" data-toc-modified-id="install_the_model_backbone"><span class="toc-item-num"></span>Install the model backbone</a></span></li>
      </ul>
   </ul>
   <ul class="toc-item">
      <li>
         <span><a href="#mistral_with_textclassifier_model" data-toc-modified-id="mistral_with_textclassifier_model"><span class="toc-item-num"></span>Mistral with the <code>TextClassifier</code> model</a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#import_mistral_text_classifier" data-toc-modified-id="import_mistral_text_classifier"><span class="toc-item-num"></span> Import the <code>TextClassifier</code> class from the <code>arcgis.learn.text</code> module</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_textclassifier_model_databunch" data-toc-modified-id="initialize_textclassifier_model_databunch"><span class="toc-item-num"></span>Initialize the <code>TextClassifier</code> model with a databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_textclassifier_model_nodatabunch" data-toc-modified-id="initialize_textclassifier_model_nodatabunch"><span class="toc-item-num"></span> Initialize the<code>TextClassifier</code> model without a databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#classify_text_using_mistral_model" data-toc-modified-id="classify_text_using_mistral_model"><span class="toc-item-num"></span>Classify the text using mistral model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#load_model" data-toc-modified-id="load_model"><span class="toc-item-num"></span>Load the model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#save_model" data-toc-modified-id="save_model"><span class="toc-item-num"></span>Save the model</a></span></li>
      </ul>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#mistral_with_entityrecognizer_model" data-toc-modified-id="mistral_with_entityrecognizer_model"><span class="toc-item-num"></span>Mistral with an<code>EntityRecognizer</code> model</a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#import_mistral_entity_recognizer" data-toc-modified-id="import_mistral_entity_recognizer"><span class="toc-item-num"></span> Import the <code>EntityRecognizer</code> class from the <code>arcgis.learn.text</code> module</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_entity_recognizer_model_databunch" data-toc-modified-id="initialize_entity_recognizer_model_databunch"><span class="toc-item-num"></span>Initialize the <code>EntityRecognizer</code> model with a databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_entity_recognizer_model_nodatabunch" data-toc-modified-id="initialize_entity_recognizer_model_nodatabunch"><span class="toc-item-num"></span> Initialize the <code>EntityRecognizer</code> model without a databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#extract_entities_using_mistral_model" data-toc-modified-id="extract_entities_using_mistral_model"><span class="toc-item-num"></span>Extract entities using the mistral model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#load_model_er" data-toc-modified-id="load_model_er"><span class="toc-item-num"></span>Load the model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#save_model_er" data-toc-modified-id="save_model_er"><span class="toc-item-num"></span>Save the model</a></span></li>
      </ul>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#conclusion" data-toc-modified-id="conclusion"><span class="toc-item-num"></span>Conclusion</a></span>
      </li>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#reference" data-toc-modified-id="reference"><span class="toc-item-num"></span>References</a></span>
      </li>
   </ul>
</div>


## Introduction to Mistral model <a id="about-mistral-model"></a>

Mistral 7B is a decoder-based language model trained using almost 7 billion parameters designed to deliver both efficiency and high performance for real-world applications.

Employing attention mechanisms like Sliding Window Attention, Mistral 7B can train with an 8k context length and a fixed cache size, resulting in a theoretical attention span of 128K tokens. This capability allows the model to focus on crucial parts of the text efficiently. Moreover, the model incorporates Grouped Query Attention (GQA) to accelerate inference and reduce cache size, thereby expediting its inference process. Additionally, its Byte-fallback tokenizer ensures consistent representation of characters, eliminating the need for out-of-vocabulary tokens.

Such design features in its architecture equip Mistral 7B for exceptional performance, particularly in tasks related to language comprehension and generation. In this guide we see how we can use the Mistral LLM for text classification and named entity recognition.

## Mistral Implementation in arcgis.learn <a id="mistral_integration_in_arcgis_learn"></a>

### Install the model backbone <a id="install_the_model_backbone"></a>

Follow these steps to download and install the Mistral model backbone:

1. Download the <a href="https://esri.maps.arcgis.com/home/item.html?id=969d2fc57c834295af1a4a42cfd51f68">mistral model backbone</a>.

2. Extract the downloaded zip file.

3. Open the anaconda prompt and move to the folder that contains arcgis_mistral_backbone-1.0.0-py_0.tar.bz2

4. Run:

 ```conda install --offline arcgis_mistral_backbone-1.0.0-py_0.tar.bz2```

## Mistral with the TextClassifier model <a id="mistral_with_textclassifier_model"></a>

### Import the TextClassifier class from the arcgis.learn.text module <a id="import_mistral_text_classifier"></a>

In [None]:
from arcgis.learn.text import TextClassifier

### Initialize the TextClassifier model with a databunch <a id="initialize_textclassifier_model_databunch"></a>

Prepare databunch for the <code>TextClassifier</code> model using the<code>prepare_textdata</code> method in <code>arcgis.learn</code>.

In [None]:
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_folder", task="classification",train_file="input_csv_file.csv",
                        text_columns="text_column", label_columns="label_column")

Once the data is prepared, the <code>TextClassifier</code> model object can be instantiated as below with the following parameters:

<code>data</code>: The databunch created using the <i>prepare_textdata</i> method.

<code>backbone</code>: To use mistral as the model backbone, use <i>backbone="mistral"</i>.

<code>prompt</code>: Text string describing the task and its guardrails. This is an optional parameter.

In [None]:
classifier_model = TextClassifier(
    data=data,
    backbone="mistral",
    prompt="Classify all the input sentences into the defined labels, do not make up your own labels."
)

### Initialize the TextClassifier model without a databunch <a id="initialize_textclassifier_model_nodatabunch"></a>

A <code>TextClassifier</code> model with a mistral backbone can also be created without a large dataset using only a few examples.

Below are the parameters to be passed into <code>TextClassifier</code>:

<code>backbone</code>: To use mistral as the model backbone, use <i>backbone="mistral"</i>.

<code>examples</code>: User defined examples to provide the mistral model, in python dictionary format:
```
{
    "label_1" :["input_text_example_1", "input_text_example_2", ...],
    "label_2" :["input_text_example_1", "input_text_example_2", ...],
    ...
}
```
    
<code>prompt</code>: Text string describing the task and its guardrails. This is an optional parameter.


In [None]:
classifier_model = TextClassifier(
    data=None,
    backbone="mistral",
    prompt="Classify all the input sentences into the defined labels, do not make up your own labels.",
    examples={
                "positive" : [" Good! it was a wonderful experience!", "i really adore your work"],
                "negative" : ["The customer support was unhelpful", "I don`t like your work"]
            }
)

### Classify the text using mistral model <a id="classify_text_using_mistral_model"></a>

To classify text using the mistral model, use the <code>predict</code> method from the <code>TextClassifier</code> class. The input to the method will be a text string or a list of text string.

In [None]:
classifier_model.predict()

### Load the model <a id="load_model"></a>

To load a saved mistral model, use the <code>from_model</code> method from the <code>TextClassifier</code> class.

In [None]:
classifier_model.from_model(r'path_to_dlpk_file')

### Save the model <a id="save_model"></a>

The following method saves the model weights and creates a Deep Learning Package (.dlpk).

In [None]:
classifier_model.save("name_of_the_mistral_model")

## Mistral with an EntityRecognizer model <a id="mistral_with_entityrecognizer_model"></a>

### Import the EntityRecognizer class from the arcgis.learn.text module <a id="import_mistral_entity_recognizer"></a>

In [None]:
from arcgis.learn.text import EntityRecognizer

### Initialize the EntityRecognizer model with a databunch <a id="initialize_entity_recognizer_model_databunch"></a>

Prepare the databunch for the <code>EntityRecognizer</code> model using the <code>prepare_textdata</code> method in arcgis.learn.

In [None]:
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_file", task="entity_recognition", dataset_type='ner_json')

Once the data is prepared, the <code>EntityRecognizer</code> model object can be instantiated with the following parameters:

<code>data</code>: The databunch created using the <code>prepare_textdata</code> method.

<code>backbone</code>: To use mistral as the model backbone, use <i>backbone="mistral"</i>.

<code>prompt</code>: Text string describing the task and its guardrails. This is an optional parameter.

In [None]:
entity_recognizer_model = EntityRecognizer(
    data=data,
    backbone="mistral",
    prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
)

### Initialize the EntityRecognizer model without a databunch <a id="initialize_entity_recognizer_model_nodatabunch"></a>

An <code>EntityRecognizer</code> model with a mistral backbone can also be created without a large dataset by using only a few examples.

Below are the parameters to be passed into <code>EntityRecognizer</code> :

<code>backbone</code>: To use mistral as the model backbone, use backbone="mistral".

<code>examples</code>: User defined examples for the mistral model, in python list format:

```
[
    ("input_text_sentence", 
         {
             "class_1":["Named Entity", ...],
             "class_2": ["Named entity", ...],
             ...
         }
    )
    ...
]
```

```
Note: The EntityRecognizer class, using the "Mistral" backbone, needs at least six examples to work effectively.
```
    
<code>prompt</code>: Text string describing the task and its guardrails. This is an optional parameter.


In [None]:
entity_recognizer_model = EntityRecognizer(
    data=None,
    backbone="mistral",
    prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
    examples=[(
            'Jim stays in London',
            {
                'name': ['Jim'], 
                'location': ['London']
            },
            ...
        )]
)

### Extract entities using the mistral model <a id="extract_entities_using_mistral_model"></a>

To extract named entities using the mistral model, use the <code>extract_entities</code> method from the <code>EntityRecognizer</code> class. The input to the method will be a text string or a list of text strings.

In [None]:
entity_recognizer_model.extract_entities()

### Load the model <a id="load_model_er"></a>

To load a saved mistral model, use the <code>from_model</code> method from the <code>EntityRecognizer</code> class.

In [None]:
entity_recognizer_model.from_model(r'path_to_dlpk_file')

### Save the model <a id="save_model_er"></a>

This method saves the model weights and creates a Deep Learning Package (.dlpk).

In [None]:
entity_recognizer_model.save("name_of_the_mistral_model")

## Conclusion <a id="conclusion"></a>

In this guide we demonstrated the steps to initialize and perform inference using the Mistral LLM as a backbone with the TextClassifier and EntityRecognizer models in arcgis.learn.

## References <a id="reference"></a>

Mistral-7B HuggingFace: <a href="https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2">https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2</a><br/>
Mistral-7B MistralAI: <a href="https://mistral.ai/news/announcing-mistral-7b/">https://mistral.ai/news/announcing-mistral-7b</a>