# Use Mistral LLM for Text Classification and Entity Recognition

<h1>Table of Content<span class="tocSkip"></span></h1>
<div class="toc">
   <ul class="toc-item">
      <li>
         <span><a href="#about-mistral-model" data-toc-modified-id="about-mistral-model"><span class="toc-item-num"></span>Introduction to Mistral model</a></span>
      </li>
   </ul>
   <ul class="toc-item">
      <li>
         <span><a href="#mistral_integration_in_arcgis_learn" data-toc-modified-id="mistral-integration-in-arcgis-learn"><span class="toc-item-num"></span>Mistral Implementation in <code>arcgis.learn</code></a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#install_the_model_backbone" data-toc-modified-id="install_the_model_backbone"><span class="toc-item-num"></span>Install the model backbone</a></span></li>
      </ul>
   </ul>
   <ul class="toc-item">
      <li>
         <span><a href="#mistral_with_textclassifier_model" data-toc-modified-id="mistral_with_textclassifier_model"><span class="toc-item-num"></span>Mistral with <code>TextClassifier</code> model</a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#import_mistral_text_classifier" data-toc-modified-id="import_mistral_text_classifier"><span class="toc-item-num"></span> Import the <code>TextClassifier</code> class from <code>arcgis.learn.text</code> module</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_textclassifier_model_databunch" data-toc-modified-id="initialize_textclassifier_model_databunch"><span class="toc-item-num"></span>Initialize <code>TextClassifier</code> model with databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_textclassifier_model_nodatabunch" data-toc-modified-id="initialize_textclassifier_model_nodatabunch"><span class="toc-item-num"></span> Initialize <code>TextClassifier</code> model without databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#classify_text_using_mistral_model" data-toc-modified-id="classify_text_using_mistral_model"><span class="toc-item-num"></span>Classify text using mistral model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#load_model" data-toc-modified-id="load_model"><span class="toc-item-num"></span>Load the model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#save_model" data-toc-modified-id="save_model"><span class="toc-item-num"></span>Save the model</a></span></li>
      </ul>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#mistral_with_entityrecognizer_model" data-toc-modified-id="mistral_with_entityrecognizer_model"><span class="toc-item-num"></span>Mistral with <code>EntityRecognizer</code> model</a></span>
      </li>
      <ul class="toc-item">
         <li><span><a href="#import_mistral_entity_recognizer" data-toc-modified-id="import_mistral_entity_recognizer"><span class="toc-item-num"></span> Import the <code>EntityRecognizer</code> class from <code>arcgis.learn.text</code> module</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_entity_recognizer_model_databunch" data-toc-modified-id="initialize_entity_recognizer_model_databunch"><span class="toc-item-num"></span>Initialize <code>EntityRecognizer</code> model with databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#initialize_entity_recognizer_model_nodatabunch" data-toc-modified-id="initialize_entity_recognizer_model_nodatabunch"><span class="toc-item-num"></span> Initialize <code>EntityRecognizer</code> model without databunch</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#extract_entities_using_mistral_model" data-toc-modified-id="extract_entities_using_mistral_model"><span class="toc-item-num"></span>Extract entities using mistral model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#load_model_er" data-toc-modified-id="load_model_er"><span class="toc-item-num"></span>Load the model</a></span></li>
      </ul>
       <ul class="toc-item">
         <li><span><a href="#save_model_er" data-toc-modified-id="save_model_er"><span class="toc-item-num"></span>Save the model</a></span></li>
      </ul>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#conclusion" data-toc-modified-id="conclusion"><span class="toc-item-num"></span>Conclusion</a></span>
      </li>
   </ul>
    <ul class="toc-item">
      <li>
         <span><a href="#reference" data-toc-modified-id="reference"><span class="toc-item-num"></span>References</a></span>
      </li>
   </ul>
</div>


## Introduction to Mistral model <a id="about-mistral-model"></a>

The Mistral 7B is a decoder-based language model trained using almost 7 billion parameters designed to deliver both efficiency and high performance for real-world applications.

Attention mechanisms like Sliding Window Attention, Mistral 7B can train with an 8k context length and a fixed cache size, resulting in a theoretical attention span of 128K tokens. This capability allows the model to focus on crucial parts of the text efficiently.

Moreover, the model incorporates Grouped Query Attention (GQA) to accelerate inference and reduce cache size, thereby expediting its inference process. Additionally, its Byte-fallback tokenizer ensures consistent representation of characters, eliminating the need for out-of-vocabulary tokens.

These enhancements makes Mistral 7B model exceptional performance, particularly in tasks related to language comprehension and generation.

## Mistral Implementation in arcgis.learn <a id="mistral_integration_in_arcgis_learn"></a>

### Install the model backbone <a id="install_the_model_backbone"></a>

Before performing text classification and entity extraction tasks using the mistral model backbone, you'll need to install it.

1. Download the <a href="https://esri.maps.arcgis.com/home/item.html?id=969d2fc57c834295af1a4a42cfd51f68">mistral model backbone</a>.

2. Extract the downloaded zip file.

3. Open the anaconda prompt and move to the folder that contains arcgis_mistral_backbone-1.0.0-py_0.tar.bz2

4. Run:

 ```conda install --offline arcgis_mistral_backbone-1.0.0-py_0.tar.bz2```

## Mistral with TextClassifier model <a id="mistral_with_textclassifier_model"></a>

### Import the TextClassifier class from arcgis.learn.text module <a id="import_mistral_text_classifier"></a>

In [None]:
from arcgis.learn.text import TextClassifier

### Initialize TextClassifier model with databunch <a id="initialize_textclassifier_model_databunch"></a>

Prepare databunch for <code>TextClassifier</code> model using <code>prepare_textdata</code> method in <code>arcgis.learn</code>.

In [None]:
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_folder", task="classification",train_file="input_csv_file.csv",
                        text_columns="text_column", label_columns="label_column")

Once the data is prepared, <code>TextClassifier</code> model object can be instantiated as below with the following parameters:

<code>data</code>: Databunch created using the <code>prepare_textdata</code> method.

<code>backbone</code>: For using mistral as model backbone use <i>backbone="mistral"</i>.
    
<code>prompt</code>: This parameter use to describe the task and guardrails for the task. This is an optional parameter.

In [None]:
classifier_model = TextClassifier(
    data=data,
    backbone="mistral",
    prompt="Classify all the input sentences into the defined labels, do not make up your own labels."
)

### Initialize TextClassifier model without databunch <a id="initialize_textclassifier_model_nodatabunch"></a>

You can create a <code>TextClassifier</code> model with mistral backbone without needing a large dataset and by using just a few examples.

Below are the parameters to be passed into <code>TextClassifier</code>:

<code>backbone</code>: For using mistral as model backbone use <i>backbone="mistral"</i>.
    
<code>examples</code>: User defined examples to the mistral model. Use the python dictionary format to provide examples:<br/>
```
{
    "label_1" :["input_text_example_1", "input_text_example_2", ...],
    "label_2" :["input_text_example_1", "input_text_example_2", ...],
    ...
}
```
    
<code>prompt</code>: This parameter use to describe the task and guardrails for the task. This is an optional parameter.


In [None]:
classifier_model = TextClassifier(
    data=None,
    backbone="mistral",
    prompt="Classify all the input sentences into the defined labels, do not make up your own labels.",
    examples={
                "positive" : [" Good! it was a wonderful experience!", "i really adore your work"],
                "negative" : ["The customer support was unhelpful", "I don`t like your work"]
            }
)

### Classify text using mistral model <a id="classify_text_using_mistral_model"></a>

To inference using the mistral model use the <code>predict</code> method from <code>TextClassifier</code> class. The input to the method will be a text string or a list of text string.

In [None]:
classifier_model.predict()

### Load the model <a id="load_model"></a>

To load a pre-saved mistral model use <code>from_model</code> method from <code>TextClassifier</code> class.

In [None]:
classifier_model.from_model(r'path_to_emd_file')

### Save the model <a id="save_model"></a>

This method saves the model weights and creates a Deep Learning Package (.dlpk).

In [None]:
classifier_model.save("name_of_the_mistral_model")

## Mistral with EntityRecognizer model <a id="mistral_with_entityrecognizer_model"></a>

### Import the EntityRecognizer class from arcgis.learn.text module <a id="import_mistral_entity_recognizer"></a>

In [None]:
from arcgis.learn.text import EntityRecognizer

### Initialize EntityRecognizer model with databunch <a id="initialize_entity_recognizer_model_databunch"></a>

Prepare databunch for <code>EntityRecognizer</code> model using <code>prepare_textdata</code> method in <code>arcgis.learn</code>.

In [None]:
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_file", task="entity_recognition", dataset_type='ner_json')

Once the data is prepared, <code>EntityRecognizer</code> model object can be instantiated as below with the following parameters:

<code>data</code>: Databunch created using the prepare_textdata method.

<code>backbone</code>: For using mistral as model backbone use <i>backbone="mistral"</i>.
    
<code>prompt</code>: This parameter use to describe the task and guardrails for the task. This is an optional parameter.

In [None]:
entity_recognizer_model = EntityRecognizer(
    data=data,
    backbone="mistral",
    prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
)

### Initialize EntityRecognizer model without databunch <a id="initialize_entity_recognizer_model_nodatabunch"></a>

You can create a <code>EntityRecognizer</code> model with mistral backbone without needing a large dataset and by using just a few examples.

Below are the parameters to be passed into <code>EntityRecognizer</code> :

<code>backbone</code>: For using mistral as model backbone use <i>backbone="mistral"</i>.

<code>examples</code>: User defined examples to the mistral model. Use the python list format to provide examples:

```
[
    ("input_text_sentence", 
         {
             "class_1":["Named Entity", ...],
             "class_2": ["Named entity", ...],
             ...
         }
    )
    ...
]
```

```
Note: The EntityRecognizer class, using the "Mistral" backbone, needs at least six examples to work effectively.
```
    
<code>prompt</code>: This parameter use to describe the task and guardrails for the task. This is an optional parameter.


In [None]:
entity_recognizer_model = EntityRecognizer(
    data=None,
    backbone="mistral",
    prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
    examples=[(
            'Jim stays in London',
            {
                'name': ['Jim'], 
                'location': ['London']
            },
            ...
        )]
)

### Extract entities using mistral model <a id="extract_entities_using_mistral_model"></a>

To inference using the mistral model use the <code>extract_entities</code> method from <code>EntityRecognizer</code> class. The input to the method will be a text string or a list of text string.

In [None]:
entity_recognizer_model.extract_entities()

### Load the model <a id="load_model_er"></a>

To load a pre saved mistral model use <code>from_model</code> method from <code>EntityRecognizer</code> class.

In [None]:
entity_recognizer_model.from_model(r'path_to_emd_file')

### Save the model <a id="save_model_er"></a>

This method saves the model weights and creates a Deep Learning Package (.dlpk).

In [None]:
entity_recognizer_model.save("name_of_the_mistral_model")

## Conclusion <a id="conclusion"></a>

This guide outlines the steps to initialize and perform inference with the <code>TextClassifier</code> and <code>EntityRecognizer</code> models using the mistral model backbone in arcgis.learn.

## References <a id="reference"></a>

Mistral-7B HuggingFace: <a href="https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2">https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2</a><br/>
Mistral-7B MistralAI: <a href="https://mistral.ai/news/announcing-mistral-7b/">https://mistral.ai/news/announcing-mistral-7b</a>