<a href="https://colab.research.google.com/github/marco-siino/ThingSpeak_ParsersGenerator/blob/main/Mistral_7B_Instruct_v0_3_ThingSpeak_Parsers_Generator.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started with `mistral-inference`

This notebook will guide you through the process of running Mistral models locally. We will cover the following:
- How to chat with Mistral 7B Instruct
- How to run Mistral 7B Instruct with function calling capabilities

We recommend using a GPU such as the A100 to run this notebook.

In [None]:
!pip install mistral-inference

## Download Mistral 7B Instruct

In [None]:
!wget https://models.mistralcdn.com/mistral-7b-v0-3/mistral-7B-Instruct-v0.3.tar

In [None]:
!DIR=$HOME/mistral_7b_instruct_v3 && mkdir -p $DIR && tar -xf mistral-7B-Instruct-v0.3.tar -C $DIR

In [None]:
!ls mistral_7b_instruct_v3

# Import libraries and load the model.

In [1]:
import os
import random
import torch

from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
from rouge_score import rouge_scorer

from mistral_inference.model import Transformer
from mistral_inference.generate import generate

from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequest

# I can decide which GPU to use on this node on Leonardo.
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
#os.environ["CUDA_VISIBLE_DEVICES"] = "2"
torch.cuda.set_device(2)

# load tokenizer
mistral_tokenizer = MistralTokenizer.from_file(os.path.expanduser("~")+"/mistral_7b_instruct_v3/tokenizer.model.v3")

# load model
model = Transformer.from_folder(os.path.expanduser("~")+"/mistral_7b_instruct_v3")

In [2]:
!nvidia-smi

Thu Jun 20 01:18:21 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA A100-SXM-64GB            On | 00000000:1D:00.0 Off |                    0 |
| N/A   42C    P0               60W / 461W|      3MiB / 65536MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM-64GB            On | 00000000:56:0

# Generate the Python code to perform the task.

In [3]:
prompt ="""
Write the Python code to convert the following JSON FILE using the following JSON SCHEMA:

JSON FILE:

{
    "channel": {
        "id": 1293177,
        "name": "San Diego - Estaci\u00f3n Meteorol\u00f3gica",
        "description": "San Diego, Cerro Largo, Uruguay\r\nEstaci\u00f3n Meteorol\u00f3gica Solar\r\n(Temp, Hum, Presion, Lluvia, Viento).\r\nESP8266, UNO R3, BME 680\r\nUpdate Interval - 15 seg\r\nhttps://clima.santiago.ovh/",
        "latitude": "-31.9939484",
        "longitude": "-53.9575388",
        "field1": "Temperatura C\u00b0",
        "field2": "Humedad %",
        "field3": "Pres. Atmosf\u00e9rica (hPa)",
        "field4": "Gas",
        "field5": "Viento",
        "field6": "Precipitaci\u00f3n (mm)",
        "field7": "Direcci\u00f3n del Viento",
        "field8": "UV",
        "created_at": "2021-01-30T16:32:32Z",
        "updated_at": "2024-06-18T14:05:34Z",
        "elevation": "136",
        "last_entry_id": 4502987
    },
    "feeds": [
        {
            "created_at": "2024-06-18T13:33:39Z",
            "entry_id": 4502888,
            "field1": "17.73",
            "field2": "93.32",
            "field3": "996.29",
            "field4": "103.33",
            "field5": "0.00",
            "field6": "0",
            "field7": "0",
            "field8": "0.74"
        }
    ]
}

JSON SCHEMA:

{
    "type": "object",
    "properties": {
      "sensorId": {
        "type": "string"
      },
      "timestamp": {
        "type": "string",
        "format": "date-time"
      },
      "temperature": {
        "type": "number"
      },
      "unit": {
        "type": "string",
        "enum": ["Celsius", "Fahrenheit"]
      }
    },
    "required": ["sensorId", "timestamp", "temperature", "unit"]
  }



"""

In [4]:
# chat completion request
completion_request = ChatCompletionRequest(messages=[UserMessage(content=prompt)])
# encode message
tokens = mistral_tokenizer.encode_chat_completion(completion_request).tokens
# generate results
out_tokens, _ = generate([tokens], model, max_tokens=5000, temperature=0.0, eos_id=mistral_tokenizer.instruct_tokenizer.tokenizer.eos_id)
# decode generated tokens
result = mistral_tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])
print(result+"\n\n")

To convert the given JSON file to the format specified by the JSON schema, we need to create a new JSON object that adheres to the schema's structure. Here's the Python code to achieve this:

```python
import json
import datetime

# Given JSON data
data = """
{
    "channel": {
        "id": 1293177,
        "name": "San Diego - Estación Meteorológica",
        "description": "...",
        "latitude": "-31.9939484",
        "longitude": "-53.9575388",
        "field1": "Temperatura C°",
        "field2": "Humedad %",
        "field3": "Pres. Atmosférica (hPa)",
        "field4": "Gas",
        "field5": "Viento",
        "field6": "Precipitación (mm)",
        "field7": "Dirección del Viento",
        "field8": "UV",
        "created_at": "2021-01-30T16:32:32Z",
        "updated_at": "2024-06-18T14:05:34Z",
        "elevation": "136",
        "last_entry_id": 4502987
    },
    "feeds": [
        {
            "created_at": "2024-06-18T13:33:39Z",
            "entry_id": 4502888,
    

In [5]:
import json
import datetime

# Given JSON data
data = """
{
    "channel": {
        "id": 1293177,
        "name": "San Diego - Estación Meteorológica",
        "description": "...",
        "latitude": "-31.9939484",
        "longitude": "-53.9575388",
        "field1": "Temperatura C°",
        "field2": "Humedad %",
        "field3": "Pres. Atmosférica (hPa)",
        "field4": "Gas",
        "field5": "Viento",
        "field6": "Precipitación (mm)",
        "field7": "Dirección del Viento",
        "field8": "UV",
        "created_at": "2021-01-30T16:32:32Z",
        "updated_at": "2024-06-18T14:05:34Z",
        "elevation": "136",
        "last_entry_id": 4502987
    },
    "feeds": [
        {
            "created_at": "2024-06-18T13:33:39Z",
            "entry_id": 4502888,
            "field1": "17.73",
            "field2": "93.32",
            "field3": "996.29",
            "field4": "103.33",
            "field5": "0.00",
            "field6": "0",
            "field7": "0",
            "field8": "0.74"
        }
    ]
}
"""

# Parse the given JSON data
json_data = json.loads(data)

# Extract the feed data
feed_data = json_data["feeds"][0]

# Convert the timestamp to datetime object
feed_timestamp = datetime.datetime.strptime(feed_data["created_at"], "%Y-%m-%dT%H:%M:%SZ")

# Create a new JSON object based on the schema
new_json = {
    "sensorId": json_data["channel"]["id"],
    "timestamp": feed_timestamp.strftime("%Y-%m-%dT%H:%M:%S"),
    "temperature": float(feed_data["field1"]),
    "unit": "Celsius"  # Assuming the temperature is in Celsius
}

# Print the new JSON object
print(json.dumps(new_json))

{"sensorId": 1293177, "timestamp": "2024-06-18T13:33:39", "temperature": 17.73, "unit": "Celsius"}
