# Classification - Fine-tuning Mistral-7b
In this demo notebook, we demonstrate how to use the SageMaker Python SDK to fine-tunine and deploy [Mistral 7B](mistralai/Mistral-7B-v0.1) model for intent Classification.

This notebook has been tested using DataScience 3.0 kernel.

Initially, we import necessary utilities and set up the SageMaker inference instance type. For this demonstration, we select the Mistral 7B model, version 2.*.


In [None]:
import utils
from sagemaker.jumpstart.estimator import JumpStartEstimator

# To autoreload the module and incorporate the on-going changes on the file
%load_ext autoreload
%autoreload 1
%aimport utils

inference_instance_type = "ml.g5.2xlarge"
model_id, model_version = "huggingface-llm-mistral-7b", "2.*"
base_endpoint_name = model_id


Next, we initialize a base predictor using the `utils` module. This predictor will be used to evaluate the model's performance before and after fine-tuning.

In [None]:
base_predictor = utils.get_predictor(
    endpoint_name=base_endpoint_name,
    model_id=model_id,
    model_version=model_version,
    inference_instance_type=inference_instance_type,
)

Let's test our deployed endpoint.

In [None]:
query = "Your task is to write the name of the city for a country. ONLY write the city name, not anything else.\n\nWhat is the capital of France?\nResponse:"
response = utils.mistral(base_predictor, user=query, max_tokens=3)
print(utils.parse_output(response))

Next, we specify the file paths for our intent dataset, which includes separate files for training and testing, and a template file for fine-tuning. We assume that the template file is already created in the previous step when we fine-tuned the FlanT5-XL model.

In [None]:
intent_dataset_file = "data/intent_dataset.jsonl"
intent_dataset_train_file = "data/intent_dataset_train.jsonl"
intent_dataset_test_file = "data/intent_dataset_test.jsonl"
ft_template_file = "data/template.json"

Upload the training data and tempalte files to S3.

In [None]:
train_data_location = utils.upload_train_and_template_to_s3(
    bucket_prefix="intent_dataset_mistral",
    train_path=intent_dataset_train_file,
    template_path=ft_template_file,
)

Fine-tune the model with 5 epochs, same as the number of epochs used for fine-tuning the FlanT5-XL model.

In [None]:
estimator = JumpStartEstimator(
    model_id=model_id,
    disable_output_compression=True,
    instance_type="ml.g5.24xlarge",
    role=utils.get_role_arn(),
)

# By default, instruction tuning is set to false. Thus, to use instruction tuning dataset you use
estimator.set_hyperparameters(
    instruction_tuned="True", epoch="5", max_input_length="1024"
)
estimator.fit({"training": train_data_location})

Deploy the fine-tuned model

In [None]:
finetuned_endpoint_name = "mistral-7b-ft-infoext"
finetuned_model_name = finetuned_endpoint_name

In [None]:
finetuned_predictor = estimator.deploy(
    endpoint_name=finetuned_endpoint_name,
    model_name=finetuned_model_name,
)

Test the fine-tuned model with ambiguous queries.

In [None]:
ambiguous_queries = [
    {
        "query": "I want to change my coverage plan. But I'm not seeing where to do this on the online site. Could you please show me how?",
        "main_intent": "techincal_support",
        "sub_intent": "portal_navigation",
    },
    {
        "query": "I'm unhappy with the current benefits of my plan and I'm considering canceling unless there are better alternatives. What can you offer?",
        "main_intent": "customer_retention",
        "sub_intent": "free_product_upgrade",
    },
]
for query in ambiguous_queries:
    question = query["query"]
    print("query:", question, "\n")
    print(
        "expected intent:  ", f"{query['main_intent']}:{query['sub_intent']}"
    )

    prompt = utils.FT_PROMPT.format(query=question)
    response = utils.mistral(base_predictor, user=prompt, max_tokens=13)
    print("base model:  ", utils.parse_output(response))

    response = utils.mistral(finetuned_predictor, user=prompt, max_tokens=13)
    print("finetuned model:  ", utils.parse_output(response))
    print("-" * 80)

As we can see, fine-tuned model is able to classify the ambiguous queries correctly.

Finally, we evaluate the fine-tuned model's performance on the test dataset.

In [None]:
test_dataset = utils.load_dataset(intent_dataset_test_file)

res = utils.evaluate_model(
    predictor=finetuned_predictor,
    llm=utils.mistral,
    dataset=test_dataset,
    prompt_template=utils.FT_PROMPT,
    response_formatter=utils.mistral_output_intent_formatter,
)
utils.print_eval_result(res, test_dataset)

As we can see in the result of evaluation, the fine-tuned model is again yielding significantly better results than just in-context learning. 
