# RabbitMQ worker - Finetune Job Sender

We have three use cases in our app for the queues:
1. The API will trigger creation of a new job.  So this will likely be a new connection to RabbitMQ, sending a message to the queue, and then closing the connection.
2. The first worker will listen to a message, do a job and then post to another queue.
3. The second worker will only listen to a message.  

Both 2 and 3 can use blocking connections, in that they can remain open and listen for messages.

So for refactoring we can split this into these building blocks:
* A base that creates a connection to RabbitMQ, creates a channel and ensures the right queues are available.
* A component for the API to send a new message
* A component for the first worker that listens to a message, does a job and posts to another queue
* A component for the second worker that listens to a message and then does a job

We will have two queues:
* `start_fetch` - to trigger a new fetch job
* `data_processing` - to trigger a data processing on fetched data

In [10]:
# import pika

## Base - Our connection building block

In [11]:
import pika


def open_channel():
    """Opens a connection, a channel, creates queues and then returns this to the caller."""
    credentials = pika.PlainCredentials("DEV_USER", "CHANGE_ME")
    connection = pika.BlockingConnection(
        pika.ConnectionParameters(host="localhost", credentials=credentials)
    )
    channel = connection.channel()
    channel.queue_declare(queue="start_fetch", durable=True)
    channel.queue_declare(queue="data_processing", durable=True)
    return channel

## Component - The API sends to start worker 1

In [27]:
import json
finetune_data_sample="""
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
"""
# file=open("training_data.jsonl", "rb"),
message = {
        "auth_code":"fabc123",
        "job_id":"fabc123",        
        "task":"finetune",
        "data_content":finetune_data_sample,
        "data_fortmat":"alpaca",        
        "upload_files":""
}
payload_json_string = json.dumps(message)
payload_json_string

'{"auth_code": "fabc123", "job_id": "fabc123", "task": "finetune", "data_content": "\\n{\\"messages\\": [{\\"role\\": \\"system\\", \\"content\\": \\"Marv is a factual chatbot that is also sarcastic.\\"}, {\\"role\\": \\"user\\", \\"content\\": \\"What\'s the capital of France?\\"}, {\\"role\\": \\"assistant\\", \\"content\\": \\"Paris, as if everyone doesn\'t know that already.\\"}]}\\n", "data_fortmat": "alpaca", "upload_files": ""}'

In [29]:
QUEUE = "start_fetch"


def start_first_job():
    """Starts the first worker."""

    # Get a new channel from the base
    channel = open_channel()

    # Send an empty message to the queue
    channel.basic_publish(exchange="", routing_key=QUEUE, body=payload_json_string)

    print("Sending Message sent to queue: {}".format(QUEUE))
    print(f"Payload :{message}")

    # Close cleanly
    channel.close()


start_first_job()

Sending Message sent to queue: start_fetch
Payload :{'auth_code': 'fabc123', 'job_id': 'fabc123', 'task': 'finetune', 'data_content': '\n{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What\'s the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn\'t know that already."}]}\n', 'data_fortmat': 'alpaca', 'upload_files': ''}


Steps to Fine-Tune ChatGPT 
Step 1: Define the Use Case 

Identifying a use case is the first step in refining ChatGPT. To get the most out of ChatGPT, it helps to have a firm grasp on the precise goal you hope to achieve. ChatGPT can be tuned for a variety of uses, including sentiment analysis, language translation, and question answering. 
Step 2: Collect and Preprocess Data 

After a use case has been established, data collection and preparation can begin. The success of fine-tuning ChatGPT relies heavily on the quality and quantity of the data collected. The data must be accurate and suitable for the intended purpose. It is also crucial to clean, normalize, and tokenize the data before analysis. 
Step 3: Prepare Data for Training 

The next phase, following data collection and preprocessing, is to get the data ready for training. Step one is to separate the data into three distinct groups: training, validation, and test. The model is trained on the training set, validated on the validation set, and tested on the test set to determine how well it performed. 
Step 4: Fine-Tune the Model 

The following phase, after the data has been cleaned and organized, is to fine-tune the model. To do so, we must first train the pre-trained ChatGPT model on the application-specific data. In order to reduce the variance between the expected and actual outputs, backpropagation is used to update the model during training. 
Step 5: Evaluate the Model 

Once the model has been fine-tuned, its performance on the test set may be assessed. Accuracy, precision, recall, and the F1 score are some of the metrics that can be used for this purpose. If the model’s performance falls short of expectations, it may be essential to go back and tweak the use case specification, data collection, data preparation, or model hyperparameters. 
Step 6: Deploy the Model 

The last step is deployment, which occurs once the model has been fine-tuned and its performance has been deemed adequate. This is the process of incorporating the model into its eventual manufacturing setting. Building a RESTful API or a web app that facilitates model interaction is one such approach. 