# Let's get started with RunGPU

RunGPU is a serverless platform which gives you immediate access to GPU's on the cloud.

Rungpu is simple to use, you don't have to set up any machines or configure any pods,containers.
You can instantly start finetuning your model with a few lines of code. 

A few basics: 

1. Pick a dataset of your choice. 
2. Pick any model from Huggingface. 
3. Build your own finetuning configuration or choose from one of our templates!

We have sample code to help you get started, with popular models, including Mistral, Llama 3, Gemma, etc.

You can request your client Id and Client Secret by sending an email to contact@rungpu.ai

Let's start with importing the libraries.



## Create a Client object for Authentication. 


In [None]:
client_id = "<Your RunGPU client ID>"
client_secret = "<Your RunGPU client Secret"

In [None]:
from rungpu import Client

client = Client(client_id, client_secret)

# Let's get started with creating your dataset. 

In this example, we will use a dataset to train the Chatbot Model to make sure it learns specific information about our use case. 


But lets start with a simpler question:

#### What is a dataset and why do you need it? 

A dataset is a document or a corpus of text that holds the knowledge you would want to train your chatbot for a specific use-case.

Here's what a sample dataset in jsonl format looks like: 

```[
    {"prompt": "What is the capital of France?", "response": "Paris"},
    {"prompt": "Who wrote 'To Kill a Mockingbird'?", "response": "Harper Lee"},
    {"prompt": "What is the boiling point of water?", "response": "100°C or 212°F"},
    {"prompt": "Who painted the Mona Lisa?", "response": "Leonardo da Vinci"},
    {"prompt": "What is the largest planet in our solar system?", "response": "Jupiter"},
    {"prompt": "What year did the Titanic sink?", "response": "1912"},
]
```

You can now create a dataset configuration. 
The config below shows an example to load the dataset file from an S3 bucket:




In [None]:

config = {
    "config": {
        "type": "s3",
        "provider": "AWS",
        "env_auth": "false",
        "access_key_id": "<aws_access_key_id>",
        "secret_access_key": "<aws_secret_access_key>",
        "region": "<aws_region>",
        "src_file": "/<bucket_name>/test.jsonl",

    }
}



### Google Drive Option
If you don't have access to a cloud provider for storage and have your dataset stored locally, 
you can get started by simply having your file in your google drive
and having a shareable link pointing to that file. The config for that looks like this: 

```
config = {'config':
            {'type':'google_drive',
             'src_file':<shareable_gdrive_link>
             }
         }
```

In [2]:
config = {'config':
            {'type':'google_drive',
             'src_file':'<shareable_gdrive_link>'
             }
         }

In [None]:
from rungpu import Dataset

dataset = Dataset(client=client, config=config)

# Create the Dataset for finetuning. 
dataset_response = dataset.create_dataset()


# Your Dataset ID.

Running the code cell above returns you a response which contains the details of the dataset you just created. 

The dataset you created is stored securely on our servers for the purposes of finetuning the model (it can be removed later.)

This response object you get on running the `create_dataset()` function contains the dataset_id unique to the dataset you just created. The dataset id looks something like the following: 

`rungpu_dataset_<random_unique_id>`

You can plug this id into your finetuning config as below for the `dataset_id` property

In [None]:
dataset_id = dataset_response['dataset_id']
dataset_id

'rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c'

# Let's Create your model and start Finetuning!

Here are some examples of models Huggingface provides that you can try out: 

- Llama-3-8b
- Llama-3-8b-instruct
- Mistral-7b-instruct-v0.1
- Mistral-7b-instruct-v0.2
- Mistral-7b
- Gemma-2b
- Gemma-2B-Instruct
- Gemma-7b
- Gemma-7B-Instruct
- distilbert-distilgpt2
- Mistral-7b-instruct-v0.3
- Code-llama-13b-instruct
- Llama-2-7b
- Llama-2-7b-chat
- Llama-2-13b
- Llama-2-13b-chat
- Mixtral-8x7B-Instruct-v0.1
- Phi-2
- Zephyr-7b-beta


We will now train a distilbert quantized to 8-bit.

Note: You will require the Huggingface path of the model.

An example of a finetuning configuration is given below: 


In [None]:
# Here's your finetuning config. 
ft_config = {
    "base_model": "distilbert/distilgpt2",
    "quant": 8,
    "num_steps": 1000,
    "strategy": "lora",
    "checkpoint_steps": 100,
    "training_size": 1000,
    "peft_config": {
        "lora": {
            "r": 16,
            "alpha": 16,
            "target_modules": [
                "q_proj",
                "k_proj",
                "v_proj",
                "o_proj",
                "gate_proj",
                "up_proj",
                "down_proj",
                "lm_head"
            ],
            "bias": "none",
            "lora_dropout": 0.05
        }
    }
}

In [None]:
from rungpu import Finetune

# Creating the Finetune object, using the dataset and finetuning configuration we just created. 
finetune = Finetune(client,dataset_id=dataset_id,config=ft_config)

#Calling the run_finetune() function to kick off the finetuning job. 
response = finetune.run_finetune()
response


# The Finetune `run_id`

The `run_finetune()` function starts the model training. The response object contains a `run_id` which is the unique identifier for this specific finetuning job run.

You can use `run_id` to track the progress of the training.


In [None]:
run_id = response['run_id']
run_id

'distilbert-distilgpt2-8-bit-7bf888be-2475-11ef-9ec8-b8ca3a5c98fc-rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c'

# Check the status of your run. 

Use the finetune object to call the `get_status` function, which will help us get the status of training using the `run_id`

In [None]:
from rungpu import Status
st_obj = Status(run_id)
status = st_obj.get_status()
status

{'RunID': 'distilbert-distilgpt2-8-bit-7bf888be-2475-11ef-9ec8-b8ca3a5c98fc-rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c',
 'time_elapsed': '2024-06-07 12:36:45.661287',
 'RunStatus': {'command': 'Finetune',
  'status': 'COMPLETED',
  'phase': 'MODEL_TRAINED',
  'client_id': 'arun_prasad',
  'model_id': 'distilbert-distilgpt2-8-bit-7bf888be-2475-11ef-9ec8-b8ca3a5c98fc',
  'base_model': 'distilbert/distilgpt2',
  'dataset_id': 'rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c',
  'run_start': '2024-06-07 12:27:28.235440',
  'run_end': '2024-06-07 12:36:45.661287',
  'export_start': '2024-06-07 12:28:21.299823',
  'export_end': '2024-06-07 12:36:45.645127',
  'quantization': 8,
  'strategy': 'lora',
  'checkpoint_steps': 10,
  'training_steps': 100,
  'training_split': 1000,
  'peft_config': {'lora': {'r': 16,
    'alpha': 16,
    'target_modules': ['q_proj',
     'k_proj',
     'v_proj',
     'o_proj',
     'gate_proj',
     'up_proj',
     'down_proj',
     'lm_head'],
    'b

In [None]:
st_obj.progress()


ps is given, it will override any value given in num_train_epochs
The following columns in the training set don't have a corresponding argument in `PeftModelForCausalLM.forward` and have been ignored: text. If text are not expected by `PeftModelForCausalLM.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 1,738
  Num Epochs = 1
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 4
  Total optimization steps = 100
  Number of trainable parameters = 816,416
Saving model checkpoint to /home/rungpu/models//distilbert-distilgpt2/8-bit/distilbert-distilgpt2-8-bit-7bf888be-2475-11ef-9ec8-b8ca3a5c98fc-rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c/checkpoint-10
Saving model checkpoint to /home/rungpu/models//distilbert-distilgpt2/8-bit/distilbert-distilgpt2-8-bit-7bf888be-2475-11ef-9ec8-b8ca3a5c98fc-rungpu_dataset_ba33febd-47a2-4b3d-b818-f0ec64610d7c/chec

# Let's check if our finetuned model has made it to the cloud

In [1]:
signed_url = finetune.get_model(run_id)
signed_url


{'status': 'RUNNING', 'message': 'Your job is still running, try again later'}
