Skip to content

3. Fine tuning your LLM

Willza edited this page Jun 9, 2023 · 36 revisions

Installation and imports

pip install panml
# Import panml
from panml.models import ModelPack

Using open source models (instruct-based fine tuning)

Create model pack to load model from HuggingFace Hub. See model options in library supported models.

lm = ModelPack(model='google/flan-t5-base', source='huggingface')

Or if using GPU

lm = ModelPack(model='google/flan-t5-base', source='huggingface', model_args={'gpu': True})

Note: model_args {key: value} inputs can take in additionally any HuggingFace AutoModel arguments that is compatible with the HuggingFace AutoModel... from_pretrained classmethod, where key is the name of the parameter, and value is the parameter value assigned.

Selecting other models from HuggingFace Hub:

# Examples
lm = ModelPack(model='gpt2', source='huggingface')
lm = ModelPack(model='gpt2-xl', source='huggingface')
lm = ModelPack(model='EleutherAI/gpt-j-6B', source='huggingface')
lm = ModelPack(model='StabilityAI/stablelm-tuned-alpha-7b', source='huggingface')
lm = ModelPack(model='tiiuae/falcon-7b-instruct', source='huggingface')
...

Applying LoRA (Low Rank Adapter) configuration for fine tuning
Note: compatible for certain models only

model_args = {
    'peft_lora': {
        'inference_mode': False, 
        'r': 8, 
        'lora_alpha': 32, 
        'lora_dropout': 0.1,
        'load': False,
    }
}
lm = ModelPack(model='google/flan-t5-base', source='huggingface', model_args=model_args)

Fine tune the model with your own data in Pandas dataframe, and execute in instruct-based training regime.

# Specify train args
train_args = {
    'title': 'my_tuned_flan_t5',
    'num_train_epochs' : 1,
    'mlm': False,
    'optimizer': 'adamw_torch',
    'per_device_train_batch_size': 10,
    'per_device_eval_batch_size': 10,
    'warmup_steps': 20,
    'weight_decay': 0.01,
    'logging_steps': 10,
    'output_dir': './results',
    'logging_dir': './logs',
    'save_model': True,
}

# Prepare data
x = df['input_text']
y = df['target_text']

# Train model
lm.fit(x, y, train_args, instruct=True)

The fine tuned model is saved in the local directory: ./results/model_my_tuned_flan_t5

Loading the model from local directory:

lm = ModelPack(model='./results/model_my_tuned_flan_t5', source='local)

Loading LoRA (Low Rank Adapter) fine tuned model:

model_args = {
    'peft_lora': {
        'load': True,
    }
}
lm = ModelPack(model='./results/model_my_tuned_flan_t5', source='local', model_args=model_args)

Saving the model to local directory:

# Specify save directory
lm.save(save_dir='./my_new_model')

# Or if save directory is not provided, the default directory is: "./results/model_<model name>"
lm.save()

Generate output with the fine tuned model

output = lm.predict('What is the meaning of happiness?', display_probability=True)
print(output['text'])

Note: Evaluation metric can be accessed with the "evaluation_result" attribute

lm.evaluation_result

Using open source models (autoregressive fine tuning)

Note: some LLMs are quite large to load into local environment. Please consider the size of the LLM when attempting to do fine tuning with limited resources

Create model pack to load model from HuggingFace Hub. See model options in library supported models.

lm = ModelPack(model='gpt2', source='huggingface')

Fine tune the model with your own data from Pandas dataframe - execute in self-supervised autoregressive training regime. Note: model is saved in the path "./results/model_<title>/"

# Specify train args
train_args = {
    'title': 'my_tuned_gpt2',
    'num_train_epochs' : 1,
    'mlm': False,
    'optimizer': 'adamw_torch',
    'per_device_train_batch_size': 10,
    'per_device_eval_batch_size': 10,
    'warmup_steps': 20,
    'weight_decay': 0.01,
    'logging_steps': 10,
    'output_dir': './results',
    'logging_dir': './logs',
    'save_model': True,
}

# Prepare data
x = df['input_text']
y = x

# Train model
lm.fit(x, y, train_args, instruct=False)

Generate output with the fine tuned model

output = lm.predict('The old man walked slowly', display_probability=True)
print(output['text'])

Training datasets

There are numerous training datasets available for LLM optimisation. See some of them in the list here.