Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters #235

chimezie · 2024-01-05T17:40:00Z

Based on lora.py and meant to perform supervised instruction fine-tuning for the various supported models, separating the model-specific logic from the common training and supporting logic (and management of training configuration parameters).

It breaks out argument parameters into a YAML file and allows arbitrary training data (and prompt) formats

A configuration .yaml file (the only command line argument) is expected to be in the following format:

parameters:
    model: "..."
    num_tokens: 100
    write_every: 1
    temp: 0.8
    train: true
    [..]

Each entry under parameters is the default argument name of the command-line arguments originally provided to lora.py. The default values are the same as the original argument parameters they are based on.

An epoch parameter was added, determining the number of iterations if provided (the number needed for a
full pass of the data, i.e., an epoch). If the value is -1, it is ignored, and the iters parameter is used as
before. If iters is -1 then 1 epoch is performed.

A module for a particular prompt syntax or training dataset format can be defined with a class that specializes TrainingRecordHandler, and provides an instance of it to main, along with a function for getting model and tokenizer instances, and another to generate from a given prompt using the tokenizer. Otherwise, the iterating batch implementation calculates (Q)LoRa finetuning with configurable parameters.

It needs to define its own get_input and get_output methods, which take a Python dictionary and return an instruction input and output as strings, respectively, and with the appropriate prompt formatting.

This allows handling arbitrary dataset JSON formats and prompt formats separate from the training.

Will probably need to reconcile with #213 once that PR is merged to main

See mistral_supervised.py for an example.

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes

Will need to sync with ml-explore#213 when merged into main

Fixed import path, iteration calculation, and creation of configuration namespace from YAML.

Removed as much model-specific code as possible to clearly separate training logic from model-related stuff (tokenizer, model structure, model loading, generation, etc.). Added additional parameters: max_tokens, tokens_per_eval, and temp (for generation). Added --dataset-summary options, which provides a summary of the training data and does nothing else. Fixed tqdm status to track iterations. Incorporated latest LoRa training from mlx-example main.

Incorporating latest from llms/mistral/*, passing them off to model-agnostic supervised LoRa training module

Uses supervised LoRa framework, implementing all the HF model-specific bits. Modules for other kinds of models can be added in the same way in these model-specific modules that use supersized_lora.py

chimezie · 2024-01-10T23:36:15Z

Another pass at separating model-specific bits from training logic. Still keeping an eye on #213 to see if there is any synergy

ProjectProgramAMark · 2024-01-14T22:53:17Z

Just pushed (proposed) final version of #213 . Take a look and let me know how I can help utilize our changes together!

chimezie · 2024-01-23T18:52:47Z

Just pushed (proposed) final version of #213 . Take a look and let me know how I can help utilize our changes together!

That would be fantastic! Sorry I only just saw this. However, I see this comment from today and will be consolidating #337 merges into this PR.

I still think a separate, configuration-based Lora example would be handy even just for the purposes of Lora hacking on this framework. Later this week, I'm planning on also adding the writing of loss/validation structured data for plotting purposes (perhaps as a separate PR). Let me know your updated thoughts regarding this and #213

ivanfioravanti · 2024-02-27T21:12:42Z

Amazing job @chimezie 🚀

chimezie added 3 commits January 5, 2024 11:55

Initial commit of configurable supervized lora training framework.

4f7023c

Will need to sync with ml-explore#213 when merged into main

Additional comment about YAML file format

0dbc78c

Minor fixes

119cadb

Fixed import path, iteration calculation, and creation of configuration namespace from YAML.

chimezie changed the title ~~Extension of mlx-examples for supervised ML of configurable datasets with YAML-based configuration of parameters~~ Extension of mlx-examples for supervised ML of configurable dataset formats with YAML-based configuration of parameters Jan 5, 2024

chimezie changed the title ~~Extension of mlx-examples for supervised ML of configurable dataset formats with YAML-based configuration of parameters~~ Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters Jan 5, 2024

chimezie added 12 commits January 5, 2024 15:13

Updates from pre-commit run --all-files

d67aa16

Merge branch 'main' into main

e0a81d9

Merge branch 'ml-explore:main' into main

2bb89da

Merge branch 'ml-explore:main' into main

649add3

Merge branch 'ml-explore:main' into main

b405a01

Merge branch 'ml-explore:main' into main

2de9059

Merge branch 'ml-explore:main' into main

8be74e7

Merge branch 'main' into main

48b6f93

Merge branch 'ml-explore:main' into main

5739cc6

Consolidated all Mistral-specific bits

c47d2b7

Incorporating latest from llms/mistral/*, passing them off to model-agnostic supervised LoRa training module

Added hf_llm-specific module

fc6a665

Uses supervised LoRa framework, implementing all the HF model-specific bits. Modules for other kinds of models can be added in the same way in these model-specific modules that use supersized_lora.py

chimezie added 12 commits January 11, 2024 10:31

Merge branch 'ml-explore:main' into main

b763c8a

Minor fix

6b51a76

Merge remote-tracking branch 'origin/main'

5b5a868

Another minor fix

b6709db

Normalize lora-related variables

5eac1a0

Merge branch 'ml-explore:main' into main

d38c2f6

Merge branch 'ml-explore:main' into main

5a39a70

Merge branch 'ml-explore:main' into main

6d8e947

Merge branch 'ml-explore:main' into main

4aa2069

Fix JSONL loading & added validation summary

45c2859

Merge branch 'ml-explore:main' into main

e56704e

Merge branch 'ml-explore:main' into main

1210971

ProjectProgramAMark mentioned this pull request Jan 14, 2024

Creating a module version of lora.py (for referencing the functions in other scripts) #213

Closed

chimezie added 2 commits January 16, 2024 15:12

Merge branch 'ml-explore:main' into main

1759821

Merge branch 'ml-explore:main' into main

f1a2a74

mzbac mentioned this pull request Jan 19, 2024

feat: move lora into mlx-lm #337

Merged

Merge branch 'ml-explore:main' into main

0e295d8

chimezie added 13 commits January 23, 2024 14:26

Merge branch 'ml-explore:main' into main

c4feb51

Merge branch 'ml-explore:main' into main

03f5a77

Merge branch 'ml-explore:main' into main

a2200c8

Merge branch 'ml-explore:main' into main

a282b90

Add iterate_batches as training parameter

177604f

Merge branch 'ml-explore:main' into main

6134231

Passing parameterized iterate_batches to evaluation

8973224

Merge branch 'ml-explore:main' into main

760437a

Merge branch 'ml-explore:main' into main

4ee2810

Add the feeding of loss data upstream

390704e

Merge remote-tracking branch 'origin/main'

fde6d8c

Added iteration number

ddbacfb

Merge branch 'ml-explore:main' into main

15c3952

ivanfioravanti mentioned this pull request Feb 18, 2024

Additional parameters to mlx_lm lora? r, lora_alpha, lora_dropout, scale? #454

Closed

fblissjr mentioned this pull request Feb 24, 2024

Instruct tuning for lora/finetune? #484

Open

chimezie mentioned this pull request Feb 29, 2024

YAML configuration for mlx_lm.lora #503

Merged

chimezie closed this Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters #235

Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters #235

chimezie commented Jan 5, 2024 •

edited

Loading

chimezie commented Jan 10, 2024

ProjectProgramAMark commented Jan 14, 2024

chimezie commented Jan 23, 2024

ivanfioravanti commented Feb 27, 2024

Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters #235

Extension of lora.py for supervised ML of configurable dataset formats with YAML-based configuration of parameters #235

Conversation

chimezie commented Jan 5, 2024 • edited Loading

Checklist

chimezie commented Jan 10, 2024

ProjectProgramAMark commented Jan 14, 2024

chimezie commented Jan 23, 2024

ivanfioravanti commented Feb 27, 2024

chimezie commented Jan 5, 2024 •

edited

Loading