ModelTrainingLocal

License: cc-by-nc-4.0, Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://creativecommons.org/licenses/by-nc/4.0/

For training on online jupyter notebooks: https://github.com/Dampish0/ModelTrainingLocal

HELLO, HERE IS THE CODE FOR TRAINING ON LOCAL PC, ANY MODEL WORKS AS LONG AS IT RUNS ON PYTORCH.

Here is training data, 2 DATASET DONT GET THEM CONFUSED!! https://huggingface.co/datasets/Dampish/QuickTrain/tree/main

If you use my code, give credit.

So ill make it short You can change variables in the json file here called "Data.json".

"huggingface_access_token" is the token for it to write to huggingface, if left blank it will not push to huggingface.co otherwise fill. "model" is the actual model from huggingface.co or a local model of choice, its better to use huggingface though. "model_tokenizer" is the base model you are using, the one you are fine tuning, you need the orginal tokenizer to avoid issues, this is simple. "Data" is the actual training data. "valid_data" is the validation data. "out_dir" is the directory where you where you want the output and/or what its going to be named. Will automatically make on if it doesnt exist.

the only parameters u need to worry about are gradient_accumulation_steps and learning rate epoch is basically for how long you want to train it its easier to limit training with max step than using epoch cutoff_len is not important, it only changes how long the instruction is before it cuts it off some models have a max limit of 1024 tokens

GRADIENT_ACCUMULATION_STEPS, you want this to be between 4 and 32, apparently the more u have the better training u get.

Batch_size and micro_batch_size is exactly the same in this case. This number basically dictates how fast the step training goes, it also uses alot more vram when you increase it, the more the better if you ask me. Generally 4 for cpu, 8 for gpu, higher batch = higher ram. This script automatically utilizes maximal amount of micro_batch_size as it can, generally you dont need to touch this number, could lead to crashes and pain trying to maximize the gpu usage. IF 0 THEN AUTOMATICAL micro_batch_size OTHERWISE IT WILL TAKE YOUR NUMBER!!

LEARNING_RATE Should be between 2e-5 <-> 5e-5 could be more could be less.

CUTOFF_LEN is the length of string before it cuts it off.

MAX_STEP is Easier to use than Epochs, basically gives a max amount of steps and ignores epoch number, for example if you give it 1000 max steps that could equal to 7.61 epochs, this number is completely random.

I rather you DO NOT use the last variables as they are hard to work with, unless you know what you are doing. Only play with CPU_MODE. Im too lazy to explain them so good luck.

Thanks for using my code

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
AI TRAIN CODE		AI TRAIN CODE
Image21.png		Image21.png
README.md		README.md
image31.png		image31.png
image41.png		image41.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ModelTrainingLocal

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

VTSTech/ModelTrainingLocal

Folders and files

Latest commit

History

Repository files navigation

ModelTrainingLocal

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages