From 1a17cf282f4102983c96fc0d2cb588546b4ec94f Mon Sep 17 00:00:00 2001 From: JJ Asghar Date: Fri, 18 Oct 2024 15:31:46 -0500 Subject: [PATCH 1/3] update for the commands for training and datagenerate --- docs/lab-4/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/lab-4/README.md b/docs/lab-4/README.md index 7ec190f..eb9d0c6 100644 --- a/docs/lab-4/README.md +++ b/docs/lab-4/README.md @@ -271,7 +271,7 @@ ilab model download 2) Next we need to generate the data, this is done with the following command: ```bash -ilab data generate --pipeline full --model ~/.cache/instructlab/models/mistral-7b-instruct-v0.2.Q4_K_M.gguf --model-family mixtral +ilab data generate --pipeline full --model ~/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf --model-family mixtral ``` This can take some time, take note of the time in the right hand corner, this is building 1000 questions off of your initial 15. @@ -281,7 +281,7 @@ This takes the granite model, leverages the tokenized version of it, and runs th hopefully you can take a lunch break or something while this is running. ```bash -ilab model train --pipeline full --effective-batch size 64 --is-padding-free false --device mps --max-batch-len 4000 --model-dir instructlab/granite-7b-lab --tokenizer-dir models/granite-7b-lab --model-name instructlab/granite-7b-lab +ilab model train --pipeline full --effective-batch-size 64 --is-padding-free false --device mps --max-batch-len 4000 --model-dir instructlab/granite-7b-lab --tokenizer-dir models/granite-7b-lab --model-name instructlab/granite-7b-lab ``` 4) When this is completed, you'll need to test this model, which is the following command: From 3baf2a436b5b57b64f3b9fbb65c4210f484abcfb Mon Sep 17 00:00:00 2001 From: JJ Asghar Date: Fri, 18 Oct 2024 15:53:34 -0500 Subject: [PATCH 2/3] taking out the sanity check and adding a better one --- docs/lab-4/README.md | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/docs/lab-4/README.md b/docs/lab-4/README.md index eb9d0c6..85caff9 100644 --- a/docs/lab-4/README.md +++ b/docs/lab-4/README.md @@ -4,15 +4,8 @@ Now that you've set up InstructLab, lets get tuning the Granite Model. ## Sanity check -First thing you should do is verify you can talk to the Granite model, go ahead and run -the following commands to verify you can. - -```bash -cd instructlab -source venv/bin/activate -ilab model chat -/q -``` +Take a moment to verify that you are not running `ilab model chat` or `ilab model serve` anywhere, +it will clash with the following commands with training and tuning the model. The Granite family of foundation models span an increasing variety of modalities, including language, code, time series, and science (e.g., materials) - with much more to come. We're building them with transparency and with focus on fulfilling rigorous enterprise requirements that are emerging for AI. If you'd like to learn more about the models themselves and how we build them, check out Granite Models. @@ -39,7 +32,7 @@ Knowledge in the taxonomy tree consists of a few more elements than skills: Format of the `qna.yaml`: -- `version`: The chache verion of the qna.yaml file, this is the format of the file used for SDG. The value must be the number 3. +- `version`: The cache version of the qna.yaml file, this is the format of the file used for SDG. The value must be the number 3. - `created_by`: Your GitHub username. - `domain`: Specify the category of the knowledge. - `seed_examples`: A collection of key/value entries. @@ -271,7 +264,7 @@ ilab model download 2) Next we need to generate the data, this is done with the following command: ```bash -ilab data generate --pipeline full --model ~/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf --model-family mixtral +ilab data generate ``` This can take some time, take note of the time in the right hand corner, this is building 1000 questions off of your initial 15. @@ -281,7 +274,7 @@ This takes the granite model, leverages the tokenized version of it, and runs th hopefully you can take a lunch break or something while this is running. ```bash -ilab model train --pipeline full --effective-batch-size 64 --is-padding-free false --device mps --max-batch-len 4000 --model-dir instructlab/granite-7b-lab --tokenizer-dir models/granite-7b-lab --model-name instructlab/granite-7b-lab +ilab model train ``` 4) When this is completed, you'll need to test this model, which is the following command: From 4f17957fa9f3f360a4b2089906a02fb1f53fb04f Mon Sep 17 00:00:00 2001 From: JJ Asghar Date: Fri, 18 Oct 2024 16:00:13 -0500 Subject: [PATCH 3/3] converted to --simple --- docs/lab-4/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/lab-4/README.md b/docs/lab-4/README.md index 85caff9..e395bf3 100644 --- a/docs/lab-4/README.md +++ b/docs/lab-4/README.md @@ -274,7 +274,7 @@ This takes the granite model, leverages the tokenized version of it, and runs th hopefully you can take a lunch break or something while this is running. ```bash -ilab model train +ilab model train --pipeline simple ``` 4) When this is completed, you'll need to test this model, which is the following command: