Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tutorial for Llama-3-8B lora training and deployment #9359

Merged
merged 19 commits into from
Jun 7, 2024

Conversation

shashank3959
Copy link
Collaborator

  • Adds a notebook for Llama-3-8b LoRA PEFT with NeMo FW
  • Adds a notebook for sending multi-LoRA inference request to NVIDIA NIM
  • Adds README that includes instructions fore context and set up

shashank3959 and others added 3 commits June 3, 2024 04:35
* Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW
* Adds a notebook for sending multi-LoRA inference request to NIM
* Adds README that includes instructions fore context and set up

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
@shashank3959 shashank3959 changed the title [DRAFT] Add tutorial for Llama-3-8B lora training and deployment Add tutorial for Llama-3-8B lora training and deployment Jun 3, 2024
@shashank3959
Copy link
Collaborator Author

@chrisalexiuk-nvidia @vinhngx could you please review?

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
@chrisalexiuk-nvidia
Copy link
Contributor

image

There's some formatting mismatch here for the Fig Text.

@shashank3959
Copy link
Collaborator Author

Adding @nealvaidya for any review comments covering the NIM part

@chrisalexiuk-nvidia
Copy link
Contributor

Screenshot 2024-06-05 at 5 02 21 PM

First notebook workin' like a charm.

tutorials/llm/llama-3/llama3-lora-nemofw.ipynb Outdated Show resolved Hide resolved
tutorials/llm/llama-3/llama3-lora-nemofw.ipynb Outdated Show resolved Hide resolved
tutorials/llm/llama-3/llama3-lora-nemofw.ipynb Outdated Show resolved Hide resolved
tutorials/llm/llama-3/llama3-lora-nemofw.ipynb Outdated Show resolved Hide resolved
Copy link
Contributor

@nealvaidya nealvaidya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few things, mostly in readme. NIM notebook looks good

tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/README.rst Outdated Show resolved Hide resolved
tutorials/llm/llama-3/llama3-lora-deploy-nim.ipynb Outdated Show resolved Hide resolved
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Comment on lines 88 to 89
`2. Multi-LoRA inference with NVIDIA NIM <./llama3-lora-deploy-nim.ipynb>`__
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Copy link
Collaborator

@jgerh jgerh Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy Multiple LoRA Inference Adapters with NVIDIA NIM <./llama3-lora-deploy-nim.ipynb>`__

Comment on edits: Remove the number, update the title to an imperative verb, and change this step to a Heading 2 by adding ---------------------- under the revised title.

`2. Multi-LoRA inference with NVIDIA NIM <./llama3-lora-deploy-nim.ipynb>`__
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This is a demonstration of deploying multiple LoRA adapters with NVIDIA
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This procedure demonstrates how to deploy multiple LoRA adapters with NVIDIA . . .

Comment on lines 93 to 97
Hugging Face model formats. We will deploy the PubMedQA LoRA adapter
from the first notebook, alongside two other already trained LoRA adapters
(`GSM8K <https://github.com/openai/grade-school-math>`__,
`SQuAD <https://rajpurkar.github.io/SQuAD-explorer/>`__) that are
available on NVIDIA NGC as examples.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will deploy the PubMedQA LoRA adapter
from the first notebook, alongside two previously trained LoRA adapters
(GSM8K <https://github.com/openai/grade-school-math>,
SQuAD <https://rajpurkar.github.io/SQuAD-explorer/>
) that are
available on NVIDIA NGC as examples.

Comment on lines 99 to 103
``NOTE``: While it’s not necessary to complete the LoRA training and
obtain the adapter from the previous notebook (“Creating a LoRA adapter
with NeMo Framework”) to follow along with this one, it is recommended
if possible. You can still learn about LoRA deployment with NIM using
the other adapters downloaded from NGC.
Copy link
Collaborator

@jgerh jgerh Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. note::
Although it’s not necessary that you complete the LoRA training and secure the adapter from the preceding notebook (“Creating a LoRA adapter with NeMo Framework”) to proceed with this one, it is advisable. Regardless, you can continue to learn about LoRA deployment with NIM using other adapters that you’ve downloaded from NVIDIA NGC.

Comment on lines 106 to 107
1. Download example LoRA adapters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Download the example LoRA adapters.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

1. Download example LoRA adapters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The following steps assume that you have authenticated with NGC and downloaded the CLI tool, as mentioned in pre-requisites.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following steps assume that you have authenticated with NGC and downloaded the CLI tool, as listed in the Requirements section.

Comment on lines 129 to 130
2. Prepare the LoRA model store
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Prepare the LoRA model store.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

Comment on lines 165 to 166
3. Set-up NIM
^^^^^^^^^^^^^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Set Up NIM.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

-p 8000:8000 \
nvcr.io/nim/meta/llama3-8b-instruct:1.0.0

The first time you run the command, it will download the model and cache it in ``$NIM_CACHE_PATH`` so subsequent deployments are even faster. There are several options to configure NIM other than the ones listed above, and you can find a full list in `NIM configuration <https://docs.nvidia.com/nim/large-language-models/latest/configuration.html>`__ documentation.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several options to configure NIM other than the ones listed above. You can find a full list in NIM configuration documentation.

Comment on lines 200 to 201
4. Start the notebook
^^^^^^^^^^^^^^^^^^^^^
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Start the notebook.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

From another terminal, follow the same instructions as the previous
notebook to launch Jupyter Lab, and navigate to `this notebook <./llama3-lora-deploy-nim.ipynb>`__.

You may use the same NeMo
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the same NeMo

Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed review of ReadMe.rst

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
@@ -1,63 +1,47 @@
Llama 3 LoRA fine-tuning and deployment with NeMo Framework and NVIDIA NIM
Copy link
Collaborator

@jgerh jgerh Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Llama 3 LoRA Fine-Tuning and Deployment with NeMo Framework and NVIDIA NIM

Comment on edits: Missed this change. Title should be capitalized.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed one change. Title should be capitalized: Llama 3 LoRA Fine-Tuning and Deployment with NeMo Framework and NVIDIA NIM

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
@jgerh
Copy link
Collaborator

jgerh commented Jun 6, 2024

Approved comments

Copy link
Collaborator

@oyilmaz-nvidia oyilmaz-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on other reviews.

@pablo-garay pablo-garay merged commit f1062b7 into NVIDIA:main Jun 7, 2024
130 of 131 checks passed
@shashank3959 shashank3959 deleted the dev/shashank-llama3-nb branch June 7, 2024 17:35
janekl pushed a commit that referenced this pull request Jun 12, 2024
* Add tutorial for Llama-3-8B lora training and deployment

* Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW
* Adds a notebook for sending multi-LoRA inference request to NIM
* Adds README that includes instructions fore context and set up

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Add inference for other LoRAs in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in path in LoRA training notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typos and add end-2-end diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor issue in architecture diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Convert README from .md to .rst

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Incorporate review suggestions

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Remove access token

Invaidate and removes HF access token

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix broken link to NIM docs

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor typo in README parameter name

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix gramma and inconsistencies in style and formatting

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Capitalize Title

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

---------

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
JesusPaz pushed a commit to JesusPaz/NeMo that referenced this pull request Jun 18, 2024
* Add tutorial for Llama-3-8B lora training and deployment

* Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW
* Adds a notebook for sending multi-LoRA inference request to NIM
* Adds README that includes instructions fore context and set up

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Add inference for other LoRAs in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in path in LoRA training notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typos and add end-2-end diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor issue in architecture diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Convert README from .md to .rst

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Incorporate review suggestions

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Remove access token

Invaidate and removes HF access token

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix broken link to NIM docs

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor typo in README parameter name

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix gramma and inconsistencies in style and formatting

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Capitalize Title

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

---------

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
rohitrango pushed a commit to rohitrango/NeMo that referenced this pull request Jun 25, 2024
* Add tutorial for Llama-3-8B lora training and deployment

* Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW
* Adds a notebook for sending multi-LoRA inference request to NIM
* Adds README that includes instructions fore context and set up

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Add inference for other LoRAs in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in path in LoRA training notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typos and add end-2-end diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor issue in architecture diagram

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Convert README from .md to .rst

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix typo in deployment notebook

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Incorporate review suggestions

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Minor updates to README

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Remove access token

Invaidate and removes HF access token

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix broken link to NIM docs

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix minor typo in README parameter name

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Fix gramma and inconsistencies in style and formatting

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

* Capitalize Title

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

---------

Signed-off-by: Shashank Verma <shashank3959@gmail.com>
@ko3n1g ko3n1g mentioned this pull request Jul 18, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants