[llm] Create separate Predictor for LLMs and enable flash attention on CUDA #3409

tgaddair · 2023-05-18T17:22:44Z

This PR introduces a new LlmPredictor class used only during batch prediction for the purpose of running text generation instead of simply outputting logits (as during training) from the forward pass. This is because at predict time we want the fully generated sequence, not just the logits.

Other fixes included in this PR:

Fix DeepSpeed inflight errors by moving unused modules and metrics out of the top-level LLM module.
Removed hack disabling eval mode when using DeepSpeed (fixed by [BUG] INFLIGHT parameters after evaluation microsoft/DeepSpeed#3068)
Fix bfloat16 metrics computation
Add support for fine-tuning without an adapter
Uses flash attention for training and prediction when the model is on CUDA

This PR also disables prompt_tuning as an adapter type for now, as generation mode does not currently work when this adapter is applied (see tests/integration_tests/test_llm.py::test_llm_finetuning_strategies[prompt_tuning_init_random-local]).

github-actions · 2023-05-18T18:23:39Z

Unit Test Results

      6 files ±      0       6 suites ±0 1h 28m 20s ⏱️ + 16m 55s
2 760 tests +2 727 2 747 ✔️ +2 718 12 💤 +8 1 ❌ +1
2 814 runs +2 715 2 796 ✔️ +2 709 17 💤 +5 1 ❌ +1

For more details on these failures, see this check.

Results for commit bd0ddc2. ± Comparison against base commit cb37535.

♻️ This comment has been updated with latest results.

for more information, see https://pre-commit.ci

…nto llm-predictor

for more information, see https://pre-commit.ci

…nto llm-predictor

This reverts commit 5d7be1e.

ludwig/models/llm.py

arnavgarg1 · 2023-05-21T23:19:15Z

ludwig/models/llm.py

+class DictWrapper:
+    """Wrapper for a LudwigFeatureDict module that allows for iteration over keys.
+
+    The purpose of this class is to avoid exposing input and output features as modules of the LLM. This is because we
+    only wish to train the underlying model, and having these additional modules can confuse systems like DeepSpeed.
+    """


Very clever way of getting around initializing all of the other modules

ludwig/models/llm.py

ludwig/distributed/base.py

…dictor

…nto llm-predictor

arnavgarg1

LGTM 🚀

arnavgarg1 · 2023-05-22T07:38:34Z

Failed test looks like a transient issue, it should be safe to merge

WIP: Create separate predictors for LLMs

12a386c

tgaddair mentioned this pull request May 18, 2023

[LLM] Fix Loss Computation for LLM Fine-tuning using shifted tensors/new loss function #3408

Merged

tgaddair and others added 20 commits May 19, 2023 21:33

Merge

e4d5686

WIP: remove modules from LLM

cb57c21

Added DictWrapper

f52e7f7

[pre-commit.ci] auto fixes from pre-commit.com hooks

7b97749

for more information, see https://pre-commit.ci

Fixed recursion

ed5c8e0

Removed print

bb9e4da

Merge branch 'llm-predictor' of https://github.com/ludwig-ai/ludwig i…

a518ee8

…nto llm-predictor

Fix device placement

b1701a7

Removed eval()

2c47205

Revert refactoring

29ca85b

Empty cuda cache after eval

2b06c91

Fixed names

eeb3d03

Merge

ae6febc

Cast prediction tensors correctly when using deepspeed with quantization

fb2d9bb

Fixed device placement for metrics

069a754

[pre-commit.ci] auto fixes from pre-commit.com hooks

78acda3

for more information, see https://pre-commit.ci

Fixed metrics

209456f

Merge branch 'llm-predictor' of https://github.com/ludwig-ai/ludwig i…

4857271

…nto llm-predictor

Fixed predictor initializer

ba89162

Disable prompt tuning

2362fcd

tgaddair requested a review from arnavgarg1 May 21, 2023 20:05

tgaddair marked this pull request as ready for review May 21, 2023 20:05

tgaddair changed the title ~~WIP: Create separate predictors for LLMs~~ [llm] Create separate Predictor for LLMs May 21, 2023

tgaddair added 2 commits May 21, 2023 13:13

Fix

6486065

Revert "Ray nightly compatibility (#3275)"

bb79014

This reverts commit 5d7be1e.

arnavgarg1 reviewed May 21, 2023

View reviewed changes

ludwig/models/llm.py Outdated Show resolved Hide resolved

arnavgarg1 reviewed May 21, 2023

View reviewed changes

ludwig/models/llm.py Outdated Show resolved Hide resolved

arnavgarg1 reviewed May 21, 2023

View reviewed changes

ludwig/models/llm.py Outdated Show resolved Hide resolved

arnavgarg1 reviewed May 21, 2023

View reviewed changes

ludwig/distributed/base.py Show resolved Hide resolved

arnavgarg1 and others added 5 commits May 21, 2023 17:21

Merge remote-tracking branch 'origin/revert-ray-nightly' into llm-pre…

f8aa0dc

…dictor

Merge branch 'master' into llm-predictor

eb48e27

Add flash attention context manager when model is on GPUs

87497c2

Addressed comments

68410a5

Merge branch 'llm-predictor' of https://github.com/ludwig-ai/ludwig i…

34566f0

…nto llm-predictor

arnavgarg1 changed the title ~~[llm] Create separate Predictor for LLMs~~ [llm] Create separate Predictor for LLMs and enable flash attention on CUDA May 22, 2023

arnavgarg1 approved these changes May 22, 2023

View reviewed changes

Fix cuda check for flash attention

bd0ddc2

tgaddair merged commit ec194d9 into master May 22, 2023
14 of 16 checks passed

tgaddair deleted the llm-predictor branch May 22, 2023 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[llm] Create separate Predictor for LLMs and enable flash attention on CUDA #3409

[llm] Create separate Predictor for LLMs and enable flash attention on CUDA #3409

tgaddair commented May 18, 2023 •

edited by arnavgarg1

github-actions bot commented May 18, 2023 •

edited

arnavgarg1 May 21, 2023

arnavgarg1 left a comment

arnavgarg1 commented May 22, 2023

[llm] Create separate Predictor for LLMs and enable flash attention on CUDA #3409

[llm] Create separate Predictor for LLMs and enable flash attention on CUDA #3409

Conversation

tgaddair commented May 18, 2023 • edited by arnavgarg1

github-actions bot commented May 18, 2023 • edited

Unit Test Results

arnavgarg1 May 21, 2023

Choose a reason for hiding this comment

arnavgarg1 left a comment

Choose a reason for hiding this comment

arnavgarg1 commented May 22, 2023

tgaddair commented May 18, 2023 •

edited by arnavgarg1

github-actions bot commented May 18, 2023 •

edited