Fixed tensor parallelism splits #47

tgaddair · 2023-11-21T03:58:26Z

After removing the layer abstraction over the LoRA weights, we introduced a transpose operation, which meant we needed to be splitting on dim=1 rather than dim=0. This was causing tensor parallelism to break, affecting all deployments with more than one GPU.

This PR also fixes support for o_proj, which is row-parallel and needs to be split on dim=0. Previously, there was a bug preventing k_proj and o_proj from being picked up correctly, which is why this was missed.

Closes #46.

geoffreyangus

LGTM

tgaddair added 4 commits November 20, 2023 16:57

Fixed tensor parallelism

422010e

Fixed o_proj

14f7a01

Comment

bf2c0f1

Makefile

5f75650

tgaddair requested a review from geoffreyangus November 21, 2023 03:58

tgaddair mentioned this pull request Nov 21, 2023

Sharded adapters not working #46

Closed

4 tasks

geoffreyangus approved these changes Nov 21, 2023

View reviewed changes

tgaddair merged commit 188834f into main Nov 21, 2023
1 check failed

tgaddair deleted the fix-tp branch November 21, 2023 04:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed tensor parallelism splits #47

Fixed tensor parallelism splits #47

tgaddair commented Nov 21, 2023

geoffreyangus left a comment

Fixed tensor parallelism splits #47

Fixed tensor parallelism splits #47

Conversation

tgaddair commented Nov 21, 2023

geoffreyangus left a comment

Choose a reason for hiding this comment