Reduce vocabulary size in test model, fix bug in routing when overlapped #45

justheuristic · 2022-08-17T11:06:00Z

More than half of the test 560m model parameters is it's embedding layer (250_880 x 1024).
This adds 1-2 GB of RAM depending on the test scenario
This PR reduces this vocabulary size to save memory during conversion, keeping only the first 50k tokens

As a result,

tests that load client-side embeddings need significantly less RAM
we can now run CI tests with 4 servers instead of 2 - needed to test routing - see bugs uncovered
some of the servers now use load balancing
CI convert_model now takes 4-5 minutes (was 6-7)

Running CI with 4 servers uncovered a bug:
https://github.com/bigscience-workshop/petals/runs/7879399020?check_suite_focus=true
which is now also fixed

This PR also temporarily pins an older bnb version, pending @TimDettmers 's fix for cpuonly

borzunov · 2022-08-17T14:03:59Z

.github/workflows/run-tests.yaml

+        run: |
+          export HF_TAG=$(python -c "import os; print(os.environ.get('GITHUB_HEAD_REF') or os.environ.get('GITHUB_REF_NAME'))")
+          python -c "from huggingface_hub import delete_repo; delete_repo(token='$BLOOM_TESTING_WRITE_TOKEN', \
+          name='test-bloomd-560m-$HF_TAG', organization='bloom-testing')" || true


Please consider moving that to a small Python script in the repo. Having a Bash command running Python, whose results are passed to another Bash command running Python, is not good.

done, see tests/srcipts/remove_old_models.py

borzunov · 2022-08-17T14:05:39Z

.github/workflows/run-tests.yaml

-          sleep 60  # wait for server to download layers
+          sleep 10 # wait for initial servers to declare blocks, then let server decide which blocks to serve
+
+          python -m cli.run_server --converted_model_name_or_path $MODEL_NAME --block_indices 0:6 \


Please consider running a loop instead, maybe using just --num_blocks without explicit --block_indices (I think using load balancing for all servers is a better test).

If it gets more complicated than just repeating something N times, please move it to a Python script.

moved to #16

test model with reduced vocabulary

90cda76

justheuristic marked this pull request as ready for review August 17, 2022 11:58

justheuristic added 15 commits August 17, 2022 14:58

black

04797c5

beat GHA into submission

1cd24fd

how bout now?

577d51c

how bout now?

bac27a7

how bout now?

4d11075

how bout now?

98fe936

how bout now?

03cd31e

partial rollback

8660cce

actually run load-balancing

3f190bb

workaround cpuonly problem

28099b5

earlier commit

2570899

earlier commit

251d048

earlier commit

61f4a5c

rollback, but leave a sanity check

0ecf296

test with 5 servers

c4bbf8f

borzunov reviewed Aug 17, 2022

View reviewed changes

remove models older than 3 days

2dbd788

justheuristic mentioned this pull request Aug 17, 2022

Client-side code improvements [Yozh-todo-list] #16

Closed

10 tasks

justheuristic added 2 commits August 17, 2022 17:55

black-isort

9e2753a

black-isort

d1b17da

justheuristic changed the title ~~Reduce vocabulary size in test model~~ Reduce vocabulary size in test model, fix bug in routing when overlapped Aug 17, 2022

typo in comment

dd05f5f

justheuristic merged commit a263400 into main Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce vocabulary size in test model, fix bug in routing when overlapped #45

Reduce vocabulary size in test model, fix bug in routing when overlapped #45

justheuristic commented Aug 17, 2022 •

edited

borzunov Aug 17, 2022

justheuristic Aug 17, 2022

borzunov Aug 17, 2022 •

edited

justheuristic Aug 17, 2022

Reduce vocabulary size in test model, fix bug in routing when overlapped #45

Reduce vocabulary size in test model, fix bug in routing when overlapped #45

Conversation

justheuristic commented Aug 17, 2022 • edited

borzunov Aug 17, 2022

Choose a reason for hiding this comment

justheuristic Aug 17, 2022

Choose a reason for hiding this comment

borzunov Aug 17, 2022 • edited

Choose a reason for hiding this comment

justheuristic Aug 17, 2022

Choose a reason for hiding this comment

justheuristic commented Aug 17, 2022 •

edited

borzunov Aug 17, 2022 •

edited