[BIT-484] Server/validator improvements (TextCausalLMNext) #852

opentaco · 2022-07-24T19:04:08Z

BIT-484 Server/validator improvements (TextCausalLMNext)

Various synapse improvements, highlights include:

Critical tokenizer fixes like setting server tokenizer padding_side.
Support for facebook/opt-* models and other non-fast tokenizer models.
Validator checks for NaNs in neuron_stats.
Gradual EMA from zero for neuron_stats used for weighting (to improve mean estimation).
Added normal validation loss to TextCausalLMNext.

Tested branch successfully for core_server and core_validator on CUDA, including running facebook/opt-13b.

Generative default expects most recent token on right-hand side with padding on left. huggingface/transformers#10552

Do not reuse model outputs from TextCausalLM, since the padding side differs.

New axons will gradually increase in weighting as the number of successfully responded queries grows. This ensures that sufficient observations are averaged before weighting to address potentially noisy validation measures.

…into BIT-484-7-TextCausalLMNext-dev # Conflicts: # bittensor/_neuron/text/core_server/nucleus_impl.py

More models, like OPT, are supported by TextCausalLMNext than TextCausalLM that requires fast tokenizers. Validation table sorted according to more populated synapse provides better view.

Otherwise some synapses like TextSeq2Seq with model_output=None will overwrite previous (potentially) non-None model_output.

To complement the existing phrase cross entropy loss, and to allow for more direct comparison to TextCausalLM.

coveralls · 2022-07-24T19:10:21Z

Pull Request Test Coverage Report for Build 0febfdd3-ca3e-4d28-b483-65e1b15a6c30

42 of 50 (84.0%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.2%) to 64.806%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
bittensor/utils/tokenizer_utils.py	42	50	84.0%

Files with Coverage Reduction	New Missed Lines	%
bittensor/utils/tokenizer_utils.py	1	85.81%

Totals
Change from base Build 5ab03dc9-731c-4622-bbe5-c614b3b857bc:	-0.2%
Covered Lines:	3924
Relevant Lines:	6055

💛 - Coveralls

Transformer models like gerpt2 typically perform worse with left-side attention mask, so turning it off.

Eugene-hu

LGTM! Amazing work!

bittensor/_neuron/text/core_server/nucleus_impl.py

bittensor/_neuron/text/core_validator/__init__.py

opentaco added 30 commits July 22, 2022 11:38

Fix topk not implemented on cpu for float16

7bab953

Unforce remote_train when cuda and float16

aec9a30

Ensure probability computations done in float32 for improved precision

b248cac

Set tokenizer.padding_side = "left" in core_server

4c7436c

Generative default expects most recent token on right-hand side with padding on left. huggingface/transformers#10552

Run model for TextCausalLMNext always

2e47182

Do not reuse model outputs from TextCausalLM, since the padding side differs.

Gradually increase weight fields from zero via EMA

1ddcd26

New axons will gradually increase in weighting as the number of successfully responded queries grows. This ensures that sufficient observations are averaged before weighting to address potentially noisy validation measures.

Add vocab to tokenizer if it does not have attr

d9c60f1

Add vocab to tokenizer if it does not have attr

0512641

Define vocab_len as real tokenizer vocabulary length

27658df

Merge remote-tracking branch 'origin/BIT-484-7-TextCausalLMNext-dev' …

29e51e3

…into BIT-484-7-TextCausalLMNext-dev # Conflicts: # bittensor/_neuron/text/core_server/nucleus_impl.py

Define vocab_len as real tokenizer vocabulary length

d11b6cf

Try use_fast=False at tokenizer load if fast fails

aa44ed3

Add epsilon for log in loss computation

a964bde

Sort validation table according to TextCausalLMNext

27a1b4c

More models, like OPT, are supported by TextCausalLMNext than TextCausalLM that requires fast tokenizers. Validation table sorted according to more populated synapse provides better view.

Avoid EMA of nan values in core_validator neuron_stats

d049163

Replace nan/inf losses with large loss in core_validator

4101ff5

Swap validator table columns TextCausalLM <-> TextCausalLMNext

c62dbad

Ensure softmax computation done in float32 for improved precision

e7d44c0

Simplify remapping_token to include untensorized offset_mappings

355c328

Simplify encode_forward_causallm in core_server

97e7987

Remove left padding from batch probabilities for logit translation

d479000

Remove left padding from batch tokens for logit translation

9685a65

Return internal model_output in encode_forward_causallm

4da7f67

Reuse model_output in encode_forward_causallmnext in core_server

f52b2f8

Move tokens to core_server device

57c7ced

Set std_sequence_len according to tokens_std in logit translation

85b302f

Check that model_output is not None before overwrite in axon

3318cfd

Otherwise some synapses like TextSeq2Seq with model_output=None will overwrite previous (potentially) non-None model_output.

Revert model_output check in axon

090bc36

Log server loss and translated loss for TextCausalLM in core_server

a6df74d

Update target_phrases type

986799b

opentaco added 8 commits July 24, 2022 16:52

Add validation loss for TextCausalLMNext

8401f58

To complement the existing phrase cross entropy loss, and to allow for more direct comparison to TextCausalLM.

Move target_phrase to cpu in phrase_cross_entropy

8947fd9

Remove losses_val_nxt in textcausallmnext

6e43a55

Use floor probability when no validation match in phrase_cross_entropy

fc5351e

Update logger used for server translated loss in core_server

d792052

Use loguru info in core_validator instead of print

f798925

Update logger style in core_validator

f6ac4f4

Update logger style in core_validator

83c0a97

opentaco added 3 commits July 24, 2022 22:37

Set vocab_len for tokenizers to store real vocabulary size

dc88edd

Update test_tokenizer_utils.py

0822fc8

Omit attention_mask in TextCausalLM (update unittests)

cad4c0c

Transformer models like gerpt2 typically perform worse with left-side attention mask, so turning it off.

opentaco requested a review from isabella618033 July 25, 2022 13:37

Eugene-hu approved these changes Jul 26, 2022

View reviewed changes

Merge branch 'Synapse' into BIT-484-7-TextCausalLMNext-dev

f1301f5

opentaco merged commit 23b004f into Synapse Jul 26, 2022

ifrit98 deleted the BIT-484-7-TextCausalLMNext-dev branch May 24, 2023 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BIT-484] Server/validator improvements (TextCausalLMNext) #852

[BIT-484] Server/validator improvements (TextCausalLMNext) #852

opentaco commented Jul 24, 2022 •

edited

Loading

coveralls commented Jul 24, 2022 •

edited

Loading

Eugene-hu left a comment

[BIT-484] Server/validator improvements (TextCausalLMNext) #852

[BIT-484] Server/validator improvements (TextCausalLMNext) #852

Conversation

opentaco commented Jul 24, 2022 • edited Loading