Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synapse basic fixes #851

Merged
merged 5 commits into from
Jul 26, 2022
Merged

Synapse basic fixes #851

merged 5 commits into from
Jul 26, 2022

Conversation

Eugene-hu
Copy link
Contributor

Fixes For:

  • btcli inspect
  • args for blacklisting and priority queue
  • sample configs for core validator

@coveralls
Copy link

coveralls commented Jul 21, 2022

Pull Request Test Coverage Report for Build 5c7141be-7483-4e16-a8ac-1c5a36394350

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 64.822%

Totals Coverage Status
Change from base Build f26da1e6-dfd8-47cd-aa1d-e3d1c88a6c96: 0.0%
Covered Lines: 3925
Relevant Lines: 6055

💛 - Coveralls

@Eugene-hu Eugene-hu merged commit 2a68580 into Synapse Jul 26, 2022
Eugene-hu added a commit that referenced this pull request Aug 8, 2022
* Remove duplicate info print in validator epoch

* Add duration info print to validator forward

* Add duration info print to validator forward

* Change print details in validator forward

* Bit 458 combine advanced with template server (#792)

* init

* added local training

* .

* working axon backward

* model saving

* shape fix

* clean

* averaging grad and loss

* fix

* fixes for comments

* removed advanced server and templarte server

* Synapse fix dend test (#798)

* prelim fixes

* 2nd fix

* final fix

* concstant fix

* validator fix

* core server self.set_fine_tuning_params() fix

* fix

* UI updates (#796)

* UI updates

* core server fixes

* turning off blacklisting for now

* generation fix

* UI and check updates

* small bug fixes

* constant import

* bug fixes

* Bit 490 synapse one fail drag all down (#810)

* init

* bug fixed

* generate bug fixes (#812)

* circle ci test fix (#813)

fixes

* validator fixes

* fixes for test_wallet

* Adds flag --wallet.reregister <bool> (#819)

* Adds flag --neuron.reregister <bool>
Default True

* created tests for new flag

* Revert "Adds flag --neuron.reregister <bool>"

This reverts commit 0736dfb.

* fix tests for new flag

* add new flag and implementation

* [WIP]Bit 490 synapse fix (#814)

* generate bug fixes

* dendrite backward disabled by default and paded sequence

* detach fix

* vocab size from 50257 to 50258

Co-authored-by: joeylegere <joeylegere@gmail.com>

* Synapse gpu fix (#822)

* temp fixes

* small fixes + blacklist

* Bit 490 synapse fix (#824)

* generate bug fixes

* dendrite backward disabled by default and paded sequence

* detach fix

* vocab size from 50257 to 50258

* fix generate sizing

* arg fix

* generate fix

* put new inputs on correct device

* tensors don't do .to inplace

* Null synapse (#827)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* Synapse defaults fix (#832)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* defaults update

* Remove template miner (#833)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* defaults update

* deprecate template miner

* core_server update (blacklist changes + priority)

* removed tests for template miner

* remove update script

* seq2seq defaults updates (#836)

* defaults updates

* 1 minute timeout

* Bit 495 synapse more unit test (#828)

* added dend receptor test

* missing synapse test

* fix

* added axon text

* fix

* .

* small defaults update for seq2seq (#837)

* Fix uninitialized device variable in core_server

* Check for any available cuda device in core_server

* Allow for higher pytorch versions for CUDA 11.6 (sm_86) support

* Remove forced training when autocast and cuda

* Cast back to float32 in case autocast in logit translation

* Truncate logits down to vocab length

* put tensors on the device

* Validator hotfix (#844)

* .

* .

* Change tokenizer pad_token to eos_token

Define PAD Token = EOS Token = 50256, according to https://github.com/huggingface/transformers/blob/49c8c67fb815a277405f84dea4a66353e19fb347/tests/models/gpt2/test_modeling_gpt2.py#L532

Set padding_side = "left", since generative default expects most recent token on right-hand side with padding on left, according to huggingface/transformers#10552

* Simplify token padding in remapping_token_causallm

* Use pad_token in remapping_token_causallm in server

Note that tokenizer(padding=True, ...) is not used because unpadded offset_mapping is required for logit translation operations.

* Add tokenizer flags to remapping_token_causallm

To allow function to be used in various scenarios, including for causallm and generate.

* Combine remapping_token functions in server

Now a single remapping_token function servers all server forward functions.

* Update forward_generate with new token_remap

* Remove old remapping_token

* Adjust token_remap parameters in core_server

* Undo pad handling in validator forward

Now tokenizer.padding_side='left' ensures that last position fulfils validation. Additionally, validation is usually performed on unpadded [batch_size, sequence_len] tokens.

* Tensorize token input_ids and attention_mask in core_server

* Fix legacy constructor device parameter issue

* Add GPT2 generate convention pad_token_id=eos_token_id

https://github.com/huggingface/transformers/blob/49c8c67fb815a277405f84dea4a66353e19fb347/tests/models/gpt2/test_modeling_gpt2.py#L532

* Maint reposition remapping_token function

* use fast fix

* Revert "use fast fix"

This reverts commit e34c032.

* Add TextCausalLMNext Synapse to proto

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

Also adds comment specifying proto compile command, which is useful to see for manual compilation instruction.

* Add code_to_synapse for text_causal_lm_next

* Add TextCausalLMNext Synapse

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add text_causal_lm_next to Dendrite class

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add synapse_causal_lm_next to Axon

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add topk token phrases utilities

Tokenizer utilities adds functions to compact and unravel topk server token phrases (standard tokenized), to be used with TextCausalLMNext synapse.

* Add unit test for topk token phrases utilities

Unit test new tokenizer utility functions that compact and unravel topk server token phrases (standard tokenized), to be used with TextCausalLMNext synapse.

* Add phrase entropy for topk token phrases

Calculates the cross entropy of a phrase prediction against a target phrase, so that this is a multi-token extension of typical cross entropy calculated for next token prediction, to be used with TextCausalLMNext synapse.

* Add unit test for phrase entropy for topk token phrases

Adds unit test for calculating the cross entropy of a phrase prediction against a target phrase, so that this is a multi-token extension of typical cross entropy calculated for next token prediction, to be used with TextCausalLMNext synapse.

* Add axon tests for TextCausalLMNext

* Add dendrite tests for TextCausalLMNext

* Add forward-backward tests for TextCausalLMNext

* Add receptor tests for TextCausalLMNext

* Add receptor_pool tests for TextCausalLMNext

* Update receptor_pool tests for TextCausalLMNext

* Update receptor_pool tests for TextCausalLMNext

* Add encode_forward_causallmnext() to server

To be used for TextCausalLMNext synapse, already compacts/encodes response for transfer to remote dendrite with backward support.

Forward pass through the pretrained model and select topk tokenizer logits and retokenize with std_tokenizer, then compact new token phrases and probabilities into 1-D tensor [ > batch_size * 2 * topk + 1] prob + at least 1 token per phrase + floor_prob. The floor probability is the mean probability of token phrases not captured in topk, required since the server tokenizer vocab_size may not be known to the receiver/validator.

* Add forward_casual_lm_next() to server

To be used for TextCausalLMNext synapse, already compacts/encodes response for transfer to remote dendrite with backward support.

Forward pass through the pretrained model and select topk tokenizer logits and retokenize with std_tokenizer, then compact new token phrases and probabilities into 1-D tensor [ > batch_size * 2 * topk + 1] prob + at least 1 token per phrase + floor_prob. The floor probability is the mean probability of token phrases not captured in topk, required since the server tokenizer vocab_size may not be known to the receiver/validator.

* Add causallmnext to core_server neuron config

* Add calc_loss_fct to neuron_utilities

* Add PositionalEncoding to neuron_utilities

* Import neuron_utilities and use its PositionalEncoding

* Add validation_len to core_validator neuron params

Number of tokens to holdout for phrase validation beyond sequence context.

* Maint core_validator forward() formatting and docs

* Maint core_validator forward() num_servers -> num_endpoints

* Detach synapse responses and move to validator device

* Add synergy_table display to core_validator

Prints the synergy loss diff matrix with pairwise loss reduction due to synergy (original loss on diagonal)

* Add stats_table display to core_validator

Gathers data and constructs neuron statistics table and prints it.

* Add synapse_table display to core_validator

Prints the evaluation of the neuron responses to the validator request.

* Add unsuccess display to core_validator

Prints the return codes and response times of unsuccessful responses.

* Add shapley_base to core_validator

Calculate Shapley base values and neuron response validation measure statistics, given responses from a synapse.

* Add scaling_law_loss_to_params to core_validator

(OpenAI scaling laws) Kaplan, Jared, et al. "Scaling laws for neural language models." arXiv:2001.08361 (2020)

* Add shapley_synergy to core_validator

Calculates Shapley synergy for coalition size 2, measured performance above expected performance. Measured in effective number of model parameters, just like base Shapley values.

* Add textcausallm to core_validator

Calculate Shapley values and neuron response validation measure statistics, given TextCausalLM synapse responses.

* Add textcausallmnext to core_validator

Calculate Shapley values and neuron response validation measure statistics, given TextCausalLMNext synapse responses.

* Use synapse validation functions in core_validator

Replace monolith with individual synapse validation function use in forward().

* Add neuron statistics variables to core_validator

* Add neuron_stats_update to core_validator

Updates self.neuron_stats with new individual dictionaries per uid.

* Add calculate_weights to core_validator

Calculates neuron set-weights from weight_key mapped values. Defines weight_key as the neuron stats key used to obtain the mapped stat value (typically a Shapley value) that the final set-weights are calculated from.

* Add __str__ and __repr__ for core_validator

Display UID, IP, wallet address summary for validator.

* Add weights_table to core_validator

Prints weights table given topk_uids and topk_weights.

* Use neuron_stats_update in core_validator

* Use stats_table to print stats update in core_validator

* Use calculate_weights and weights_table in core_validator

* Replace server_stats with neuron_stats in core_validator

* Fix rich print formatting in core_validator

* Update neuron_stats_columns with TextCausalLMNext synapse fields

* Move tensors to core_validator device

* Simplify shapley_synergy loss_diff_share

* Move tensors to same device in phrase_cross_entropy

* Use non-zero weight_key in core_validator

* Fix topk not implemented on cpu for float16

* Unforce remote_train when cuda and float16

* Ensure probability computations done in float32 for improved precision

* Set tokenizer.padding_side = "left" in core_server

Generative default expects most recent token on right-hand side with padding on left. huggingface/transformers#10552

* Run model for TextCausalLMNext always

Do not reuse model outputs from TextCausalLM, since the padding side differs.

* Gradually increase weight fields from zero via EMA

New axons will gradually increase in weighting as the number of successfully responded queries grows. This ensures that sufficient observations are averaged before weighting to address potentially noisy validation measures.

* Add vocab to tokenizer if it does not have attr

* Add vocab to tokenizer if it does not have attr

* Define vocab_len as real tokenizer vocabulary length

* Define vocab_len as real tokenizer vocabulary length

* Try use_fast=False at tokenizer load if fast fails

* Add epsilon for log in loss computation

* Sort validation table according to TextCausalLMNext

More models, like OPT, are supported by TextCausalLMNext than TextCausalLM that requires fast tokenizers. Validation table sorted according to more populated synapse provides better view.

* Avoid EMA of nan values in core_validator neuron_stats

* Replace nan/inf losses with large loss in core_validator

* Swap validator table columns TextCausalLM <-> TextCausalLMNext

* Ensure softmax computation done in float32 for improved precision

* Simplify remapping_token to include untensorized offset_mappings

* Simplify encode_forward_causallm in core_server

* Remove left padding from batch probabilities for logit translation

* Remove left padding from batch tokens for logit translation

* Return internal model_output in encode_forward_causallm

* Reuse model_output in encode_forward_causallmnext in core_server

* Move tokens to core_server device

* Set std_sequence_len according to tokens_std in logit translation

* Check that model_output is not None before overwrite in axon

Otherwise some synapses like TextSeq2Seq with model_output=None will overwrite previous (potentially) non-None model_output.

* Revert model_output check in axon

* Log server loss and translated loss for TextCausalLM in core_server

* Update target_phrases type

* Add validation loss for TextCausalLMNext

To complement the existing phrase cross entropy loss, and to allow for more direct comparison to TextCausalLM.

* Move target_phrase to cpu in phrase_cross_entropy

* Remove losses_val_nxt in textcausallmnext

* Use floor probability when no validation match in phrase_cross_entropy

* Update logger used for server translated loss in core_server

* Use loguru info in core_validator instead of print

* Update logger style in core_validator

* Update logger style in core_validator

* Set vocab_len for tokenizers to store real vocabulary size

* Update test_tokenizer_utils.py

* Omit attention_mask in TextCausalLM (update unittests)

Transformer models like gerpt2 typically perform worse with left-side attention mask, so turning it off.

* Add message to synapse_callback return outputs

Allows direct loss evaluations and statistics to be passed through from the server model synapse execution to the axon logging output, useful to display server-side loss values for verification at validation.

* Add message to synapse_callback return outputs

Allows direct loss evaluations and statistics to be passed through from the server model synapse execution to the axon logging output, useful to display server-side loss values for verification at validation.

* [BIT 487] btcli regen coldkey with pubkey only (#831)

* fix no_prompt help dialog

* add function to regen coldkeypub from the pub part

* added new regen_coldkeypub to CLI

* add no_prompt option

* catch if the addr/pub key is valid first

* add None default

* fix utils import

* explicitly set ss58_format

* add tests for new regen_coldkeypub

* add integration test for wallet coldkeypub create

* add integration test for cli regen_coldkeypub

* fix type

* add wallet args

* fix config access

* don't specify bad function param

* fix test to only check coldkeypub

* remove printout inside validation check and rename

* fix uncaught name change

* Synapse basic fixes (#851)

* basic bug fixes

* sample config fixes

Co-authored-by: Ala Shaabana <shaabana@gmail.com>
Co-authored-by: joeylegere <joeylegere@gmail.com>

* Add support for non whitespace-preserving tokenizers

Prepends strings with a space, in the case of non whitespace-preserving tokenizers like BERT, which should mimic whitespace-preserving token strings more often than not.

Adds error catching on per-UID basis in shapley_base for synapse validation to be more robust.

* Move prep_tokenizer to tokenizer_utils.py

* Update warning message in core_validator synapse validation

* Disable coveralls (#855)

* Update README.md

* disable publishing coverage

* dont upload any coverage

Co-authored-by: Unconst <32490803+unconst@users.noreply.github.com>
Co-authored-by: Joey Legere <joey@opentensor.ai>

* Add set_std_token_phrases for caching std_tokenizer equivalent of tokenizer token strings

Sets std_token_phrases which are the tokenizer token strings tokenized with std_tokenizer, so the std_tokenizer equivalent of the tokenizer token strings. Used for converting model predictions/logits into std_tokenizer representations, for example in TextCausalLMNext.

* Update topk_token_phrases to output 2D tensor with gradients instead of compact tensor

Standardizes the TextCausalLMNext server model output to 2D tensor with gradients, thereafter the axon encoding will compact the 2D tensor into 1D by removing ignore_index padding. Standardization should allow for multithreaded decoding at dendrite receptor_pool into 2D tensor for validator to form matching gradient shape.

Select topk tokenizer logits/phrases and include std_token_phrases counterparts (std_tokenization of token text) in topk_tensor output of shape [batch_size * (topk + 1), max_len], where max len of all phrase lists (with prob in front) is max_{b,k}(len([prob_k, tok_0_k, tok_1_k, ...])).

The output topk_tensor also includes a floor_prob for each batch item. The floor probability is the mean probability of token phrases not captured in topk, required since the tokenizer vocab_size may not be known to the receiver.

* Add compact_topk_token_phrases to tokenizer_utils.py

Compact 2D topk_tensor [batch_size * (topk + 1), max_len] by removing ignore_index padding, and also offset tokens by 2 to preserve [0, 1] for probabilities to allow for proper unraveling demarcated by probability boundaries.

* Update topk_token_phrases usages

* Update unravel_topk_token_phrases to form 2D topk_tensor

Unravel topk token phrases input_tensor from 1-D to [batch_size * (topk + 1), max_len] topk_tensor, which includes topk token probabilities (prob_k) + floor_prob in first column with gradients attached, with std_tokens in remaining columns with ignore_index padding.

* Update phrase_cross_entropy to use topk_tensor as input

* Update phrase_cross_entropy usages

* Change topk_tensor shape to [batch_size, (topk + 1), max_len]

* Update TextCausalLMNext synapse to use forward_response encoding/decoding

* Add backward_request_gradient encoding/decoding for TextCausalLMNext with new dimensions

* Update TextCausalLMNext unit tests with new backward dimensions

* Update server to perform encoding in synapse

* Update check_len and topk_tensor.device in phrase_cross_entropy

* Update unit tests for TextCausalLMNext with new shapes

* Update check_backward_request_gradient for TextCausalLMNext with new shapes

* Update nill responses of TextCausalLMNext with new shapes

* Debug axon tests

* Debug axon tests

* Debug axon tests

* Update decode_backward_request_gradient shape

* Update nill responses for TextCausalLMNext synapses

* Update nill responses for TextCausalLMNext synapses

* Update test_receptor_neuron_mock_server shape for causallmnext

* Debug receptor_impl response deserialization exception

* Update causallmnext shape in receptor tests

* Update causallmnext shape in receptor tests

* Update receptor stub shapes for causallmnext

* Debug receptor_impl response deserialization exception

* Update assert message in unravel_topk_token_phrases

* Add template_forward_response_tensor for TextCausalLMNext

* Prepare tokenizer with std_token_phrases in unit tests

* Assert ResponseDeserializationException for failed causallmnext unravel

* Add and update causallmnext unit tests

* Require logging debug or trace to print validator tables

* Use cell values correctly to sort table rows in validator

* Add parameter count estimates to neuron_stats in validator

* Parameterize scaling_law_power for scaling_law_loss_to_params

* Move scaling_law_power into validation_func

* Add synapse name to synergy_table

* Limit step weight table to neurons validated in that step

* Exclude include_uids with no stats in weights_table

* Remove synapse name from synergy_table

* Convert set to list in weights_table

* Update weights_table status info

* Remove unused import from tokenizer_utils

* Remove unused name parameter in synergy_table

* Add argument --nucleus.scaling_law_power to core_validator

Power for modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.

* Mark updated uids in weights_table

* Adjust uid marking in weights_table

* Remove unused imports in core_validator

* Mark uid row in stats_table with uid key

* Mark uid row in stats_table with uid key

* [BIT-532] Adding TextCausalLM user arg (#859)

* added TextCausalLM user arg

* Set synapse_keys based on neuron.validation_synapse

Co-authored-by: opentaco <93473497+opentaco@users.noreply.github.com>
Co-authored-by: opentaco <opentaco@protonmail.com>

* Update weights_table conditions for include_uids

* Bump Transformer Requirements to 4.20.1 (#863)

up the hugging transformer version

* Add validator progress status console output when debug/trace off

* Add validator progress status console output when debug/trace off

* Add validator progress status console output when debug/trace off

* Change status message style

* Add validator identifier message

* Change status message style

* Change status message style

* Change status message style

* Change status message style

* Modify message ordering

* Add responsive/queried stat to validator status

* Change status message style

* Synapse timeout fix (#869)

* detach logits, and detach graph

* loss timing

* turn off priority threadpool

* timing + loss.item

* syntax fix

* better timings

* additional timing within _forward function

* enable threadpool

* remove loss cal

* thread pool limitation

* remove loss cal

* thread pool update

* Null entry

* import bittensor

* cancel future + remove finetune

* remove timings, add mutex lock for synapse calls

* cleanup + comments

* removing detach from forward

* more cleanup + increase priority size

* test fix

* revert loss calculation

* revert off arguements

* Bittensor V3.0.0 (#870)

version 3.0.0

* Additional Checks and Protections (#861)

* additional checks

* doc string update

* blacklist time update

* Wandb Maintenence (#862)

(1) added weight metric to wandb (2) accumulate wandb commit

* Bit 537 remove thread queue (#871)

* removed thread queue

* data_corpus fix

* fix

* added timeout

Co-authored-by: opentaco <opentaco@protonmail.com>
Co-authored-by: isabella618033 <49876827+isabella618033@users.noreply.github.com>
Co-authored-by: Cameron Fairchild <cameron.fairchild@mail.utoronto.ca>
Co-authored-by: joeylegere <joeylegere@gmail.com>
Co-authored-by: Cameron Fairchild <cameron@opentensor.ai>
Co-authored-by: opentaco <93473497+opentaco@users.noreply.github.com>
Co-authored-by: Ala Shaabana <shaabana@gmail.com>
Co-authored-by: Unconst <32490803+unconst@users.noreply.github.com>
Co-authored-by: Joey Legere <joey@opentensor.ai>
camfairchild added a commit to camfairchild/bittensor that referenced this pull request Aug 8, 2022
* Remove duplicate info print in validator epoch

* Add duration info print to validator forward

* Add duration info print to validator forward

* Change print details in validator forward

* Bit 458 combine advanced with template server (opentensor#792)

* init

* added local training

* .

* working axon backward

* model saving

* shape fix

* clean

* averaging grad and loss

* fix

* fixes for comments

* removed advanced server and templarte server

* Synapse fix dend test (opentensor#798)

* prelim fixes

* 2nd fix

* final fix

* concstant fix

* validator fix

* core server self.set_fine_tuning_params() fix

* fix

* UI updates (opentensor#796)

* UI updates

* core server fixes

* turning off blacklisting for now

* generation fix

* UI and check updates

* small bug fixes

* constant import

* bug fixes

* Bit 490 synapse one fail drag all down (opentensor#810)

* init

* bug fixed

* generate bug fixes (opentensor#812)

* circle ci test fix (opentensor#813)

fixes

* validator fixes

* fixes for test_wallet

* Adds flag --wallet.reregister <bool> (opentensor#819)

* Adds flag --neuron.reregister <bool>
Default True

* created tests for new flag

* Revert "Adds flag --neuron.reregister <bool>"

This reverts commit 0736dfb.

* fix tests for new flag

* add new flag and implementation

* [WIP]Bit 490 synapse fix (opentensor#814)

* generate bug fixes

* dendrite backward disabled by default and paded sequence

* detach fix

* vocab size from 50257 to 50258

Co-authored-by: joeylegere <joeylegere@gmail.com>

* Synapse gpu fix (opentensor#822)

* temp fixes

* small fixes + blacklist

* Bit 490 synapse fix (opentensor#824)

* generate bug fixes

* dendrite backward disabled by default and paded sequence

* detach fix

* vocab size from 50257 to 50258

* fix generate sizing

* arg fix

* generate fix

* put new inputs on correct device

* tensors don't do .to inplace

* Null synapse (opentensor#827)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* Synapse defaults fix (opentensor#832)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* defaults update

* Remove template miner (opentensor#833)

* default synapse last hidden state

* Synapse

* move synapse code within try catch

* args

* kwargs

* Unknown synapse

* unknown synapse

* codes update

* string update

* revert fix for comment

* proto updates

* seq2seq parameters update

* defaults update

* deprecate template miner

* core_server update (blacklist changes + priority)

* removed tests for template miner

* remove update script

* seq2seq defaults updates (opentensor#836)

* defaults updates

* 1 minute timeout

* Bit 495 synapse more unit test (opentensor#828)

* added dend receptor test

* missing synapse test

* fix

* added axon text

* fix

* .

* small defaults update for seq2seq (opentensor#837)

* Fix uninitialized device variable in core_server

* Check for any available cuda device in core_server

* Allow for higher pytorch versions for CUDA 11.6 (sm_86) support

* Remove forced training when autocast and cuda

* Cast back to float32 in case autocast in logit translation

* Truncate logits down to vocab length

* put tensors on the device

* Validator hotfix (opentensor#844)

* .

* .

* Change tokenizer pad_token to eos_token

Define PAD Token = EOS Token = 50256, according to https://github.com/huggingface/transformers/blob/49c8c67fb815a277405f84dea4a66353e19fb347/tests/models/gpt2/test_modeling_gpt2.py#L532

Set padding_side = "left", since generative default expects most recent token on right-hand side with padding on left, according to huggingface/transformers#10552

* Simplify token padding in remapping_token_causallm

* Use pad_token in remapping_token_causallm in server

Note that tokenizer(padding=True, ...) is not used because unpadded offset_mapping is required for logit translation operations.

* Add tokenizer flags to remapping_token_causallm

To allow function to be used in various scenarios, including for causallm and generate.

* Combine remapping_token functions in server

Now a single remapping_token function servers all server forward functions.

* Update forward_generate with new token_remap

* Remove old remapping_token

* Adjust token_remap parameters in core_server

* Undo pad handling in validator forward

Now tokenizer.padding_side='left' ensures that last position fulfils validation. Additionally, validation is usually performed on unpadded [batch_size, sequence_len] tokens.

* Tensorize token input_ids and attention_mask in core_server

* Fix legacy constructor device parameter issue

* Add GPT2 generate convention pad_token_id=eos_token_id

https://github.com/huggingface/transformers/blob/49c8c67fb815a277405f84dea4a66353e19fb347/tests/models/gpt2/test_modeling_gpt2.py#L532

* Maint reposition remapping_token function

* use fast fix

* Revert "use fast fix"

This reverts commit e34c032.

* Add TextCausalLMNext Synapse to proto

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

Also adds comment specifying proto compile command, which is useful to see for manual compilation instruction.

* Add code_to_synapse for text_causal_lm_next

* Add TextCausalLMNext Synapse

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add text_causal_lm_next to Dendrite class

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add synapse_causal_lm_next to Axon

Specifies messaging of topk server token phrases with probabilities. Server last position token predictions are retokenized to token phrases with the bittensor tokenizer. Allows for zero translation loss CausalLM next generation between different tokenizers.

* Add topk token phrases utilities

Tokenizer utilities adds functions to compact and unravel topk server token phrases (standard tokenized), to be used with TextCausalLMNext synapse.

* Add unit test for topk token phrases utilities

Unit test new tokenizer utility functions that compact and unravel topk server token phrases (standard tokenized), to be used with TextCausalLMNext synapse.

* Add phrase entropy for topk token phrases

Calculates the cross entropy of a phrase prediction against a target phrase, so that this is a multi-token extension of typical cross entropy calculated for next token prediction, to be used with TextCausalLMNext synapse.

* Add unit test for phrase entropy for topk token phrases

Adds unit test for calculating the cross entropy of a phrase prediction against a target phrase, so that this is a multi-token extension of typical cross entropy calculated for next token prediction, to be used with TextCausalLMNext synapse.

* Add axon tests for TextCausalLMNext

* Add dendrite tests for TextCausalLMNext

* Add forward-backward tests for TextCausalLMNext

* Add receptor tests for TextCausalLMNext

* Add receptor_pool tests for TextCausalLMNext

* Update receptor_pool tests for TextCausalLMNext

* Update receptor_pool tests for TextCausalLMNext

* Add encode_forward_causallmnext() to server

To be used for TextCausalLMNext synapse, already compacts/encodes response for transfer to remote dendrite with backward support.

Forward pass through the pretrained model and select topk tokenizer logits and retokenize with std_tokenizer, then compact new token phrases and probabilities into 1-D tensor [ > batch_size * 2 * topk + 1] prob + at least 1 token per phrase + floor_prob. The floor probability is the mean probability of token phrases not captured in topk, required since the server tokenizer vocab_size may not be known to the receiver/validator.

* Add forward_casual_lm_next() to server

To be used for TextCausalLMNext synapse, already compacts/encodes response for transfer to remote dendrite with backward support.

Forward pass through the pretrained model and select topk tokenizer logits and retokenize with std_tokenizer, then compact new token phrases and probabilities into 1-D tensor [ > batch_size * 2 * topk + 1] prob + at least 1 token per phrase + floor_prob. The floor probability is the mean probability of token phrases not captured in topk, required since the server tokenizer vocab_size may not be known to the receiver/validator.

* Add causallmnext to core_server neuron config

* Add calc_loss_fct to neuron_utilities

* Add PositionalEncoding to neuron_utilities

* Import neuron_utilities and use its PositionalEncoding

* Add validation_len to core_validator neuron params

Number of tokens to holdout for phrase validation beyond sequence context.

* Maint core_validator forward() formatting and docs

* Maint core_validator forward() num_servers -> num_endpoints

* Detach synapse responses and move to validator device

* Add synergy_table display to core_validator

Prints the synergy loss diff matrix with pairwise loss reduction due to synergy (original loss on diagonal)

* Add stats_table display to core_validator

Gathers data and constructs neuron statistics table and prints it.

* Add synapse_table display to core_validator

Prints the evaluation of the neuron responses to the validator request.

* Add unsuccess display to core_validator

Prints the return codes and response times of unsuccessful responses.

* Add shapley_base to core_validator

Calculate Shapley base values and neuron response validation measure statistics, given responses from a synapse.

* Add scaling_law_loss_to_params to core_validator

(OpenAI scaling laws) Kaplan, Jared, et al. "Scaling laws for neural language models." arXiv:2001.08361 (2020)

* Add shapley_synergy to core_validator

Calculates Shapley synergy for coalition size 2, measured performance above expected performance. Measured in effective number of model parameters, just like base Shapley values.

* Add textcausallm to core_validator

Calculate Shapley values and neuron response validation measure statistics, given TextCausalLM synapse responses.

* Add textcausallmnext to core_validator

Calculate Shapley values and neuron response validation measure statistics, given TextCausalLMNext synapse responses.

* Use synapse validation functions in core_validator

Replace monolith with individual synapse validation function use in forward().

* Add neuron statistics variables to core_validator

* Add neuron_stats_update to core_validator

Updates self.neuron_stats with new individual dictionaries per uid.

* Add calculate_weights to core_validator

Calculates neuron set-weights from weight_key mapped values. Defines weight_key as the neuron stats key used to obtain the mapped stat value (typically a Shapley value) that the final set-weights are calculated from.

* Add __str__ and __repr__ for core_validator

Display UID, IP, wallet address summary for validator.

* Add weights_table to core_validator

Prints weights table given topk_uids and topk_weights.

* Use neuron_stats_update in core_validator

* Use stats_table to print stats update in core_validator

* Use calculate_weights and weights_table in core_validator

* Replace server_stats with neuron_stats in core_validator

* Fix rich print formatting in core_validator

* Update neuron_stats_columns with TextCausalLMNext synapse fields

* Move tensors to core_validator device

* Simplify shapley_synergy loss_diff_share

* Move tensors to same device in phrase_cross_entropy

* Use non-zero weight_key in core_validator

* Fix topk not implemented on cpu for float16

* Unforce remote_train when cuda and float16

* Ensure probability computations done in float32 for improved precision

* Set tokenizer.padding_side = "left" in core_server

Generative default expects most recent token on right-hand side with padding on left. huggingface/transformers#10552

* Run model for TextCausalLMNext always

Do not reuse model outputs from TextCausalLM, since the padding side differs.

* Gradually increase weight fields from zero via EMA

New axons will gradually increase in weighting as the number of successfully responded queries grows. This ensures that sufficient observations are averaged before weighting to address potentially noisy validation measures.

* Add vocab to tokenizer if it does not have attr

* Add vocab to tokenizer if it does not have attr

* Define vocab_len as real tokenizer vocabulary length

* Define vocab_len as real tokenizer vocabulary length

* Try use_fast=False at tokenizer load if fast fails

* Add epsilon for log in loss computation

* Sort validation table according to TextCausalLMNext

More models, like OPT, are supported by TextCausalLMNext than TextCausalLM that requires fast tokenizers. Validation table sorted according to more populated synapse provides better view.

* Avoid EMA of nan values in core_validator neuron_stats

* Replace nan/inf losses with large loss in core_validator

* Swap validator table columns TextCausalLM <-> TextCausalLMNext

* Ensure softmax computation done in float32 for improved precision

* Simplify remapping_token to include untensorized offset_mappings

* Simplify encode_forward_causallm in core_server

* Remove left padding from batch probabilities for logit translation

* Remove left padding from batch tokens for logit translation

* Return internal model_output in encode_forward_causallm

* Reuse model_output in encode_forward_causallmnext in core_server

* Move tokens to core_server device

* Set std_sequence_len according to tokens_std in logit translation

* Check that model_output is not None before overwrite in axon

Otherwise some synapses like TextSeq2Seq with model_output=None will overwrite previous (potentially) non-None model_output.

* Revert model_output check in axon

* Log server loss and translated loss for TextCausalLM in core_server

* Update target_phrases type

* Add validation loss for TextCausalLMNext

To complement the existing phrase cross entropy loss, and to allow for more direct comparison to TextCausalLM.

* Move target_phrase to cpu in phrase_cross_entropy

* Remove losses_val_nxt in textcausallmnext

* Use floor probability when no validation match in phrase_cross_entropy

* Update logger used for server translated loss in core_server

* Use loguru info in core_validator instead of print

* Update logger style in core_validator

* Update logger style in core_validator

* Set vocab_len for tokenizers to store real vocabulary size

* Update test_tokenizer_utils.py

* Omit attention_mask in TextCausalLM (update unittests)

Transformer models like gerpt2 typically perform worse with left-side attention mask, so turning it off.

* Add message to synapse_callback return outputs

Allows direct loss evaluations and statistics to be passed through from the server model synapse execution to the axon logging output, useful to display server-side loss values for verification at validation.

* Add message to synapse_callback return outputs

Allows direct loss evaluations and statistics to be passed through from the server model synapse execution to the axon logging output, useful to display server-side loss values for verification at validation.

* [BIT 487] btcli regen coldkey with pubkey only (opentensor#831)

* fix no_prompt help dialog

* add function to regen coldkeypub from the pub part

* added new regen_coldkeypub to CLI

* add no_prompt option

* catch if the addr/pub key is valid first

* add None default

* fix utils import

* explicitly set ss58_format

* add tests for new regen_coldkeypub

* add integration test for wallet coldkeypub create

* add integration test for cli regen_coldkeypub

* fix type

* add wallet args

* fix config access

* don't specify bad function param

* fix test to only check coldkeypub

* remove printout inside validation check and rename

* fix uncaught name change

* Synapse basic fixes (opentensor#851)

* basic bug fixes

* sample config fixes

Co-authored-by: Ala Shaabana <shaabana@gmail.com>
Co-authored-by: joeylegere <joeylegere@gmail.com>

* Add support for non whitespace-preserving tokenizers

Prepends strings with a space, in the case of non whitespace-preserving tokenizers like BERT, which should mimic whitespace-preserving token strings more often than not.

Adds error catching on per-UID basis in shapley_base for synapse validation to be more robust.

* Move prep_tokenizer to tokenizer_utils.py

* Update warning message in core_validator synapse validation

* Disable coveralls (opentensor#855)

* Update README.md

* disable publishing coverage

* dont upload any coverage

Co-authored-by: Unconst <32490803+unconst@users.noreply.github.com>
Co-authored-by: Joey Legere <joey@opentensor.ai>

* Add set_std_token_phrases for caching std_tokenizer equivalent of tokenizer token strings

Sets std_token_phrases which are the tokenizer token strings tokenized with std_tokenizer, so the std_tokenizer equivalent of the tokenizer token strings. Used for converting model predictions/logits into std_tokenizer representations, for example in TextCausalLMNext.

* Update topk_token_phrases to output 2D tensor with gradients instead of compact tensor

Standardizes the TextCausalLMNext server model output to 2D tensor with gradients, thereafter the axon encoding will compact the 2D tensor into 1D by removing ignore_index padding. Standardization should allow for multithreaded decoding at dendrite receptor_pool into 2D tensor for validator to form matching gradient shape.

Select topk tokenizer logits/phrases and include std_token_phrases counterparts (std_tokenization of token text) in topk_tensor output of shape [batch_size * (topk + 1), max_len], where max len of all phrase lists (with prob in front) is max_{b,k}(len([prob_k, tok_0_k, tok_1_k, ...])).

The output topk_tensor also includes a floor_prob for each batch item. The floor probability is the mean probability of token phrases not captured in topk, required since the tokenizer vocab_size may not be known to the receiver.

* Add compact_topk_token_phrases to tokenizer_utils.py

Compact 2D topk_tensor [batch_size * (topk + 1), max_len] by removing ignore_index padding, and also offset tokens by 2 to preserve [0, 1] for probabilities to allow for proper unraveling demarcated by probability boundaries.

* Update topk_token_phrases usages

* Update unravel_topk_token_phrases to form 2D topk_tensor

Unravel topk token phrases input_tensor from 1-D to [batch_size * (topk + 1), max_len] topk_tensor, which includes topk token probabilities (prob_k) + floor_prob in first column with gradients attached, with std_tokens in remaining columns with ignore_index padding.

* Update phrase_cross_entropy to use topk_tensor as input

* Update phrase_cross_entropy usages

* Change topk_tensor shape to [batch_size, (topk + 1), max_len]

* Update TextCausalLMNext synapse to use forward_response encoding/decoding

* Add backward_request_gradient encoding/decoding for TextCausalLMNext with new dimensions

* Update TextCausalLMNext unit tests with new backward dimensions

* Update server to perform encoding in synapse

* Update check_len and topk_tensor.device in phrase_cross_entropy

* Update unit tests for TextCausalLMNext with new shapes

* Update check_backward_request_gradient for TextCausalLMNext with new shapes

* Update nill responses of TextCausalLMNext with new shapes

* Debug axon tests

* Debug axon tests

* Debug axon tests

* Update decode_backward_request_gradient shape

* Update nill responses for TextCausalLMNext synapses

* Update nill responses for TextCausalLMNext synapses

* Update test_receptor_neuron_mock_server shape for causallmnext

* Debug receptor_impl response deserialization exception

* Update causallmnext shape in receptor tests

* Update causallmnext shape in receptor tests

* Update receptor stub shapes for causallmnext

* Debug receptor_impl response deserialization exception

* Update assert message in unravel_topk_token_phrases

* Add template_forward_response_tensor for TextCausalLMNext

* Prepare tokenizer with std_token_phrases in unit tests

* Assert ResponseDeserializationException for failed causallmnext unravel

* Add and update causallmnext unit tests

* Require logging debug or trace to print validator tables

* Use cell values correctly to sort table rows in validator

* Add parameter count estimates to neuron_stats in validator

* Parameterize scaling_law_power for scaling_law_loss_to_params

* Move scaling_law_power into validation_func

* Add synapse name to synergy_table

* Limit step weight table to neurons validated in that step

* Exclude include_uids with no stats in weights_table

* Remove synapse name from synergy_table

* Convert set to list in weights_table

* Update weights_table status info

* Remove unused import from tokenizer_utils

* Remove unused name parameter in synergy_table

* Add argument --nucleus.scaling_law_power to core_validator

Power for modified scaling law, powered down to improve dynamic range, e.g. 3 → 6 nats for 0.5.

* Mark updated uids in weights_table

* Adjust uid marking in weights_table

* Remove unused imports in core_validator

* Mark uid row in stats_table with uid key

* Mark uid row in stats_table with uid key

* [BIT-532] Adding TextCausalLM user arg (opentensor#859)

* added TextCausalLM user arg

* Set synapse_keys based on neuron.validation_synapse

Co-authored-by: opentaco <93473497+opentaco@users.noreply.github.com>
Co-authored-by: opentaco <opentaco@protonmail.com>

* Update weights_table conditions for include_uids

* Bump Transformer Requirements to 4.20.1 (opentensor#863)

up the hugging transformer version

* Add validator progress status console output when debug/trace off

* Add validator progress status console output when debug/trace off

* Add validator progress status console output when debug/trace off

* Change status message style

* Add validator identifier message

* Change status message style

* Change status message style

* Change status message style

* Change status message style

* Modify message ordering

* Add responsive/queried stat to validator status

* Change status message style

* Synapse timeout fix (opentensor#869)

* detach logits, and detach graph

* loss timing

* turn off priority threadpool

* timing + loss.item

* syntax fix

* better timings

* additional timing within _forward function

* enable threadpool

* remove loss cal

* thread pool limitation

* remove loss cal

* thread pool update

* Null entry

* import bittensor

* cancel future + remove finetune

* remove timings, add mutex lock for synapse calls

* cleanup + comments

* removing detach from forward

* more cleanup + increase priority size

* test fix

* revert loss calculation

* revert off arguements

* Bittensor V3.0.0 (opentensor#870)

version 3.0.0

* Additional Checks and Protections (opentensor#861)

* additional checks

* doc string update

* blacklist time update

* Wandb Maintenence (opentensor#862)

(1) added weight metric to wandb (2) accumulate wandb commit

* Bit 537 remove thread queue (opentensor#871)

* removed thread queue

* data_corpus fix

* fix

* added timeout

Co-authored-by: opentaco <opentaco@protonmail.com>
Co-authored-by: isabella618033 <49876827+isabella618033@users.noreply.github.com>
Co-authored-by: Cameron Fairchild <cameron.fairchild@mail.utoronto.ca>
Co-authored-by: joeylegere <joeylegere@gmail.com>
Co-authored-by: Cameron Fairchild <cameron@opentensor.ai>
Co-authored-by: opentaco <93473497+opentaco@users.noreply.github.com>
Co-authored-by: Ala Shaabana <shaabana@gmail.com>
Co-authored-by: Unconst <32490803+unconst@users.noreply.github.com>
Co-authored-by: Joey Legere <joey@opentensor.ai>
@ifrit98 ifrit98 deleted the Synapse-basic-fixes branch May 24, 2023 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants