In [1]:
import ecco
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel, AutoModelForSeq2SeqLM
hf_model_id = 'microsoft/deberta-v3-small'


# 1- load the model the tokenizer
tokenizer = AutoTokenizer.from_pretrained(hf_model_id)
model = AutoModel.from_pretrained(hf_model_id)


Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Some weights of the model checkpoint at microsoft/deberta-v3-small were not used when initializing DebertaV2Model: ['mask_predictions.classifier.weight', 'lm_predictions.lm_head.LayerNorm.bias', 'mask_predictions.LayerNorm.weight', 'mask_predictions.dense.bias', 'lm_predictions.lm_head.dense.bias', 'mask_predictions.dense.weight', 'lm_predictions.lm_head.dense.weight', 'lm_predictions.lm_head.bias', 'mask_predictions.classifier.bias', 'lm_predictions.lm_head.LayerNorm.weight', 'mask_predictions.LayerNorm.bias']
- This IS expected if you are initializing DebertaV2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DebertaV2Model from the checkpoint of a model that you expect to be exactly identical

## 2- identify model type
Identify type of model. Is it `causal` like GPT? `mlm` like BERT, or `enc-dec` like T5? We learn this by reading about the model artchitecture. In this example, deBERTa is based on BERT and so is `mlm`.

## 3- identify layer names
We need the names of the embedding layer (to support attribution calculations) and the FFNN layer (to support neuron activation collection).

In [3]:
model

DebertaV2Model(
  (embeddings): DebertaV2Embeddings(
    (word_embeddings): Embedding(128100, 768, padding_idx=0)
    (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
    (dropout): StableDropout()
  )
  (encoder): DebertaV2Encoder(
    (layer): ModuleList(
      (0): DebertaV2Layer(
        (attention): DebertaV2Attention(
          (self): DisentangledSelfAttention(
            (query_proj): Linear(in_features=768, out_features=768, bias=True)
            (key_proj): Linear(in_features=768, out_features=768, bias=True)
            (value_proj): Linear(in_features=768, out_features=768, bias=True)
            (pos_dropout): StableDropout()
            (dropout): StableDropout()
          )
          (output): DebertaV2SelfOutput(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-07, elementwise_affine=True)
            (dropout): StableDropout()
          )
        )
        (intermediate): Deb

In [4]:
# Embeddings is:
model.embeddings.word_embeddings

Embedding(128100, 768, padding_idx=0)

In [15]:
# The FFNN is the one with the widest output dimension
# We take the parts of the name after the layer number. In this case it's 
# intermediate.dense
model.encoder.layer[0].intermediate.dense

Linear(in_features=768, out_features=3072, bias=True)

## Tokenizer

In [17]:
ids= tokenizer('tokenization')
ids

{'input_ids': [1, 10704, 4820, 2], 'token_type_ids': [0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1]}

In [19]:
tokenizer.convert_ids_to_tokens(ids['input_ids'])

['[CLS]', '▁token', 'ization', '[SEP]']

So `token_prefix` is ▁

And `partial_token_prefix` is ''

# Model config
If we're passing the model config directly to Ecco, it would look like this:

In [2]:

model_config = {
    'embedding': "embeddings.word_embeddings",
    'type': 'mlm',
    'activations': ['intermediate\.dense'], #This is a regex
    'token_prefix': '▁',
    'partial_token_prefix': ''
}

In [3]:
ecco_model = ecco.from_pretrained(hf_model_id, 
                                  activations=True,
                                  model_config=model_config)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Some weights of the model checkpoint at microsoft/deberta-v3-small were not used when initializing DebertaV2Model: ['mask_predictions.classifier.weight', 'lm_predictions.lm_head.LayerNorm.bias', 'mask_predictions.LayerNorm.weight', 'mask_predictions.dense.bias', 'lm_predictions.lm_head.dense.bias', 'mask_predictions.dense.weight', 'lm_predictions.lm_head.dense.weight', 'lm_predictions.lm_head.bias', 'mask_predictions.classifier.bias', 'lm_predictions.lm_head.LayerNorm.weight', 'mask_predictions.LayerNorm.bias']
- This IS expected if you are initializing DebertaV2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DebertaV2Model from the checkpoint of a model that you expect to be exactly identical

In [4]:
output.config

NameError: name 'output' is not defined

In [5]:
text = ''' Now I ask you: \n what can be expected of man since he is a being endowed with strange qualities? Shower upon him every earthly blessing, drown him in a sea of happiness, so that nothing but bubbles of bliss can be seen on the surface; give him economic prosperity, such that he should have nothing else to do but sleep, eat cakes and busy himself with the continuation of his species, and even then out of sheer ingratitude, sheer spite, man would play you some nasty trick. He would even risk his cakes and would deliberately desire the most fatal rubbish, the most uneconomical absurdity, simply to introduce into all this positive good sense his fatal fantastic element. It is just his fantastic dreams, his vulgar folly that he will desire to retain, simply in order to prove to himself--as though that were so necessary-- that men still are men and not the keys of a piano, which the laws of nature threaten to control so completely that soon one will be able to desire nothing but by the calendar. And that is not all: even if man really were nothing but a piano-key, even if this were proved to him by natural science and mathematics, even then he would not become reasonable, but would purposely do something perverse out of simple ingratitude, simply to gain his point. And if he does not find means he will contrive destruction and chaos, will contrive sufferings of all sorts, only to gain his point! He will launch a curse upon the world, and as only man can curse (it is his privilege, the primary distinction between him and other animals), may be by his curse alone he will attain his object--that is, convince himself that he is a man and not a piano-key!
'''

inputs = ecco_model.tokenizer([text], return_tensors="pt")
output = ecco_model(inputs)

nmf_1 = output.run_nmf(n_components=8) 

nmf_1.explore()

<IPython.core.display.Javascript object>

In [6]:
output.config

{'embedding': 'embeddings.word_embeddings',
 'type': 'mlm',
 'activations': ['intermediate\\.dense'],
 'token_prefix': '▁',
 'partial_token_prefix': '',
 'tokenizer_config': {'token_prefix': '▁', 'partial_token_prefix': ''}}

## model-config.yaml
If we're adding to model-config.yaml, the entry looks like this:





In [23]:
"""

microsoft/deberta-v3-small:
    embedding: "embeddings.word_embeddings"
    type: 'mlm'
    activations:
        - 'intermediate\.dense'
    token_prefix: '▁'
    partial_token_prefix: ''

"""

'\n\nmicrosoft/deberta-v3-small:\n    embedding: "model.embeddings.word_embeddings"\n    type: \'mlm\'\n    activations:\n        - \'intermediate\\.dense\'\n    token_prefix: \'▁\'\n    partial_token_prefix: \'\'\n\n'