Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a helper function to display vectors of logits nicely #112

Open
neelnanda-io opened this issue Dec 19, 2022 · 9 comments
Open

Add a helper function to display vectors of logits nicely #112

neelnanda-io opened this issue Dec 19, 2022 · 9 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@neelnanda-io
Copy link
Collaborator

Often you want to look at vectors over the vocabulary (eg the logits at a specific position). This is >50,000 dimensions and this is hard to interpret! I want there to be nice utils to visualize a vector like this.

An MVP would be a function mapping this to a pandas dataframe, with the token index, token string value, logit, log prob and probability. Either for just the top K, or for the entire vocab.

But I expect there's many ways to make something nice here! One option is to imitate nostalgebraist's graphing style for plot_logit_lens in `transformer_utils link. This takes a layer x position x d_vocab tensor, and visualises it as a layer x position heatmap, printing the string value of the top token in each cell, and colouring by the top token value.

image

@neelnanda-io neelnanda-io added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Dec 19, 2022
@sheikheddy
Copy link
Contributor

sheikheddy commented Mar 2, 2023

@sheikheddy
Copy link
Contributor

sheikheddy commented Mar 2, 2023

Okay, I'm going to put down some rough thoughts:

Often you want to look at vectors over the vocabulary (eg the logits at a specific position). This is >50,000 dimensions and this is hard to interpret! I want there to be nice utils to visualize a vector like this.

A more explicit way to put it:

Encoding name OpenAI models
gpt2 (or r50k_base) Most GPT-3 models (and GPT-2)
p50k_base Code models, text-davinci-002, text-davinci-003
cl100k_base text-embedding-ada-002

Let's start with this snippet from https://github.com/openai/tiktoken:

def gpt2():
    mergeable_ranks = data_gym_to_mergeable_bpe_ranks(
        vocab_bpe_file="https://openaipublic.blob.core.windows.net/gpt-2/encodings/main/vocab.bpe",
        encoder_json_file="https://openaipublic.blob.core.windows.net/gpt-2/encodings/main/encoder.json",
    )
    return {
        "name": "gpt2",
        "explicit_n_vocab": 50257,
        "pat_str": r"""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""",
        "mergeable_ranks": mergeable_ranks,
        "special_tokens": {"<|endoftext|>": 50256},
    }

For clarity, here are a few assumptions:

  • There is a static dictionary mapping token ids (int) to token values (string).
  • Tokens earlier in the sequence affect the likelihood of tokens later (attention)
  • We are interested in how simple local interactions affect complex global structure.
  • You can have up to 4096 tokens in the context window, each token has 50k+ choices.
  • We usually look at logits at the final layer but in principle can check them at any output layer.
  • You can normalize with softmax(logits) to get logprobs

Here's a few ideas:

  • If you have a really long wire, or a really long strip of film, then it will take up a lot of horizontal space, and you will only be able to see a small slice at once.
  • In the same way we wrap up wires into coils, and filmstrips into spools of cassette tapes, we can take our string of positions and bend them into a circle like a paperclip.
  • There are two main types of loops: in the first one, you touch the two ends together to make the outline of a circle. You can use the space inside to draw lines that show connections between different sections. And you can use the space outside for hierarchy.
  • In the second one, you wind it up really tightly, like a rope, or DNA. For these, we can imagine marking each cell with a color according to state (this makes me think about Turing Machines for some reason).
  • You can also animate it to unwind at a variable rate, where the speed is controlled by uncertainty, faster when probability mass is concentrated, slower when it is more spread out. (This makes me think about Fourier decompositions and spectrograms). Ideally this would autoplay, like https://gource.io/.
  • See this https://ourworldindata.org/technology-long-run and https://socks-studio.com/2021/11/03/constructing-knowledge-through-geometry-ramon-llulls-figures-in-ars-magna-1305/ for inspiration.
  • Analogy to video editing: the timeline is your 1d position vector, and you layer different effects and masks and footage together into one final rendered composite. For interpretability, you run this process in reverse: go back from a final video to the source.
  • Finally, interactivity would be nice to have. Bret Victor has written a lot about this from a user design point of view, e.g http://worrydream.com/MagicInk/#reducing_interaction, to make the programming a bit easier I'd recommend borrowing heavily from existing component libraries.

All of this sounds a bit overkill for a helper function, but if fully realized, I think it'd be a really neat tool.

@sheikheddy
Copy link
Contributor

I'll try to put a prototype up this weekend

@neelnanda-io
Copy link
Collaborator Author

neelnanda-io commented Mar 2, 2023 via email

@sheikheddy
Copy link
Contributor

Seems like this would be a contribution to https://github.com/alan-cooney/CircuitsVis/blob/main/python/circuitsvis/logits.py, not TransformerLens?

@neelnanda-io
Copy link
Collaborator Author

neelnanda-io commented Mar 7, 2023 via email

@jbloomAus
Copy link
Collaborator

@sheikheddy @neelnanda-io What's the plan here? Do we need an interactive visualization or will something else do?

@abdurraheemali
Copy link

https://www.brendangregg.com/blog/2017-02-06/flamegraphs-vs-treemaps-vs-sunburst.html for a non-interactive visualization, flame graphs do pretty well

(I'm @sheikheddy from an alt-account)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants