feat: add single token feature detection and logging #91

meghana-0211 · 2025-02-22T19:39:46Z

Single Token Feature Detection

Adds functionality to detect and log features that primarily activate on single tokens, as requested in #87.

Changes

Added is_single_token field to LatentRecord
Implemented detection logic in constructors.py
Updated scorer and analysis code to track single token metrics
Added single token ratio to verbose logging output

Implementation Details

Features are considered "single token" if ≥80% of top 50% activations are concentrated on single tokens
Thresholds are configurable via parameters
Information is propagated through the entire pipeline
Results are included in logging when verbose mode is enabled

Closes #87

- Add is_single_token field to LatentRecord - Implement single token detection in constructors - Update scorer and analysis to track single token metrics - Log single token ratio in verbose mode Closes EleutherAI#87

CLAassistant · 2025-02-22T19:39:51Z

All committers have signed the CLA.

for more information, see https://pre-commit.ci

SrGonao · 2025-02-22T23:46:37Z

If you are willing to give it another shot, I had already started in #89. I think it should be done after caching and not when you are building the feature examples (although I think having and option to skip single token latents is nice). Maybe we can have both of them - my method as a diagnostics tool that is used when verbose it true, and your suggestion as maybe an argument (skip_single_token ?) that can be used to not explain single token latents? @luciaquirke, @norabelrose maybe you have ideas?

luciaquirke · 2025-02-25T00:00:17Z

delphi/latents/constructors.py

    max_examples = constructor_cfg.max_examples
    min_examples = constructor_cfg.min_examples

+    token_windows, act_windows = pool_max_activation_windows(


It looks like this call has been duplicated by mistake. I don't believe this code will run. I have added documentation on how to run the tests to the README.md.

@meghana-0211 try pytest . and python -m delphi.tests.e2e.

luciaquirke · 2025-02-25T00:03:40Z

delphi/latents/constructors.py

+    # For each example, check if activation is concentrated in a single position
+    max_activations = activations.max(dim=1).values
+    top_k = int(len(max_activations) * quantile_threshold)
+    top_indices = max_activations.topk(top_k).indices


I believe that the activations are already in top k order (tbh they should probably be renamed to ordered_example_acts or similar to reflect this, starting in _top_k_pools). So we should be able to directly slice the first len(activations) * quantile_threshold activations.

luciaquirke · 2025-02-25T00:09:27Z

delphi/latents/constructors.py

+    # Calculate ratio of single token activations
+    single_token_ratio = (significant_activations == 1).float().mean().item()
+
+    return single_token_ratio >= activation_ratio_threshold


it doesn't look like this method actually checks whether the activations are single token 🤔 maybe I'm just confused but I can't see it

luciaquirke · 2025-02-25T00:15:11Z

Looks like there may have been a miscommunication - single token feature is supposed to mean that the feature fires on the same token every time, not that it fires on one token within each sequence. This code also won't run as is. That said, the code in result_analysis looks great and could be pulled verbatim into a future PR for this issue.

@SrGonao I'm curious what the issue is with doing this type of non-LLM calling data processing in the constructor, could you say a bit more about your reasoning?

SrGonao · 2025-02-25T10:12:25Z

I think if we want to get statistics as a diagnostics tool it should be done after caching.
If we want to skip single token latents it should be done in the constructor.

luciaquirke · 2025-02-25T11:30:54Z

Yeah I think we probably just want statistics and not skipping single token explanation to keep things simple for now. I trust that we shouldn't collect statistics before caching is complete, am curious as to the reasoning.

SrGonao · 2025-02-26T13:29:38Z

Sorry, by after caching I meant right after. I think slowing down the constructor leads to underutilization of the GPUs and so in my mind I would like to keep as much "work" that can be done beforehand out of the constructor. Maybe it doesn't make a difference though, I just thought it to be more natural.

luciaquirke · 2025-03-10T04:17:23Z

Closing due to inactivity

feat: add single token feature detection and logging

fbe868d

- Add is_single_token field to LatentRecord - Implement single token detection in constructors - Update scorer and analysis to track single token metrics - Log single token ratio in verbose mode Closes EleutherAI#87

[pre-commit.ci] auto fixes from pre-commit.com hooks

a192a28

for more information, see https://pre-commit.ci

SrGonao mentioned this pull request Feb 22, 2025

Caching statistics #89

Merged

luciaquirke reviewed Feb 25, 2025

View reviewed changes

luciaquirke closed this Mar 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add single token feature detection and logging #91

feat: add single token feature detection and logging #91

Uh oh!

meghana-0211 commented Feb 22, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Feb 22, 2025 •

edited

Loading

Uh oh!

SrGonao commented Feb 22, 2025 •

edited

Loading

Uh oh!

luciaquirke Feb 25, 2025 •

edited

Loading

Uh oh!

luciaquirke Feb 25, 2025 •

edited

Loading

Uh oh!

luciaquirke Feb 25, 2025 •

edited

Loading

Uh oh!

luciaquirke Feb 25, 2025 •

edited

Loading

Uh oh!

luciaquirke commented Feb 25, 2025 •

edited

Loading

Uh oh!

SrGonao commented Feb 25, 2025

Uh oh!

luciaquirke commented Feb 25, 2025

Uh oh!

SrGonao commented Feb 26, 2025

Uh oh!

luciaquirke commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add single token feature detection and logging #91

feat: add single token feature detection and logging #91

Uh oh!

Conversation

meghana-0211 commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Single Token Feature Detection

Changes

Implementation Details

Uh oh!

CLAassistant commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SrGonao commented Feb 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luciaquirke Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciaquirke Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciaquirke Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciaquirke Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luciaquirke commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SrGonao commented Feb 25, 2025

Uh oh!

luciaquirke commented Feb 25, 2025

Uh oh!

SrGonao commented Feb 26, 2025

Uh oh!

luciaquirke commented Mar 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

meghana-0211 commented Feb 22, 2025 •

edited

Loading

CLAassistant commented Feb 22, 2025 •

edited

Loading

SrGonao commented Feb 22, 2025 •

edited

Loading

luciaquirke Feb 25, 2025 •

edited

Loading

luciaquirke Feb 25, 2025 •

edited

Loading

luciaquirke Feb 25, 2025 •

edited

Loading

luciaquirke Feb 25, 2025 •

edited

Loading

luciaquirke commented Feb 25, 2025 •

edited

Loading