Skip to content

CKA for representation similarity#48

Merged
Xmaster6y merged 7 commits intomainfrom
cka
Apr 15, 2026
Merged

CKA for representation similarity#48
Xmaster6y merged 7 commits intomainfrom
cka

Conversation

@Xmaster6y
Copy link
Copy Markdown
Owner

@Xmaster6y Xmaster6y commented Apr 15, 2026

What does this PR do?

Key insights about the PR.

Linked Issues

Checklist

  • I have read the CONTRIBUTING guide.
  • I have added tests for my changes if needed.
  • I have updated the documentation if needed.

Summary by cubic

Add centered kernel alignment (CKA) to compare latent representations, with a new estimator, tests, and a docs notebook. Introduces tdhook.latent.representation_similarity.CkaEstimator (re-exported from tdhook.latent); related to #34.

  • New Features

    • New subpackage tdhook.latent.representation_similarity exporting CkaEstimator.
    • Computes linear CKA on (N, D) or (..., N, D); reads in_key_a/in_key_b, writes cka; validates shapes/device; promotes dtypes; returns nan on degenerate inputs; deterministic.
    • Docs: added a “Representation Similarity” notebook and methods card; Sphinx now keeps generated API files; .gitignore ignores docs/source/api/.
    • Tests: coverage for batching, invariances, custom keys, error paths, dtype promotion, and estimator repr.
  • Refactors

    • Attribution helpers and LRP rules: added docstrings, improved error messages, and simplified a mapper call path.

Written for commit c300361. Summary will update on new commits.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 98.50746% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 97.30%. Comparing base (b8e537b) to head (c300361).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/tdhook/attribution/gradient_helpers/helpers.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #48      +/-   ##
==========================================
+ Coverage   97.19%   97.30%   +0.11%     
==========================================
  Files          39       41       +2     
  Lines        2426     2489      +63     
==========================================
+ Hits         2358     2422      +64     
+ Misses         68       67       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Xmaster6y
Copy link
Copy Markdown
Owner Author

@cubic-dev-ai

@cubic-dev-ai
Copy link
Copy Markdown

cubic-dev-ai Bot commented Apr 15, 2026

@cubic-dev-ai

@Xmaster6y I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files

Confidence score: 4/5

  • This PR is likely safe to merge, but there is a moderate edge-case risk: torch.stack(cka_values) in src/tdhook/latent/representation_similarity/cka.py can raise at runtime when cka_values is empty.
  • The issue is reasonably well-supported (severity 5/10, confidence 9/10) and appears user-facing only when flattened batch size is 0, so impact is conditional rather than broad.
  • Pay close attention to src/tdhook/latent/representation_similarity/cka.py - handle empty-batch flows so cka_values is validated before stacking.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/tdhook/latent/representation_similarity/cka.py">

<violation number="1" location="src/tdhook/latent/representation_similarity/cka.py:46">
P2: `torch.stack(cka_values)` crashes for empty batches because `cka_values` can be empty when the flattened batch size is 0.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/tdhook/latent/representation_similarity/cka.py
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files

Confidence score: 4/5

  • This PR looks safe to merge with minimal risk, but there is a user-facing reproducibility concern in docs/source/notebooks/methods/representation-similarity.ipynb.
  • The most significant issue is defaulting DEV=True, which can make Colab examples pull unreleased main code instead of the published package, leading to inconsistent results for readers.
  • Because the issue is moderate (4/10) and isolated to documentation notebook behavior rather than core runtime logic, the overall merge risk stays low.
  • Pay close attention to docs/source/notebooks/methods/representation-similarity.ipynb - ensure DEV defaults to False so users only opt into dev behavior explicitly.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docs/source/notebooks/methods/representation-similarity.ipynb">

<violation number="1" location="docs/source/notebooks/methods/representation-similarity.ipynb:29">
P2: Defaulting `DEV` to `True` makes Colab run against unreleased `main` code by default. Set it to `False` so documentation examples are reproducible against the published package unless a user explicitly opts into dev mode.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread docs/source/notebooks/methods/representation-similarity.ipynb
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 6 files (changes from recent commits).

Auto-approved: Isolated addition of CKA similarity estimator with comprehensive tests and documentation; existing code changes are limited to docstrings and trivial, safe refactors.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 7 files (changes from recent commits).

Requires human review: Auto-approval blocked by 1 unresolved issue from previous reviews.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 4 files (changes from recent commits).

Requires human review: Auto-approval blocked by 1 unresolved issue from previous reviews.

@Xmaster6y Xmaster6y merged commit bc7978d into main Apr 15, 2026
7 checks passed
@Xmaster6y Xmaster6y deleted the cka branch April 15, 2026 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant