-
Notifications
You must be signed in to change notification settings - Fork 251
Issues: TransformerLensOrg/TransformerLens
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug Report] Different results from HuggingFace when using the GPT2 small example
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#685
opened Jul 27, 2024 by
nreHieW
1 task done
[Question] Why does Transformer Lens only support quantized LLaMA models?
#684
opened Jul 26, 2024 by
miguel-kjh
[Bug Report] Qwen model implementation is too inaccurate
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
needs-investigation
Issues that need to be recreated, or investigated before work can be done
#683
opened Jul 23, 2024 by
bryce13950
1 task done
[Proposal] Demo and Tutorial on Patchscopes and "Patching + Generation"
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
demo
Creating a demo or tutorial
#680
opened Jul 16, 2024 by
HenryCai11
1 task done
[Proposal] Allow tied embeddings
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
enhancement
New feature or request
#671
opened Jul 12, 2024 by
neelnanda-io
does run_with_cache method support data parallel , how can I do it ?
#669
opened Jul 12, 2024 by
Yang-bug-star
[Proposal] Allow recent versions of beartype
complexity-simple
Simple issues, which may be good for beginners
tooling
Anything pertaining to outside tools used within the codebase
#665
opened Jul 10, 2024 by
jettjaniak
1 task done
[Bug Report] Pythia output inconsistent across batch sizes when use_split_qkv_input=True
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
implementation-inaccuracy
Any issues related to our implementation being off from the official version
#661
opened Jul 8, 2024 by
oliveradk
1 task done
[Bug Report] RMSNormPre in Transformer_lens is maybe different from Llama source code?
#657
opened Jul 6, 2024 by
wangyifei0047
Is it possible to use a locally downloaded model without accessing HF?
#655
opened Jul 4, 2024 by
ccp123456
[Proposal] Documentation: Map the Act Names to the Transformer
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
documentation
Improvements or additions to documentation
#644
opened Jun 21, 2024 by
JuVogt
1 task done
[Proposal] Remove the overhead caused by full_hook.__name__ = (hook.__repr__())?
#631
opened Jun 8, 2024 by
verlocks
[Proposal] Add support for Baichuan1 and Baichuan2
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
#622
opened Jun 3, 2024 by
StarrySeas1
[Bug Report] The output from HookedTransformer is not identical compared to Huggingface model for Lllama 3
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#615
opened May 28, 2024 by
iamsimha
1 task done
[Proposal] Merge utils and utilities
complexity-moderate
Moderately complicated issues for people who have intermediate experience with the code
refactor
Changing something with the code that will either affect external user, or contributors
#612
opened May 27, 2024 by
bryce13950
1 task done
[Feature Request] Add Stopping Criteria support
complexity-high
Very complicated changes for people to address who are quite familiar with the code
enhancement
New feature or request
#595
opened May 15, 2024 by
Butanium
Setup for fine tuned Mistral model ?
question
Further information is requested
#592
opened May 14, 2024 by
SiddhantOjha17
[Bug Report] TransformerLens's use of This will not be worked on
einsum
leads to different training dynamics on TPUs
wontfix
#591
opened May 13, 2024 by
jqhoogland
[Proposal] Setup unit tests to cover model configurations
good first issue
Good for newcomers
testing
A task that needs to be completed in order to improve the current test coverage.
#588
opened May 11, 2024 by
bryce13950
1 task done
[Bug Report] Unable to Llama 3 70b on multigpu in 4bit
bug
Something isn't working
complexity-high
Very complicated changes for people to address who are quite familiar with the code
#569
opened May 3, 2024 by
winglian
[Bug Report] Grokking demo currently broken in Colab
bug
Something isn't working
demo
Creating a demo or tutorial
#543
opened Apr 16, 2024 by
bryce13950
1 task done
[Bug Report] Test coverage missing on add_hook in hook_points
bug
Something isn't working
testing
A task that needs to be completed in order to improve the current test coverage.
#540
opened Apr 15, 2024 by
bryce13950
[Bug Report] Residual Stack Not Adding Up
demo
Creating a demo or tutorial
documentation
Improvements or additions to documentation
#523
opened Mar 22, 2024 by
EitanGronich
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.