You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To improve the adaptability of Large Language Models (LLMs) by examining and optimizing the storage paradigm within autoregressive transformer models. The emphasis is on pinpointing and editing the locations where factual associations are stored, ensuring that the models retain current and relevant information without requiring extensive retraining
Mechanistic analysis of a GPT-2–like model exploring the compositionality gap in transformers. Using Logit Lens and Causal Tracing, the study identifies and mitigates a deep-layer bottleneck via dataset enhancement to improve logical reasoning.