Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4,463 changes: 2,235 additions & 2,228 deletions content/.metadata.json

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions content/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Changelog

## 2.1.156

- Fixed an issue when using Opus 4.8 where thinking blocks were modified, leading to API errors.

## 2.1.154

- Opus 4.8 is here! Now defaults to high effort · /effort xhigh for your hardest tasks
Expand Down
2 changes: 1 addition & 1 deletion content/blog/research/impact-software-development.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Jobs that involve computer programming are a small sector of the modern economy,

In our [previous Economic Index research](https://www.anthropic.com/news/the-anthropic-economic-index), we found very disproportionate use of Claude by US workers in computer-related occupations: that is, there were many more conversations with Claude about computer-related tasks than one would predict from the number of people working in relevant jobs. It’s the same in [the educational context](https://www.anthropic.com/news/anthropic-education-report-how-university-students-use-claude): Computer Science degrees—which involve large amounts of coding—show highly disproportionate AI use.

To understand these changes in more detail, we conducted an analysis of 500,000 coding-related interactions across [Claude.ai](http://claude.ai/redirect/website.v1.fead9975-43b7-403a-972e-eb131ee4485a) (the “default” way that most people interact with Claude) and [Claude Code](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview) (our new specialist coding “agent” that can independently accomplish chains of complex tasks using a variety of digital tools).
To understand these changes in more detail, we conducted an analysis of 500,000 coding-related interactions across [Claude.ai](http://claude.ai/redirect/website.v1.33a1fedf-5747-4b68-9502-0e04849443fe) (the “default” way that most people interact with Claude) and [Claude Code](https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview) (our new specialist coding “agent” that can independently accomplish chains of complex tasks using a variety of digital tools).

We found three key patterns:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ Our views on the AI competition between the US and China.

### Terms and policies

Privacy choices* [Privacy policy](https://www.anthropic.com/legal/privacy)
* [Privacy policy](https://www.anthropic.com/legal/privacy)
* [Consumer health data privacy policy](https://www.anthropic.com/legal/consumer-health-data-privacy-policy)
* [Responsible disclosure policy](https://www.anthropic.com/responsible-disclosure-policy)
* [Terms of service: Commercial](https://www.anthropic.com/legal/commercial-terms)
Expand Down
4 changes: 2 additions & 2 deletions content/blog/research/interpretability-dreams.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ Title: Interpretability Dreams
URL Source: https://www.anthropic.com/research/interpretability-dreams

Markdown Content:
# Interpretability Dreams \ Anthropic

[Skip to main content](https://www.anthropic.com/research/interpretability-dreams#main-content)[Skip to footer](https://www.anthropic.com/research/interpretability-dreams#footer)

[](https://www.anthropic.com/)
Expand Down Expand Up @@ -160,3 +158,5 @@ Privacy choices* [Privacy policy](https://www.anthropic.com/legal/privacy)
* [](https://www.linkedin.com/company/anthropicresearch)
* [](https://x.com/AnthropicAI)
* [](https://www.youtube.com/@anthropic-ai)

# Interpretability Dreams \ Anthropic
2 changes: 1 addition & 1 deletion content/blog/research/mapping-mind-language-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ There was both an engineering challenge (the raw sizes of the models involved re

As for the scientific risk, the proof is in the pudding.

We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on [claude.ai](https://claude.ai/redirect/website.v1.b1ef2abc-07bf-4def-a05d-35e4a35fc8c1)), providing a rough conceptual map of its internal states halfway through its computation. This is the first ever detailed look inside a modern, production-grade large language model.
We successfully extracted millions of features from the middle layer of Claude 3.0 Sonnet, (a member of our current, state-of-the-art model family, currently available on [claude.ai](https://claude.ai/redirect/website.v1.6b08c2cb-d676-49ea-a3d5-4d257df65e79)), providing a rough conceptual map of its internal states halfway through its computation. This is the first ever detailed look inside a modern, production-grade large language model.

Whereas the features we found in the toy language model were rather superficial, the features we found in Sonnet have a depth, breadth, and abstraction reflecting Sonnet's advanced capabilities.

Expand Down
2 changes: 1 addition & 1 deletion content/blog/research/natural-language-autoencoders.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ For example, in the blackmail test discussed above, we used NLAs to understand w

Claude does not say it thinks it’s being tested in this transcript from our blackmail evaluation, but the NLA explanation reveals its suspicions.

Extending this analysis to other evaluations, NLAs suggest that Claude suspects it’s being tested more often than it lets on. For instance, in a test of whether Claude takes destructive actions while writing code—for example, deleting important files—NLA explanations show signs of evaluation awareness 16% of the time, even though Claude never explicitly verbalizes this. Similarly, on the popular coding benchmark SWE-bench Verified, which we use to determine how capable Claude is, NLAs indicate that Claude is aware it is being evaluated on 26% of problems. In contrast, NLA explanations indicate evaluation awareness on less than 1% of real [claude.ai](http://claude.ai/redirect/website.v1.6518b370-b0b5-4a69-8954-bfa93edc4441) usage that opted in for training.
Extending this analysis to other evaluations, NLAs suggest that Claude suspects it’s being tested more often than it lets on. For instance, in a test of whether Claude takes destructive actions while writing code—for example, deleting important files—NLA explanations show signs of evaluation awareness 16% of the time, even though Claude never explicitly verbalizes this. Similarly, on the popular coding benchmark SWE-bench Verified, which we use to determine how capable Claude is, NLAs indicate that Claude is aware it is being evaluated on 26% of problems. In contrast, NLA explanations indicate evaluation awareness on less than 1% of real [claude.ai](http://claude.ai/redirect/website.v1.bc7e7e7d-f778-4a22-9852-5b3feb477007) usage that opted in for training.

![Image 4](https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2Ffc08005f32ff020f21e598cc72f58768af2bcc4f-1590x1170.png&w=3840&q=75)

Expand Down
36 changes: 18 additions & 18 deletions content/claude-code-manifest.json
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
{
"name": "@anthropic-ai/claude-code",
"version": "2.1.154",
"version": "2.1.156",
"author": {
"name": "Anthropic",
"email": "support@anthropic.com"
},
"license": "SEE LICENSE IN README.md",
"_id": "@anthropic-ai/claude-code@2.1.154",
"_id": "@anthropic-ai/claude-code@2.1.156",
"maintainers": [
{
"name": "zak-anthropic",
Expand Down Expand Up @@ -73,20 +73,20 @@
"claude": "bin/claude.exe"
},
"dist": {
"shasum": "46815e4dfae0da4cd7bb49f650f7c3f798e96f39",
"tarball": "https://registry.npmjs.org/@anthropic-ai/claude-code/-/claude-code-2.1.154.tgz",
"shasum": "32f21eb881e84f421195873842c68c367093d43e",
"tarball": "https://registry.npmjs.org/@anthropic-ai/claude-code/-/claude-code-2.1.156.tgz",
"fileCount": 7,
"integrity": "sha512-1LSc7Jtm7Jy/GgqzoC4jDtTeV4GphGdB2+r49QDJnDa4fDsJfxVHR4TQNETeyNtGw/uD1+voo7f2Vhjldyqe8w==",
"integrity": "sha512-DRIqsawy+n+LtNBaxOW+3JYLaehbCdEdc+mZjYv/zRnZ1bHeTetJBcV41TagNjL00hHjrlALdl76wmA4s/PVQQ==",
"signatures": [
{
"sig": "MEUCIFkL6YSLE7iKTkKBeNo3L67P4ZetJz58XixxXhK20rjcAiEApdebuVHImUSYJDmRGP7ZKKGZgdrxE+Qfwgr6wVVKFKg=",
"sig": "MEUCIAz7iPG3QRZaIOiHKoXT1QSJwRGEkNW2fLQJeEHoeTU8AiEAgtiITFRAZCekOQ17GEVlOht7C4javPiJRal5Ii4AJDI=",
"keyid": "SHA256:DhQ8wR5APBvFHLF/+Tc+AYvPOdTpcIDqOhxsBHRwC7U"
}
],
"unpackedSize": 145690
},
"type": "module",
"_from": "file:staged-npm/anthropic-ai-claude-code-2.1.154.tgz",
"_from": "file:staged-npm/anthropic-ai-claude-code-2.1.156.tgz",
"engines": {
"node": ">=18.0.0"
},
Expand All @@ -98,8 +98,8 @@
"name": "wolffiex",
"email": "wolffiex@anthropic.com"
},
"_resolved": "/home/runner/work/claude-cli-internal/claude-cli-internal/staged-npm/anthropic-ai-claude-code-2.1.154.tgz",
"_integrity": "sha512-1LSc7Jtm7Jy/GgqzoC4jDtTeV4GphGdB2+r49QDJnDa4fDsJfxVHR4TQNETeyNtGw/uD1+voo7f2Vhjldyqe8w==",
"_resolved": "/home/runner/work/claude-cli-internal/claude-cli-internal/staged-npm/anthropic-ai-claude-code-2.1.156.tgz",
"_integrity": "sha512-DRIqsawy+n+LtNBaxOW+3JYLaehbCdEdc+mZjYv/zRnZ1bHeTetJBcV41TagNjL00hHjrlALdl76wmA4s/PVQQ==",
"_npmVersion": "11.13.0",
"description": "Use Claude, Anthropic's AI assistant, right from your terminal. Claude can understand your codebase, edit files, run terminal commands, and handle entire workflows for you.",
"directories": {},
Expand All @@ -108,17 +108,17 @@
"_hasShrinkwrap": false,
"readmeFilename": "README.md",
"optionalDependencies": {
"@anthropic-ai/claude-code-linux-x64": "2.1.154",
"@anthropic-ai/claude-code-win32-x64": "2.1.154",
"@anthropic-ai/claude-code-darwin-x64": "2.1.154",
"@anthropic-ai/claude-code-linux-arm64": "2.1.154",
"@anthropic-ai/claude-code-win32-arm64": "2.1.154",
"@anthropic-ai/claude-code-darwin-arm64": "2.1.154",
"@anthropic-ai/claude-code-linux-x64-musl": "2.1.154",
"@anthropic-ai/claude-code-linux-arm64-musl": "2.1.154"
"@anthropic-ai/claude-code-linux-x64": "2.1.156",
"@anthropic-ai/claude-code-win32-x64": "2.1.156",
"@anthropic-ai/claude-code-darwin-x64": "2.1.156",
"@anthropic-ai/claude-code-linux-arm64": "2.1.156",
"@anthropic-ai/claude-code-win32-arm64": "2.1.156",
"@anthropic-ai/claude-code-darwin-arm64": "2.1.156",
"@anthropic-ai/claude-code-linux-x64-musl": "2.1.156",
"@anthropic-ai/claude-code-linux-arm64-musl": "2.1.156"
},
"_npmOperationalInternal": {
"tmp": "tmp/claude-code_2.1.154_1779982850242_0.32321079600865255",
"tmp": "tmp/claude-code_2.1.156_1779999172460_0.1439017309709112",
"host": "s3://npm-registry-packages-npm-production"
}
}
2 changes: 1 addition & 1 deletion content/en/agents-and-tools/tool-use/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ When you use `tools`, the API also automatically includes a special system promp

| Model | Tool choice | Tool use system prompt token count |
|--------------------------|------------------------------------------------------|---------------------------------------------|
| Claude Opus 4.8 | `auto`, `none`<hr />`any`, `tool` | 290 tokens<hr />410 tokens |
| <NextOpus /> | `auto`, `none`<hr />`any`, `tool` | 290 tokens<hr />410 tokens |
| Claude Opus 4.7 | `auto`, `none`<hr />`any`, `tool` | 675 tokens<hr />804 tokens |
| Claude Opus 4.6 | `auto`, `none`<hr />`any`, `tool` | 497 tokens<hr />589 tokens |
| Claude Opus 4.5 | `auto`, `none`<hr />`any`, `tool` | 496 tokens<hr />588 tokens |
Expand Down
Loading
Loading