Reverse-engineered internal API calls from Claude Code CLI and Gemini CLI — how they authenticate and communicate with their respective backends.
This allows you to call the same APIs directly from your code, bypassing the slow CLI subprocess invocation.
Claude Code CLI uses OAuth tokens (not API keys) to call the standard Anthropic Messages API at api.anthropic.com/v1/messages.
The key discovery: it requires specific beta headers and metadata that aren't documented in the public API docs.
- User authenticates via
claudeCLI (OAuth flow) - Token saved at
~/.claude/.credentials.json - Token format:
sk-ant-oat01-...(OAuth Access Token)
~/.claude/.credentials.json
{
"claudeAiOauth": {
"accessToken": "sk-ant-oat01-...",
"refreshToken": "sk-ant-ort01-...",
"expiresAt": 1774341108518
}
}POST https://api.anthropic.com/v1/messages?beta=true
| Header | Value |
|---|---|
authorization |
Bearer sk-ant-oat01-... |
anthropic-beta |
claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,effort-2025-11-24 |
anthropic-dangerous-direct-browser-access |
true |
anthropic-version |
2023-06-01 |
user-agent |
claude-cli/<version> (external, cli) |
x-app |
cli |
content-type |
application/json |
{
"model": "claude-sonnet-4-20250514",
"max_tokens": 4096,
"messages": [{"role": "user", "content": "hello"}],
"system": [{"type": "text", "text": "x-anthropic-billing-header: cc_version=2.1.81; cc_entrypoint=cli;"}],
"metadata": {"user_id": "{\"device_id\":\"...\",\"account_uuid\":\"...\"}"},
"thinking": {"type": "enabled", "budget_tokens": 1024}
}Important:
- The
systemarray MUST include the billing header as first element - The
metadata.user_idis a JSON string containing device and account info thinkingis required when using theinterleaved-thinkingbeta- Without the beta headers, the API returns generic
400 Error - The
context-1m-2025-08-07beta may not be available on all subscriptions — remove it if you get "long context beta not available"
import json, os, requests
# Load OAuth token
creds = json.load(open(os.path.expanduser("~/.claude/.credentials.json")))
token = creds["claudeAiOauth"]["accessToken"]
response = requests.post(
"https://api.anthropic.com/v1/messages?beta=true",
headers={
"authorization": f"Bearer {token}",
"anthropic-beta": "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14",
"anthropic-dangerous-direct-browser-access": "true",
"anthropic-version": "2023-06-01",
"content-type": "application/json",
"user-agent": "claude-cli/2.1.81 (external, cli)",
"x-app": "cli",
},
json={
"model": "claude-sonnet-4-20250514",
"max_tokens": 2048,
"messages": [{"role": "user", "content": "hello"}],
"system": [{"type": "text", "text": "x-anthropic-billing-header: cc_version=2.1.81; cc_entrypoint=cli;"}],
"metadata": {"user_id": "{}"},
"thinking": {"type": "enabled", "budget_tokens": 1024},
},
timeout=30,
)
data = response.json()
for block in data.get("content", []):
if block["type"] == "text":
print(block["text"])Claude's API natively supports tool calling with the OAuth token:
{
"tools": [
{
"name": "get_weather",
"description": "Get weather for a location",
"input_schema": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}
]
}The response includes tool_use blocks with id, name, and input.
Gemini CLI uses Google OAuth tokens to call an internal Google API at cloudcode-pa.googleapis.com — this is the Cloud Code Assist backend, NOT the public Generative Language API.
- User authenticates via
geminiCLI (Google OAuth) - Tokens saved at
~/.gemini/oauth_creds.json - Token format:
ya29.a0ATk...(Google OAuth Access Token)
~/.gemini/oauth_creds.json
{
"access_token": "ya29.a0ATk...",
"refresh_token": "1//0hxlQ...",
"expiry_date": 1774331846208,
"token_type": "Bearer"
}These are embedded in the Gemini CLI source code and are safe to use per OAuth2 installed app specification:
Client ID: <extracted from @google/gemini-cli-core>
Client Secret: <extracted from @google/gemini-cli-core>
To find them yourself:
grep -r "apps.googleusercontent.com" node_modules/@google/gemini-cli-core/POST https://oauth2.googleapis.com/token
Content-Type: application/x-www-form-urlencoded
grant_type=refresh_token
&client_id=<CLIENT_ID>
&client_secret=<CLIENT_SECRET>
&refresh_token=<REFRESH_TOKEN>
POST https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist
Authorization: Bearer ya29.a0ATk...
Content-Type: application/json
{
"metadata": {
"ideType": "IDE_UNSPECIFIED",
"platform": "PLATFORM_UNSPECIFIED",
"pluginType": "GEMINI"
}
}
Response:
{
"cloudaicompanionProject": "valiant-sphinx-1l34x"
}POST https://cloudcode-pa.googleapis.com/v1internal:generateContent
Authorization: Bearer ya29.a0ATk...
Content-Type: application/json
{
"model": "gemini-2.5-flash",
"project": "<project_id_from_step_1>",
"user_prompt_id": "any-unique-id",
"request": {
"contents": [
{"role": "user", "parts": [{"text": "hello"}]}
]
}
}
Response:
{
"response": {
"candidates": [
{
"content": {
"parts": [{"text": "Hello!"}],
"role": "model"
}
}
]
}
}import json, os, requests
# Load OAuth token
creds = json.load(open(os.path.expanduser("~/.gemini/oauth_creds.json")))
token = creds["access_token"]
base = "https://cloudcode-pa.googleapis.com/v1internal"
headers = {"Authorization": f"Bearer {token}", "Content-Type": "application/json"}
# Get project ID
r = requests.post(f"{base}:loadCodeAssist", headers=headers, json={
"metadata": {"ideType": "IDE_UNSPECIFIED", "platform": "PLATFORM_UNSPECIFIED", "pluginType": "GEMINI"}
})
project = r.json()["cloudaicompanionProject"]
# Generate content
r = requests.post(f"{base}:generateContent", headers=headers, json={
"model": "gemini-2.5-flash",
"project": project,
"user_prompt_id": "test-1",
"request": {"contents": [{"role": "user", "parts": [{"text": "hello"}]}]}
})
text = r.json()["response"]["candidates"][0]["content"]["parts"][0]["text"]
print(text)Gemini's internal API supports function calling:
{
"request": {
"contents": [...],
"tools": [{
"functionDeclarations": [{
"name": "get_weather",
"description": "Get weather",
"parameters": {
"type": "OBJECT",
"properties": {"location": {"type": "STRING"}},
"required": ["location"]
}
}]
}]
}
}Response includes functionCall parts. Send results back as functionResponse:
{"role": "user", "parts": [{"functionResponse": {"name": "get_weather", "response": {"result": "sunny"}}}]}| Method | Response Time |
|---|---|
Claude CLI (claude -p) |
~15s |
| Claude Direct API | ~3-8s |
Gemini CLI (gemini -p) |
~56s |
| Gemini Direct API | ~3s |
- Node.js HTTP interceptor — injected via
NODE_OPTIONS="--require=interceptor.js"to capture allfetch()andhttps.request()calls - Full request dump — saved headers, body, and URL of the
/v1/messagescall - Iterative testing — removed headers one by one to find the minimum required set
- Key breakthrough for Claude: the
anthropic-betaheaders,x-app: cli, billing header in system prompt, andmetadata.user_idare all required - Key breakthrough for Gemini: the endpoint is
cloudcode-pa.googleapis.com/v1internal(not the publicgenerativelanguage.googleapis.com), and requires a project ID obtained vialoadCodeAssist
This is for educational and research purposes. The internal APIs may change without notice. Always respect the terms of service of the respective platforms.