Skip to content

context embeddings guide#1

Open
shortysal92 wants to merge 239 commits into
add-colab-buttonfrom
claude/add-claude-documentation-P4sMf
Open

context embeddings guide#1
shortysal92 wants to merge 239 commits into
add-colab-buttonfrom
claude/add-claude-documentation-P4sMf

Conversation

@shortysal92
Copy link
Copy Markdown
Owner

No description provided.

PedramNavid and others added 30 commits October 21, 2025 16:13
Fix formatting of exclude_path entry for .pyc files
* feat: add cookbook-audit skill for automated notebook validation

Refactor notebook-review command to delegate validation to a new cookbook-audit skill.

Add comprehensive automated validation script (validate_notebook.py) that:
- Checks for hardcoded secrets and API keys
- Validates notebook structure and introductions
- Detects code quality issues (variable names, verbosity)
- Identifies deprecated API patterns and invalid models
- Converts notebooks to markdown for easier review

Add detailed audit rubric (SKILL.md) with:
- Structured audit workflow and report format
- Scoring framework across 4 dimensions (20 points total)
- Concrete examples of high and low-scoring audits
- Comprehensive checklist and content philosophy
- Style and structural requirements for cookbook notebooks

The validate_notebook.py script runs automated checks and generates
a markdown version of notebooks (saved to gitignored tmp/ folder) for
more efficient context usage during manual review.

* feat(security): add detect-secrets configuration and Anthropic credentials detector

Add baseline configuration for the detect-secrets library with a custom plugin
to detect Anthropic API keys and credentials in notebooks. Includes comprehensive
set of built-in detectors and heuristic filters to prevent secrets from being
committed to the repository.

feat(cookbook-audit): integrate detect-secrets for hardcoded credential detection

Enhanced the notebook validation to use detect-secrets for identifying
hardcoded API keys and credentials. The implementation:
- Runs detect-secrets-hook on notebooks with baseline configuration
- Automatically locates baseline at `scripts/detect-secrets/.secrets.baseline`
- Falls back to basic pattern matching if detect-secrets unavailable
- Provides detailed output for manual review of potential secrets

Updated documentation to reflect the automated secret scanning capability.

* chore(workflows): remove unnecessary id-token permission

Remove id-token: write permission from Claude Code workflow files
as it is not needed for these operations. The workflows only require:
- contents: read (to checkout repository code)
- pull-requests: write (to comment on pull requests)

The id-token: write permission is used for OIDC authentication with
cloud providers (AWS, GCP, Azure) which these workflows do not use.

This follows the principle of least privilege and reduces the
security attack surface.

Affected workflows:
- claude-notebook-review.yml
- claude-link-review.yml

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* use relative paths and run ruff on notebook script

---------

Co-authored-by: Claude <noreply@anthropic.com>
Enhanced the classification cookbook with clearer explanations and better
pedagogical structure throughout:

- Added context sections before key code blocks (data loading, evaluation
  framework, random baseline, simple classifier, RAG, CoT)
- Included analysis of confusion matrices after each classification approach,
  explaining what the results reveal and motivating the next technique
- Added progressive accuracy tracking (10% → 70% → 94% → 97%) to show
  empirical improvement at each step
- Improved evaluation section with clearer motivation for Promptfoo and
  production-scale evaluation needs
- Added comprehensive Promptfoo results analysis explaining temperature
  effects and production recommendations
- Fixed evaluate() function to use max_workers and as_completed() for proper
  rate limit handling without artificial delays

These changes make the guide more action-oriented while building transferable
understanding of why each technique (RAG, chain-of-thought) improves
classification accuracy.

🤖 Generated with my best friend, [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
* Create validate_brand.py

Missing validate_brand.py

* fix: improve validate_brand.py compatibility with repository standards

- Update example brand guidelines to use correct Acme Corporation standards matching SKILL.md
- Add comprehensive error handling to load_guidelines_from_json function
- Add get_acme_corporation_guidelines() helper function for default guidelines
- Fix test content to use proper brand name capitalization
- Improve documentation with detailed docstrings

These changes ensure consistency with apply_brand.py and the SKILL.md reference documentation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* chore: upgrade dependencies and add development tooling

Update core dependencies to latest versions:
- anthropic 0.39.0 → 0.71.0
- ipykernel 6.29.5 → 7.1.0
- notebook 7.2.2 → 7.4.7
- numpy 1.26.4 → 2.3.4
- pandas 2.2.3 → 2.3.3

Add Makefile with common development tasks (format, lint, test, check).
Add pylint configuration for deeper code analysis.
Enhance ruff linting configuration with per-file rules for notebooks.

* Format and Lint all files/notebooks with ruff
* add ruff format/lint checks in repo and format all notebooks

* add github workflow, update GH actions so they only operate on changed files and restrict Claude workflow jobs to internal contributors

Updates all three workflows to:
- Accept pr_number as manual input parameter
- Dynamically resolve PR number from event context
- Fetch correct PR ref and base branch for manual triggers
- Pass GH_TOKEN for gh CLI commands during manual dispatch
- Change sounddevice requirement from >=0.5.2 to >=0.5.1 to fix installation issues (version 0.5.2 doesn't exist)
- Update sentence-by-sentence streaming cell to use mp3_44100_128 format instead of pcm_44100 (free tier compatible)
- Add pip upgrade cell to notebook for better package management
- Clean up notebook cell execution outputs

Co-Authored-By: ashprabaker <ashprabaker@anthropic.com>
Added a detailed "How to Use This Cookbook" section that guides users through:
- Step 1: Environment setup with API keys and dependencies
- Step 2: Working through the notebook to learn concepts
- Step 3: Running the production script for hands-on experience

Also expanded the "More About ElevenLabs" section with additional resources including Voice Library, API Playground, and SDK links.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
The script was using pcm_44100 format which requires ElevenLabs Pro tier,
causing WebSocket connections to close with error 1008. Fixed by:

- Changed TTS_OUTPUT_FORMAT from pcm_44100 to mp3_44100_128 (free tier)
- Added pydub dependency for MP3 decoding
- Updated AudioQueue.add() to decode MP3 chunks before playback
- Enhanced WebSocket close handler to log error details
- Updated docstring to reflect MP3 format usage

The script now works with free tier ElevenLabs accounts.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Improve link checker workflow efficiency by:
- Detect changed markdown/notebook files in PRs
- Only convert and check changed files instead of entire repo
- Keep full scan for scheduled/manual runs
- Add fetch-depth: 0 for proper diff comparison

This reduces CI time for PRs while maintaining comprehensive checks on schedule.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
- Add virtual environment setup instructions
- Document required ElevenLabs API key permissions
- Add troubleshooting section covering common issues
- Add project ideas to inspire users
- Suppress MP3 decoding errors with try-except pattern
- Document audio popping as expected free-tier behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…stant

Add ElevenLabs Low Latency Voice Assistant Integration
…stant

Update ElevenLabs Voice Assistant: Improve documentation and error handling
Previously, the workflow only checked markdown files in the skills/
directory and the root README.md. This caused failures when PRs
modified markdown files in other directories (e.g., third_party/).

Now all changed markdown files are included in the link check,
regardless of their directory location.

Fixes CI failure in PR #263 where third_party/ElevenLabs/README.md
changes were not being checked.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
* docs: restructure research agent notebook with comprehensive explanations and examples

Reorganized the notebook to improve pedagogical flow and add detailed guidance:

- Added introduction explaining why research agents are ideal agentic use cases
- Included prerequisites section with required knowledge and setup instructions
- Expanded stateless query example with output visualization and detailed explanations
- Clarified when to use stateless vs stateful agent patterns
- Enhanced production-ready improvements section with three key enhancements
- Added stateful agent example using ClaudeSDKClient with conversation memory
- Improved conclusion with clear learning outcomes and next steps
- Updated kernel specification to match current environment

* Update 00_The_one_liner_research_agent.ipynb
* feat(skills): add comprehensive style guide for cookbook audits

* fix(ci): resolve lychee "No links were found" error

Fixed two issues in the link checking workflow:

1. **Fixed bash variable scoping bug**: The while-read loop was running in a
   subshell, causing FILES variable assignments to be lost. Changed to use
   process substitution (< <(...)) to keep the loop in the current shell.

2. **Added failIfEmpty: false**: Added this parameter to both lychee-action
   invocations to prevent failures when no links are found (legitimate for
   some PRs).

3. **Skip lychee when no files**: Added has_files output and conditional
   check to skip the lychee step entirely when no markdown files are found.

4. **Exclude .claude/ directory**: Added .claude/ to lychee.toml exclude_path
   since it contains tooling/config files that don't need link checking.

These changes ensure the link checker works correctly for PRs that only
modify files in excluded directories or files without external links.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
- bump sdk from v0.0.20 to v0.1.6
- modernize code with sdk type abstractions
- add session management and tool filtering
- expand notebook 01 with more features
- improve narrative flow and explanations
- add utils for report tracking
* feat: add GitHub Action to post notebook diffs in PR comments

- Automatically detects changed .ipynb files in PRs
- Generates nbdime diffs for each changed notebook
- Posts formatted comment with collapsible diff sections
- Handles multiple notebooks and new files gracefully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: use --system flag for uv pip install in workflow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: use uvx to run nbdiff without installation

Using uvx --from nbdime nbdiff creates an ephemeral environment
and avoids issues with system Python being externally managed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: add nbdiff flags to ignore non-content changes

Adds flags to focus diff on actual content:
- --ignore-attachments: Skip attachment changes
- --ignore-metadata: Skip metadata changes
- --ignore-identifiers: Skip cell ID changes
- --ignore-details: Skip execution counts, etc.

This makes the diff more readable and focused on code/markdown changes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: use correct nbdiff ignore flag syntax

Use short flags (-A -M -I -D) instead of long form flags that don't exist.

- -A: ignore attachments
- -M: ignore metadata
- -I: ignore identifiers (cell IDs)
- -D: ignore details (execution counts, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Adds a top-level CLAUDE.md covering the full repository structure,
development workflows, code conventions, CI/CD pipelines, environment
setup, and subsection-specific notes to help AI assistants navigate
and contribute to the Anthropic Cookbook effectively.
Replace the verbose 406-line file with a concise 75-line guide that
covers commands, non-obvious architecture decisions (dummy package,
shared slash commands for local/CI, intentional notebook outputs,
separate claude_agent_sdk project), and key conventions. Removes
generic practices and redundant file listings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.