Merged
Conversation
- Created new 'atlas quickstart' CLI command with 3 security review tasks - Implemented metrics visualization table showing learning progression - Added offline mode support (ATLAS_OFFLINE_MODE) for smoke testing - Implemented backward compatibility for ATLAS_FAKE_LLM with deprecation warnings (Issue #110) - Added graceful storage fallback when Postgres unavailable - Enhanced quickstart with cost estimation and better error handling - Added comprehensive test coverage (20+ test cases) - Updated documentation: README.md, docs/sdk/quickstart.mdx, pypi.md - Deprecated examples/quickstart.py in favor of CLI command - Updated Docker entrypoint to use new CLI command Features: - 3 progressive security review tasks demonstrating learning - Metrics table with improvement indicators (↑/↓) - Learning insights generation - Configurable task count (1-3 tasks) - Storage integration with graceful fallback - Offline mode for testing without API calls Closes: Replace legacy quickstart with CLI command implementation
The script has been fully replaced by the 'atlas quickstart' CLI command. All functionality has been migrated to the new command with additional features: - 3 progressive tasks instead of 2 passes - Metrics visualization - Better error handling - Offline mode support
- Fix test_quickstart_command_registered to use --offline instead of --help - Fix test_quickstart_missing_config_file to properly handle SystemExit Both tests were failing due to incorrect test setup: - First test used --help which triggers SystemExit - Second test didn't account for sys.exit() raising SystemExit instead of returning
- Increase text response truncation from 500 to 1500 characters - Add smart JSON structure detection and display: * Shows nested structure with key counts * Displays truncated JSON snippet (first 500 chars) * Better visualization of complex JSON responses - Save run artifacts for each task with full answers - Add note pointing to run artifacts for full answers - Display artifact directory path in completion message This provides better developer experience: - More content visible in CLI (1500 chars vs 500) - JSON responses show structure overview + snippet - Full answers always available in .atlas/runs/ artifacts
- Add documentation comment explaining what's saved in artifacts - Verify playbook entries are displayed via _render_learning_summary - Add helpful note about deeper learning analysis when playbook entries exist - Point users to scripts/eval_learning.py and docs/evaluation/learning_eval.md Playbook entries (the key learning signal) are: - Saved in artifacts: metadata.learning_state.metadata.playbook_entries - Displayed in CLI: Active Playbook Entries section with cue hits/adoptions - Full structure includes: cue, action, scope, provenance, impact metrics This ensures users understand where to find learning data and how to analyze it.
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR replaces the standalone examples/quickstart.py script with a new atlas quickstart CLI command and renames the environment variable ATLAS_FAKE_LLM to ATLAS_OFFLINE_MODE while maintaining backward compatibility. The new quickstart command demonstrates Atlas learning capabilities through 3 progressive security review tasks with improved metrics visualization and user experience.
Key changes:
- New
atlas quickstartCLI command with offline mode, configurable task count, and storage options - Environment variable renamed from
ATLAS_FAKE_LLMtoATLAS_OFFLINE_MODEwith deprecation warning for the old name - Comprehensive test coverage for offline mode functionality and CLI command features
Reviewed Changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
atlas/cli/quickstart.py |
New CLI command implementation with 3 security tasks, metrics table, and learning insights |
atlas/utils/llm_client.py |
Updated to support both ATLAS_OFFLINE_MODE (new) and ATLAS_FAKE_LLM (deprecated) with warning |
tests/unit/utils/test_llm_client.py |
New test suite for offline mode variable handling and deprecation warnings |
tests/unit/cli/test_quickstart.py |
Comprehensive tests for quickstart command functionality |
atlas/cli/main.py |
Registered new quickstart subcommand |
docs/sdk/quickstart.mdx |
Updated documentation for new CLI command usage |
examples/quickstart.py |
Removed old script (replaced by CLI command) |
docs/guides/pypi.md |
Updated to reference ATLAS_OFFLINE_MODE with backward compatibility note |
docker/entrypoint.sh |
Updated to use new CLI command |
README.md |
Updated quickstart instructions |
examples/mcp_tool_learning/README.md |
Updated reference to new CLI command |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Extract _has_playbook_entries() helper function to eliminate code duplication - Add test_format_final_answer_json() - Test JSON structure display - Add test_format_final_answer_text_truncation() - Test text truncation at 1500 chars - Add test_artifact_saving() - Test that artifacts are saved per task - Add test_learning_analysis_note_shown() - Test learning note appears when playbook entries exist - Add test_has_playbook_entries() - Test helper function with various metadata structures Addresses feedback from code review: adds comprehensive test coverage for the smart JSON handling and artifact saving features introduced in commits 8d4fe00 and 72c20f8.
a57d6f8 to
48a94f8
Compare
- Remove unused imports (os, sys) from test files - Add comment to empty except clause explaining intent - Fix JSON serialization to use _ensure_jsonable helper - Ensure .env file is loaded early for API keys - Fix cost estimate: reduce from $0.05 to $0.01 per task (actual ~$0.001-0.005)
48a94f8 to
6a1c5c0
Compare
- Convert _check_storage_available to async function - Convert _ensure_storage to async function - Fixes issue where storage was not detected when called from async context
… persistence - Set learning_key_override in session_metadata for all tasks - Ensures learning state persists across all 3 quickstart tasks - Users can now see learning progression and persistence in action
- Standardize learning config across all example and template configs - Add commented common options for discoverability (history_limit, pruning_config, apply_to_prompts) - Remove redundant defaults (enforce_generality, provisional_acceptance, pruning_config) - all have sensible defaults - Improve configuration documentation for ML engineers: - Add Key Concepts section with clear definitions - Add 'How it works' explanation for learning system - Define all terms before use (capability probe, empirical validation, playbook entries) - Add real-world tuning examples with context - Remove redundancy and improve flow - Add 'Why this matters' context - Improve SQL query explanations - Fix terminology consistency (playbook entries, not pamphlets) - Add schema constraints documentation - All configs validated and passing tests
- Fix ExecutionContext metadata timing bug: capture metadata immediately after arun() completes - Fix empty exception handler: add logging for serialization errors - Fix storage check cleanup: add proper finally block and logging - Fix learning key override precedence: override now takes precedence over existing key - Remove cost estimation functionality from quickstart command Addresses critical issues #1-4 from PR review (#114)
- Extract offline mode check utility to atlas/utils/env.py - Remove all code comments from quickstart.py - Extract magic numbers as constants - Add comprehensive error handling in _format_final_answer() - Standardize type hints (Dict -> dict) in llm_client.py - Add explicit recursion depth guard constant
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new
atlas quickstartCLI command that replaces the oldexamples/quickstart.pyscript, making it easier for users to run a demonstration of Atlas learning capabilities. It also updates documentation and code to support a new offline mode (ATLAS_OFFLINE_MODE) for mock LLM responses, deprecating the previousATLAS_FAKE_LLMenvironment variable. The changes improve usability, provide better error handling, and clarify the workflow for running and testing Atlas. Below are the most important changes grouped by theme:CLI Improvements and Quickstart Command
atlas quickstartCLI command that runs a demonstration with 3 security review tasks, visualizes metrics, estimates costs, and supports both online and offline modes. This command is now registered in the CLI parser and replaces the old script. [1] [2] [3] [4] [5]README.md,docs/sdk/quickstart.mdx,docs/guides/pypi.md,examples/mcp_tool_learning/README.md) to referenceatlas quickstartinstead of the deprecated script, and provide step-by-step instructions, expected output, troubleshooting, and feature highlights. [1] [2] [3] [4] [5]Offline Mode and Environment Variable Updates
ATLAS_OFFLINE_MODE=1as the preferred way to run Atlas in mock mode for offline testing, with clear warnings for users still using the legacyATLAS_FAKE_LLMvariable. All code and documentation references now useATLAS_OFFLINE_MODE. [1] [2] [3] [4] [5]Usability and Error Handling
Deprecation Notices
examples/quickstart.pyscript and theATLAS_FAKE_LLMenvironment variable, with warnings and documentation updates to guide users to the new workflow. [1] [2] [3] [4] [5]Documentation and Example Updates
Fixes: #109 #110