Comprehensive database of VTK Python test files with baseline images and generated documentation for MCP integration.
This repository provides a complete workflow to:
- Scrape Python tests from the VTK repository (909 tests)
- Map official VTK baseline images to tests (672 images)
- Extract VTK test data files for runnable tests
- Generate markdown documentation with baseline images and data files
- Export structured data for MCP RAG integration
- Comprehensive Scraping: Extract Python tests from VTK repository Testing directories
- VTK Class Detection: AST parsing + regex patterns for class/method extraction
- Module Organization: Categorize tests by VTK module (Filters/Core, IO/XML, etc.)
- Test Type Classification: Unit, integration, and visual test identification
- Baseline Discovery: Automatic detection of test reference images
- Image Association: Link baseline images to corresponding test files
- Image Cataloging: Download and organize baseline images locally
- Data File Extraction: Extract and copy VTK test data files (including directories)
- Visual Documentation: Include images and data files in generated markdown docs
- Test Documentation: Generate markdown files for each test
- Rich Content: Include test code, VTK classes used, and baseline images
- Organized Structure: Categorize docs by module and test type
- Cross-References: Link related tests and VTK classes
{
"id": "test_<hash>",
"title": "Test file name",
"vtk_module": "Filters/Core",
"test_type": "visual",
"code": "Full Python test code",
"vtk_classes": ["vtkActor", "vtkPolyData"],
"vtk_methods": ["SetInput", "Update"],
"baseline_images": ["TestName.png"],
"data_files": ["Data/beach.jpg", "Data/headsq/"],
"test_functions": ["TestFunction1"]
}
vtk-python-tests/
├── setup_venv.sh # One-command setup
├── setup.py # Project setup
├── build.py # Build workflow
├── clean.py # Cleanup script
├── scripts/
│ ├── scrape/
│ │ ├── scrape_vtk_tests.py # VTK test scraper
│ │ └── common.py # Shared utilities
│ ├── setup_vtk_test_database.py # Baseline image setup
│ └── generate_docs.py # Documentation generator
├── data/
│ ├── processed/
│ │ └── vtk_python_tests.jsonl # Main database (909 tests)
│ ├── images/baselines/ # Baseline images (672 images)
│ ├── vtk_source/ # Downloaded VTK source code
│ └── vtk_release_data/ # Downloaded VTK test data files
└── docs/tests/ # Generated documentation
├── index.md # Test index
├── BASELINE_IMAGES.md # Baseline summary
├── Data/ # Copied VTK test data files
└── [Module]/[Test].md # Individual test docs
./setup_venv.sh
python3 -m venv .venv
source .venv/bin/activate
python setup.py
python build.py # Complete workflow
python build.py --scrape-only # Create database only
python build.py --setup-only # Add baseline images only
python build.py --docs-only # Generate docs with data files
python clean.py # Remove docs/images
python clean.py --all # Remove everything
Each test gets a markdown file with:
- Test name and purpose
- VTK module and category
- Test type (unit/integration/visual)
- Full Python test code
- VTK classes used with descriptions
- VTK methods called
- Dependencies and imports
- Baseline images (for visual tests)
- Required VTK test data files
- Image descriptions and context
- Multiple test outputs if applicable
- Similar tests in the same module
- Tests using the same VTK classes
- Cross-references to related functionality
This repository integrates with the main VTK MCP system:
- Data Export: JSONL format compatible with MCP corpus
- Schema Alignment: Matches vtk-python-examples format
- External Integration: Used by
integrate_external_repos.py
Purpose: Create initial test database from VTK repository
- Scrapes VTK Python test files from repository
- Extracts test metadata and VTK class usage
- Creates initial
data/processed/vtk_python_tests.jsonl
Usage: python scripts/scrape/scrape_vtk_tests.py
Purpose: Complete VTK test database setup
- Downloads VTK 9.5.0 data files
- Maps baseline images to tests in database
- Copies baseline images to
data/images/baselines/
- Updates database with baseline metadata
Usage: python scripts/setup_vtk_test_database.py
Purpose: Generate markdown documentation with baseline images and data files
- Reads test database with baseline info
- Creates markdown files with baseline images automatically included
- Extracts and copies VTK test data files to
docs/tests/Data/
- Generates organized docs in
docs/tests/
Usage: python scripts/generate_docs.py --clean --vtk-data-dir data/vtk_release_data
setup_venv.sh
- One-command environment setupsetup.py
- Project initialization (directories, dependencies)build.py
- Complete workflow automationclean.py
- Generated file cleanup
- Fork the repository
- Create feature branch
- Add tests for new functionality
- Submit pull request
This project follows VTK's licensing terms for derivative works.