feat: Implement validate command with comprehensive Excel file validation#30
Merged
feat: Implement validate command with comprehensive Excel file validation#30
Conversation
Create validation/ module with: - Module structure (__init__.py files) - SchemaParser: Extract table definitions from schema.sql - TableSchema: Data class for table requirements SchemaParser features: - Parse CREATE TABLE statements with regex - Extract column names and types (INTEGER, TEXT, REAL, DATETIME) - Detect primary keys - Detect foreign keys with references Foundational for all validation checks. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add dataclasses for validation results: - ValidationError: Individual error/warning with location - FileValidationResult: Result per file - ValidationResult: Overall project result Models include: - Error codes (COLUMN_MISSING, DUPLICATE_PK, etc.) - Severity levels (error, warning) - Location tracking (sheet, row, column) - Helper properties for status checks Foundation for all validation checks and reporting. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add file_checks.py with checks for: - File existence (FILE_NOT_FOUND) - File readability (FILE_CORRUPT) - File format validation (NOT_EXCEL_FILE) - File size warning (LARGE_FILE for > 100MB) These are the first validation checks that run before attempting to read any data from the files. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add sheet_checks.py with checks for: - Sheet existence (SHEET_MISSING) - Sheet has data (SHEET_EMPTY) - Extra sheets warning (EXTRA_SHEET) - Row count helper Validates that required sheets exist and contain data, warns about extra sheets that will be ignored during import. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add column_checks.py with checks for: - Required columns present (COLUMN_MISSING) - Extra columns warning (EXTRA_COLUMN) - Column existence helper Ensures all required columns from schema are present, warns about extra columns that will be ignored. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add type_checks.py with checks for: - Type mismatches (TYPE_MISMATCH) - Numeric range validation (NEGATIVE_VALUE) - Sample reporting for first 5 issues Validates that INTEGER, REAL columns contain appropriate data, warns about negative values in quantity/stock columns. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add pk_checks.py with checks for: - Duplicate primary key values (DUPLICATE_PK) - Null values in primary key (PK_NULL_VALUES) - Sample reporting (up to 10 duplicate PKs with row locations) Critical check to ensure data integrity before import. Shows exact row locations of duplicate values for easy fixing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add null_checks.py with check for: - Null values in all columns (NULL_VALUES) - Sample row reporting (up to 10 locations) Warns about null values with specific row locations to help users quickly identify and fix missing data. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add validator.py with: - validate_project(): Validate all Excel files - validate_file(): Validate single file with all checks - Integration of all check modules - Strict mode support (warnings as errors) Orchestrates all validation checks: 1. File-level checks (exists, format, readable) 2. Sheet-level checks (exists, not empty) 3. Column checks (required, extra) 4. Type checks (data types, ranges) 5. Primary key checks (uniqueness, nulls) 6. Null value checks Returns structured ValidationResult with timing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add reporters.py with ValidationReporter class: - print_result(): Main output method - print_success(): Success message - print_warnings(): Warnings summary - print_errors(): Detailed error listing - print_file_details(): Per-file detailed output Provides clear, actionable output with: - Color-coded status indicators ([OK], [X], [!]) - Error code identification - Suggestions for each error - Summary statistics - Next steps guidance Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add validate command to CLI with: - validate: Main validation command - --strict flag: Treat warnings as errors - Integration with Validator and ValidationReporter - Exit code 1 on validation failure Usage: wareflow validate # Validate all files wareflow validate --strict # Fail on warnings Provides clear output and actionable error messages to help users fix Excel files before attempting import. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fix AttributeError in reporters.py: - Changed errors_count/warnings_count to len(errors)/len(warnings) - FileValidationResult uses list attributes, not count attributes Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add support for inline PRIMARY KEY declarations: - Pattern for: column_name TYPE PRIMARY KEY - Handles: no_produit INTEGER PRIMARY KEY Previous pattern only worked for: PRIMARY KEY (no_produit) This fixes the issue where primary keys were not being detected, causing validation to miss duplicate PK errors. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add v1.0.0.md: Complete architecture and design document - Dual interface strategy (CLI + GUI with CustomTkinter) - Analysis system design (core + custom YAML-based) - Plugin architecture with progressive enhancement - Project structure and workflows - Add v0.2.0.md: Implementation roadmap for v0.2.0 - Current state analysis (35% complete) - Missing features inventory (analyze, export, run) - 5-phase implementation plan (12-18 days) - Technical specifications with code examples - Testing strategy and risk assessment These documents provide the strategic vision and tactical implementation plan for completing the core analytics engine. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the
validatecommand for comprehensive validation of Excel files before import, detecting errors and providing clear, actionable feedback.Features
Validation Checks Implemented
File-Level Checks:
Sheet-Level Checks:
Column-Level Checks:
Data-Level Checks:
Command Usage
Output Examples
Success:
With Errors:
Implementation Details
Module Structure:
Technical Highlights:
Benefits
Dependencies
Testing
Tested with sample Excel files:
🤖 Generated with Claude Code