AI-powered file analysis tool with intelligent filename generation and comprehensive analysis capabilities.
# Install the project
pip install -e ".[dev]"
# Analyze an image (primary use case)
fa path/to/image.jpg
# Get JSON output
fa --json path/to/image.jpg
# Get markdown output
fa --md path/to/image.jpg
- AI Vision Analysis: Smart image analysis with FastVLM, BakLLaVA, and Qwen2-VL models
- Intelligent Filename Generation: AI-powered suggestions based on image content
- Multiple Output Formats: Text (default), JSON, and Markdown output
- Comprehensive Analysis: Metadata extraction, duplicate detection, OCR, malware scanning
- Advanced Tools: ripgrep content searching, binary analysis with binwalk
The File Analyzer includes advanced AI-powered filename generation that suggests meaningful names based on image content:
$ fa test_data/images/test.jpg
Recommended Filename: number-5.jpg
Description:
The image displays a stylized, minimalist design of the number '5'. It is composed
of two overlapping rectangles, with the top rectangle forming the upper part of the
number and the bottom rectangle forming the lower part. The rectangles are filled
with a solid, light beige color...
Tags: beige, design, flat design, minimalism, number, solid color
Analysis Time: 17.34 seconds
-
Content Recognition: Detects specific content types (letters, numbers, icons, objects)
letter-t.jpg
for text charactersnumber-5.jpg
for numeric digitsicon-star.png
for symbolic contentduck-wizard.jpg
for descriptive content
-
Tag Cleaning: Removes generic terms like "image", "photo", "shooting" while preserving meaningful tags
-
Semantic Analysis: Uses AI models to generate descriptive filenames from complex image content
-
Fallback Logic: Graceful degradation when AI analysis fails or produces unclear results
├── src/ # Core source code (Python modules)
│ ├── analyzer.py # Main analyzer module
│ ├── vision.py # Vision analysis module
│ ├── json_utils.py # JSON processing utilities
│ ├── model_config.py # Model management configuration
│ ├── fastvlm_adapter.py # FastVLM model adapter
│ └── artifact_guard.py # Artifact path discipline
│
├── tools/ # Command-line tools and utilities
│ ├── analyze.sh # Main CLI wrapper
│ ├── vision_test.sh # Vision model testing
│ ├── json_test.sh # JSON output testing
│ ├── download_models.py # Model download utility
│ ├── setup_fastvlm.sh # FastVLM environment setup
│ ├── benchmark_fastvlm.py # Model benchmarking tools
│ └── ... # Other utility scripts
│
├── tests/ # Test scripts and validation
│ ├── test_path_enforcement.sh
│ ├── strict_example.sh
│ ├── test_fastvlm.sh
│ └── ... # Test harnesses
│
├── artifacts/ # Canonical storage for outputs
│ ├── analysis/ # Analysis results
│ ├── vision/ # Vision model outputs
│ ├── test/ # Test results
│ ├── benchmark/ # Performance benchmarks
│ └── tmp/ # Temporary files
│
├── artifact_guard_py_adapter.sh # Runtime path enforcement
└── libs/ # External libraries
└── ml-fastvlm/ # FastVLM vision library (CODE ONLY)
Model files are stored in ~/.local/share/fastvlm/ (see MODELS.md)
# Direct file analysis (main use case)
fa path/to/image.jpg # Smart analysis with filename suggestion
fa --json path/to/image.jpg # JSON output format
fa --md path/to/image.jpg # Markdown output format
fa --verbose path/to/image.jpg # Verbose debugging output
# Path handling
fa ./relative/path/image.jpg # Relative paths
fa /absolute/path/image.jpg # Absolute paths
fa ~/home/path/image.jpg # Tilde expansion
# Model management
fa model list # List available AI models
fa model download --size 0.5b # Download specific model
# Testing and validation
fa test # Run comprehensive test suite
fa validate # Validate configuration
fa benchmark # Performance benchmarks
# Legacy subcommands (still supported)
fa quick path/to/image.jpg # Alias for direct analysis
fa analyze vision path/to/image.jpg # Comprehensive analysis mode
For comprehensive file analysis with multiple tools:
# Run all analyses on a directory
./tools/analyze.sh -a ~/Documents
# Extract metadata and scan for duplicates
./tools/analyze.sh -m -d ~/Pictures
# Search for specific content
./tools/analyze.sh -s "password" ~/Downloads
# OCR images in a directory
./tools/analyze.sh -o ~/Screenshots
# Include only specific file types
./tools/analyze.sh -a -i "*.jpg" -i "*.png" ~/Pictures
# Analyze images with AI vision models
./tools/analyze.sh -V ~/Pictures
The File Analyzer supports multiple output formats for different use cases:
Human-readable format with intelligent filename suggestions:
Recommended Filename: letter-t.jpg
Description:
The image displays a simple, stylized icon of a letter 'T' in a bold, sans-serif font.
The icon is rendered in a solid, mustard yellow color with a slight shadow effect...
Tags: branding, icon, letter, minimalist, typography
Analysis Time: 17.18 seconds
Structured data format for programmatic use:
{
"recommended_filename": "letter-t.jpg",
"description": "The image displays a simple, stylized icon...",
"tags": ["branding", "icon", "letter", "minimalist", "typography"],
"metadata": {
"model": "fastvlm_1.5b",
"execution_time": 17.180412,
"timestamp": "2025-05-22T15:38:16.012597"
},
"original_file": "test_data/images/test.jpg"
}
Formatted output for documentation and reports:
# File Analysis: test.jpg
**Recommended Filename:** `letter-t.jpg`
## Description
The image displays a simple, stylized icon of a letter 'T' in a bold, sans-serif font...
## Tags
branding, icon, letter, minimalist, typography
## Metadata
- Model: fastvlm_1.5b
- Execution Time: 17.180412 seconds
Results are saved to the canonical artifacts directory structure:
artifacts/analysis/<context>_<unique_id>/summary.json
: Overall summaryartifacts/analysis/<context>_<unique_id>/metadata.json
: File metadataartifacts/analysis/<context>_<unique_id>/duplicates.txt
: Duplicate filesartifacts/analysis/<context>_<unique_id>/ocr_results.json
: Text from imagesartifacts/analysis/<context>_<unique_id>/malware_scan.txt
: Malware scan resultsartifacts/analysis/<context>_<unique_id>/search_results.txt
: Content search resultsartifacts/analysis/<context>_<unique_id>/binary_analysis.txt
: Binary analysisartifacts/vision/<context>_<unique_id>/vision_analysis.json
: AI vision model analysisartifacts/vision/<context>_<unique_id>/vision_metrics.json
: Vision analysis performance metrics
The system supports custom configuration files in JSON format. Create a config.json
in the current directory or specify a custom path with the -c
option.
Example configuration:
{
"default_output_dir": "artifacts/analysis",
"max_threads": 4,
"max_ocr_images": 50,
"file_extensions": {
"images": [".jpg", ".jpeg", ".png", ".tif", ".tiff", ".bmp", ".gif"]
},
"default_exclude_patterns": ["*.log", "*.tmp", "*.bak"],
"vision": {
"model": "fastvlm",
"max_images": 10,
"description_mode": "standard",
"output_format": "json",
"max_retries": 3
}
}
The system includes advanced AI vision models for image analysis with robust JSON output:
-
FastVLM: Apple's efficient vision model (default, fastest on Apple Silicon)
- Model variants: 0.5B (fastest), 1.5B (default), 7B (highest quality)
- Performance: Up to 85x faster Time-to-First-Token than alternatives
-
BakLLaVA: Mature vision language model with good performance
- Works well on all platforms
- More mature with better handling of complex scenes
-
Qwen2-VL: Document analysis specialist
- Optimized for text extraction from documents
- Good performance on structured content
- describe (default): General image description with details and context
- detect: Object detection with locations and relationships
- document: Optimized for text extraction from documents/screenshots
All vision analysis results are provided in a structured JSON format:
{
"/path/to/image.jpg": {
"description": "Detailed image description text...",
"tags": ["tag1", "tag2", "tag3"],
"metadata": {
"response_time": 1.25,
"model": "FastVLM 1.5B",
"timestamp": "2025-05-20 15:30:45",
"attempts": 1,
"mode": "describe"
}
}
}
The system implements robust JSON validation for reliable output:
-
Advanced Extraction and Validation
- Sophisticated JSON extraction with balanced bracket matching
- Handles nested objects and arrays correctly
- Multiple extraction strategies for different response patterns
- Centralized validation through dedicated
json_utils
module
-
Automatic Retry Logic
- Multiple retry attempts with progressively stronger JSON-forcing prompts
- Ensures valid, well-structured output even when model responses vary
- Graceful fallback to text output when JSON parsing fails completely
- Detailed failure metadata for debugging purposes
-
Extraction Capabilities
- Can extract valid JSON even from partially correct text responses
- Uses intelligent pattern matching to find embedded JSON objects
- Supports specialized patterns for different analysis modes (describe, detect, document)
- Preserves context and field relationships in extraction
-
Performance Metrics
- Tracks response time and other performance indicators
- Records number of retry attempts needed for valid JSON
- Provides detailed metrics in separate JSON file for benchmarking
- Standardized metadata format across all output types
The File Analyzer uses a centralized model management system for handling AI model files. This system keeps large model files outside the git repository while providing a consistent interface for model access.
For complete details, see MODELS.md.
- Centralized storage: Models are stored in
~/.local/share/fastvlm/
- Automatic download: Models are downloaded automatically when needed
- Multiple model sizes: Support for 0.5B (small), 1.5B (medium), and 7B (large) models
- CI integration: GitHub Actions workflows for model caching
# Set up the environment
./tools/setup_fastvlm.sh
# List available models
python tools/download_models.py list
# Download a specific model
python tools/download_models.py download --size 0.5b
# Get model information
python tools/download_models.py info --size 0.5b
The system includes a unified adapter interface for model access:
from src.fastvlm_adapter import create_adapter
# Create adapter
adapter = create_adapter(model_size="0.5b")
# Run prediction
result = adapter.predict(image_path="path/to/image.jpg",
prompt="Describe this image.",
mode="describe")
FastVLM is Apple's efficient vision language model designed specifically for Apple Silicon.
The simplest way to set up FastVLM is to use our setup script:
# Run the setup script
./tools/setup_fastvlm.sh
# This will:
# 1. Install MLX if needed
# 2. Clone the FastVLM repository
# 3. Download the 0.5B model by default
For more advanced options, see MODELS.md.
FastVLM includes several optimization features:
-
Image Preprocessing: Automatically resizes and optimizes images
- Description Mode: 512x512 resolution (default)
- Object Detection: 384x384 resolution
- Document Analysis: 768x768 resolution
-
Memory Optimization:
- 4-bit quantization for efficient memory usage
- Metal acceleration for Apple Silicon
- Resolution customization based on analysis needs
Common issues and solutions:
- Out of Memory Errors: Use a smaller model or reduce batch size
- Slow Performance: Ensure Metal acceleration is enabled
- Model Loading Failures: Check model files with
tools/fastvlm_errors.py
- Image Format Errors: Ensure images are in supported formats (JPG, PNG)
For more detailed error diagnostics, run:
./tools/fastvlm_errors.py
You can install the File Analysis System to your PATH using the provided install script:
# Install to ~/bin (default)
./install.sh
# Or specify a custom installation directory
./install.sh /usr/local/bin
This creates symbolic links to the tool in the specified directory. After installation, you can run the tool using:
analyze-files
- for the shell script wrapperfile-analyzer
- for the Python script directly
See INSTALL.md for detailed instructions on installing all required dependencies.
For vision analysis capabilities, additional dependencies are required:
# For FastVLM (recommended for Apple Silicon)
pip install mlx mlx-fastvlm
# For Qwen2-VL (good for document analysis)
pip install mlx-vlm
# For BakLLaVA
# Requires llama.cpp with BakLLaVA-1-Q4_K_M.gguf model
git clone https://github.com/Fuzzy-Search/realtime-bakllava
cd realtime-bakllava && make
The project includes a comprehensive test suite designed to prevent regressions:
# Run all tests
pytest
# Run with coverage reporting
pytest --cov=src
# Run specific test categories
pytest tests/test_cli_integration.py -v # CLI integration tests
pytest tests/test_fastvlm_json.py -v # Model output validation
pytest tests/test_vision_core.py -v # Vision model integration
# Run regression prevention tests
pytest tests/test_cli_integration.py::TestRegressionPrevention -v
CLI Integration Tests (tests/test_cli_integration.py
):
- Complete user experience testing for all command patterns
- Output format validation (JSON, Markdown, text)
- Path handling (relative, absolute, tilde expansion)
- Error scenarios and edge cases
- Filename generation and tag cleaning validation
- Regression prevention for CLI argument parsing issues
Model Output Tests (tests/test_fastvlm_json_parsing.py
):
- Real captured model outputs for validation
- Token limit optimization testing (prevents JSON repetition)
- JSON repair functionality validation
- Malformed JSON handling and extraction
Core Functionality Tests:
- Vision model integration with JSON validation
- Metadata extraction and duplicate detection
- Content searching and file filtering
- Path validation and artifact discipline enforcement
- Performance metrics collection
All test scripts implement robust security measures:
-
Path Validation and Clean-up:
- Strict validation of output paths before any file operations
- Use of path guard pattern to prevent accidental writes to system directories
- Safe directory cleaning with
find "$output_dir" -mindepth 1 -delete
instead of dangerousrm -rf
- Multiple safety checks before performing destructive operations
-
Output Directory Safety:
- Canonical artifact paths with automatic validation
- Directory existence checks before all operations
- Explicit rejection of empty paths, root paths, and system directories
- Guards against path traversal attacks
from src.analyzer import FileAnalyzer
from src.artifact_guard import get_canonical_artifact_path, PathGuard
# Get a canonical artifact path for output
output_dir = get_canonical_artifact_path("analysis", "my_analysis")
# Create analyzer with canonical output path
analyzer = FileAnalyzer("/path/to/analyze", output_dir=output_dir)
# Use PathGuard to enforce artifact discipline for all file operations
with PathGuard(output_dir):
analyzer.extract_metadata()
analyzer.find_duplicates()
results = analyzer.get_results()
This project uses a strict artifact management system to prevent file sprawl and ensure consistent output locations. See ARTIFACTS.md for details on the artifact system.
Key benefits:
- Consistent output paths with unique identifiers
- Runtime enforcement of path discipline
- Automatic manifest generation
- Centralized cleanup and management
- Protection against accidental writes to system directories
- Secure path validation to prevent path traversal vulnerabilities
When adding new capabilities to the File Analyzer system, please follow these conventions:
- Modular Design: Add new analysis types as separate methods in the
FileAnalyzer
class. - CLI Integration: Update both
src/analyzer.py
andtools/analyze.sh
with new options. - Documentation: Update the README.md with descriptions and examples of the new feature.
- Tests: Add appropriate test cases in the tests directory.
- Configuration: Add relevant configuration options in config.json.
- Error Handling: Use consistent error handling with try/except blocks.
- Status Reporting: Update the
self.results
dictionary with status information. - Output Format: For structured data, prefer JSON output with consistent fields.
- Dependencies: Document any new external dependencies in INSTALL.md.
- Performance: Use multithreading for CPU-bound operations when appropriate.
When adding new analysis types that produce JSON output:
- Include a
status
field ("success", "error", or "skipped") - Include a
timestamp
field with the analysis time - For ML models, include performance metrics
- Implement validation to ensure output is always valid JSON
- Use consistent field naming across different analysis types
Please adhere to the following directory structure when contributing:
src/
: Core Python modules and librariestools/
: Command-line tools and developer utilitiestests/
: Test scripts and validation harnessesartifacts/
: Output directory for all generated files
All Bash scripts must source artifact_guard_py_adapter.sh and follow the canonical path discipline.
External libraries are stored in the libs/
directory and should NEVER be modified directly. If changes are needed to library functionality, create wrapper functions instead.