cube-js · paco-valdez · Oct 22, 2025 · Oct 22, 2025 · Oct 22, 2025 · Oct 22, 2025
diff --git a/.gitignore b/.gitignore
@@ -160,3 +160,5 @@ cython_debug/
 #  and can be added to the global gitignore or merged into this file.  For a more nuclear
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
+.DS_Store
+
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,136 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+## Project Overview
+
+**cube_dbt** is a Python package that converts dbt models and columns into Cube semantic layer definitions. It parses dbt manifest files and provides Jinja-compatible YAML output for integrating data models with Cube's semantic layer.
+
+## Common Development Commands
+
+```bash
+# Testing
+pdm run test                 # Run all tests (34 unit tests)
+pytest tests/ -v             # Run tests with verbose output
+pytest tests/test_dbt.py     # Run specific test file
+pytest -k "test_model"       # Run tests matching pattern
+
+# Development Setup
+pdm install                  # Install project with dev dependencies
+pdm install --prod           # Install production dependencies only
+pdm lock                     # Update pdm.lock file
+pdm update                   # Update all dependencies
+
+# Building & Publishing
+pdm build                    # Build distribution packages
+pdm publish                  # Publish to PyPI (requires credentials)
+
+# Development Workflow
+pdm run python -m cube_dbt   # Run the module directly
+python -c "from cube_dbt import Dbt; print(Dbt.version())"  # Check version
+```
+
+## High-Level Architecture
+
+The package consists of 4 core classes that work together:
+
+### Core Classes
+
+**Dbt (src/cube_dbt/dbt.py)**
+- Entry point for loading dbt manifest files
+- Supports file paths and URLs via `from_file()` and `from_url()` class methods
+- Implements chainable filtering API: `filter(paths=[], tags=[], names=[])`
+- Lazy initialization - models are only loaded when accessed
+- Handles manifest v1-v12 formats
+
+**Model (src/cube_dbt/model.py)**
+- Represents a single dbt model from the manifest
+- Key method: `as_cube()` - exports model as Cube-compatible YAML
+- Supports multiple primary keys via column tags
+- Provides access to columns, description, database, schema, and alias
+- Handles special characters in model names (spaces, dots, dashes)
+
+**Column (src/cube_dbt/column.py)**
+- Represents dbt columns with comprehensive type mapping
+- Maps 130+ database-specific types to 5 Cube dimension types:
+  - string, number, time, boolean, geo
+- Database support: BigQuery, Snowflake, Redshift, generic SQL
+- Primary key detection via `primary_key` tag in column metadata
+- Raises RuntimeError for unknown column types (fail-fast approach)
+
+**Dump (src/cube_dbt/dump.py)**
+- Custom YAML serialization utilities
+- Returns Jinja SafeString for template compatibility
+- Handles proper indentation for nested structures
+- Used internally by Model.as_cube() for output formatting
+
+### Key Design Patterns
+
+1. **Lazy Loading**: Models are loaded only when first accessed via `dbt.models` property
+2. **Builder Pattern**: Filter methods return self for chaining: `dbt.filter(tags=['tag1']).filter(paths=['path1'])`
+3. **Factory Methods**: `Dbt.from_file()` and `Dbt.from_url()` for different data sources
+4. **Type Mapping Strategy**: Centralized database type to Cube type conversion in Column class
+
+### Data Flow
+
+```
+manifest.json → Dbt.from_file() → filter() → models → Model.as_cube() → YAML output
+                                                ↓
+                                            columns → Column.dimension_type()
+```
+
+## Testing Structure
+
+Tests use a real dbt manifest fixture (tests/manifest.json, ~397KB) with example models:
+
+- **test_dbt.py**: Tests manifest loading, filtering by paths/tags/names, version checking
+- **test_model.py**: Tests YAML export, primary key handling, special character escaping
+- **test_column.py**: Tests type mapping for different databases, primary key detection
+- **test_dump.py**: Tests YAML formatting and Jinja compatibility
+
+Run specific test scenarios:
+```bash
+pytest tests/test_column.py::TestColumn::test_bigquery_types -v
+pytest tests/test_model.py::TestModel::test_multiple_primary_keys -v
+```
+
+## Important Implementation Details
+
+### Primary Key Configuration
+Primary keys are defined using tags in dbt column metadata:
+```yaml
+# In dbt schema.yml
+columns:
+  - name: id
+    meta:
+      tags: ['primary_key']
+```
+
+### Type Mapping Behavior
+- Unknown types raise RuntimeError immediately (fail-fast)
+- Database-specific types are checked first, then generic SQL types
+- Default mappings can be found in `src/cube_dbt/column.py` TYPE_MAP dictionaries
+
+### Jinja Template Integration
+All output from `as_cube()` is wrapped in Jinja SafeString to prevent double-escaping in templates. Use the `safe` filter if needed in templates.
+
+### URL Loading Authentication
+When using `Dbt.from_url()`, basic authentication is supported:
+```python
+dbt = Dbt.from_url("https://user:pass@example.com/manifest.json")
+```
+
+## Recent Changes (from git history)
+
+- Multiple primary key support (#15)
+- Documentation of package properties (#14)
+- Extended dbt contract data type support (#10)
+- Jinja escaping protection for as_cube() (#2)
+
+## Package Metadata
+
+- **Version**: Defined in `src/cube_dbt/__init__.py`
+- **Python Requirement**: >= 3.8
+- **Production Dependency**: PyYAML >= 6.0.1
+- **License**: MIT
+- **Build System**: PDM with PEP 517/518 compliance
diff --git a/QUICK_REFERENCE.md b/QUICK_REFERENCE.md
@@ -0,0 +1,90 @@
+# cube_dbt Quick Reference
+
+## What is cube_dbt?
+A Python package that converts dbt models and columns into Cube semantic layer definitions. It parses dbt manifests and provides Jinja-compatible YAML output.
+
+## Install & Run Tests
+```bash
+pdm install              # Set up environment
+pdm run test             # Run all tests
+```
+
+## Basic Usage
+```python
+from cube_dbt import Dbt
+
+# Load and filter
+dbt = Dbt.from_file('manifest.json').filter(
+    paths=['marts/'],
+    tags=['cube'],
+    names=['model_name']
+)
+
+# Access models
+model = dbt.model('my_model')
+print(model.name)
+print(model.sql_table)
+print(model.columns)
+
+# Export to Cube (YAML)
+print(model.as_cube())
+print(model.as_dimensions())
+```
+
+## Project Structure
+```
+src/cube_dbt/
+├── dbt.py         - Dbt class (manifest loading & filtering)
+├── model.py       - Model class (cube export)
+├── column.py      - Column class (type mapping)
+├── dump.py        - YAML utilities (Jinja-safe)
+└── __init__.py    - Public exports
+
+tests/             - 34 unit tests, all passing
+```
+
+## Key Classes
+
+### Dbt
+- `from_file(path)` - Load from JSON
+- `from_url(url)` - Load from remote URL
+- `filter(paths=[], tags=[], names=[])` - Chainable filtering
+- `.models` - Get all filtered models
+- `.model(name)` - Get single model
+
+### Model
+- `.name`, `.description`, `.sql_table` - Properties
+- `.columns` - List of Column objects
+- `.primary_key` - List of primary key columns
+- `.as_cube()` - Export as Cube definition (YAML)
+- `.as_dimensions()` - Export dimensions (YAML)
+
+### Column
+- `.name`, `.description`, `.type`, `.meta` - Properties
+- `.primary_key` - Boolean
+- `.as_dimension()` - Export dimension (YAML)
+
+Type mapping: BigQuery, Snowflake, Redshift → Cube types (number, string, time, boolean, geo)
+
+## Dependencies
+- Production: PyYAML >= 6.0.1, orjson >= 3.10.15
+  - Note: orjson is used for fast JSON parsing. If unavailable, the package may fall back to standard libraries.
+- Development: pytest >= 7.4.2
+- Python: >= 3.8
+
+## Common Tasks
+| Task | Command |
+|------|---------|
+| Run tests | `pdm run test` |
+| Run specific test | `pytest tests/test_dbt.py -v` |
+| Install deps | `pdm install` |
+| Lock deps | `pdm lock` |
+| Build package | `pdm build` |
+
+## Recent Changes
+- v0.6.2: Multiple primary keys support
+- Type support for dbt contracts
+- Jinja template safe rendering
+
+## Publishing
+GitHub Actions auto-publishes to PyPI on release.