Feature/samples for fao by Polichinel · Pull Request #39 · views-platform/views-models

Polichinel · 2026-03-17T16:22:13Z

No description provided.

Companion change to views-baseline PR #3. Standardizes the config key for trailing training window from "months" to "window_months" and adds the explicit "time_steps" key (forecast horizon, previously a hidden default of 36 in views-baseline). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add explicit `prediction_format: "dataframe"` to 54 models that previously relied on the implicit default. Prerequisite for making prediction_format a required key in pipeline-core (Phase 3A). Also includes baseline config updates and calibration log refreshes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tectural risks - Add mandatory config keys (time_steps, rolling_origin_stride) to 52+ models - Replace exec() with importlib.util in create_catalogs.py for safer config loading - Migrate 13 old-pattern models to ForecastingModelArgs CLI API - Fix heat_waves/hot_stream forecasting offset bug (-2 → -1) - Add comprehensive test suite (2029 tests): - Config completeness, structure, CLI pattern, partition consistency - Ensemble config validation and dependency checking - Red-team failure injection tests - Add base_docs governance: 9 ADRs, 3 CICs, contributor protocols, standards - Remove dead code from create_catalogs.py and purple_alien/main.py Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove files superseded by the test suite or no longer referenced: - reports/archived/ (15 old training/sweep logs from Feb 2025) - verify_architecture.py (one-off NBEATS debug investigation) - compare_configs.py (hardcoded 3-model check, replaced by test suite) Also add .ruff_cache/ to .gitignore and update ADR-001 ontology. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove unused imports (sys, re, pytest, REPO_ROOT, ALL_MODEL_DIRS) and rename ambiguous variable `l` to `line` in test_catalogs.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Shell script that trains and evaluates each model on calibration and validation partitions, logging results per model without aborting on failure. Supports --models, --partitions, and --timeout flags. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Models without requirements.txt fail to install packages when run.sh creates their conda environment. Generated from main.py imports: - 15 models: views-r2darts2>=1.0.0,<2.0.0 - 12 models: views-baseline>=1.0.0,<2.0.0 - 6 models: views-stepshifter>=1.0.0,<2.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Simplify integration test runner to activate a single conda env (default: views_pipeline) instead of trying to use per-model envs via run.sh. Adds --env and --exclude flags. Excludes purple_alien by default (needs views-hydranet env). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…oint_metrics CoreConfigSniffer in views-pipeline-core v2.2.0 requires the new key names. Updates 44 models from the old generic keys to the type-specific format. Also removes CRPS from regression_point_metrics (not meaningful for point estimates). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The views-r2darts2 ReproducibilityGate requires core parameters (random_state, optimizer_cls, lr_scheduler_*, early_stopping_min_delta, gradient_clip_val, output_chunk_length/shift) that these models were missing. Values taken from working reference models of the same arch. Fixed: heat_waves, good_life, elastic_heart, teenage_dirtbag, dancing_queen, cool_cat. Also cleaned up 60 lines of commented-out sweep results in good_life. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The ReproducibilityGate checks both core and architecture-specific parameters. After fixing core params, 4 models still had missing arch-specific params: - cool_cat (TiDEModel): use_reversible_instance_norm - good_life (TransformerModel): d_model, nhead, dim_feedforward, activation, norm_type, use_reversible_instance_norm, detect_anomaly - heat_waves (TFTModel): dropout, add_relative_index, use_static_covariates, norm_type, skip_interpolation, hidden_continuous_size - elastic_heart (TSMixerModel): use_static_covariates, use_reversible_instance_norm Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix leading space in bittersweet_symphony queryset name - Rename targets→regression_targets in all 5 ensembles (wrap strings in lists) - Rename metrics→regression_point_metrics in all 5 ensembles (remove CRPS — not meaningful for point estimates) - Update ensemble test to expect new key names Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…atchpad Align documentation structure with platform convention (views-r2darts2, views-hydranet pattern): - Move ADRs, CICs, contributor_protocols, standards from base_docs/ to docs/ - Delete docs/internal/ (leftover scratchpad from prior session) - Delete docs/model_catalog_old_pipeline.md (superseded by README catalogs) - Create reports/ directory for future operational outputs - Update all internal cross-references from base_docs/ to docs/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

1. ReproducibilityGate param validation for all 15 darts models — checks 16 core params + architecture-specific params per algorithm (TFT: 15, Transformer: 13, TiDE: 14, TSMixer: 12, TCN: 8, BlockRNN: 8, NBEATS: 10). Prevents the exact bug fixed in 6 models. 2. Old-key regression tests — verify no model or ensemble uses the deprecated "targets" or "metrics" keys (must be "regression_targets" and "regression_point_metrics"). 3. Ensemble-model level agreement — verify all constituent models in an ensemble have the same level (cm/pgm) as the ensemble itself. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace hardcoded DARTS_CORE_PARAMS and ARCH_PARAMS with dynamic import from views_r2darts2.infrastructure.reproducibility_gate. This eliminates the DRY violation — param requirements are now sourced from the canonical definition. Tests skip when views_r2darts2 is not installed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The common/ directory was created then reverted — it no longer exists. Remove the Shared Infrastructure row from the ontology table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The common/partitions.py centralization was attempted then reverted, but governance docs still referenced it. Updated all ADRs, protocols, and checklist to reflect the actual architecture: per-model self- contained partition files with test-enforced consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Complete the 6 ranger baseline models (black, blue, green, pink, red, yellow) by committing their config_deployment, config_partitions, config_queryset, config_sweep, main.py, and run.sh files. Only config_meta, config_hyperparameters, and requirements.txt were previously committed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Allows running integration tests on only CM or PGM models: bash run_integration_tests.sh --level cm bash run_integration_tests.sh --level pgm Extracts level from each model's config_meta.py via Python. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add docs/run_integration_tests.md with prerequisites, all flags, internal mechanics, log structure, exit codes, and caveats. Add Integration Testing section to README with quick-start examples and options table. Add doc pointer to top of run_integration_tests.sh. Uncomment *.txt in .gitignore to stop tracking model run log artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… package Allows running integration tests for a single model library at a time (baseline, stepshifter, r2darts2, hydranet) by matching views-<name> in each model's requirements.txt. Combinable with --level. Also fixes exit-code precedence bug where failures without timeouts exited 0, and untracks 3 data log .txt files already covered by .gitignore. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Runs the structural test suite (~2200 tests, <20s) on every push to main/development and on all PRs. Requires only views_pipeline_core. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Polichinel and others added 27 commits February 24, 2026 23:56

feat(purple_alien): align config_sweep with operational hyperparameters

f3c99b5

fix(purple_alien): add missing 'name' key to sweep_config

554f61a

fix(tests): resolve ruff lint errors in test suite

773c9f2

Remove unused imports (sys, re, pytest, REPO_ROOT, ALL_MODEL_DIRS) and rename ambiguous variable `l` to `line` in test_catalogs.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(docs): remove stale common/ reference from ADR-001

b580043

The common/ directory was created then reverted — it no longer exists. Remove the Shared Infrastructure row from the ontology table. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(docs): add --library flag to README integration testing table

d3e6805

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(configs): remove duplicate import in blank_space config_queryset

c846ca2

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix(docs): remove hardcoded model count from integration test guide

b8709d8

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

feat(ci): add pytest workflow for push and PR gates

231a3ae

Runs the structural test suite (~2200 tests, <20s) on every push to main/development and on all PRs. Requires only views_pipeline_core. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Polichinel merged commit f178d07 into development Mar 17, 2026
1 check passed

Polichinel deleted the feature/samples_for_fao branch March 17, 2026 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/samples for fao#39

Feature/samples for fao#39
Polichinel merged 27 commits intodevelopmentfrom
feature/samples_for_fao

Polichinel commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Polichinel commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant