Releases · dssg/triage

27 Aug 05:15

shaycrk

v5.0.0

f15b86a

Dried Apricot

WARNING: BREAKING CHANGES!

Note that several changes in triage 5 break backwards compatibility with triage 4. If you are upgrading a project from an earlier version of triage, it is highly recommended that you first create a backup of your current database!

These breaking changes include:

Revision in the way the model_hash is calculated means that if you're re-running an experiment from an earlier version of triage, it will re-train your models and give them new model_ids even if the configuration hasn't changed.
The built_by_experiment column has been removed from triage_metadata.models in preference of tracking the specific run that built the model. The experiment_hash can still be obtained by joining to triage_metadata.triage_runs (née triage_metadata.experiment_runs). Should you need the data that was in this column at the time of migration, it can be found in triage_metadata.deprecated_models_built_by_experiment, but it will not be restored to the table upon database downgrade.
Changes in the structure of matrix metadata means the matrix_hash will no longer be backwards-compatible with oder version of triage (as with models, re-running an old config would result in matrices being re-created)
The random_seed column has been removed from triage_metadata.experiments in preference of tracking it at the run level as well. A database upgrade followed by a downgrade would lose this data (but could be recovered from the runs table)

New Functionality

Functionality for predicting forward, either with an existing model object or by retraining a new model with the most current data given a model_group_id (#631)
Utility for adding predictions to models previously trained/tested with save_predictions=False (#836)
Provisioner for easily setting up a postgresql database (via docker) that can be used with triage (#840)
More flexibility in parallelization for more resource-intensive model types, like random forests (#853)

Bug Fixes

Ensure model-level random seeds are re-used when the config and experiment-level random seed are unchanged (#848)
Remove the project path from the model_hash definition: the model_id shouldn't depend on where triage is being run (#830)
Ensure that feature groups are sorted in matrix metadata for consistency in downstream calculations (#833)

Thanks To

@tweddielin, @thcrock, @ecsalomon, @kasunamare

Contributors

thcrock, kasunamare, and 2 other contributors

Assets 2

26 Aug 21:11

shaycrk

v4.4.0

5be45b9

AROY-D: The Second Box

Primarily a bugfix release for anyone working on triage 4. New functionality will be introduced with the 5.0 release.

Bug Fixes

Fix functionality of bias analysis using aequitas during experiment runs. Previously the attributes for bias analysis were getting scrambled relative to the scores and labels when the latter get sorted for "best case" and "worst case" analyses, invalidating any results produced by these analyses. This release fixes this bug, ensure the same set of entities is provided for attributes and labels/scores, and adds a unit test to cover this issue. (#858)
Close database sessions during unit tests to avoid intermittent exceptions during test cleanup. (#851)

Assets 2

22 Apr 18:01

shaycrk

v4.3.0

18238ea

AROY-D

New Functionality

Added connector for aequitas visualizations (#837)
Allow user-specified model grids to extend presets (#843)
Audition improvements, including baseline models and stable color schemes (#844)

Bug Fixes

Fixed building triage in docker container for dirty duck tutorial (#818, #820)
Improve audition's handling of multiple models with different random seeds (#823)

Refactoring/Documentation

Switched to github actions for CI testing (#825)
Update dependencies (#835)

Assets 2

09 Jul 16:05

nanounanue

v4.1.1

35db4ef

El "Patched" Paisano Pre-release

Pre-release

Patched due some inconsistencies between catwalk and the newest version of sklearn

Assets 2

30 Jun 22:27

nanounanue

v4.1.0

e804cf3

El Paisano Pre-release

Pre-release

What is in this release?

Now the schema is called triage_metadata instead of model_metadata (issue #700)
Replace flag now is passed to ModelTrainerTester (issue #784)
New folder structure for dirtyduck
New folder structure for triage in a docker
Fix an inconsistency in the command line option of the tutorial
Python version as columns in experiment run (issue #742)
Incorrect columns in individual_importances (issue #744)
Updated deprecated method calls (issues #734 and #754)
Long standing issue with parsedatetime resolved (issue #721 )
Several issues with dirtyduck solved (issues #750 #735 #736 #781)

Thanks to

@thcrock , @shaycrk , @adunmore , @nanounanue

Assets 2

14 Dec 00:12

nanounanue

v4.0.0

ea31788

Chengdu

New Functionality:

Evaluate on subsets [Resolves #535, #138] (#552)
Implement train/testing priority [Resolves #542] (#581)
Introduce experiment_runs table, beef up experiments table (#637)
Dirty duck (the whole enchilada) (#670)
Add compute best/worst/stochastic for each evaluation [Resolves #292] (#674)
Insert Ranks for Predictions [Resolves #357] (#671)
Support Python 3.7 [Resolves #683] (#684)
Bias Part 1: Protected groups generator (#680)
Bias part 2 (#688)
Added DummyClassifier to the SimpleClassifiers batch (#702)

Bug Fixes:

config is a str, not a fd (#610)
Keep PyYAML pinned as v5 breaks our usage (#615)
Fix cohort in unit tests, remove old code, squash some warnings (#621)
Fix logging of which matrix was saved (#623)
Harden postmodeling against lack of predictions [Resolves #638] (#645)
Validate distinct feature group prefixes (#634)
fix imports in example postmodeling notebook (#646)
Fixed Audition's docs (#665)
MS Triage (#666)
Fixed broken links (#675)
Fix Travis deploy [Resolves #493] (#677)
Fix logging typos that only show up when splits are empty (#685)
Fixes Postmodeling Weird Error [Resolves #691] (#693)
Don't auto-upgrade db for new Experiments [Resolves #695] (#698)
Check for capital letters with validator [Closes #632] (#701)
check for empty protected_df (#709)
Fixing dirtyduck (#720)
Update MANIFEST.in (#723)

General Improvements:

Read database connections from process environment (#605)
Scheduled monthly dependency update for March (#619)
Use compressed CSVs [Resolves #498] (#626)
Faster train/test task generation (#628)
Remove support for entity-only matrix indices [Resolves #477] (#622)
-Enable dburl env var in results_schema CLI [Resolves #636] (#639)
Run validation by default [Closes #635] (#642)
Add feature_importance metric to SLR (#587)
Scheduled monthly dependency update for April (#664)
Remove redundant imputation flag columns [Resolves #544] (#676)
write 5+ GiB (matrices) to S3Store (#687)
Add more user database management options to CLI [Resolves #697] (#699)
Scheduled monthly dependency update for May (#679)
Kit and adolfos amazing adventure (aka experiment config defaults) Closes #717 (#719)

Refactoring/Documentation:

Broaden test coverage to CLI and postmodeling (#618)
Update model_group_performance.py (#650)
Upgrade ohio (#678)
Remove site dir (#686)
Bump experiment to v7 (#689)
Config doc (#694)
Repo readme (#682)
Added QuickStart guide to documentation

Assets 2

20 Feb 00:28

thcrock

v3.3.0

7185327

Arepa

New functionality:

Postmodel Analysis (#482)
Stores Timechop image to disk (#590)
Add matrix uuid to evaluations tables [Resolves #591] (#593)
Experiment Profiling [Resolves #557] (#558)

Bug fixes:

Postmodel fixes (#604)
Fixes #598 (#600)
Series equality operator [Resolves #563] (#564)
Fix MatrixStore memory leak [Resolves #594]
Fix empty/columns check on HDFStore [Resolves #589] (#592)
Fix upgrade_db to use filehandle [Resolves #572]
Fix FromObj.maybe_materialize [Resolves #565] (#566)
support 5 GB multipart upload threshold via S3Fs (#546)

General Improvements:

Scheduled monthly dependency update for February (#588)
Namespace cohort and labels tables by their config [Resolves #574] (#576)
Only Build Features for Cohort [Resolves #513] (#567)
Colocate Testing with Training [Resolves #560] (#569)
Upgrade PyYAML to current security-patched release
Skip Prediction Saving [Resolves #559]
Scheduled monthly dependency update for January (#562)
Materialize Subquery From Objects [Resolves #554] (#555)
Skip already-evaluated models [Resolves #540] (#541)
Throw warning if unscaled logit is used [Resolves #508] (#548)
support in develop script for detection of pyenv installed via Homebrew
upgrade install-cli to better support non-GNU (MacOS)
Cohort Generation respects replace flag [Resolves #503]

Refactoring/Documentation:

Add Audition, Postmodeling, Dirty Duck references to docs (#599)
audition_config file
Audition config correct (#601)
Experiment Architecture Doc [Resolves #579] (#580)
docs: make proper list of experiment upgrading links
Cohort and Label Deep Dive [Resolves #492] (#577)
Disable individual importance in example experiment config (#568)
Tweak language in running document

Assets 2

10 Dec 16:49

thcrock

v3.2.0

00b52be

Flaming Hot Cheeto

New functionality:

Add additional feature group CV strategy (all-combinations) (#518)
Downcasting feature tables (#510)
Label Generation Replace Flag [Resolves #499]
Audition model group filter [Resovles #494] (#495)
development environment wizard (#511)

Bug Fixes:

Fix db engine check in Experiment [Resolves #538] (#539)
Allow >5GB matrices with S3 [Resolves #530]
refined test query to avoid unwarranted failure
Prevent experiment hanging when worker is killed by OS [Resolves #501] (#506)
develop script should install triage with the rq extra (#521)

General Improvements:

Scheduled monthly dependency update for December (#526)
Shorten log lines [Resolves #528]
Verbose config check (#483)
added pytest fixtures to simplify and clean up (architect) tests (#522)
Add HDF5 to CLI and doc [Resolves #496] (#497)

Refactoring/Documentation:

Move example yaml configuration files into subdirectory (#520)
Fix links in results_schema readme [Resolves #524] (#525)
Refactoring: Remove cohort options besides query [Resolves #504]

Assets 2

02 Nov 19:55

thcrock

v3.1.1

b22c3d5

Tim Tam

Flip featuretest CLI arguments to match the doc [Resolves #486] (#487)
Scheduled monthly dependency update for November (#485)
Associate Experiment with all models and matrices [Resolves #411] (#476)
Clean up Session Closing in Predictor [Resolves #478]
Downcast matrices [Resolves #372]
Update Contribution Guide [Resolves #425]
Initial run of Black for code formatting

Assets 2

05 Oct 16:45

thcrock

v3.0.0

a553361

Introducing the CLI

Many changes here, largely related to introducing the Command Line Interface ported from DirtyDuck

Refactor functionality to ExperimentBase class [Resolves #400]
Feature testing functionality [Resolves #420]
Storage refactoring, experiment CLI [Resolves #424]
Timechop visualization CLI [Resolves #437]

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WARNING: BREAKING CHANGES!

New Functionality

Bug Fixes

Thanks To

Contributors

Bug Fixes

New Functionality

Bug Fixes

Refactoring/Documentation

What is in this release?

Thanks to

Releases: dssg/triage

Dried Apricot

WARNING: BREAKING CHANGES!

New Functionality

Bug Fixes

Thanks To

Contributors

AROY-D: The Second Box

Bug Fixes

AROY-D

New Functionality

Bug Fixes

Refactoring/Documentation

El "Patched" Paisano

El Paisano

What is in this release?

Thanks to

Chengdu

Arepa

Flaming Hot Cheeto

Tim Tam

Introducing the CLI