🌿 Trava ( initially stands for TrainValidation )

Framework that helps to train models, compare them and track parameters&metrics along the way. Works with tabular data only.

pip install trava

Compare models and keep track of metrics with ease!

While working on a project, we often experiment with different models looking at the same metrics. For example, we log those that can be represented as a single number, however some of them require graphs to make sense. It's also useful to save those metrics somewhere for future analysis, the list can go on.

So why not to use some unified interface for that?

Here is Trava's way:

1). Declare metrics you want to calculate:

# in this case, sk and sk_proba are just wrappers around sklearn's metrics
# but you can use any metric implementation you want
scorers = [
  sk_proba(log_loss),
  sk_proba(roc_auc_score),
  sk(recall_score),
  sk(precision_score),
]

2). What do you want to do with the metrics?

# let's log the metrics
logger_handler = LoggerHandler(scorers=scorers)

3). Initialize Trava

trava = TravaSV(results_handlers=[logger_handler])

4). Fit your model using Trava

# prepare your data
X_train, X_test, y_train, y_test = ...

split_result = SplitResult(X_train=X_train, 
                           y_train=y_train,
                           X_test=X_test,
                           y_test=y_test)

trava.fit_predict(raw_split_data=split_result,
                  model_type=GaussianNB, # we pass model class and parameters separately
                  model_init_params={},  # to be able to track them properly
                  model_id='gnb') # just a unique identifier for this model

fit_predict call does roughly the same as:

gnb = GaussianNB()
gnb.fit(split_result.X_train, split_result.y_train)
gnb.predict(split_result.X_test)

But now you don't care how the metrics you declared are calculated. You just get them in your console! Btw, those metrics definitely need to be improved. :]

Model evaluation nb
* Results for gnb model *
Train metrics:
log_loss:
16.755867191506482
roc_auc_score:
0.7746522424770221
recall_score:
0.10468384074941452
precision_score:
0.9122448979591836


Test metrics:
log_loss:
16.94514025416013
roc_auc_score:
0.829444814485889
recall_score:
0.026041666666666668
precision_score:
0.7692307692307693

After training multiple models you can get all metrics for all models by calling.

trava.results

Get the full picture and more examples by going through the guide notebook!

Built-in handlers:

LoggerHandler - logs metrics
PlotHandler - plots metrics
MetricsDictHandler - returns all metrics wrapped in a dict

Enable metrics autotracking. How cool is that?

Experiments tracking is a must in Data Science, so you shouldn't neglect that. You may integrate any tracking framework in Trava! Trava comes with MLFlow tracker ready-to-go. It can autotrack:

model's parameters
any metric
plots
serialized models

MLFlow example:

# get tracker's instance
tracker = MLFlowTracker(scorers=scorers)
# initialize Trava with it
trava = TravaSV(tracker=tracker)
# fit your model as before
trava.fit_predict(raw_split_data=split_result,
                  model_type=GaussianNB,
                  model_id='gnb')

Done. All model parameters and metrics are now tracked! Also supported tracking of:

cross-validation case with nested tracking
eval results for common boosting libraries ( XGBoost, LightGBM, CatBoost )

Checkout a detailed notebooks how to track metrics & parameters and plots & serialized models.

General information

highly customizable training & evaluation processes ( see trava.fit_predictor.py.FitPredictor class and its subclasses )
built-in train/test/validation split logic
common boosting libraries extensions ( for early-stopping with validation sets )
tracks model parameters, metrics, plots, serialized models. it's easy to integrate any tracking framework of your choice
you are also able to evaluate metrics after fit_predict call, if you forgot to add some metric
you are able to evaluate metrics even when your data and even a trained model are already unloaded ( depends on a metric used, true most of the times )
now only supervised learning problems are supported yet there is a potential to extend it to support unsupervised problems
unit-tested
I use it every day for my needs thus I care about the quality and reliability

Only sklearn-style model are supported for the time being. ( it uses fit, predict, predict_proba methods )

Requirements

pandas
numpy
python 3.7

It's also convenient to use the lib with sklearn ( e.g. for taking metrics from there. ). Also couple of extensions are based on sklearn classes.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
examples		examples
tests		tests
trava		trava
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
flake8.cfg		flake8.cfg
mypy.ini		mypy.ini
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌿 Trava ( initially stands for TrainValidation )

Compare models and keep track of metrics with ease!

So why not to use some unified interface for that?

Here is Trava's way:

1). Declare metrics you want to calculate:

2). What do you want to do with the metrics?

3). Initialize Trava

4). Fit your model using Trava

Built-in handlers:

Enable metrics autotracking. How cool is that?

MLFlow example:

General information

Requirements

About

Releases 11

Packages

Languages

ityutin/trava

Folders and files

Latest commit

History

Repository files navigation

🌿 Trava ( initially stands for TrainValidation )

Compare models and keep track of metrics with ease!

So why not to use some unified interface for that?

Here is Trava's way:

1). Declare metrics you want to calculate:

2). What do you want to do with the metrics?

3). Initialize Trava

4). Fit your model using Trava

Built-in handlers:

Enable metrics autotracking. How cool is that?

MLFlow example:

General information

Requirements

About

Topics

Resources

Stars

Watchers

Forks

Releases 11

Packages 0

Languages

Packages