Skip to content

v0.1.0

Choose a tag to compare

@lamesjaidler lamesjaidler released this 17 Feb 16:14
3eec965

Added:

  • Added ParallelPipeline class to pipeline module & unit tests.
  • Added rules parameter to filter modules, allowing a given rule set to be filtered.
  • Added rules parameter to RBSOptimiser, allowing a given rule set to be filtered to the rules remaining after optimisation.
  • Added rules attribute to rule generators (which is a Rules class containing the generated rules).
  • Added example notebook for ParallelPipeline class.
  • Added advanced example notebook for BayesSearchCV class.
  • Added __repr__ to AgglomerativeClusteringReducer class.

Changed:

  • The get_params method in the LinearPipeline and ParallelPipeline classes now returns a dictionary of parameters and their values for each pipeline step.
  • Removed num_cores parameter from RBSPipeline and RBSOptimiser classes (as it was ignored anyway).
  • Updated LinearPipeline and BayesSearchCV example notebook with diagrams.
  • Relaxed versions of packages required in setup.py.

Improvements:

  • Classes in later stages of a LinearPipeline can now use the initial datasets if required (e.g. if a rule optimiser is placed after a rule generator, the rule optimiser can be configured to use the initial feature set to optimise the generated rules). See the use_init_data param in LinearPipeline class.
  • ClassAccessor classes can now be used in parameters that are mutable iterables - this allows attributes from classes earlier in a pipeline to be passed to parameters of classes later in the pipeline that are mutable iterables (e.g. lists).
  • Users can set whether they want to use the sample_weight when calculating the metric of the validation fold in the BayesSearchCV class (prior to this, the sample_weight was always used if provided). See the sample_weight_in_val parameter in the BayesSearchCV class.
  • Rules classes can now be combined using + and sum().

Fixes:

  • Updated rule generators so rule names remain constant if the fit method is applied more than once after class instantiation.
  • int values are now properly parsed when converting from rule_strings to rule_dicts in the Rules class.
  • Added a check for the parameter name when updating a class parameter in a LinearPipeline or ParallelPipeline - if the parameter does not exist, an error is thrown (prior to this, no error was thrown).
  • Updated the RBSPipeline class to return the index of X_rules in the prediction.