Skip to content

Releases: paypal/Iguanas

v0.1.4

Choose a tag to compare

@lamesjaidler lamesjaidler released this 14 Mar 16:40

Added:

  • Added Bounds class to metrics module.
  • Added examples in docstrings of all classes.
  • Added warnings modules for custom warnings.
  • Added infer_dtypes parameter to rule generator classes. If set to True, column datatypes are inferred from the data (same as previous versions); if set to False, the datatypes from the dataset are used.

Changed:

  • Allow empty Rules class to be created.

Improvements:

  • Added the following operators to those supported by rule converters: greater_field, greater_or_equal_field, less_field, less_or_equal_field.
  • Parallelised DirectSearchOptimiser class.
  • Added try/except to ParallelPipeline to allow it to run even when a step fails to generate a rule set.
  • Added verbosity > 1 to BayesSearchCV.

v0.1.3

Choose a tag to compare

@lamesjaidler lamesjaidler released this 25 Feb 12:07

Improvements

  • Added verbosity to LinearPipeline and ParallelPipeline classes.
  • Added check for duplicate columns in X or X_rules for relevant classes - an exception is now thrown if duplicate columns are found.
  • Updated rule optimiser classes to include non-optimisable rules in the final rule_strings attribute (this ensures that these rules remain in the set when a rule optimiser is used in a pipeline)

Changes

  • In the rule optimiser, rules which use exclusively all null features are allocated to the zero variance group, rather than the non-optimisable group.
  • Updated the create_x0, create_bounds and create_initial_simplexes methods in the DirectSearchOptimiser class so that np.nan values are converted to 0.
  • Added pre- and post-optimisation methods to _base_pipeline.
  • Updated documentation and notebooks to reflect changes.

v0.1.2

Choose a tag to compare

@lamesjaidler lamesjaidler released this 23 Feb 10:57
0f8fce1

Changed

  • Updated URLs in LICENSE to use https.

Fixes

  • Updated _calc_tps_fps_tns_fns_numpy to use .to_numpy() when converting Pandas objects to numpy - fixes error seen when Pandas objects of Int dtype are used.

v0.1.1

Choose a tag to compare

@lamesjaidler lamesjaidler released this 18 Feb 12:30

Fixes:

  • Fixed error thrown when converting rules with exponentials (e.g. 1e-05) present in a condition from string to dictionary format.

v0.1.0

Choose a tag to compare

@lamesjaidler lamesjaidler released this 17 Feb 16:14
3eec965

Added:

  • Added ParallelPipeline class to pipeline module & unit tests.
  • Added rules parameter to filter modules, allowing a given rule set to be filtered.
  • Added rules parameter to RBSOptimiser, allowing a given rule set to be filtered to the rules remaining after optimisation.
  • Added rules attribute to rule generators (which is a Rules class containing the generated rules).
  • Added example notebook for ParallelPipeline class.
  • Added advanced example notebook for BayesSearchCV class.
  • Added __repr__ to AgglomerativeClusteringReducer class.

Changed:

  • The get_params method in the LinearPipeline and ParallelPipeline classes now returns a dictionary of parameters and their values for each pipeline step.
  • Removed num_cores parameter from RBSPipeline and RBSOptimiser classes (as it was ignored anyway).
  • Updated LinearPipeline and BayesSearchCV example notebook with diagrams.
  • Relaxed versions of packages required in setup.py.

Improvements:

  • Classes in later stages of a LinearPipeline can now use the initial datasets if required (e.g. if a rule optimiser is placed after a rule generator, the rule optimiser can be configured to use the initial feature set to optimise the generated rules). See the use_init_data param in LinearPipeline class.
  • ClassAccessor classes can now be used in parameters that are mutable iterables - this allows attributes from classes earlier in a pipeline to be passed to parameters of classes later in the pipeline that are mutable iterables (e.g. lists).
  • Users can set whether they want to use the sample_weight when calculating the metric of the validation fold in the BayesSearchCV class (prior to this, the sample_weight was always used if provided). See the sample_weight_in_val parameter in the BayesSearchCV class.
  • Rules classes can now be combined using + and sum().

Fixes:

  • Updated rule generators so rule names remain constant if the fit method is applied more than once after class instantiation.
  • int values are now properly parsed when converting from rule_strings to rule_dicts in the Rules class.
  • Added a check for the parameter name when updating a class parameter in a LinearPipeline or ParallelPipeline - if the parameter does not exist, an error is thrown (prior to this, no error was thrown).
  • Updated the RBSPipeline class to return the index of X_rules in the prediction.

v0.0.2

Choose a tag to compare

@lamesjaidler lamesjaidler released this 20 Jan 12:45
6a9207e

Added:

  • Code of conduct
  • Issue templates
  • Contributing guidelines
  • rule_names check in rule generator/optimiser unit tests
  • Added Github workflow for CI/CD for Ubuntu and Mac OS

Changed:

  • Reduced rule generator naming methods to one method in _BaseGenerator
  • Rule generators calculate rule_lambdas at the end
  • Removed TypeVars from iguanas.utils.typing, replaced with strings. Added unit tests.

Improvements:

  • Reduced runtime of RBSPipeline
  • Updated wording of introduction in documentation
  • Updated test_bayes_search_cv to cover all methods

Fixes:

  • Bug in AgglomerativeClusteringReducer which duplicated column names in cols_to_keep when cols_to_drop was empty
  • Bug in BayesSearchCV which omitted all but one param from best_params when multiple params were including in search_space for a given pipeline step

v0.0.1

Choose a tag to compare

@lamesjaidler lamesjaidler released this 22 Dec 17:04

Initial release.