Add catboost to default initial assumptions #1081

MorrisNein · 2023-04-07T16:57:08Z

Added CatBoost to FEDOT initial assumptions for classification.

For the sake of stability, restricted tuning of the model's loss function.

codecov · 2023-04-07T17:13:23Z

Codecov Report

Merging #1081 (7322433) into master (8629d56) will decrease coverage by 0.39%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1081      +/-   ##
==========================================
- Coverage   88.93%   88.55%   -0.39%     
==========================================
  Files         132      132              
  Lines        9356     9356              
==========================================
- Hits         8321     8285      -36     
- Misses       1035     1071      +36

Impacted Files	Coverage Δ
...edot/api/api_utils/assumptions/task_assumptions.py	`78.94% <ø> (ø)`
fedot/core/pipelines/tuning/search_space.py	`100.00% <ø> (ø)`

... and 6 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

MorrisNein · 2023-04-07T20:05:21Z

After running integration tests, got CatBoostError at test_multi_modal_example.

@nicl-nno, wdyt? Should we give it a further chance?

The full log & traceback:

FAILED      [ 81%]2023-04-07 22:52:15,256 - AssumptionsHandler - Memory consumption for fitting of the initial pipeline in main session: current 2.5 MiB, max: 16.2 MiB
2023-04-07 22:52:15,258 - ApiComposer - Initial pipeline was fitted in 3.0 sec.
2023-04-07 22:52:15,264 - ApiComposer - AutoML configured. Parameters tuning: False. Time limit: 10 min. Set of candidate models: ['qda', 'pca', 'scaling', 'bernb', 'knn', 'logit', 'dt', 'rf', 'normalization'].
2023-04-07 22:52:15,266 - ApiComposer - Pipeline composition started.
2023-04-07 22:52:50,299 - MultiprocessingDispatcher - 3 individuals out of 3 in previous population were evaluated successfully.
Generations:   0%|          | 1/10000 [01:15<?, ?gen/s]

test\integration\real_applications\test_examples.py:100 (test_multi_modal_example)
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\joblib\externals\loky\process_executor.py", line 428, in _process_worker
    r = call_item()
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\joblib\externals\loky\process_executor.py", line 275, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\joblib\_parallel_backends.py", line 620, in __call__
    return self.func(*args, **kwargs)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\joblib\parallel.py", line 288, in __call__
    return [func(*args, **kwargs)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\joblib\parallel.py", line 288, in <listcomp>
    return [func(*args, **kwargs)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\golem\core\optimisers\genetic\evaluation.py", line 258, in evaluate_single
    eval_res = super().evaluate_single(graph, uid_of_individual, with_time_limit, cache_key, logs_initializer)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\golem\core\optimisers\genetic\evaluation.py", line 168, in evaluate_single
    fitness, graph = adapted_evaluate(graph)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\golem\core\adapter\adapter.py", line 174, in adapted_fun
    result = fun(*adapted_args, **adapted_kwargs)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\golem\core\optimisers\genetic\evaluation.py", line 181, in _evaluate_graph
    fitness = self._objective_eval(domain_graph)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\optimisers\objective\data_objective_eval.py", line 74, in evaluate
    raise ex
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\optimisers\objective\data_objective_eval.py", line 66, in evaluate
    prepared_pipeline = self.prepare_graph(graph, train_data, fold_id, self._eval_n_jobs)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\optimisers\objective\data_objective_eval.py", line 112, in prepare_graph
    graph.fit(
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\pipeline.py", line 191, in fit
    train_predicted = self._fit(input_data=copied_input_data)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\pipeline.py", line 111, in _fit
    train_predicted = self.root_node.fit(input_data=input_data)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\node.py", line 198, in fit
    input_data = self._get_input_data(input_data=input_data, parent_operation='fit')
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\node.py", line 275, in _get_input_data
    input_data = self._input_from_parents(input_data=input_data, parent_operation=parent_operation)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\node.py", line 301, in _input_from_parents
    parent_results, _ = _combine_parents(parent_nodes, input_data,
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\node.py", line 393, in _combine_parents
    prediction = parent.fit(input_data=input_data)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\pipelines\node.py", line 202, in fit
    self.fitted_operation, operation_predict = self.operation.fit(params=self._parameters,
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\operations\operation.py", line 85, in fit
    self.fitted_operation = self._eval_strategy.fit(train_data=data)
  File "C:\Users\petro\PycharmProjects\FEDOT\fedot\core\operations\evaluation\evaluation_interfaces.py", line 234, in fit
    operation_implementation.fit(train_data.features, train_data.target)
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\catboost\core.py", line 5128, in fit
    self._fit(X, y, cat_features, text_features, embedding_features, None, sample_weight, None, None, None, None, baseline, use_best_model,
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\catboost\core.py", line 2355, in _fit
    self._train(
  File "C:\Users\petro\PycharmProjects\FEDOT\venv\lib\site-packages\catboost\core.py", line 1759, in _train
    self._object._train(train_pool, test_pool, params, allow_clear_pool, init_model._object if init_model else None)
  File "_catboost.pyx", line 4623, in _catboost._CatBoost._train
  File "_catboost.pyx", line 4672, in _catboost._CatBoost._train
_catboost.CatBoostError: C:/Program Files (x86)/Go Agent/pipelines/BuildMaster/catboost.git/catboost/libs/metrics/metric.cpp:6381: Max target greater than 1: 4
"""

MorrisNein · 2023-04-07T20:35:52Z

Нагуглил возможную ошибку: скорее всего, дело в CrossEntropyLoss, не подходящей для многоклассовых данных.

У нас тюнинг позволяет задавать эту функцию потерь узлу CatBoost:

FEDOT/fedot/core/pipelines/tuning/search_space.py

Lines 257 to 264 in 61e6a4f

    
           'catboost': { 
        
               'max_depth': (hp.uniformint, [1, 11]), 
        
               'learning_rate': (hp.loguniform, [np.log(0.01), np.log(0.2)]), 
        
               'min_data_in_leaf': (hp.qloguniform, [0, 6, 1]), 
        
               'border_count': (hp.uniformint, [2, 255]), 
        
               'l2_leaf_reg': (hp.loguniform, [np.log(1e-8), np.log(10)]), 
        
               'loss_function': (hp.choice, [['Logloss', 'CrossEntropy']]) 
        
           },

Возможно ли это при помощи мутации - не уверен

Upd. Действительно: мутация меняла loss функцию на бинарную, что вызывало ошибку при обучении узла. Пофикшено удалением параметра loss_function из пространства поиска.

nicl-nno · 2023-04-07T20:54:50Z

Возможно ли это при помощи мутации

Вполне.

aim-pep8-bot · 2023-04-07T21:21:07Z

Hello @MorrisNein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2023-04-07 21:22:11 UTC

MorrisNein · 2023-04-08T20:42:01Z

Все остальные интеграционные тесты падают ровно так же, как и на мастере 😏
Видимо, больше ошибок с CatBoost не наблюдается.
Предлагаю вливать.

deleted redundant files corrected typos simplifeid code remove redundant correct pep8 issues add example add mutations correct visualization fiting process improve visualization add partial in solver.py add example in .py format examples/confidence_intervals move prediction intervals in core/pipelines delete old examples add unit tests Refactoring of ApiParams and ApiMetrics (aimclub#1041) * WIP refactor ApiParams * Remove explicit ApiParams initialization * Move all params initialization to ApiParams * Minor changes * Remove _divide_params * WIP create ApiParamsBuilder * Add ApiParamsBuilder * Minor * Rename history_folder to history_dir * Remove train_data from ApiParams * Fixes after rebase * Move obtain_metric to ApiMetrics * Fix plot_pareto * Fix Fedot.tune * Refactor Fedot.get_metrics * Fix Fedot.tune * Fix metric names * Fixes after rebase * Remove ApiParamsBuilder * Structure parameters in Fedot docstring * Refactor init_composer_requirements * Refactor init_optimizer_params * Refactor init_optimizer_params * Fix docstrings * Add tests for ApiParamsRepository * Minors * Fix test_api_params * Review fixes * Review fixes * Review fixes Hotfix of pipeline import export example (aimclub#1064) meta rules (aimclub#1057) * initial assumption * final architecture * add meta rules * minor * simplify * add meta rule * fix with cv folds * minors * fix types * minor * golem fixes * add log messages * pep8 * remove log file Fix initial assumptions as list of pipelines (aimclub#1070) `Fedot(..., initial_assumption=...)` is expected to get a sequence of pipelines and pass them as initial graphs to an optimizer via composer. Moreover, FEDOT itself generates more than one initial assumption by default. As the result of this bug, composer passed only one of the initial assumptions to an optimizer. This PR fixes the bug and adds the corresponding test. tests fix (aimclub#1073) * remove some assumptions * update requirements * fix pep8 * update to golem Docs updated, badges added (aimclub#1072) * Docs updated, badges added * Mirror workflow fix Improve API documentation (aimclub#1067) - Moved type hints from method headers to corresponding parameters. - Allowed referencing GOLEM objects in FEDOT documentation. - Replaced all url links to documentation pages with sphinx references - it fixed some broken links. - FEDOT now uses its own directory for cache, instead of using GOLEM folder. - Documented `**composer_tuner_params` of `Fedot` with type hints and default values. All parameters with no usage examples are placed to the separate issue aimclub#1076 minor meta fix (aimclub#1078) * minor fix * indent fix * minor Add catboost to default initial assumptions (aimclub#1081) * add catboost to default initial assumptions for classification * restrict mutating loss function in CatBoost * evaluate f1 as expected in the example * pep8 F1 averaging fix (aimclub#1083) Minor logging fix (aimclub#1082) * fix print instead of logging for memory consumption 1059 timestamp bug (aimclub#1065) 879 FEDOT features (aimclub#1075) describe framework's features add example for surrogate optimizer (aimclub#1085) Example with surrogate optimizer was added. External parameters field was removed from api (now we should use partial) related pull request in GOLEM aimclub/GOLEM#82 has_one_root fix (aimclub#1091) * has_one_root fix * test fix Remove outdated test handled in thegolem (aimclub#1101) 358 Reduce execution time for unit tests (aimclub#1098) Update RTD benchmarks tabular data page (aimclub#1099) * +csv support Golem update requirements (aimclub#1088) * Upd RemoveType in Advisor (golem sync) * Upd requirements.txt for stable GOLEM * Fix few imports Add ts bench (aimclub#1104) Add results from ts benchmark Release 0.7.1 and test workflow updates (aimclub#1105) * Upd release version * Add pre-release tests actions on 'release' branch; Disable auto-publish * Upd GOLEM version * Remove manual-build.yml (dup of integration-build.yml) * Add integration tests badge to README * Revert "Upd GOLEM version" (for PR in master) This reverts commit 257ff16. Hotfix some integration tests for release 071 (aimclub#1107) * Fix integration test of ApiParams * Fix integration test of composition_time * Fix integration test of metocean_forecasting * Fix integration test of nemo_multiple.py * Workaround for sqlite exception raised in tests * pep8 fixes * fix different seed in quality imporvement tests * simplify condition * remove test that barely tested anything * fix condition * remove seed from example * fix typo parallel cache files test fix (aimclub#1109) Add IOptTuner (aimclub#1102) * Refactor search space * Fix ParametersChanges * Redact tuner builder * PEP 8 * Fix examples and correct tuning docs * Add IOpt example to docs * Add simple IOpt example * Fix integration tests (6 steal not work) * Fix integration test test_tuner_builder_with_custom_params (6 steal not work) * Fix pep8 * Add tuners comparison in docs * Update golem version in requirements * Fix warn_requirements * Fix warn_requirements * Fix table in tuning docs * Fix credit_scoring_problem_multiobj.py * Fix extra requirements * Edit docs * Fix table in docs * Change requirements * Test requirements * Set stable branch + h2o benchmark's tabular data values (aimclub#1106) * +h2o vals * change csv to html * specified table uuid * change max values style * add df to html converter --------- Co-authored-by: nicl-nno <nicl.nno@gmail.com> Added results for multimodal benchmark into FEDOT.docs (aimclub#1115) * - added results for multimodal benchmark ... add docstrings and type hints add prediction intervals unit test simplified solvers code correct pep issues add class defining PredictionIntervals params fix MutationStrength import issue correct test_data path update imports updated unit tests correct type-hint small corrections take short ts for unit test update pred_int test change ql_models to 'max' in pred_ints test update params for ql method ... update example update tests add base_quantiles visualization fix import get_base_quantiles update examples and deleted ql from tests update pep pep issues pep pep update requirements.txt

add catboost to default initial assumptions for classification

61e6a4f

MorrisNein self-assigned this Apr 7, 2023

MorrisNein changed the title ~~Add catboost to default initial assumptions for classification~~ Add catboost to default initial assumptions Apr 7, 2023

MorrisNein added 2 commits April 8, 2023 00:17

restrict mutating loss function in CatBoost

a1948cf

evaluate f1 as expected in the example

310aa68

pep8

7322433

MorrisNein requested a review from nicl-nno April 8, 2023 20:42

nicl-nno approved these changes Apr 8, 2023

View reviewed changes

MorrisNein merged commit dca4364 into master Apr 9, 2023

MorrisNein deleted the add_catboost_assumption branch April 9, 2023 09:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add catboost to default initial assumptions #1081

Add catboost to default initial assumptions #1081

MorrisNein commented Apr 7, 2023 •

edited

Loading

codecov bot commented Apr 7, 2023 •

edited

Loading

MorrisNein commented Apr 7, 2023 •

edited

Loading

MorrisNein commented Apr 7, 2023 •

edited

Loading

nicl-nno commented Apr 7, 2023

aim-pep8-bot commented Apr 7, 2023 •

edited

Loading

MorrisNein commented Apr 8, 2023

Add catboost to default initial assumptions #1081

Add catboost to default initial assumptions #1081

Conversation

MorrisNein commented Apr 7, 2023 • edited Loading

codecov bot commented Apr 7, 2023 • edited Loading

Codecov Report

MorrisNein commented Apr 7, 2023 • edited Loading

MorrisNein commented Apr 7, 2023 • edited Loading

nicl-nno commented Apr 7, 2023

aim-pep8-bot commented Apr 7, 2023 • edited Loading

Comment last updated at 2023-04-07 21:22:11 UTC

MorrisNein commented Apr 8, 2023

MorrisNein commented Apr 7, 2023 •

edited

Loading

codecov bot commented Apr 7, 2023 •

edited

Loading

MorrisNein commented Apr 7, 2023 •

edited

Loading

MorrisNein commented Apr 7, 2023 •

edited

Loading

aim-pep8-bot commented Apr 7, 2023 •

edited

Loading