Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeseries surrogate #85

Merged
merged 35 commits into from
Jan 20, 2024
Merged

Timeseries surrogate #85

merged 35 commits into from
Jan 20, 2024

Conversation

ShikovEgor
Copy link
Collaborator

What's new:

  • Unification of meta-dataset file construction for surrogate model training for table and timeseries data.
  • Rework of the training code.
  • Fix quality metrics and data sampling, taking into account the difference in number of pipelines for different datasets.
  • Added checkpoints and configes for timeseries.

@pep8speaks
Copy link

pep8speaks commented Nov 28, 2023

Hello @ShikovEgor! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1:1: F401 '.datasets_loader.DatasetsLoader' imported but unused
Line 2:1: F401 '.custom_datasets_loader.CustomDatasetsLoader' imported but unused
Line 3:1: F401 '.openml_datasets_loader.OpenMLDatasetsLoader' imported but unused
Line 4:1: F401 '.timeseries_dataset_loader.TimeSeriesDatasetsLoader' imported but unused

Line 1:1: F401 '.models_loader.ModelsLoader' imported but unused
Line 2:1: F401 '.fedot_pipelines_loader.FEDOTPipelinesLoader' imported but unused
Line 3:1: F401 '.knowledge_base_models_loader.KBTSModelsLoader' imported but unused
Line 3:1: F401 '.knowledge_base_models_loader.KnowledgeBaseModelsLoader' imported but unused
Line 3:1: F401 '.knowledge_base_models_loader.CompatKBModelsLoader' imported but unused
Line 4:1: F401 '.fedot_history_loader.FedotHistoryLoader' imported but unused

Line 45:29: W291 trailing whitespace
Line 46:48: W291 trailing whitespace
Line 47:51: W291 trailing whitespace
Line 48:53: W291 trailing whitespace
Line 49:73: W291 trailing whitespace
Line 50:48: W291 trailing whitespace
Line 71:1: E302 expected 2 blank lines, found 1
Line 105:45: W291 trailing whitespace
Line 106:45: W291 trailing whitespace
Line 107:31: E127 continuation line over-indented for visual indent
Line 107:47: W291 trailing whitespace
Line 108:31: E127 continuation line over-indented for visual indent
Line 109:31: E127 continuation line over-indented for visual indent
Line 129:1: E302 expected 2 blank lines, found 1
Line 130:121: E501 line too long (142 > 120 characters)
Line 154:1: W293 blank line contains whitespace
Line 162:45: W291 trailing whitespace
Line 163:31: E127 continuation line over-indented for visual indent
Line 163:47: W291 trailing whitespace
Line 164:31: E127 continuation line over-indented for visual indent
Line 165:31: E127 continuation line over-indented for visual indent
Line 204:1: W293 blank line contains whitespace

Line 1:1: F401 '.file_system.PathType' imported but unused
Line 1:1: F401 '.file_system.ensure_dir_exists' imported but unused
Line 1:1: F401 '.file_system.get_checkpoints_dir' imported but unused
Line 1:1: F401 '.file_system.get_configs_dir' imported but unused
Line 1:1: F401 '.file_system.get_data_dir' imported but unused
Line 1:1: F401 '.file_system.get_project_root' imported but unused
Line 3:1: F401 '.cache.CacheOperator' imported but unused
Line 3:1: F401 '.cache.get_cache_dir' imported but unused
Line 3:1: F401 '.cache.get_dataset_cache_path' imported but unused
Line 3:1: F401 '.cache.get_dataset_cache_path_by_id' imported but unused
Line 3:1: F401 '.cache.update_openml_cache_dir' imported but unused

Line 1:1: F401 '.data.GraphDataset' imported but unused
Line 1:1: F401 '.data.PairDataset' imported but unused
Line 1:1: F401 '.data.SingleDataset' imported but unused
Line 2:1: F401 '.dataset_generate.KnowledgeBaseToDataset' imported but unused
Line 2:1: F401 '.dataset_generate.dataset_from_id_without_data_loading' imported but unused
Line 2:1: F401 '.dataset_generate.dataset_from_id_with_data_loading' imported but unused

Line 2:1: F401 'random.choice' imported but unused
Line 231:89: W291 trailing whitespace

Line 15:1: F401 'gamlet.components.models_loaders.KBTSModelsLoader' imported but unused
Line 15:1: F401 'gamlet.components.models_loaders.CompatKBModelsLoader' imported but unused
Line 99:1: W293 blank line contains whitespace

Line 1:1: F401 '.train_surrogate_model.train_surrogate_model' imported but unused
Line 1:1: F401 '.train_surrogate_model.setup_loaders' imported but unused
Line 1:1: F401 '.train_surrogate_model.do_training' imported but unused
Line 1:59: W291 trailing whitespace
Line 4:36: E124 closing bracket does not match visual indentation
Line 5:1: F401 '.tune_surrogate_model.tune_surrogate_model' imported but unused

Line 210:1: W293 blank line contains whitespace
Line 211:1: W293 blank line contains whitespace
Line 283:62: W291 trailing whitespace

Line 4:1: F401 'typing.List' imported but unused
Line 8:1: F401 'torch' imported but unused
Line 12:1: F401 'pytorch_lightning.Trainer' imported but unused
Line 13:1: F401 'pytorch_lightning.callbacks.EarlyStopping' imported but unused
Line 13:1: F401 'pytorch_lightning.callbacks.ModelCheckpoint' imported but unused
Line 14:1: F401 'pytorch_lightning.loggers.TensorBoardLogger' imported but unused
Line 17:1: F401 'gamlet.surrogate.surrogate_model' imported but unused
Line 18:52: E231 missing whitespace after ','
Line 20:1: E302 expected 2 blank lines, found 1

Line 16:1: W293 blank line contains whitespace
Line 23:25: F821 undefined name 'CompatKBModelsLoader'
Line 23:46: F821 undefined name 'knowledge_base_directory'
Line 23:88: W291 trailing whitespace
Line 28:25: F821 undefined name 'KBTSModelsLoader'
Line 28:42: F821 undefined name 'knowledge_base_directory'
Line 30:64: W291 trailing whitespace
Line 31:1: W293 blank line contains whitespace
Line 43:1: E305 expected 2 blank lines after class or function definition, found 1

Comment last updated at 2024-01-15 13:31:38 UTC

@MorrisNein
Copy link
Collaborator

@ShikovEgor, нужно сделать ребейз на главную ветку и в процессе пофиксить конфликты.

Также много всего по pep8 - рекомендую воспользоваться хоткеями в IDE, чтобы автоматически пофиксить форматирование и убрать неиспользуемые импорты

@MorrisNein
Copy link
Collaborator

MorrisNein commented Dec 18, 2023

Также обратите внимание на падающий юнит тест. Где-то нужно перепроверить путь к файлу и при необходимости заменить на путь с использованием функций из file_system.py

@MorrisNein
Copy link
Collaborator

MorrisNein commented Dec 18, 2023

И всё-таки, покройте используемые классы и функции минимальными юнит-тестами на работоспособность на минимальном подмножестве данных, которые можно сохранить в tests/data. Это убережёт ваш код от вымирания с каждым последующим PR.

Я уверен, что авторам кода это проще сделать, чем кому-либо другому

@ShikovEgor
Copy link
Collaborator Author

Не могу пока пофиксить:
python scripts/main.py --train --config configs/train_surrogate_TS.yml
Traceback (most recent call last):
File "/data/home/egor/GAMLET/scripts/main.py", line 5, in
from meta_automl.surrogate import training
File "/data/home/egor/GAMLET/meta_automl/surrogate/training/init.py", line 1, in
from .train_surrogate_model import train_surrogate_model, test_ranking
File "/data/home/egor/GAMLET/meta_automl/surrogate/training/train_surrogate_model.py", line 22, in
from meta_automl.data_preparation.surrogate_dataset import GraphDataset, PairDataset, SingleDataset
File "/data/home/egor/GAMLET/meta_automl/data_preparation/surrogate_dataset/init.py", line 2, in
from .dataset_generate import (
File "/data/home/egor/GAMLET/meta_automl/data_preparation/surrogate_dataset/dataset_generate.py", line 16, in
from meta_automl.data_preparation.feature_preprocessors import FeaturesPreprocessor
File "/data/home/egor/GAMLET/meta_automl/data_preparation/feature_preprocessors/init.py", line 1, in
from .feature_preprocessors import FeaturesPreprocessor
File "/data/home/egor/GAMLET/meta_automl/data_preparation/feature_preprocessors/feature_preprocessors.py", line 9, in
from meta_automl.data_preparation.meta_features_extractors.dataset_meta_features import DatasetMetaFeatures
File "/data/home/egor/GAMLET/meta_automl/data_preparation/meta_features_extractors/init.py", line 2, in
from .openml_dataset_meta_features_extractor import OpenMLDatasetMetaFeaturesExtractor
File "/data/home/egor/GAMLET/meta_automl/data_preparation/meta_features_extractors/openml_dataset_meta_features_extractor.py", line 7, in
from meta_automl.data_preparation.feature_preprocessors import FeaturesPreprocessor
ImportError: cannot import name 'FeaturesPreprocessor' from partially initialized module 'meta_automl.data_preparation.feature_preprocessors' (most likely due to a circular import) (/data/home/egor/GAMLET/meta_automl/data_preparation/feature_preprocessors/init.py)
(new) [egor@bb565d71c869 GAMLET]$

@MorrisNein
Copy link
Collaborator

Не могу пока пофиксить: ...

Готово. Но лучше совсем уберите FeaturesPreprocessor из OpenMLDatasetMetaFeaturesExtractor.

@MorrisNein
Copy link
Collaborator

Пофиксил конфликты повторно

Copy link

codecov bot commented Dec 19, 2023

Codecov Report

Attention: 297 lines in your changes are missing coverage. Please review.

Comparison is base (4c4929b) 41.10% compared to head (415365d) 40.37%.

❗ Current head 415365d differs from pull request most recent head 7ff26a0. Consider uploading reports for the commit 7ff26a0 to get more accurate results

Files Patch % Lines
gamlet/surrogate/training/train_surrogate_model.py 0.00% 112 Missing ⚠️
...nts/models_loaders/knowledge_base_models_loader.py 19.56% 74 Missing ⚠️
gamlet/data_preparation/surrogate_dataset/data.py 0.00% 32 Missing ⚠️
..._preparation/surrogate_dataset/dataset_generate.py 0.00% 26 Missing ⚠️
gamlet/surrogate/surrogate_model.py 25.71% 26 Missing ⚠️
gamlet/surrogate/training/tune_surrogate_model.py 0.00% 21 Missing ⚠️
...tractors/openml_dataset_meta_features_extractor.py 75.00% 1 Missing ⚠️
...omponents/models_loaders/fedot_pipelines_loader.py 66.66% 1 Missing ⚠️
...let/data_preparation/surrogate_dataset/__init__.py 0.00% 1 Missing ⚠️
gamlet/surrogate/encoders/dataset_encoder.py 0.00% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #85      +/-   ##
==========================================
- Coverage   41.10%   40.37%   -0.73%     
==========================================
  Files          60       60              
  Lines        2472     2576     +104     
==========================================
+ Hits         1016     1040      +24     
- Misses       1456     1536      +80     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ShikovEgor ShikovEgor merged commit 2f41916 into main Jan 20, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants