Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding bug #407

Merged
merged 13 commits into from Sep 14, 2021
Merged

Encoding bug #407

merged 13 commits into from Sep 14, 2021

Conversation

MAGLeb
Copy link
Collaborator

@MAGLeb MAGLeb commented Aug 30, 2021

  • add import/export for "one_hot_encoding" operation
  • whether categories in test data contain in train data
  • fix issues with categorical expansion
  • fix testing time

RIGHT now we have preprocessing pipeline:

  • convert values to one type in columns
  • fill missing values
  • one hot encoding for categorical

Closed issues:
#400
#399
#412

@MAGLeb MAGLeb requested a review from nicl-nno August 30, 2021 14:31
@pep8speaks
Copy link

pep8speaks commented Aug 30, 2021

Hello @MAGLeb! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-09-14 10:33:12 UTC

@codecov
Copy link

codecov bot commented Aug 30, 2021

Codecov Report

Merging #407 (ae28621) into master (55bc694) will decrease coverage by 0.51%.
The diff coverage is 81.05%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #407      +/-   ##
==========================================
- Coverage   83.90%   83.38%   -0.52%     
==========================================
  Files         128      128              
  Lines        8684     8728      +44     
==========================================
- Hits         7286     7278       -8     
- Misses       1398     1450      +52     
Impacted Files Coverage Δ
fedot/core/operations/operation.py 96.51% <ø> (-0.40%) ⬇️
fedot/core/pipelines/pipeline.py 88.13% <73.52%> (-3.08%) ⬇️
fedot/core/data/data.py 85.64% <100.00%> (-0.85%) ⬇️
...tations/data_operations/sklearn_transformations.py 85.20% <100.00%> (-0.07%) ⬇️
fedot/core/operations/operation_template.py 93.04% <100.00%> (+0.38%) ⬆️
fedot/core/composer/advisor.py 75.75% <0.00%> (-21.22%) ⬇️
...implementations/data_operations/sklearn_filters.py 84.00% <0.00%> (-10.00%) ⬇️
...on_implementations/models/discriminant_analysis.py 88.63% <0.00%> (-9.10%) ⬇️
fedot/api/api_utils/composer.py 85.31% <0.00%> (-8.40%) ⬇️
fedot/core/pipelines/tuning/hyperparams.py 93.61% <0.00%> (-6.39%) ⬇️
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55bc694...ae28621. Read the comment docs.

@Dreamlone
Copy link
Collaborator

У меня вопрос. Функция _fill_remaining_gaps до сих пор нужна?

Честно говоря, я думал, что она ушла еще после этого PR по категориальным признакам. Есть какая-то причина, почему мы все равно применяем операцию заполнения пропусков на уровне операции?

cases/oil.py Outdated Show resolved Hide resolved
fedot/core/data/data.py Outdated Show resolved Hide resolved
fedot/core/pipelines/pipeline.py Outdated Show resolved Hide resolved
fedot/core/pipelines/pipeline.py Outdated Show resolved Hide resolved
fedot/core/pipelines/pipeline.py Outdated Show resolved Hide resolved
@MAGLeb
Copy link
Collaborator Author

MAGLeb commented Aug 31, 2021

У меня вопрос. Функция _fill_remaining_gaps до сих пор нужна?

Честно говоря, я думал, что она ушла еще после этого PR по категориальным признакам. Есть какая-то причина, почему мы все равно применяем операцию заполнения пропусков на уровне операции?

Не нужна, можно удалять.

@nicl-nno
Copy link
Collaborator

nicl-nno commented Sep 1, 2021

#412

Вот эта ошибка ещё проявляется

@MAGLeb
Copy link
Collaborator Author

MAGLeb commented Sep 3, 2021

У меня вопрос. Функция _fill_remaining_gaps до сих пор нужна?
Честно говоря, я думал, что она ушла еще после этого PR по категориальным признакам. Есть какая-то причина, почему мы все равно применяем операцию заполнения пропусков на уровне операции?

Не нужна, можно удалять.

Deleted

@MAGLeb
Copy link
Collaborator Author

MAGLeb commented Sep 14, 2021

#412

Вот эта ошибка ещё проявляется

Solve this error. Check for each runs.

@nicl-nno
Copy link
Collaborator

image

А датасет специально урезан?

@nicl-nno nicl-nno linked an issue Sep 14, 2021 that may be closed by this pull request
@MAGLeb
Copy link
Collaborator Author

MAGLeb commented Sep 14, 2021

image

А датасет специально урезан?

Нет, был урезанный в этой ветке. Подумал ты специально сделал.

@nicl-nno
Copy link
Collaborator

Нет, можно вернуть как было.

Copy link
Collaborator

@nicl-nno nicl-nno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Верни плз scoring_train как было, а так вроде всё норм.

@MAGLeb MAGLeb merged commit 07c14b1 into master Sep 14, 2021
@nicl-nno nicl-nno deleted the encoding_bug branch October 19, 2021 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants