Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fractional split duplication bug #24

Merged

Conversation

xandaau
Copy link
Collaborator

@xandaau xandaau commented Feb 1, 2023

This bug was described at #10

Now Splitter is checking for duplicates in id_column at each run. If there are any of these, an error will be raised.

It seems logically correct that in almost all groups splitting problems, we want to have unique id for each object(row).
In special cases, one can create an additional column with a unique id, and it will be much simplier rather than handling issues with hashing, index search, etc inside Splitter methods.

@xandaau xandaau changed the base branch from main to dev February 1, 2023 15:38
@codecov-commenter
Copy link

codecov-commenter commented Feb 1, 2023

Codecov Report

Merging #24 (1600383) into dev (d6b47dd) will decrease coverage by 0.03%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##              dev      #24      +/-   ##
==========================================
- Coverage   84.23%   84.20%   -0.03%     
==========================================
  Files          42       42              
  Lines        2988     2995       +7     
==========================================
+ Hits         2517     2522       +5     
- Misses        471      473       +2     
Impacted Files Coverage Δ
ambrosia/spark_tools/split_tools.py 76.31% <50.00%> (-0.72%) ⬇️
ambrosia/tools/split_tools.py 81.48% <80.00%> (-0.05%) ⬇️
ambrosia/splitter/handlers.py 92.85% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@xandaau xandaau merged commit 789b3c1 into MobileTeleSystems:dev Feb 2, 2023
xandaau added a commit that referenced this pull request Feb 15, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 13, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Not executable Preprocessor data agg bug fixed

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 13, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Regression tests for duplicated ids for split

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 13, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Methods for convenient dumping added

* Tests added for classes load and dump to yaml abilities

* Docstring unified

* linted

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 13, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Theoretical tools params order is fixed
and parameters names are corrected

* Decorators for parallelism added

* Glabal namings for design result added

* Binary result tables view modified

* Parallelism parameters handling added

* Changed bootstarap parallel strategy and other parameters

* Changed parallellism,
added group_ratio parameter,
unified views of resulted tables

* Func name corrected

* Helpers modified to handle ``groups_size`` parameter

* Designer class and func modified to handle all changes made
Tests corrected

* Fixed broken import

* Bootstrap params modified, utility func moved to back tools

* as_numeric parameter invokation changed

* Binary approach output tables unified

* Multiprocessing handling wrap func changed

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
@xandaau xandaau deleted the fractional_split_duplication_bug branch April 13, 2023 21:30
xandaau added a commit that referenced this pull request Apr 16, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

* optional spark pkg

* fix Make for extras

* remove lock

* Linted

---------

Co-authored-by: Хакимов Артем Валерьевич <artem.khakimov@gmail.com>
Co-authored-by: Artem Khakimov <83234014+xandaau@users.noreply.github.com>
Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 17, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

* optional spark pkg

* fix Make for extras

* remove lock

* spark tester

* Relative effect for ttest_ind spark

* Swap Spark Ttest groups for correct pvalue calc
Remove errors in merge

* Decrease effect in test

* Back tools import fixed

---------

Co-authored-by: Хакимов Артем Валерьевич <artem.khakimov@gmail.com>
Co-authored-by: Artem Khakimov <83234014+xandaau@users.noreply.github.com>
Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Docs and tutorials changed, part I

* another doc and usage examples update

* First errors naming unified to Designer

* Deterministic bootstrap behaviour added

* Tester CI MHC change denied

* Configs added

* Agg watched data dumped

* docs .rst files updated

* .nblinks to usage examples added

* power func output corrected

* Test for groups ratio and alternatives added

* README is updated

* Usage examples updated

* CHANGELOG updated

* update

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Not executable Preprocessor data agg bug fixed

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Regression tests for duplicated ids for split

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Methods for convenient dumping added

* Tests added for classes load and dump to yaml abilities

* Docstring unified

* linted

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Theoretical tools params order is fixed
and parameters names are corrected

* Decorators for parallelism added

* Glabal namings for design result added

* Binary result tables view modified

* Parallelism parameters handling added

* Changed bootstarap parallel strategy and other parameters

* Changed parallellism,
added group_ratio parameter,
unified views of resulted tables

* Func name corrected

* Helpers modified to handle ``groups_size`` parameter

* Designer class and func modified to handle all changes made
Tests corrected

* Fixed broken import

* Bootstrap params modified, utility func moved to back tools

* as_numeric parameter invokation changed

* Binary approach output tables unified

* Multiprocessing handling wrap func changed

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

* optional spark pkg

* fix Make for extras

* remove lock

* Linted

---------

Co-authored-by: Хакимов Артем Валерьевич <artem.khakimov@gmail.com>
Co-authored-by: Artem Khakimov <83234014+xandaau@users.noreply.github.com>
Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in `scipy` way, notebooks updated (#31)

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* 0.3.0 version descr added to changelog

* optional spark pkg

* fix Make for extras

* remove lock

* spark tester

* Relative effect for ttest_ind spark

* Swap Spark Ttest groups for correct pvalue calc
Remove errors in merge

* Decrease effect in test

* Back tools import fixed

---------

Co-authored-by: Хакимов Артем Валерьевич <artem.khakimov@gmail.com>
Co-authored-by: Artem Khakimov <83234014+xandaau@users.noreply.github.com>
Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
xandaau added a commit that referenced this pull request Apr 21, 2023
* tests actions on pull request added, test badge updated

* Python version dependency limited to be <3.11

* Preprocessing enhancement (#22)

* Preprocessing enhancement first iteration

* check_cols call corrected

* Robust classes improved, tests added

* cuped structure improved, tests namings improved

* Preprocessing usage example updated

* MLVR class save load dict methods implemented

* chain smoke test for preprocessor added

* fix test

* chain smoke test for preprocessor added

* files linted

* Docstrings improved

* code decomposition

* docstring and typing added

* Old preprocessor is removed

* docstrings improved

* Temp class for logging added

* docs parsing for preprocessing classes added and updated

* .rst file modified

* version updated

* Robust Preprocessors and Metric Transformers separation (#23)

* Fractional split duplication bug (#24)

* Added check for id uniqueness in splitter

* Dataframe id_column access fix

* duplicated function removed

* Fittable aggregate preprocessor (#25)

* _check_columns method renamed

* Fittable aggregatepreprocessor

* aggregation tests and class improvement

* fix binary theory with groups_ratio/alternative/stabilizing

* Update README.rst

Telegram channel link added

* add docstrings and project dependencies

* fix poetry version

* fix imports [isort] && fix tests

* fix linters[black]

* Docstrings improved, method check added

* fix bug forgot first error for binary design

* Multiple alpha power design for binary intervals methods

* Incorrect variables types changed

* Storable Preprocessor and VarReduction classes changes (#29)

* Storable Preprocessor and VarReduction classes changes

* Unused dict with preproc classes removed

* Alternatives namings changed in scipy way, notebooks updated

* Imports fixed

* Docs and tutorials changed, part I

* another doc and usage examples update

* First errors naming unified to Designer

* Deterministic bootstrap behaviour added

* Tester CI MHC change denied

* Configs added

* Agg watched data dumped

* docs .rst files updated

* .nblinks to usage examples added

* power func output corrected

* Test for groups ratio and alternatives added

* README is updated

* Usage examples updated

* CHANGELOG updated

* update

---------

Co-authored-by: Байрамкулов Аслан Магомедович <ambajramku@www.mts.ru>
Co-authored-by: Aslan Bayramkulov <aslan.bayramkulov96@gmail.com>
Co-authored-by: Artem Vasin <a.vasin@ira-labs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants