Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] minor fixes #1219

Merged
merged 46 commits into from
Jan 28, 2023
Merged

[ENH] minor fixes #1219

merged 46 commits into from
Jan 28, 2023

Conversation

samukweku
Copy link
Collaborator

@samukweku samukweku commented Dec 2, 2022

PR Description

Please describe the changes proposed in the pull request:

  • minor fixes for drop_constant_columns and get_dupes
  • improve performance for select when all entries are scalars and are the same dtype
  • move is more flexible with the select_columns syntax - multiple columns/rows can be moved at once
  • avoid mutation in collapse_levels
  • impute now supports multiple columns, making it easy to deprecate fill_empty
  • fix deprecation warning for np.bool8
  • simplify column selection for dtypes (is_numeric_dtype, ...)

Please tag maintainers to review.

@ericmjl
Copy link
Member

ericmjl commented Dec 2, 2022

@codecov
Copy link

codecov bot commented Dec 2, 2022

Codecov Report

Merging #1219 (46b3a1f) into dev (9245f76) will increase coverage by 0.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev    #1219      +/-   ##
==========================================
+ Coverage   97.69%   97.77%   +0.07%     
==========================================
  Files          78       78              
  Lines        3768     3767       -1     
==========================================
+ Hits         3681     3683       +2     
+ Misses         87       84       -3     

tests/functions/test_impute.py Show resolved Hide resolved
janitor/functions/move.py Show resolved Hide resolved
tests/functions/test_encode_categorical.py Outdated Show resolved Hide resolved
tests/functions/test_encode_categorical.py Outdated Show resolved Hide resolved
mkdocs.yml Outdated Show resolved Hide resolved
@samukweku
Copy link
Collaborator Author

apologies, my rebasing included files that shouldnt be in there ... i def have to go back to basics for git again

Copy link
Contributor

@thatlittleboy thatlittleboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asian-man-cubicle-thumbsup

@ericmjl ericmjl merged commit 161c290 into dev Jan 28, 2023
samukweku added a commit that referenced this pull request Jan 29, 2023
* minor fix for drop_constant_columns

* minor fix for get_dupes

* minor fix for collapse_levels, primarily for speed

* fix test fails

* fix test fails

* vectorise collapse_levels some more for performance sake

* allow for mutation

* leave collapse_levels as is

* Update test_collapse_levels.py

* Update test_collapse_levels.py

* Update test_collapse_levels.py

* restor collapse_levels to before

* shortcut if all entries are strings in a list in a select call

* use get_indexer_for for lists that contain only strings in select

* make more robust by checking on scalar, instead of just strings

* improve comments

* rebase

* more edits

* remove extra check

* shortcut for *

* exclude api/utils from mkdocs

* exclude api/utils from mkdocs

* simplify move

* avoid mutation in collapse_levels

* make move more robust with select syntax

* docs

* fix docstring

* replicate fill_empty in impute - reduce duplication

* add tests

* fix doctest

* fix docstrings

* defer copy in pivot_wider to pd.pivot

* fix np.bool8 deprecation

* simplify dtype column selection

* fix warning msg output for change_type

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rebase

* expose _select_index

* add parameters

* use get_index_labels where possible

* add test for multiple columns

* make column selection more robust for sequences

* add test for set/dict selection

* add test for move - both source and target are lists

* exclude utils from docs

* fix test fails

---------

Co-authored-by: samuel.oranyeli <samuel.oranyeli@grow.inc>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
samukweku added a commit that referenced this pull request Feb 1, 2023
* minor fix for drop_constant_columns

* minor fix for get_dupes

* minor fix for collapse_levels, primarily for speed

* fix test fails

* fix test fails

* vectorise collapse_levels some more for performance sake

* allow for mutation

* leave collapse_levels as is

* Update test_collapse_levels.py

* Update test_collapse_levels.py

* Update test_collapse_levels.py

* restor collapse_levels to before

* shortcut if all entries are strings in a list in a select call

* use get_indexer_for for lists that contain only strings in select

* make more robust by checking on scalar, instead of just strings

* improve comments

* rebase

* more edits

* remove extra check

* shortcut for *

* exclude api/utils from mkdocs

* exclude api/utils from mkdocs

* simplify move

* avoid mutation in collapse_levels

* make move more robust with select syntax

* docs

* fix docstring

* replicate fill_empty in impute - reduce duplication

* add tests

* fix doctest

* fix docstrings

* defer copy in pivot_wider to pd.pivot

* fix np.bool8 deprecation

* simplify dtype column selection

* fix warning msg output for change_type

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* rebase

* expose _select_index

* add parameters

* use get_index_labels where possible

* add test for multiple columns

* make column selection more robust for sequences

* add test for set/dict selection

* add test for move - both source and target are lists

* exclude utils from docs

* fix test fails

---------

Co-authored-by: samuel.oranyeli <samuel.oranyeli@grow.inc>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants