Applying to iloc of pandas data frame #500

tetrar124 · 2019-02-10T18:19:17Z

Description

Bug fix for selecting columns and add new features for the name of the columns of pandas data frame.

Related issues or pull requests

Pull Request Checklist

Added a note about the modification or contribution to the ./docs/sources/CHANGELOG.md file (if applicable)
Added appropriate unit test functions in the ./mlxtend/*/tests directories (if applicable)
Modify documentation in the corresponding Jupyter Notebook under mlxtend/docs/sources/ (if applicable)
Ran nosetests ./mlxtend -sv and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., nosetests ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)
Checked for style issues by running flake8 ./mlxtend

rasbt#499 Bug fix for selecting columns from the list and it works not only a list but the index of pandas.

pep8speaks · 2019-02-10T18:19:19Z

Hello @tetrar124! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 13, 2019 at 07:57 Hours UTC

add error message and comment for using it

rasbt · 2019-02-10T20:27:57Z

Thanks for the PR. However, instead doing a try-except, I think the following could solve the problem better? (Haven't tested it, but what do you think? Does that cover your use case issue?)

        if hasattr(X, 'loc') or hasattr(X, 'iloc'):
            types = {type(i) for i in self.cols}
            if len(types) > 1:
                raise ValueError('Elements in `cols` should be all of the same data type.')
            if isinstance(cols[0], int):
                t = X.iloc[:, self.cols].values
            elif isinstance(cols[0], str):
                t = X.loc[:, self.cols].values
           else:
               raise ValueError('Elements in `cols` should be either `int` or `str`.')

tetrar124 · 2019-02-11T04:31:41Z

I checked your code on my environment, your code works well (I changed only cols[0] to self.cols[0]).
And I added 2 lines in this code for using a tuple with iloc because of using a tuple in the user's guide.
http://rasbt.github.io/mlxtend/user_guide/feature_selection/ColumnSelector/

        if hasattr(X, 'loc') or hasattr(X, 'iloc'):
            if type(self.cols) == tuple:
               self.cols = list(self.cols)
            types = {type(i) for i in self.cols}
            if len(types) > 1:
                raise ValueError('Elements in `cols` should be all of the same data type.')
            if isinstance(self.cols[0], int):
                t = X.iloc[:, self.cols].values
            elif isinstance(self.cols[0], str):
                t = X.loc[:, self.cols].values
            else:
               raise ValueError('Elements in `cols` should be either `int` or `str`.')

Thank you for your cooperation.

Update for pandas iloc

rasbt · 2019-02-12T08:27:05Z

Thanks for working on this PR! Just as a tip to make the debugging smoother/faster (since the CI is very slow), you can also test your code locally by running

nosetests mlxtend/feature_selection

to see if there are any issues with the code.

comment

coveralls · 2019-02-12T15:37:41Z

Coverage decreased (-0.009%) to 91.587% when pulling 09169ef on tetrar124:patch-4 into 6b57a73 on rasbt:master.

for coverage

mlxtend/feature_selection/column_selector.py

elif to if

tetrar124 · 2019-02-12T16:32:29Z

Sorry, I return elif to if. The reason for using elif is only for the coverage. I checked new lines with nose and flask8, and there are no errors other than 'line too long' from the new comment lines for ValueError.

rasbt · 2019-02-12T16:39:08Z

Ok thanks :)! Btw don't worry about minor changes regarding the coverage too much. Can you please make the line < 79 chars though (splitting it over two lines) -- we decided to strictly enforce PEP8 at some point so that the library won't become too messy over time by making exceptions.

rasbt · 2019-02-12T16:52:39Z

Could you also add an additional unit test for using a DataFrame with integer column names to make sure it works correctly? (I think we don't have that one yet) :)

only new lines in the parenthesis

tetrar124 · 2019-02-12T17:56:08Z

I made the data frame with columns whose names were only integer values, and I checked the type of these columns names showed int.
In this situation, there was no bug but this code has worked only as the mode of iloc.
(I think it was correct work by this code)

rasbt · 2019-02-13T07:56:51Z

Thanks for testing. We should add a unit test though so that we can ensure that the behavior doesn't break in future.

I just added a unit test and changelog entry to this PR. Let me know if you have any further comments. Otherwise, I think it's good to merge if/when the CI tests pass. Thanks for the PR!

bug fix and applying to the index of pandas

7a1652a

rasbt#499 Bug fix for selecting columns from the list and it works not only a list but the index of pandas.

tetrar124 added 2 commits February 11, 2019 03:58

add error message and comment

c542267

add error message and comment for using it

add error message and comment

1503094

add error message and comment for using it

tetrar124 changed the title ~~bug fix and applying to the index of pandas~~ bug fix and applying to the name of the columns of pandas data frame Feb 10, 2019

tetrar124 mentioned this pull request Feb 11, 2019

ColumnSelector has the bug to select columns #499

Closed

Update for pandas iloc

4b8b3e5

Update for pandas iloc

tetrar124 changed the title ~~bug fix and applying to the name of the columns of pandas data frame~~ Applying to iloc of pandas data frame Feb 12, 2019

Update for pandas iloc

c756e73

Update for pandas iloc

comment

d217a3a

comment

if to elif

6b738df

for coverage

rasbt approved these changes Feb 12, 2019

View reviewed changes

mlxtend/feature_selection/column_selector.py Outdated Show resolved Hide resolved

rasbt requested changes Feb 12, 2019

View reviewed changes

mlxtend/feature_selection/column_selector.py Show resolved Hide resolved

elif to if

ce7cb6e

elif to if

rasbt approved these changes Feb 12, 2019

View reviewed changes

For PEP8

6b4548b

only new lines in the parenthesis

add unit test and changelog note

f9f80d2

Merge branch 'master' into patch-4

09169ef

rasbt merged commit 1ca3059 into rasbt:master Feb 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Applying to iloc of pandas data frame #500

Applying to iloc of pandas data frame #500

tetrar124 commented Feb 10, 2019 •

edited

Loading

pep8speaks commented Feb 10, 2019 •

edited

Loading

rasbt commented Feb 10, 2019

tetrar124 commented Feb 11, 2019 •

edited

Loading

rasbt commented Feb 12, 2019

coveralls commented Feb 12, 2019 •

edited

Loading

tetrar124 commented Feb 12, 2019

rasbt commented Feb 12, 2019

rasbt commented Feb 12, 2019

tetrar124 commented Feb 12, 2019 •

edited

Loading

rasbt commented Feb 13, 2019

Applying to iloc of pandas data frame #500

Applying to iloc of pandas data frame #500

Conversation

tetrar124 commented Feb 10, 2019 • edited Loading

Description

Related issues or pull requests

Pull Request Checklist

pep8speaks commented Feb 10, 2019 • edited Loading

Comment last updated on February 13, 2019 at 07:57 Hours UTC

rasbt commented Feb 10, 2019

tetrar124 commented Feb 11, 2019 • edited Loading

rasbt commented Feb 12, 2019

coveralls commented Feb 12, 2019 • edited Loading

tetrar124 commented Feb 12, 2019

rasbt commented Feb 12, 2019

rasbt commented Feb 12, 2019

tetrar124 commented Feb 12, 2019 • edited Loading

rasbt commented Feb 13, 2019

tetrar124 commented Feb 10, 2019 •

edited

Loading

pep8speaks commented Feb 10, 2019 •

edited

Loading

tetrar124 commented Feb 11, 2019 •

edited

Loading

coveralls commented Feb 12, 2019 •

edited

Loading

tetrar124 commented Feb 12, 2019 •

edited

Loading