Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"errors" parameter has no effect on pd.DataFrame.astype if a dictionary is passed in as the dtype argument #25905

Closed
nandrea1 opened this issue Mar 28, 2019 · 2 comments

Comments

@nandrea1
Copy link

commented Mar 28, 2019

Code Sample, a copy-pastable example if possible

Given the code sample below:

data_df = pd.DataFrame([{'col_a': '1', 'col_b': '16.5%', 'col_c': 'test'},
                        {'col_a': '2.2', 'col_b': '15.3', 'col_c': 'another_test'}
                       ]) 
type_dict = {'col_a': 'float64', 'col_b': 'float64', 'col_c': 'object}
data_df = data_df.astype(dtype=type_dict, errors='ignore')

Problem description

Pandas will raise the error ValueError: could not convert string to float: '16.5%' despite the fact that the errors='ignore' parameter was passed in. This is probable due to the implementation of astype, as can be seen in pandas.core.generic.py lines 5658 - 5680

        if is_dict_like(dtype):
            if self.ndim == 1:  # i.e. Series
                if len(dtype) > 1 or self.name not in dtype:
                    raise KeyError('Only the Series name can be used for '
                                   'the key in Series dtype mappings.')
                new_type = dtype[self.name]
                return self.astype(new_type, copy, errors, **kwargs)
            elif self.ndim > 2:
                raise NotImplementedError(
                    'astype() only accepts a dtype arg of type dict when '
                    'invoked on Series and DataFrames. A single dtype must be '
                    'specified when invoked on a Panel.'
                )
            for col_name in dtype.keys():
                if col_name not in self:
                    raise KeyError('Only a column name can be used for the '
                                   'key in a dtype mappings argument.')
            results = []
            for col_name, col in self.iteritems():
                if col_name in dtype:
                    results.append(col.astype(dtype[col_name], copy=copy))
                else:
                    results.append(results.append(col.copy() if copy else col))

Expected Output

One would expect the pd.DataFrame to successfully type cast, leaving the col_b as an np.object type

@WillAyd

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

I believe this is what #25888 is trying to solve

johnklehm added a commit to johnklehm/pandas that referenced this issue Mar 29, 2019

Update the test data to relfect pandas-dev#25905. Add an explicit che…
…ck for the absense of the exception we'd normally throw. Assert the resulting dataframe against a reference dataframe. Move the test case to test_dtypes.
@WillAyd

This comment has been minimized.

Copy link
Member

commented Apr 1, 2019

Closed via #25888

@WillAyd WillAyd closed this Apr 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.