Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case insensitive bool parsing #1295

Closed
changhiskhan opened this issue May 23, 2012 · 3 comments

Comments

@changhiskhan
Copy link
Contributor

commented May 23, 2012

From mailing list:

This is something that surprised me.
When reading from csv the values "True" and "False" are converted to bool, while "TRUE" and "FALSE"' to object
Using a converter {...: np.bool} I could take it into account.
Is it useful to take more possibilities into account knowing that there are a lot of possibilies True/False, Yes/No, ....?

In [19]: data = '''A;B
....: True;False
....: False;True'''
In [20]: df = pd.read_csv(StringIO(data), sep=';')
In [22]: df.dtypes
Out[22]:
A bool
B bool

In [23]: data = '''A;B
....: TRUE;FALSE
....: FALSE;TRUE'''

In [24]: df = pd.read_csv(StringIO(data), sep=';')

In [25]: df.dtypes
Out[25]:
A object
B object

In [27]: df = pd.read_csv(StringIO(data), sep=';', converters={'A':np.bool})

In [28]: df.dtypes
Out[28]:
A bool
B object

@moleary

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2012

Hi,

I'm new to pandas, but I'd like to get involved. Is this something I could take a crack at?

Mark

@changhiskhan

This comment has been minimized.

Copy link
Contributor Author

commented Jul 25, 2012

Absolutely. If you haven't already done so, a good starting point would be to read the pandas developer page, the pandas examples on read_csv etc, and then dive into the pandas.io.parsers module.
We look forward to your pull request!

@moleary

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2012

Okay I'll take a look at it. Thanks.

moleary added a commit to moleary/pandas that referenced this issue Jul 26, 2012

@wesm wesm closed this Sep 18, 2012

yarikoptic added a commit to neurodebian/pandas that referenced this issue Sep 27, 2012
Merge commit 'v0.8.1-203-g67121af' into debian
* commit 'v0.8.1-203-g67121af': (193 commits)
  BUG: DataFrame column formatting issue in length-truncated column close pandas-dev#1906
  BUG: override min/max in DatetimeIndex to function as expected close pandas-dev#1895
  BUG: DataFrame mixed-type arithmetic column-wise, fix DataFrame.diff upcasting->object bug close pandas-dev#1896
  BUG: treat nobs=1 >= min_periods case in rolling_std/variance as 0 trivially. close pandas-dev#1884
  TST: skip to_file test if URLError occurs on some systems
  VB: resolve test name conflict and update make script
  DOC: minor change to build script to help auto build process
  DOC: fixed extlinks in sphinx conf
  TST: oops import in wrong place
  TST: skip test_console_encode if sys.stdin.encoding is None
  TST: unit test for pandas-dev#1902 and default to csv.QUOTE_MINIMAL
  Make it possible to set quoting for to_csv
  ENH: clean up pandas-dev#1691 changes, rls note
  ENH: add more possible bool values to read_csv pandas-dev#1295
  BUG: fix rolling_max/min for small inputs and large windows. Add a check that the min_period <= window size. Fixes pandas-dev#1897.
  Mention Ubuntu for NeuroDebian repository
  BUG: don't clobber color keyword in Series.plot, close pandas-dev#1890
  DOC: add intersphinx mapping for python library, close pandas-dev#1556
  BUG: fix mixed-integer .ix indexing bugs. closepandas-dev#1799
  BUG: unicode sheet name in to_excel pandas-dev#1828
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.