
<a id='whatsnew-0141'></a>

# v0.14.1 (July 11, 2014)

{{ header }}

This is a minor release from 0.14.0 and includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.

- Highlights include:  
  - New methods `select_dtypes()` to select columns
    based on the dtype and `sem()` to calculate the
    standard error of the mean.  
  - Support for dateutil timezones (see [docs](user_guide/timeseries.ipynb#timeseries-timezone)).  
  - Support for ignoring full line comments in the `read_csv()`
    text parser.  
  - New documentation section on [Options and Settings](user_guide/options.ipynb#options).  
  - Lots of bug fixes.  
- [Enhancements](#whatsnew-0141-enhancements)  
- [API Changes](#whatsnew-0141-api)  
- [Performance Improvements](#whatsnew-0141-performance)  
- [Experimental Changes](#whatsnew-0141-experimental)  
- [Bug Fixes](#whatsnew-0141-bug-fixes)  



<a id='whatsnew-0141-api'></a>

## API changes

- Openpyxl now raises a ValueError on construction of the openpyxl writer
  instead of warning on pandas import ([GH7284](https://github.com/pandas-dev/pandas/issues/7284)).  
- For `StringMethods.extract`, when no match is found, the result - only
  containing `NaN` values - now also has `dtype=object` instead of
  `float` ([GH7242](https://github.com/pandas-dev/pandas/issues/7242))  
- `Period` objects no longer raise a `TypeError` when compared using `==`
  with another object that *isn’t* a `Period`. Instead
  when comparing a `Period` with another object using `==` if the other
  object isn’t a `Period` `False` is returned. ([GH7376](https://github.com/pandas-dev/pandas/issues/7376))  
- Previously, the behaviour on resetting the time or not in
`offsets.apply`, `rollforward` and `rollback` operations differed
between offsets. With the support of the `normalize` keyword for all offsets(see
below) with a default value of False (preserve time), the behaviour changed for certain
offsets (BusinessMonthBegin, MonthEnd, BusinessMonthEnd, CustomBusinessMonthEnd,
BusinessYearBegin, LastWeekOfMonth, FY5253Quarter, LastWeekOfMonth, Easter):  

```ipython
In [6]: from pandas.tseries import offsets

In [7]: d = pd.Timestamp('2014-01-01 09:00')

# old behaviour < 0.14.1
In [8]: d + offsets.MonthEnd()
Out[8]: pd.Timestamp('2014-01-31 00:00:00')
```


Starting from 0.14.1 all offsets preserve time by default. The old
behaviour can be obtained with `normalize=True`  
Note that for the other offsets the default behaviour did not change.  
- Add back `#N/A N/A` as a default NA value in text parsing, (regression from 0.12) ([GH5521](https://github.com/pandas-dev/pandas/issues/5521))  
- Raise a `TypeError` on inplace-setting with a `.where` and a non `np.nan` value as this is inconsistent
  with a set-item expression like `df[mask] = None` ([GH7656](https://github.com/pandas-dev/pandas/issues/7656))  



<a id='whatsnew-0141-enhancements'></a>

## Enhancements

- Add `dropna` argument to `value_counts` and `nunique` ([GH5569](https://github.com/pandas-dev/pandas/issues/5569)).  
- Add `select_dtypes()` method to allow selection of
  columns based on dtype ([GH7316](https://github.com/pandas-dev/pandas/issues/7316)). See [the docs](getting_started/basics.ipynb#basics-selectdtypes).  
- All `offsets` supports the `normalize` keyword to specify whether
`offsets.apply`, `rollforward` and `rollback` resets the time (hour,
minute, etc) or not (default `False`, preserves time) ([GH7156](https://github.com/pandas-dev/pandas/issues/7156)):  

In [None]:
import pandas.tseries.offsets as offsets

day = offsets.Day()
day.apply(pd.Timestamp('2014-01-01 09:00'))

day = offsets.Day(normalize=True)
day.apply(pd.Timestamp('2014-01-01 09:00'))

- `PeriodIndex` is represented as the same format as `DatetimeIndex` ([GH7601](https://github.com/pandas-dev/pandas/issues/7601))  
- `StringMethods` now work on empty Series ([GH7242](https://github.com/pandas-dev/pandas/issues/7242))  
- The file parsers `read_csv` and `read_table` now ignore line comments provided by
  the parameter comment, which accepts only a single character for the C reader.
  In particular, they allow for comments before file data begins ([GH2685](https://github.com/pandas-dev/pandas/issues/2685))  
- Add `NotImplementedError` for simultaneous use of `chunksize` and `nrows`
  for read_csv() ([GH6774](https://github.com/pandas-dev/pandas/issues/6774)).  
- Tests for basic reading of public S3 buckets now exist ([GH7281](https://github.com/pandas-dev/pandas/issues/7281)).  
- `read_html` now sports an `encoding` argument that is passed to the
  underlying parser library. You can use this to read non-ascii encoded web
  pages ([GH7323](https://github.com/pandas-dev/pandas/issues/7323)).  
- `read_excel` now supports reading from URLs in the same way
  that `read_csv` does.  ([GH6809](https://github.com/pandas-dev/pandas/issues/6809))  
- Support for dateutil timezones, which can now be used in the same way as
  pytz timezones across pandas. ([GH4688](https://github.com/pandas-dev/pandas/issues/4688))  
  See [the docs](user_guide/timeseries.ipynb#timeseries-timezone).  
- Implemented `sem` (standard error of the mean) operation for `Series`,
  `DataFrame`, `Panel`, and `Groupby` ([GH6897](https://github.com/pandas-dev/pandas/issues/6897))  
- Add `nlargest` and `nsmallest` to the `Series` `groupby` whitelist,
  which means you can now use these methods on a `SeriesGroupBy` object
  ([GH7053](https://github.com/pandas-dev/pandas/issues/7053)).  
- All offsets `apply`, `rollforward` and `rollback` can now handle `np.datetime64`, previously results in `ApplyTypeError` ([GH7452](https://github.com/pandas-dev/pandas/issues/7452))  
- `Period` and `PeriodIndex` can contain `NaT` in its values ([GH7485](https://github.com/pandas-dev/pandas/issues/7485))  
- Support pickling `Series`, `DataFrame` and `Panel` objects with
  non-unique labels along *item* axis (`index`, `columns` and `items`
  respectively) ([GH7370](https://github.com/pandas-dev/pandas/issues/7370)).  
- Improved inference of datetime/timedelta with mixed null objects. Regression from 0.13.1 in interpretation of an object Index
  with all null elements ([GH7431](https://github.com/pandas-dev/pandas/issues/7431))  



<a id='whatsnew-0141-performance'></a>

## Performance

- Improvements in dtype inference for numeric operations involving yielding performance gains for dtypes: `int64`, `timedelta64`, `datetime64` ([GH7223](https://github.com/pandas-dev/pandas/issues/7223))  
- Improvements in Series.transform for significant performance gains ([GH6496](https://github.com/pandas-dev/pandas/issues/6496))  
- Improvements in DataFrame.transform with ufuncs and built-in grouper functions for significant performance gains ([GH7383](https://github.com/pandas-dev/pandas/issues/7383))  
- Regression in groupby aggregation of datetime64 dtypes ([GH7555](https://github.com/pandas-dev/pandas/issues/7555))  
- Improvements in MultiIndex.from_product for large iterables ([GH7627](https://github.com/pandas-dev/pandas/issues/7627))  



<a id='whatsnew-0141-experimental'></a>

## Experimental

- `pandas.io.data.Options` has a new method, `get_all_data` method, and now consistently returns a
  MultiIndexed `DataFrame` ([GH5602](https://github.com/pandas-dev/pandas/issues/5602))  
- `io.gbq.read_gbq` and `io.gbq.to_gbq` were refactored to remove the
  dependency on the Google `bq.py` command line client. This submodule
  now uses `httplib2` and the Google `apiclient` and `oauth2client` API client
  libraries which should be more stable and, therefore, reliable than
  `bq.py`. See [the docs](user_guide/io.ipynb#io-bigquery). ([GH6937](https://github.com/pandas-dev/pandas/issues/6937)).  



<a id='whatsnew-0141-bug-fixes'></a>

## Bug fixes

- Bug in `DataFrame.where` with a symmetric shaped frame and a passed other of a DataFrame ([GH7506](https://github.com/pandas-dev/pandas/issues/7506))  
- Bug in Panel indexing with a MultiIndex axis ([GH7516](https://github.com/pandas-dev/pandas/issues/7516))  
- Regression in datetimelike slice indexing with a duplicated index and non-exact end-points ([GH7523](https://github.com/pandas-dev/pandas/issues/7523))  
- Bug in setitem with list-of-lists and single vs mixed types ([GH7551](https://github.com/pandas-dev/pandas/issues/7551):)  
- Bug in time ops with non-aligned Series ([GH7500](https://github.com/pandas-dev/pandas/issues/7500))  
- Bug in timedelta inference when assigning an incomplete Series ([GH7592](https://github.com/pandas-dev/pandas/issues/7592))  
- Bug in groupby `.nth` with a Series and integer-like column name ([GH7559](https://github.com/pandas-dev/pandas/issues/7559))  
- Bug in `Series.get` with a boolean accessor ([GH7407](https://github.com/pandas-dev/pandas/issues/7407))  
- Bug in `value_counts` where `NaT` did not qualify as missing (`NaN`) ([GH7423](https://github.com/pandas-dev/pandas/issues/7423))  
- Bug in `to_timedelta` that accepted invalid units and misinterpreted ‘m/h’ ([GH7611](https://github.com/pandas-dev/pandas/issues/7611), [GH6423](https://github.com/pandas-dev/pandas/issues/6423))  
- Bug in line plot doesn’t set correct `xlim` if `secondary_y=True` ([GH7459](https://github.com/pandas-dev/pandas/issues/7459))  
- Bug in grouped `hist` and `scatter` plots use old `figsize` default ([GH7394](https://github.com/pandas-dev/pandas/issues/7394))  
- Bug in plotting subplots with `DataFrame.plot`, `hist` clears passed `ax` even if the number of subplots is one ([GH7391](https://github.com/pandas-dev/pandas/issues/7391)).  
- Bug in plotting subplots with `DataFrame.boxplot` with `by` kw raises `ValueError` if the number of subplots exceeds 1 ([GH7391](https://github.com/pandas-dev/pandas/issues/7391)).  
- Bug in subplots displays `ticklabels` and `labels` in different rule ([GH5897](https://github.com/pandas-dev/pandas/issues/5897))  
- Bug in `Panel.apply` with a MultiIndex as an axis ([GH7469](https://github.com/pandas-dev/pandas/issues/7469))  
- Bug in `DatetimeIndex.insert` doesn’t preserve `name` and `tz` ([GH7299](https://github.com/pandas-dev/pandas/issues/7299))  
- Bug in `DatetimeIndex.asobject` doesn’t preserve `name` ([GH7299](https://github.com/pandas-dev/pandas/issues/7299))  
- Bug in MultiIndex slicing with datetimelike ranges (strings and Timestamps), ([GH7429](https://github.com/pandas-dev/pandas/issues/7429))  
- Bug in `Index.min` and `max` doesn’t handle `nan` and `NaT` properly ([GH7261](https://github.com/pandas-dev/pandas/issues/7261))  
- Bug in `PeriodIndex.min/max` results in `int` ([GH7609](https://github.com/pandas-dev/pandas/issues/7609))  
- Bug in `resample` where `fill_method` was ignored if you passed `how` ([GH2073](https://github.com/pandas-dev/pandas/issues/2073))  
- Bug in `TimeGrouper` doesn’t exclude column specified by `key` ([GH7227](https://github.com/pandas-dev/pandas/issues/7227))  
- Bug in `DataFrame` and `Series` bar and barh plot raises `TypeError` when `bottom`
  and `left` keyword is specified ([GH7226](https://github.com/pandas-dev/pandas/issues/7226))  
- Bug in `DataFrame.hist` raises `TypeError` when it contains non numeric column ([GH7277](https://github.com/pandas-dev/pandas/issues/7277))  
- Bug in `Index.delete` does not preserve `name` and `freq` attributes ([GH7302](https://github.com/pandas-dev/pandas/issues/7302))  
- Bug in `DataFrame.query()`/`eval` where local string variables with the @
  sign were being treated as temporaries attempting to be deleted
  ([GH7300](https://github.com/pandas-dev/pandas/issues/7300)).  
- Bug in `Float64Index` which didn’t allow duplicates ([GH7149](https://github.com/pandas-dev/pandas/issues/7149)).  
- Bug in `DataFrame.replace()` where truthy values were being replaced
  ([GH7140](https://github.com/pandas-dev/pandas/issues/7140)).  
- Bug in `StringMethods.extract()` where a single match group Series
  would use the matcher’s name instead of the group name ([GH7313](https://github.com/pandas-dev/pandas/issues/7313)).  
- Bug in `isnull()` when `mode.use_inf_as_null == True` where isnull
  wouldn’t test `True` when it encountered an `inf`/`-inf`
  ([GH7315](https://github.com/pandas-dev/pandas/issues/7315)).  
- Bug in inferred_freq results in None for eastern hemisphere timezones ([GH7310](https://github.com/pandas-dev/pandas/issues/7310))  
- Bug in `Easter` returns incorrect date when offset is negative ([GH7195](https://github.com/pandas-dev/pandas/issues/7195))  
- Bug in broadcasting with `.div`, integer dtypes and divide-by-zero ([GH7325](https://github.com/pandas-dev/pandas/issues/7325))  
- Bug in `CustomBusinessDay.apply` raises `NameError` when `np.datetime64` object is passed ([GH7196](https://github.com/pandas-dev/pandas/issues/7196))  
- Bug in `MultiIndex.append`, `concat` and `pivot_table` don’t preserve timezone ([GH6606](https://github.com/pandas-dev/pandas/issues/6606))  
- Bug in `.loc` with a list of indexers on a single-multi index level (that is not nested) ([GH7349](https://github.com/pandas-dev/pandas/issues/7349))  
- Bug in `Series.map` when mapping a dict with tuple keys of different lengths ([GH7333](https://github.com/pandas-dev/pandas/issues/7333))  
- Bug all `StringMethods` now work on empty Series ([GH7242](https://github.com/pandas-dev/pandas/issues/7242))  
- Fix delegation of read_sql to read_sql_query when query does not contain ‘select’ ([GH7324](https://github.com/pandas-dev/pandas/issues/7324)).  
- Bug where a string column name assignment to a `DataFrame` with a
  `Float64Index` raised a `TypeError` during a call to `np.isnan`
  ([GH7366](https://github.com/pandas-dev/pandas/issues/7366)).  
- Bug where `NDFrame.replace()` didn’t correctly replace objects with
  `Period` values ([GH7379](https://github.com/pandas-dev/pandas/issues/7379)).  
- Bug in `.ix` getitem should always return a Series ([GH7150](https://github.com/pandas-dev/pandas/issues/7150))  
- Bug in MultiIndex slicing with incomplete indexers ([GH7399](https://github.com/pandas-dev/pandas/issues/7399))  
- Bug in MultiIndex slicing with a step in a sliced level ([GH7400](https://github.com/pandas-dev/pandas/issues/7400))  
- Bug where negative indexers in `DatetimeIndex` were not correctly sliced
  ([GH7408](https://github.com/pandas-dev/pandas/issues/7408))  
- Bug where `NaT` wasn’t repr’d correctly in a `MultiIndex` ([GH7406](https://github.com/pandas-dev/pandas/issues/7406),
  [GH7409](https://github.com/pandas-dev/pandas/issues/7409)).  
- Bug where bool objects were converted to `nan` in `convert_objects`
  ([GH7416](https://github.com/pandas-dev/pandas/issues/7416)).  
- Bug in `quantile` ignoring the axis keyword argument ([GH7306](https://github.com/pandas-dev/pandas/issues/7306))  
- Bug where `nanops._maybe_null_out` doesn’t work with complex numbers
  ([GH7353](https://github.com/pandas-dev/pandas/issues/7353))  
- Bug in several `nanops` functions when `axis==0` for
  1-dimensional `nan` arrays ([GH7354](https://github.com/pandas-dev/pandas/issues/7354))  
- Bug where `nanops.nanmedian` doesn’t work when `axis==None`
  ([GH7352](https://github.com/pandas-dev/pandas/issues/7352))  
- Bug where `nanops._has_infs` doesn’t work with many dtypes
  ([GH7357](https://github.com/pandas-dev/pandas/issues/7357))  
- Bug in `StataReader.data` where reading a 0-observation dta failed ([GH7369](https://github.com/pandas-dev/pandas/issues/7369))  
- Bug in `StataReader` when reading Stata 13 (117) files containing fixed width strings ([GH7360](https://github.com/pandas-dev/pandas/issues/7360))  
- Bug in `StataWriter` where encoding was ignored ([GH7286](https://github.com/pandas-dev/pandas/issues/7286))  
- Bug in `DatetimeIndex` comparison doesn’t handle `NaT` properly ([GH7529](https://github.com/pandas-dev/pandas/issues/7529))  
- Bug in passing input with `tzinfo` to some offsets `apply`, `rollforward` or `rollback` resets `tzinfo` or raises `ValueError` ([GH7465](https://github.com/pandas-dev/pandas/issues/7465))  
- Bug in `DatetimeIndex.to_period`, `PeriodIndex.asobject`, `PeriodIndex.to_timestamp` doesn’t preserve `name` ([GH7485](https://github.com/pandas-dev/pandas/issues/7485))  
- Bug in `DatetimeIndex.to_period` and `PeriodIndex.to_timestamp` handle `NaT` incorrectly ([GH7228](https://github.com/pandas-dev/pandas/issues/7228))  
- Bug in `offsets.apply`, `rollforward` and `rollback` may return normal `datetime` ([GH7502](https://github.com/pandas-dev/pandas/issues/7502))  
- Bug in `resample` raises `ValueError` when target contains `NaT` ([GH7227](https://github.com/pandas-dev/pandas/issues/7227))  
- Bug in `Timestamp.tz_localize` resets `nanosecond` info ([GH7534](https://github.com/pandas-dev/pandas/issues/7534))  
- Bug in `DatetimeIndex.asobject` raises `ValueError` when it contains `NaT` ([GH7539](https://github.com/pandas-dev/pandas/issues/7539))  
- Bug in `Timestamp.__new__` doesn’t preserve nanosecond properly ([GH7610](https://github.com/pandas-dev/pandas/issues/7610))  
- Bug in `Index.astype(float)` where it would return an `object` dtype
  `Index` ([GH7464](https://github.com/pandas-dev/pandas/issues/7464)).  
- Bug in `DataFrame.reset_index` loses `tz` ([GH3950](https://github.com/pandas-dev/pandas/issues/3950))  
- Bug in `DatetimeIndex.freqstr` raises `AttributeError` when `freq` is `None` ([GH7606](https://github.com/pandas-dev/pandas/issues/7606))  
- Bug in `GroupBy.size` created by `TimeGrouper` raises `AttributeError` ([GH7453](https://github.com/pandas-dev/pandas/issues/7453))  
- Bug in single column bar plot is misaligned ([GH7498](https://github.com/pandas-dev/pandas/issues/7498)).  
- Bug in area plot with tz-aware time series raises `ValueError` ([GH7471](https://github.com/pandas-dev/pandas/issues/7471))  
- Bug in non-monotonic `Index.union` may preserve `name` incorrectly ([GH7458](https://github.com/pandas-dev/pandas/issues/7458))  
- Bug in `DatetimeIndex.intersection` doesn’t preserve timezone ([GH4690](https://github.com/pandas-dev/pandas/issues/4690))  
- Bug in `rolling_var` where a window larger than the array would raise an error([GH7297](https://github.com/pandas-dev/pandas/issues/7297))  
- Bug with last plotted timeseries dictating `xlim` ([GH2960](https://github.com/pandas-dev/pandas/issues/2960))  
- Bug with `secondary_y` axis not being considered for timeseries `xlim` ([GH3490](https://github.com/pandas-dev/pandas/issues/3490))  
- Bug in `Float64Index` assignment with a non scalar indexer ([GH7586](https://github.com/pandas-dev/pandas/issues/7586))  
- Bug in `pandas.core.strings.str_contains` does not properly match in a case insensitive fashion when `regex=False` and `case=False` ([GH7505](https://github.com/pandas-dev/pandas/issues/7505))  
- Bug in `expanding_cov`, `expanding_corr`, `rolling_cov`, and `rolling_corr` for two arguments with mismatched index  ([GH7512](https://github.com/pandas-dev/pandas/issues/7512))  
- Bug in `to_sql` taking the boolean column as text column ([GH7678](https://github.com/pandas-dev/pandas/issues/7678))  
- Bug in grouped hist doesn’t handle rot kw and sharex kw properly ([GH7234](https://github.com/pandas-dev/pandas/issues/7234))  
- Bug in `.loc` performing fallback integer indexing with `object` dtype indices ([GH7496](https://github.com/pandas-dev/pandas/issues/7496))  
- Bug (regression) in `PeriodIndex` constructor when passed `Series` objects ([GH7701](https://github.com/pandas-dev/pandas/issues/7701)).  



<a id='whatsnew-0-14-1-contributors'></a>

## Contributors

A total of 46 people contributed patches to this release.  People with a
“+” by their names contributed a patch for the first time.


- Andrew Rosenfeld  
- Andy Hayden  
- Benjamin Adams +  
- Benjamin M. Gross +  
- Brian Quistorff +  
- Brian Wignall +  
- DSM  
- Daniel Waeber  
- David Bew +  
- David Stephens  
- Jacob Schaer  
- Jan Schulz  
- John David Reaver  
- John W. O’Brien  
- Joris Van den Bossche  
- Julien Danjou +  
- K.-Michael Aye  
- Kevin Sheppard  
- Kyle Meyer  
- Matt Wittmann  
- Matthew Brett +  
- Michael Mueller +  
- Mortada Mehyar  
- Phillip Cloud  
- Rob Levy +  
- Schaer, Jacob C +  
- Stephan Hoyer  
- Thomas Kluyver  
- Todd Jennings  
- Tom Augspurger  
- TomAugspurger  
- bwignall  
- clham  
- dsm054 +  
- helger +  
- immerrr  
- jaimefrio  
- jreback  
- lexual  
- onesandzeroes  
- rockg  
- sanguineturtle +  
- seth-p +  
- sinhrks  
- unknown  
- yelite +  