
<a id='whatsnew-0210'></a>

# v0.21.0 (October 27, 2017)

{{ header }}

This is a major release from 0.20.3 and includes a number of API changes, deprecations, new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.

Highlights include:

- Integration with [Apache Parquet](https://parquet.apache.org/), including a new top-level `read_parquet()` function and `DataFrame.to_parquet()` method, see [here](#whatsnew-0210-enhancements-parquet).  
- New user-facing `pandas.api.types.CategoricalDtype` for specifying
  categoricals independent of the data, see [here](#whatsnew-0210-enhancements-categorical-dtype).  
- The behavior of `sum` and `prod` on all-NaN Series/DataFrames is now consistent and no longer depends on whether [bottleneck](http://berkeleyanalytics.com/bottleneck) is installed, and `sum` and `prod` on empty Series now return NaN instead of 0, see [here](#whatsnew-0210-api-breaking-bottleneck).  
- Compatibility fixes for pypy, see [here](#whatsnew-0210-pypy).  
- Additions to the `drop`, `reindex` and `rename` API to make them more consistent, see [here](#whatsnew-0210-enhancements-drop-api).  
- Addition of the new methods `DataFrame.infer_objects` (see [here](#whatsnew-0210-enhancements-infer-objects)) and `GroupBy.pipe` (see [here](#whatsnew-0210-enhancements-groupby-pipe)).  
- Indexing with a list of labels, where one or more of the labels is missing, is deprecated and will raise a KeyError in a future version, see [here](#whatsnew-0210-api-breaking-loc).  


Check the [API Changes](#whatsnew-0210-api-breaking) and [deprecations](#whatsnew-0210-deprecations) before updating.

## What’s new in v0.21.0

- [New features](#New-features)  
  - [Integration with Apache Parquet file format](#Integration-with-Apache-Parquet-file-format)  
  - [`infer_objects` type conversion](#`infer_objects`-type-conversion)  
  - [Improved warnings when attempting to create columns](#Improved-warnings-when-attempting-to-create-columns)  
  - [`drop` now also accepts index/columns keywords](#`drop`-now-also-accepts-index/columns-keywords)  
  - [`rename`, `reindex` now also accept axis keyword](#`rename`,-`reindex`-now-also-accept-axis-keyword)  
  - [`CategoricalDtype` for specifying categoricals](#`CategoricalDtype`-for-specifying-categoricals)  
  - [`GroupBy` objects now have a `pipe` method](#`GroupBy`-objects-now-have-a-`pipe`-method)  
  - [`Categorical.rename_categories` accepts a dict-like](#`Categorical.rename_categories`-accepts-a-dict-like)  
  - [Other enhancements](#Other-enhancements)  
- [Backwards incompatible API changes](#Backwards-incompatible-API-changes)  
  - [Dependencies have increased minimum versions](#Dependencies-have-increased-minimum-versions)  
  - [Sum/Prod of all-NaN or empty Series/DataFrames is now consistently NaN](#Sum/Prod-of-all-NaN-or-empty-Series/DataFrames-is-now-consistently-NaN)  
  - [Indexing with a list with missing labels is deprecated](#Indexing-with-a-list-with-missing-labels-is-deprecated)  
  - [NA naming changes](#NA-naming-changes)  
  - [Iteration of Series/Index will now return Python scalars](#Iteration-of-Series/Index-will-now-return-Python-scalars)  
  - [Indexing with a Boolean Index](#Indexing-with-a-Boolean-Index)  
  - [`PeriodIndex` resampling](#`PeriodIndex`-resampling)  
  - [Improved error handling during item assignment in pd.eval](#Improved-error-handling-during-item-assignment-in-pd.eval)  
  - [Dtype conversions](#Dtype-conversions)  
  - [MultiIndex constructor with a single level](#MultiIndex-constructor-with-a-single-level)  
  - [UTC Localization with Series](#UTC-Localization-with-Series)  
  - [Consistency of range functions](#Consistency-of-range-functions)  
  - [No automatic Matplotlib converters](#No-automatic-Matplotlib-converters)  
  - [Other API changes](#Other-API-changes)  
- [Deprecations](#Deprecations)  
  - [Series.select and DataFrame.select](#Series.select-and-DataFrame.select)  
  - [Series.argmax and Series.argmin](#Series.argmax-and-Series.argmin)  
- [Removal of prior version deprecations/changes](#Removal-of-prior-version-deprecations/changes)  
- [Performance improvements](#Performance-improvements)  
- [Documentation changes](#Documentation-changes)  
- [Bug fixes](#Bug-fixes)  
  - [Conversion](#Conversion)  
  - [Indexing](#Indexing)  
  - [I/O](#I/O)  
  - [Plotting](#Plotting)  
  - [Groupby/resample/rolling](#Groupby/resample/rolling)  
  - [Sparse](#Sparse)  
  - [Reshaping](#Reshaping)  
  - [Numeric](#Numeric)  
  - [Categorical](#Categorical)  
  - [PyPy](#PyPy)  
  - [Other](#Other)  
- [Contributors](#Contributors)  


<a id='whatsnew-0210-enhancements'></a>

## New features


<a id='whatsnew-0210-enhancements-parquet'></a>

### Integration with Apache Parquet file format

Integration with [Apache Parquet](https://parquet.apache.org/), including a new top-level `read_parquet()` and `DataFrame.to_parquet()` method, see [here](user_guide/io.ipynb#io-parquet) ([GH15838](https://github.com/pandas-dev/pandas/issues/15838), [GH17438](https://github.com/pandas-dev/pandas/issues/17438)).

[Apache Parquet](https://parquet.apache.org/) provides a cross-language, binary file format for reading and writing data frames efficiently.
Parquet is designed to faithfully serialize and de-serialize `DataFrame` s, supporting all of the pandas
dtypes, including extension dtypes such as datetime with timezones.

This functionality depends on either the [pyarrow](http://arrow.apache.org/docs/python/) or [fastparquet](https://fastparquet.readthedocs.io/en/latest/) library.
For more details, see see [the IO docs on Parquet](user_guide/io.ipynb#io-parquet).


<a id='whatsnew-0210-enhancements-infer-objects'></a>

### `infer_objects` type conversion

The `DataFrame.infer_objects()` and `Series.infer_objects()`
methods have been added to perform dtype inference on object columns, replacing
some of the functionality of the deprecated `convert_objects`
method. See the documentation [here](getting_started/basics.ipynb#basics-object-conversion)
for more details. ([GH11221](https://github.com/pandas-dev/pandas/issues/11221))

This method only performs soft conversions on object columns, converting Python objects
to native types, but not any coercive conversions. For example:

Note that column `'C'` was not converted - only scalar numeric types
will be converted to a new type.  Other types of conversion should be accomplished
using the `to_numeric()` function (or `to_datetime()`, `to_timedelta()`).


<a id='whatsnew-0210-enhancements-attribute-access'></a>

### Improved warnings when attempting to create columns

New users are often puzzled by the relationship between column operations and
attribute access on `DataFrame` instances ([GH7175](https://github.com/pandas-dev/pandas/issues/7175)). One specific
instance of this confusion is attempting to create a new column by setting an
attribute on the `DataFrame`:

```ipython
In [1]: df = pd.DataFrame({'one': [1., 2., 3.]})
In [2]: df.two = [4, 5, 6]
```


This does not raise any obvious exceptions, but also does not create a new column:

```ipython
In [3]: df
Out[3]:
    one
0  1.0
1  2.0
2  3.0
```


Setting a list-like data structure into a new attribute now raises a `UserWarning` about the potential for unexpected behavior. See [Attribute Access](user_guide/indexing.ipynb#indexing-attribute-access).


<a id='whatsnew-0210-enhancements-drop-api'></a>

### `drop` now also accepts index/columns keywords

The `drop()` method has gained `index`/`columns` keywords as an
alternative to specifying the `axis`. This is similar to the behavior of `reindex`
([GH12392](https://github.com/pandas-dev/pandas/issues/12392)).

For example:


<a id='whatsnew-0210-enhancements-rename-reindex-axis'></a>

### `rename`, `reindex` now also accept axis keyword

The `DataFrame.rename()` and `DataFrame.reindex()` methods have gained
the `axis` keyword to specify the axis to target with the operation
([GH12392](https://github.com/pandas-dev/pandas/issues/12392)).

Here’s `rename`:

And `reindex`:

The “index, columns” style continues to work as before.

We *highly* encourage using named arguments to avoid confusion when using either
style.


<a id='whatsnew-0210-enhancements-categorical-dtype'></a>

### `CategoricalDtype` for specifying categoricals

`pandas.api.types.CategoricalDtype` has been added to the public API and
expanded to include the `categories` and `ordered` attributes. A
`CategoricalDtype` can be used to specify the set of categories and
orderedness of an array, independent of the data. This can be useful for example,
when converting string data to a `Categorical` ([GH14711](https://github.com/pandas-dev/pandas/issues/14711),
[GH15078](https://github.com/pandas-dev/pandas/issues/15078), [GH16015](https://github.com/pandas-dev/pandas/issues/16015), [GH17643](https://github.com/pandas-dev/pandas/issues/17643)):

One place that deserves special mention is in `read_csv()`. Previously, with
`dtype={'col': 'category'}`, the returned values and categories would always
be strings.

Notice the “object” dtype.

With a `CategoricalDtype` of all numerics, datetimes, or
timedeltas, we can automatically convert to the correct type

The values have been correctly interpreted as integers.

The `.dtype` property of a `Categorical`, `CategoricalIndex` or a
`Series` with categorical type will now return an instance of
`CategoricalDtype`. While the repr has changed, `str(CategoricalDtype())` is
still the string `'category'`. We’ll take this moment to remind users that the
*preferred* way to detect categorical data is to use
`pandas.api.types.is_categorical_dtype()`, and not `str(dtype) == 'category'`.

See the [CategoricalDtype docs](user_guide/categorical.ipynb#categorical-categoricaldtype) for more.


<a id='whatsnew-0210-enhancements-groupby-pipe'></a>

### `GroupBy` objects now have a `pipe` method

`GroupBy` objects now have a `pipe` method, similar to the one on
`DataFrame` and `Series`, that allow for functions that take a
`GroupBy` to be composed in a clean, readable syntax. ([GH17871](https://github.com/pandas-dev/pandas/issues/17871))

For a concrete example on combining `.groupby` and `.pipe` , imagine having a
DataFrame with columns for stores, products, revenue and sold quantity. We’d like to
do a groupwise calculation of *prices* (i.e. revenue/quantity) per store and per product.
We could do this in a multi-step operation, but expressing it in terms of piping can make the
code more readable.

First we set the data:

Now, to find prices per store/product, we can simply do:

See the [documentation](user_guide/groupby.ipynb#groupby-pipe) for more.


<a id='whatsnew-0210-enhancements-rename-categories'></a>

### `Categorical.rename_categories` accepts a dict-like

`rename_categories()` now accepts a dict-like argument for
`new_categories`. The previous categories are looked up in the dictionary’s
keys and replaced if found. The behavior of missing and extra keys is the same
as in `DataFrame.rename()`.

To assist with upgrading pandas, `rename_categories` treats `Series` as
list-like. Typically, Series are considered to be dict-like (e.g. in
`.rename`, `.map`). In a future version of pandas `rename_categories`
will change to treat them as dict-like. Follow the warning message’s
recommendations for writing future-proof code.

```ipython
In [33]: c.rename_categories(pd.Series([0, 1], index=['a', 'c']))
FutureWarning: Treating Series 'new_categories' as a list-like and using the values.
In a future version, 'rename_categories' will treat Series like a dictionary.
For dict-like, use 'new_categories.to_dict()'
For list-like, use 'new_categories.values'.
Out[33]:
[0, 0, 1]
Categories (2, int64): [0, 1]
```



<a id='whatsnew-0210-enhancements-other'></a>

### Other enhancements

#### New functions or methods

- `nearest()` is added to support nearest-neighbor upsampling ([GH17496](https://github.com/pandas-dev/pandas/issues/17496)).  
- `Index` has added support for a `to_frame` method ([GH15230](https://github.com/pandas-dev/pandas/issues/15230)).  

#### New keywords

- Added a `skipna` parameter to `infer_dtype()` to
  support type inference in the presence of missing values ([GH17059](https://github.com/pandas-dev/pandas/issues/17059)).  
- `Series.to_dict()` and `DataFrame.to_dict()` now support an `into` keyword which allows you to specify the `collections.Mapping` subclass that you would like returned.  The default is `dict`, which is backwards compatible. ([GH16122](https://github.com/pandas-dev/pandas/issues/16122))  
- `Series.set_axis()` and `DataFrame.set_axis()` now support the `inplace` parameter. ([GH14636](https://github.com/pandas-dev/pandas/issues/14636))  
- `Series.to_pickle()` and `DataFrame.to_pickle()` have gained a `protocol` parameter ([GH16252](https://github.com/pandas-dev/pandas/issues/16252)). By default, this parameter is set to [HIGHEST_PROTOCOL](https://docs.python.org/3/library/pickle.html#data-stream-format)  
- `read_feather()` has gained the `nthreads` parameter for multi-threaded operations ([GH16359](https://github.com/pandas-dev/pandas/issues/16359))  
- `DataFrame.clip()` and `Series.clip()` have gained an `inplace` argument. ([GH15388](https://github.com/pandas-dev/pandas/issues/15388))  
- `crosstab()` has gained a `margins_name` parameter to define the name of the row / column that will contain the totals when `margins=True`. ([GH15972](https://github.com/pandas-dev/pandas/issues/15972))  
- `read_json()` now accepts a `chunksize` parameter that can be used when `lines=True`. If `chunksize` is passed, read_json now returns an iterator which reads in `chunksize` lines with each iteration. ([GH17048](https://github.com/pandas-dev/pandas/issues/17048))  
- `read_json()` and `to_json()` now accept a `compression` argument which allows them to transparently handle compressed files. ([GH17798](https://github.com/pandas-dev/pandas/issues/17798))  

#### Various enhancements

- Improved the import time of pandas by about 2.25x.  ([GH16764](https://github.com/pandas-dev/pandas/issues/16764))  
- Support for [PEP 519 – Adding a file system path protocol](https://www.python.org/dev/peps/pep-0519/) on most readers (e.g.
  `read_csv()`) and writers (e.g. `DataFrame.to_csv()`) ([GH13823](https://github.com/pandas-dev/pandas/issues/13823)).  
- Added a `__fspath__` method to `pd.HDFStore`, `pd.ExcelFile`,
  and `pd.ExcelWriter` to work properly with the file system path protocol ([GH13823](https://github.com/pandas-dev/pandas/issues/13823)).  
- The `validate` argument for `merge()` now checks whether a merge is one-to-one, one-to-many, many-to-one, or many-to-many. If a merge is found to not be an example of specified merge type, an exception of type `MergeError` will be raised. For more, see [here](user_guide/merging.ipynb#merging-validation) ([GH16270](https://github.com/pandas-dev/pandas/issues/16270))  
- Added support for [PEP 518](https://www.python.org/dev/peps/pep-0518/) (`pyproject.toml`) to the build system ([GH16745](https://github.com/pandas-dev/pandas/issues/16745))  
- `RangeIndex.append()` now returns a `RangeIndex` object when possible ([GH16212](https://github.com/pandas-dev/pandas/issues/16212))  
- `Series.rename_axis()` and `DataFrame.rename_axis()` with `inplace=True` now return `None` while renaming the axis inplace. ([GH15704](https://github.com/pandas-dev/pandas/issues/15704))  
- `api.types.infer_dtype()` now infers decimals. ([GH15690](https://github.com/pandas-dev/pandas/issues/15690))  
- `DataFrame.select_dtypes()` now accepts scalar values for include/exclude as well as list-like. ([GH16855](https://github.com/pandas-dev/pandas/issues/16855))  
- `date_range()` now accepts ‘YS’ in addition to ‘AS’ as an alias for start of year. ([GH9313](https://github.com/pandas-dev/pandas/issues/9313))  
- `date_range()` now accepts ‘Y’ in addition to ‘A’ as an alias for end of year. ([GH9313](https://github.com/pandas-dev/pandas/issues/9313))  
- `DataFrame.add_prefix()` and `DataFrame.add_suffix()` now accept strings containing the ‘%’ character. ([GH17151](https://github.com/pandas-dev/pandas/issues/17151))  
- Read/write methods that infer compression (`read_csv()`, `read_table()`, `read_pickle()`, and `to_pickle()`) can now infer from path-like objects, such as `pathlib.Path`. ([GH17206](https://github.com/pandas-dev/pandas/issues/17206))  
- `read_sas()` now recognizes much more of the most frequently used date (datetime) formats in SAS7BDAT files. ([GH15871](https://github.com/pandas-dev/pandas/issues/15871))  
- `DataFrame.items()` and `Series.items()` are now present in both Python 2 and 3 and is lazy in all cases. ([GH13918](https://github.com/pandas-dev/pandas/issues/13918), [GH17213](https://github.com/pandas-dev/pandas/issues/17213))  
- `pandas.io.formats.style.Styler.where()` has been implemented as a convenience for `pandas.io.formats.style.Styler.applymap()`. ([GH17474](https://github.com/pandas-dev/pandas/issues/17474))  
- `MultiIndex.is_monotonic_decreasing()` has been implemented.  Previously returned `False` in all cases. ([GH16554](https://github.com/pandas-dev/pandas/issues/16554))  
- `read_excel()` raises `ImportError` with a better message if `xlrd` is not installed. ([GH17613](https://github.com/pandas-dev/pandas/issues/17613))  
- `DataFrame.assign()` will preserve the original order of `**kwargs` for Python 3.6+ users instead of sorting the column names. ([GH14207](https://github.com/pandas-dev/pandas/issues/14207))  
- `Series.reindex()`, `DataFrame.reindex()`, `Index.get_indexer()` now support list-like argument for `tolerance`. ([GH17367](https://github.com/pandas-dev/pandas/issues/17367))  



<a id='whatsnew-0210-api-breaking'></a>

## Backwards incompatible API changes


<a id='whatsnew-0210-api-breaking-deps'></a>

### Dependencies have increased minimum versions

We have updated our minimum supported versions of dependencies ([GH15206](https://github.com/pandas-dev/pandas/issues/15206), [GH15543](https://github.com/pandas-dev/pandas/issues/15543), [GH15214](https://github.com/pandas-dev/pandas/issues/15214)).
If installed, we now require:

> |Package|Minimum Version|Required|
|:------------:|:---------------:|:--------:|
|Numpy|1.9.0|X|
|Matplotlib|1.4.3||
|Scipy|0.14.0||
|Bottleneck|1.0.0||

Additionally, support has been dropped for Python 3.4 ([GH15251](https://github.com/pandas-dev/pandas/issues/15251)).


<a id='whatsnew-0210-api-breaking-bottleneck'></a>

### Sum/Prod of all-NaN or empty Series/DataFrames is now consistently NaN

>**Note**
>
>The changes described here have been partially reverted. See
the [v0.22.0 Whatsnew](v0.22.0#whatsnew-0220) for more.

The behavior of `sum` and `prod` on all-NaN Series/DataFrames no longer depends on
whether [bottleneck](http://berkeleyanalytics.com/bottleneck) is installed, and return value of `sum` and `prod` on an empty Series has changed ([GH9422](https://github.com/pandas-dev/pandas/issues/9422), [GH15507](https://github.com/pandas-dev/pandas/issues/15507)).

Calling `sum` or `prod` on an empty or all-`NaN` `Series`, or columns of a `DataFrame`, will result in `NaN`. See the [docs](user_guide/missing_data.ipynb#missing-data-numeric-sum).

Previously WITHOUT `bottleneck` installed:

```ipython
In [2]: s.sum()
Out[2]: np.nan
```


Previously WITH `bottleneck`:

```ipython
In [2]: s.sum()
Out[2]: 0.0
```


New behavior, without regard to the bottleneck installation:

Note that this also changes the sum of an empty `Series`. Previously this always returned 0 regardless of a `bottleneck` installation:

```ipython
In [1]: pd.Series([]).sum()
Out[1]: 0
```


but for consistency with the all-NaN case, this was changed to return NaN as well:


<a id='whatsnew-0210-api-breaking-loc'></a>

### Indexing with a list with missing labels is deprecated

Previously, selecting with a list of labels, where one or more labels were missing would always succeed, returning `NaN` for missing labels.
This will now show a `FutureWarning`. In the future this will raise a `KeyError` ([GH15747](https://github.com/pandas-dev/pandas/issues/15747)).
This warning will trigger on a `DataFrame` or a `Series` for using `.loc[]`  or `[[]]` when passing a list-of-labels with at least 1 missing label.
See the [deprecation docs](user_guide/indexing.ipynb#indexing-deprecate-loc-reindex-listlike).

Previous behavior

```ipython
In [4]: s.loc[[1, 2, 3]]
Out[4]:
1    2.0
2    3.0
3    NaN
dtype: float64
```


Current behavior

```ipython
In [4]: s.loc[[1, 2, 3]]
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike

Out[4]:
1    2.0
2    3.0
3    NaN
dtype: float64
```


The idiomatic way to achieve selecting potentially not-found elements is via `.reindex()`

Selection with all keys found is unchanged.


<a id='whatsnew-0210-api-na-changes'></a>

### NA naming changes

In order to promote more consistency among the pandas API, we have added additional top-level
functions `isna()` and `notna()` that are aliases for `isnull()` and `notnull()`.
The naming scheme is now more consistent with methods like `.dropna()` and `.fillna()`. Furthermore
in all cases where `.isnull()` and `.notnull()` methods are defined, these have additional methods
named `.isna()` and `.notna()`, these are included for classes `Categorical`,
`Index`, `Series`, and `DataFrame`. ([GH15001](https://github.com/pandas-dev/pandas/issues/15001)).

The configuration option `pd.options.mode.use_inf_as_null` is deprecated, and `pd.options.mode.use_inf_as_na` is added as a replacement.


<a id='whatsnew-0210-api-breaking-iteration-scalars'></a>

### Iteration of Series/Index will now return Python scalars

Previously, when using certain iteration methods for a `Series` with dtype `int` or `float`, you would receive a `numpy` scalar, e.g. a `np.int64`, rather than a Python `int`. Issue ([GH10904](https://github.com/pandas-dev/pandas/issues/10904)) corrected this for `Series.tolist()` and `list(Series)`. This change makes all iteration methods consistent, in particular, for `__iter__()` and `.map()`; note that this only affects int/float dtypes. ([GH13236](https://github.com/pandas-dev/pandas/issues/13236), [GH13258](https://github.com/pandas-dev/pandas/issues/13258), [GH14216](https://github.com/pandas-dev/pandas/issues/14216)).

Previously:

```ipython
In [2]: type(list(s)[0])
Out[2]: numpy.int64
```


New behavior:

Furthermore this will now correctly box the results of iteration for `DataFrame.to_dict()` as well.

Previously:

```ipython
In [8]: type(df.to_dict()['a'][0])
Out[8]: numpy.int64
```


New behavior:


<a id='whatsnew-0210-api-breaking-loc-with-index'></a>

### Indexing with a Boolean Index

Previously when passing a boolean `Index` to `.loc`, if the index of the `Series/DataFrame` had `boolean` labels,
you would get a label based selection, potentially duplicating result labels, rather than a boolean indexing selection
(where `True` selects elements), this was inconsistent how a boolean numpy array indexed. The new behavior is to
act like a boolean numpy array indexer. ([GH17738](https://github.com/pandas-dev/pandas/issues/17738))

Previous behavior:

```ipython
In [59]: s.loc[pd.Index([True, False, True])]
Out[59]:
True     2
False    1
False    3
True     2
dtype: int64
```


Current behavior

Furthermore, previously if you had an index that was non-numeric (e.g. strings), then a boolean Index would raise a `KeyError`.
This will now be treated as a boolean indexer.

Previously behavior:

```ipython
In [39]: s.loc[pd.Index([True, False, True])]
KeyError: "None of [Index([True, False, True], dtype='object')] are in the [index]"
```


Current behavior


<a id='whatsnew-0210-api-breaking-period-index-resampling'></a>

### `PeriodIndex` resampling

In previous versions of pandas, resampling a `Series`/`DataFrame` indexed by a `PeriodIndex` returned a `DatetimeIndex` in some cases ([GH12884](https://github.com/pandas-dev/pandas/issues/12884)). Resampling to a multiplied frequency now returns a `PeriodIndex` ([GH15944](https://github.com/pandas-dev/pandas/issues/15944)). As a minor enhancement, resampling a `PeriodIndex` can now handle `NaT` values ([GH13224](https://github.com/pandas-dev/pandas/issues/13224))

Previous behavior:

```ipython
In [1]: pi = pd.period_range('2017-01', periods=12, freq='M')

In [2]: s = pd.Series(np.arange(12), index=pi)

In [3]: resampled = s.resample('2Q').mean()

In [4]: resampled
Out[4]:
2017-03-31     1.0
2017-09-30     5.5
2018-03-31    10.0
Freq: 2Q-DEC, dtype: float64

In [5]: resampled.index
Out[5]: DatetimeIndex(['2017-03-31', '2017-09-30', '2018-03-31'], dtype='datetime64[ns]', freq='2Q-DEC')
```


New behavior:

Upsampling and calling `.ohlc()` previously returned a `Series`, basically identical to calling `.asfreq()`. OHLC upsampling now returns a DataFrame with columns `open`, `high`, `low` and `close` ([GH13083](https://github.com/pandas-dev/pandas/issues/13083)). This is consistent with downsampling and `DatetimeIndex` behavior.

Previous behavior:

```ipython
In [1]: pi = pd.period_range(start='2000-01-01', freq='D', periods=10)

In [2]: s = pd.Series(np.arange(10), index=pi)

In [3]: s.resample('H').ohlc()
Out[3]:
2000-01-01 00:00    0.0
                ...
2000-01-10 23:00    NaN
Freq: H, Length: 240, dtype: float64

In [4]: s.resample('M').ohlc()
Out[4]:
         open  high  low  close
2000-01     0     9    0      9
```


New behavior:


<a id='whatsnew-0210-api-breaking-pandas-eval'></a>

### Improved error handling during item assignment in pd.eval

[`eval()`](https://docs.python.org/3/library/functions.html#eval) will now raise a `ValueError` when item assignment malfunctions, or
inplace operations are specified, but there is no item assignment in the expression ([GH16732](https://github.com/pandas-dev/pandas/issues/16732))

Previously, if you attempted the following expression, you would get a not very helpful error message:

```ipython
In [3]: pd.eval("a = 1 + 2", target=arr, inplace=True)
...
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`)
and integer or boolean arrays are valid indices
```


This is a very long way of saying numpy arrays don’t support string-item indexing. With this
change, the error message is now this:

In [None]:
In [3]: pd.eval("a = 1 + 2", target=arr, inplace=True)
...
ValueError: Cannot assign expression output to target

It also used to be possible to evaluate expressions inplace, even if there was no item assignment:

```ipython
In [4]: pd.eval("1 + 2", target=arr, inplace=True)
Out[4]: 3
```


However, this input does not make much sense because the output is not being assigned to
the target. Now, a `ValueError` will be raised when such an input is passed in:

```ipython
In [4]: pd.eval("1 + 2", target=arr, inplace=True)
...
ValueError: Cannot operate inplace if there is no assignment
```



<a id='whatsnew-0210-api-breaking-dtype-conversions'></a>

### Dtype conversions

Previously assignments, `.where()` and `.fillna()` with a `bool` assignment, would coerce to same the type (e.g. int / float), or raise for datetimelikes. These will now preserve the bools with `object` dtypes. ([GH16821](https://github.com/pandas-dev/pandas/issues/16821)).

In [None]:
In [5]: s[1] = True

In [6]: s
Out[6]:
0    1
1    1
2    3
dtype: int64

New behavior

Previously, as assignment to a datetimelike with a non-datetimelike would coerce the
non-datetime-like item being assigned ([GH14145](https://github.com/pandas-dev/pandas/issues/14145)).

In [None]:
In [1]: s[1] = 1

In [2]: s
Out[2]:
0   2011-01-01 00:00:00.000000000
1   1970-01-01 00:00:00.000000001
dtype: datetime64[ns]

These now coerce to `object` dtype.

- Inconsistent behavior in `.where()` with datetimelikes which would raise rather than coerce to `object` ([GH16402](https://github.com/pandas-dev/pandas/issues/16402))  
- Bug in assignment against `int64` data with `np.ndarray` with `float64` dtype may keep `int64` dtype ([GH14001](https://github.com/pandas-dev/pandas/issues/14001))  



<a id='whatsnew-210-api-multiindex-single'></a>

### MultiIndex constructor with a single level

The `MultiIndex` constructors no longer squeezes a MultiIndex with all
length-one levels down to a regular `Index`. This affects all the
`MultiIndex` constructors. ([GH17178](https://github.com/pandas-dev/pandas/issues/17178))

Previous behavior:

```ipython
In [2]: pd.MultiIndex.from_tuples([('a',), ('b',)])
Out[2]: Index(['a', 'b'], dtype='object')
```


Length 1 levels are no longer special-cased. They behave exactly as if you had
length 2+ levels, so a `MultiIndex` is always returned from all of the
`MultiIndex` constructors:


<a id='whatsnew-0210-api-utc-localization-with-series'></a>

### UTC Localization with Series

Previously, `to_datetime()` did not localize datetime `Series` data when `utc=True` was passed. Now, `to_datetime()` will correctly localize `Series` with a `datetime64[ns, UTC]` dtype to be consistent with how list-like and `Index` data are handled. ([GH6415](https://github.com/pandas-dev/pandas/issues/6415)).

Previous behavior

```ipython
In [12]: pd.to_datetime(s, utc=True)
Out[12]:
0   2013-01-01
1   2013-01-01
2   2013-01-01
dtype: datetime64[ns]
```


New behavior

Additionally, DataFrames with datetime columns that were parsed by `read_sql_table()` and `read_sql_query()` will also be localized to UTC only if the original SQL columns were timezone aware datetime columns.


<a id='whatsnew-0210-api-consistency-of-range-functions'></a>

### Consistency of range functions

In previous versions, there were some inconsistencies between the various range functions: `date_range()`, `bdate_range()`, `period_range()`, `timedelta_range()`, and `interval_range()`. ([GH17471](https://github.com/pandas-dev/pandas/issues/17471)).

One of the inconsistent behaviors occurred when the `start`, `end` and `period` parameters were all specified, potentially leading to ambiguous ranges.  When all three parameters were passed, `interval_range` ignored the `period` parameter, `period_range` ignored the `end` parameter, and the other range functions raised.  To promote consistency among the range functions, and avoid potentially ambiguous ranges, `interval_range` and `period_range` will now raise when all three parameters are passed.

Previous behavior:

```ipython
 In [2]: pd.interval_range(start=0, end=4, periods=6)
 Out[2]:
 IntervalIndex([(0, 1], (1, 2], (2, 3]]
               closed='right',
               dtype='interval[int64]')

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
Out[3]: PeriodIndex(['2017Q1', '2017Q2', '2017Q3', '2017Q4', '2018Q1', '2018Q2'], dtype='period[Q-DEC]', freq='Q-DEC')
```


New behavior:

```ipython
In [2]: pd.interval_range(start=0, end=4, periods=6)
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified

In [3]: pd.period_range(start='2017Q1', end='2017Q4', periods=6, freq='Q')
---------------------------------------------------------------------------
ValueError: Of the three parameters: start, end, and periods, exactly two must be specified
```


Additionally, the endpoint parameter `end` was not included in the intervals produced by `interval_range`.  However, all other range functions include `end` in their output.  To promote consistency among the range functions, `interval_range` will now include `end` as the right endpoint of the final interval, except if `freq` is specified in a way which skips `end`.

Previous behavior:

```ipython
In [4]: pd.interval_range(start=0, end=4)
Out[4]:
IntervalIndex([(0, 1], (1, 2], (2, 3]]
              closed='right',
              dtype='interval[int64]')
```


New behavior:


<a id='whatsnew-0210-api-mpl-converters'></a>

### No automatic Matplotlib converters

Pandas no longer registers our `date`, `time`, `datetime`,
`datetime64`, and `Period` converters with matplotlib when pandas is
imported. Matplotlib plot methods (`plt.plot`, `ax.plot`, …), will not
nicely format the x-axis for `DatetimeIndex` or `PeriodIndex` values. You
must explicitly register these methods:

Pandas built-in `Series.plot` and `DataFrame.plot` *will* register these
converters on first-use ([GH17710](https://github.com/pandas-dev/pandas/issues/17710)).

>**Note**
>
>This change has been temporarily reverted in pandas 0.21.1,
for more details see [here](v0.21.1#whatsnew-0211-converters).


<a id='whatsnew-0210-api'></a>

### Other API changes

- The Categorical constructor no longer accepts a scalar for the `categories` keyword. ([GH16022](https://github.com/pandas-dev/pandas/issues/16022))  
- Accessing a non-existent attribute on a closed `HDFStore` will now
  raise an `AttributeError` rather than a `ClosedFileError` ([GH16301](https://github.com/pandas-dev/pandas/issues/16301))  
- `read_csv()` now issues a `UserWarning` if the `names` parameter contains duplicates ([GH17095](https://github.com/pandas-dev/pandas/issues/17095))  
- `read_csv()` now treats `'null'` and `'n/a'` strings as missing values by default ([GH16471](https://github.com/pandas-dev/pandas/issues/16471), [GH16078](https://github.com/pandas-dev/pandas/issues/16078))  
- `pandas.HDFStore`’s string representation is now faster and less detailed. For the previous behavior, use `pandas.HDFStore.info()`. ([GH16503](https://github.com/pandas-dev/pandas/issues/16503)).  
- Compression defaults in HDF stores now follow pytables standards. Default is no compression and if `complib` is missing and `complevel` > 0 `zlib` is used ([GH15943](https://github.com/pandas-dev/pandas/issues/15943))  
- `Index.get_indexer_non_unique()` now returns a ndarray indexer rather than an `Index`; this is consistent with `Index.get_indexer()` ([GH16819](https://github.com/pandas-dev/pandas/issues/16819))  
- Removed the `@slow` decorator from `pandas.util.testing`, which caused issues for some downstream packages’ test suites. Use `@pytest.mark.slow` instead, which achieves the same thing ([GH16850](https://github.com/pandas-dev/pandas/issues/16850))  
- Moved definition of `MergeError` to the `pandas.errors` module.  
- The signature of `Series.set_axis()` and `DataFrame.set_axis()` has been changed from `set_axis(axis, labels)` to `set_axis(labels, axis=0)`, for consistency with the rest of the API. The old signature is deprecated and will show a `FutureWarning` ([GH14636](https://github.com/pandas-dev/pandas/issues/14636))  
- `Series.argmin()` and `Series.argmax()` will now raise a `TypeError` when used with `object` dtypes, instead of a `ValueError` ([GH13595](https://github.com/pandas-dev/pandas/issues/13595))  
- `Period` is now immutable, and will now raise an `AttributeError` when a user tries to assign a new value to the `ordinal` or `freq` attributes ([GH17116](https://github.com/pandas-dev/pandas/issues/17116)).  
- `to_datetime()` when passed a tz-aware `origin=` kwarg will now raise a more informative `ValueError` rather than a `TypeError` ([GH16842](https://github.com/pandas-dev/pandas/issues/16842))  
- `to_datetime()` now raises a `ValueError` when format includes `%W` or `%U` without also including day of the week and calendar year ([GH16774](https://github.com/pandas-dev/pandas/issues/16774))  
- Renamed non-functional `index` to `index_col` in `read_stata()` to improve API consistency ([GH16342](https://github.com/pandas-dev/pandas/issues/16342))  
- Bug in `DataFrame.drop()` caused boolean labels `False` and `True` to be treated as labels 0 and 1 respectively when dropping indices from a numeric index. This will now raise a ValueError ([GH16877](https://github.com/pandas-dev/pandas/issues/16877))  
- Restricted DateOffset keyword arguments.  Previously, `DateOffset` subclasses allowed arbitrary keyword arguments which could lead to unexpected behavior.  Now, only valid arguments will be accepted. ([GH17176](https://github.com/pandas-dev/pandas/issues/17176)).  



<a id='whatsnew-0210-deprecations'></a>

## Deprecations

- `DataFrame.from_csv()` and `Series.from_csv()` have been deprecated in favor of `read_csv()` ([GH4191](https://github.com/pandas-dev/pandas/issues/4191))  
- `read_excel()` has deprecated `sheetname` in favor of `sheet_name` for consistency with `.to_excel()` ([GH10559](https://github.com/pandas-dev/pandas/issues/10559)).  
- `read_excel()` has deprecated `parse_cols` in favor of `usecols` for consistency with `read_csv()` ([GH4988](https://github.com/pandas-dev/pandas/issues/4988))  
- `read_csv()` has deprecated the `tupleize_cols` argument. Column tuples will always be converted to a `MultiIndex` ([GH17060](https://github.com/pandas-dev/pandas/issues/17060))  
- `DataFrame.to_csv()` has deprecated the `tupleize_cols` argument. MultiIndex columns will be always written as rows in the CSV file ([GH17060](https://github.com/pandas-dev/pandas/issues/17060))  
- The `convert` parameter has been deprecated in the `.take()` method, as it was not being respected ([GH16948](https://github.com/pandas-dev/pandas/issues/16948))  
- `pd.options.html.border` has been deprecated in favor of `pd.options.display.html.border` ([GH15793](https://github.com/pandas-dev/pandas/issues/15793)).  
- `SeriesGroupBy.nth()` has deprecated `True` in favor of `'all'` for its kwarg `dropna` ([GH11038](https://github.com/pandas-dev/pandas/issues/11038)).  
- `DataFrame.as_blocks()` is deprecated, as this is exposing the internal implementation ([GH17302](https://github.com/pandas-dev/pandas/issues/17302))  
- `pd.TimeGrouper` is deprecated in favor of `pandas.Grouper` ([GH16747](https://github.com/pandas-dev/pandas/issues/16747))  
- `cdate_range` has been deprecated in favor of `bdate_range()`, which has gained `weekmask` and `holidays` parameters for building custom frequency date ranges. See the [documentation](user_guide/timeseries.ipynb#timeseries-custom-freq-ranges) for more details ([GH17596](https://github.com/pandas-dev/pandas/issues/17596))  
- passing `categories` or `ordered` kwargs to `Series.astype()` is deprecated, in favor of passing a [CategoricalDtype](#whatsnew-0210-enhancements-categorical-dtype) ([GH17636](https://github.com/pandas-dev/pandas/issues/17636))  
- `.get_value` and `.set_value` on `Series`, `DataFrame`, `Panel`, `SparseSeries`, and `SparseDataFrame` are deprecated in favor of using `.iat[]` or `.at[]` accessors ([GH15269](https://github.com/pandas-dev/pandas/issues/15269))  
- Passing a non-existent column in `.to_excel(..., columns=)` is deprecated and will raise a `KeyError` in the future ([GH17295](https://github.com/pandas-dev/pandas/issues/17295))  
- `raise_on_error` parameter to `Series.where()`, `Series.mask()`, `DataFrame.where()`, `DataFrame.mask()` is deprecated, in favor of `errors=` ([GH14968](https://github.com/pandas-dev/pandas/issues/14968))  
- Using `DataFrame.rename_axis()` and `Series.rename_axis()` to alter index or column *labels* is now deprecated in favor of using `.rename`. `rename_axis` may still be used to alter the name of the index or columns ([GH17833](https://github.com/pandas-dev/pandas/issues/17833)).  
- `reindex_axis()` has been deprecated in favor of `reindex()`. See [here](#whatsnew-0210-enhancements-rename-reindex-axis) for more ([GH17833](https://github.com/pandas-dev/pandas/issues/17833)).  



<a id='whatsnew-0210-deprecations-select'></a>

### Series.select and DataFrame.select

The `Series.select()` and `DataFrame.select()` methods are deprecated in favor of using `df.loc[labels.map(crit)]` ([GH12401](https://github.com/pandas-dev/pandas/issues/12401))

```ipython
In [3]: df.select(lambda x: x in ['bar', 'baz'])
FutureWarning: select is deprecated and will be removed in a future release. You can use .loc[crit] as a replacement
Out[3]:
     A
bar  2
baz  3
```



<a id='whatsnew-0210-deprecations-argmin-min'></a>

### Series.argmax and Series.argmin

The behavior of `Series.argmax()` and `Series.argmin()` have been deprecated in favor of `Series.idxmax()` and `Series.idxmin()`, respectively ([GH16830](https://github.com/pandas-dev/pandas/issues/16830)).

For compatibility with NumPy arrays, `pd.Series` implements `argmax` and
`argmin`. Since pandas 0.13.0, `argmax` has been an alias for
`pandas.Series.idxmax()`, and `argmin` has been an alias for
`pandas.Series.idxmin()`. They return the *label* of the maximum or minimum,
rather than the *position*.

We’ve deprecated the current behavior of `Series.argmax` and
`Series.argmin`. Using either of these will emit a `FutureWarning`. Use
`Series.idxmax()` if you want the label of the maximum. Use
`Series.values.argmax()` if you want the position of the maximum. Likewise for
the minimum. In a future release `Series.argmax` and `Series.argmin` will
return the position of the maximum or minimum.


<a id='whatsnew-0210-prior-deprecations'></a>

## Removal of prior version deprecations/changes

- `read_excel()` has dropped the `has_index_names` parameter ([GH10967](https://github.com/pandas-dev/pandas/issues/10967))  
- The `pd.options.display.height` configuration has been dropped ([GH3663](https://github.com/pandas-dev/pandas/issues/3663))  
- The `pd.options.display.line_width` configuration has been dropped ([GH2881](https://github.com/pandas-dev/pandas/issues/2881))  
- The `pd.options.display.mpl_style` configuration has been dropped ([GH12190](https://github.com/pandas-dev/pandas/issues/12190))  
- `Index` has dropped the `.sym_diff()` method in favor of `.symmetric_difference()` ([GH12591](https://github.com/pandas-dev/pandas/issues/12591))  
- `Categorical` has dropped the `.order()` and `.sort()` methods in favor of `.sort_values()` ([GH12882](https://github.com/pandas-dev/pandas/issues/12882))  
- [`eval()`](https://docs.python.org/3/library/functions.html#eval) and `DataFrame.eval()` have changed the default of `inplace` from `None` to `False` ([GH11149](https://github.com/pandas-dev/pandas/issues/11149))  
- The function `get_offset_name` has been dropped in favor of the `.freqstr` attribute for an offset ([GH11834](https://github.com/pandas-dev/pandas/issues/11834))  
- pandas no longer tests for compatibility with hdf5-files created with pandas < 0.11 ([GH17404](https://github.com/pandas-dev/pandas/issues/17404)).  



<a id='whatsnew-0210-performance'></a>

## Performance improvements

- Improved performance of instantiating `SparseDataFrame` ([GH16773](https://github.com/pandas-dev/pandas/issues/16773))  
- `Series.dt` no longer performs frequency inference, yielding a large speedup when accessing the attribute ([GH17210](https://github.com/pandas-dev/pandas/issues/17210))  
- Improved performance of `set_categories()` by not materializing the values ([GH17508](https://github.com/pandas-dev/pandas/issues/17508))  
- `Timestamp.microsecond` no longer re-computes on attribute access ([GH17331](https://github.com/pandas-dev/pandas/issues/17331))  
- Improved performance of the `CategoricalIndex` for data that is already categorical dtype ([GH17513](https://github.com/pandas-dev/pandas/issues/17513))  
- Improved performance of `RangeIndex.min()` and `RangeIndex.max()` by using `RangeIndex` properties to perform the computations ([GH17607](https://github.com/pandas-dev/pandas/issues/17607))  



<a id='whatsnew-0210-docs'></a>

## Documentation changes

- Several `NaT` method docstrings (e.g. `NaT.ctime()`) were incorrect ([GH17327](https://github.com/pandas-dev/pandas/issues/17327))  
- The documentation has had references to versions < v0.17 removed and cleaned up ([GH17442](https://github.com/pandas-dev/pandas/issues/17442), [GH17442](https://github.com/pandas-dev/pandas/issues/17442), [GH17404](https://github.com/pandas-dev/pandas/issues/17404) & [GH17504](https://github.com/pandas-dev/pandas/issues/17504))  



<a id='whatsnew-0210-bug-fixes'></a>

## Bug fixes

### Conversion

- Bug in assignment against datetime-like data with `int` may incorrectly convert to datetime-like ([GH14145](https://github.com/pandas-dev/pandas/issues/14145))  
- Bug in assignment against `int64` data with `np.ndarray` with `float64` dtype may keep `int64` dtype ([GH14001](https://github.com/pandas-dev/pandas/issues/14001))  
- Fixed the return type of `IntervalIndex.is_non_overlapping_monotonic` to be a Python `bool` for consistency with similar attributes/methods.  Previously returned a `numpy.bool_`. ([GH17237](https://github.com/pandas-dev/pandas/issues/17237))  
- Bug in `IntervalIndex.is_non_overlapping_monotonic` when intervals are closed on both sides and overlap at a point ([GH16560](https://github.com/pandas-dev/pandas/issues/16560))  
- Bug in `Series.fillna()` returns frame when `inplace=True` and `value` is dict ([GH16156](https://github.com/pandas-dev/pandas/issues/16156))  
- Bug in `Timestamp.weekday_name` returning a UTC-based weekday name when localized to a timezone ([GH17354](https://github.com/pandas-dev/pandas/issues/17354))  
- Bug in `Timestamp.replace` when replacing `tzinfo` around DST changes ([GH15683](https://github.com/pandas-dev/pandas/issues/15683))  
- Bug in `Timedelta` construction and arithmetic that would not propagate the `Overflow` exception ([GH17367](https://github.com/pandas-dev/pandas/issues/17367))  
- Bug in `astype()` converting to object dtype when passed extension type classes (`DatetimeTZDtype`, `CategoricalDtype`) rather than instances. Now a `TypeError` is raised when a class is passed ([GH17780](https://github.com/pandas-dev/pandas/issues/17780)).  
- Bug in `to_numeric()` in which elements were not always being coerced to numeric when `errors='coerce'` ([GH17007](https://github.com/pandas-dev/pandas/issues/17007), [GH17125](https://github.com/pandas-dev/pandas/issues/17125))  
- Bug in `DataFrame` and `Series` constructors where `range` objects are converted to `int32` dtype on Windows instead of `int64` ([GH16804](https://github.com/pandas-dev/pandas/issues/16804))  

### Indexing

- When called with a null slice (e.g. `df.iloc[:]`), the `.iloc` and `.loc` indexers return a shallow copy of the original object. Previously they returned the original object. ([GH13873](https://github.com/pandas-dev/pandas/issues/13873)).  
- When called on an unsorted `MultiIndex`, the `loc` indexer now will raise `UnsortedIndexError` only if proper slicing is used on non-sorted levels ([GH16734](https://github.com/pandas-dev/pandas/issues/16734)).  
- Fixes regression in 0.20.3 when indexing with a string on a `TimedeltaIndex` ([GH16896](https://github.com/pandas-dev/pandas/issues/16896)).  
- Fixed `TimedeltaIndex.get_loc()` handling of `np.timedelta64` inputs ([GH16909](https://github.com/pandas-dev/pandas/issues/16909)).  
- Fix `MultiIndex.sort_index()` ordering when `ascending` argument is a list, but not all levels are specified, or are in a different order ([GH16934](https://github.com/pandas-dev/pandas/issues/16934)).  
- Fixes bug where indexing with `np.inf` caused an `OverflowError` to be raised ([GH16957](https://github.com/pandas-dev/pandas/issues/16957))  
- Bug in reindexing on an empty `CategoricalIndex` ([GH16770](https://github.com/pandas-dev/pandas/issues/16770))  
- Fixes `DataFrame.loc` for setting with alignment and tz-aware `DatetimeIndex` ([GH16889](https://github.com/pandas-dev/pandas/issues/16889))  
- Avoids `IndexError` when passing an Index or Series to `.iloc` with older numpy ([GH17193](https://github.com/pandas-dev/pandas/issues/17193))  
- Allow unicode empty strings as placeholders in multilevel columns in Python 2 ([GH17099](https://github.com/pandas-dev/pandas/issues/17099))  
- Bug in `.iloc` when used with inplace addition or assignment and an int indexer on a `MultiIndex` causing the wrong indexes to be read from and written to ([GH17148](https://github.com/pandas-dev/pandas/issues/17148))  
- Bug in `.isin()` in which checking membership in empty `Series` objects raised an error ([GH16991](https://github.com/pandas-dev/pandas/issues/16991))  
- Bug in `CategoricalIndex` reindexing in which specified indices containing duplicates were not being respected ([GH17323](https://github.com/pandas-dev/pandas/issues/17323))  
- Bug in intersection of `RangeIndex` with negative step ([GH17296](https://github.com/pandas-dev/pandas/issues/17296))  
- Bug in `IntervalIndex` where performing a scalar lookup fails for included right endpoints of non-overlapping monotonic decreasing indexes ([GH16417](https://github.com/pandas-dev/pandas/issues/16417), [GH17271](https://github.com/pandas-dev/pandas/issues/17271))  
- Bug in `DataFrame.first_valid_index()` and `DataFrame.last_valid_index()` when no valid entry ([GH17400](https://github.com/pandas-dev/pandas/issues/17400))  
- Bug in `Series.rename()` when called with a callable, incorrectly alters the name of the `Series`, rather than the name of the `Index`. ([GH17407](https://github.com/pandas-dev/pandas/issues/17407))  
- Bug in `String.str_get()` raises `IndexError` instead of inserting NaNs when using a negative index. ([GH17704](https://github.com/pandas-dev/pandas/issues/17704))  

### I/O

- Bug in `read_hdf()` when reading a timezone aware index from `fixed` format HDFStore ([GH17618](https://github.com/pandas-dev/pandas/issues/17618))  
- Bug in `read_csv()` in which columns were not being thoroughly de-duplicated ([GH17060](https://github.com/pandas-dev/pandas/issues/17060))  
- Bug in `read_csv()` in which specified column names were not being thoroughly de-duplicated ([GH17095](https://github.com/pandas-dev/pandas/issues/17095))  
- Bug in `read_csv()` in which non integer values for the header argument generated an unhelpful / unrelated error message ([GH16338](https://github.com/pandas-dev/pandas/issues/16338))  
- Bug in `read_csv()` in which memory management issues in exception handling, under certain conditions, would cause the interpreter to segfault ([GH14696](https://github.com/pandas-dev/pandas/issues/14696), [GH16798](https://github.com/pandas-dev/pandas/issues/16798)).  
- Bug in `read_csv()` when called with `low_memory=False` in which a CSV with at least one column > 2GB in size would incorrectly raise a `MemoryError` ([GH16798](https://github.com/pandas-dev/pandas/issues/16798)).  
- Bug in `read_csv()` when called with a single-element list `header` would return a `DataFrame` of all NaN values ([GH7757](https://github.com/pandas-dev/pandas/issues/7757))  
- Bug in `DataFrame.to_csv()` defaulting to ‘ascii’ encoding in Python 3, instead of ‘utf-8’ ([GH17097](https://github.com/pandas-dev/pandas/issues/17097))  
- Bug in `read_stata()` where value labels could not be read when using an iterator ([GH16923](https://github.com/pandas-dev/pandas/issues/16923))  
- Bug in `read_stata()` where the index was not set ([GH16342](https://github.com/pandas-dev/pandas/issues/16342))  
- Bug in `read_html()` where import check fails when run in multiple threads ([GH16928](https://github.com/pandas-dev/pandas/issues/16928))  
- Bug in `read_csv()` where automatic delimiter detection caused a `TypeError` to be thrown when a bad line was encountered rather than the correct error message ([GH13374](https://github.com/pandas-dev/pandas/issues/13374))  
- Bug in `DataFrame.to_html()` with `notebook=True` where DataFrames with named indices or non-MultiIndex indices had undesired horizontal or vertical alignment for column or row labels, respectively ([GH16792](https://github.com/pandas-dev/pandas/issues/16792))  
- Bug in `DataFrame.to_html()` in which there was no validation of the `justify` parameter ([GH17527](https://github.com/pandas-dev/pandas/issues/17527))  
- Bug in `HDFStore.select()` when reading a contiguous mixed-data table featuring VLArray ([GH17021](https://github.com/pandas-dev/pandas/issues/17021))  
- Bug in `to_json()` where several conditions (including objects with unprintable symbols, objects with deep recursion, overlong labels) caused segfaults instead of raising the appropriate exception ([GH14256](https://github.com/pandas-dev/pandas/issues/14256))  

### Plotting

- Bug in plotting methods using `secondary_y` and `fontsize` not setting secondary axis font size ([GH12565](https://github.com/pandas-dev/pandas/issues/12565))  
- Bug when plotting `timedelta` and `datetime` dtypes on y-axis ([GH16953](https://github.com/pandas-dev/pandas/issues/16953))  
- Line plots no longer assume monotonic x data when calculating xlims, they show the entire lines now even for unsorted x data. ([GH11310](https://github.com/pandas-dev/pandas/issues/11310), [GH11471](https://github.com/pandas-dev/pandas/issues/11471))  
- With matplotlib 2.0.0 and above, calculation of x limits for line plots is left to matplotlib, so that its new default settings are applied. ([GH15495](https://github.com/pandas-dev/pandas/issues/15495))  
- Bug in `Series.plot.bar` or `DataFrame.plot.bar` with `y` not respecting user-passed `color` ([GH16822](https://github.com/pandas-dev/pandas/issues/16822))  
- Bug causing `plotting.parallel_coordinates` to reset the random seed when using random colors ([GH17525](https://github.com/pandas-dev/pandas/issues/17525))  

### Groupby/resample/rolling

- Bug in `DataFrame.resample(...).size()` where an empty `DataFrame` did not return a `Series` ([GH14962](https://github.com/pandas-dev/pandas/issues/14962))  
- Bug in `infer_freq()` causing indices with 2-day gaps during the working week to be wrongly inferred as business daily ([GH16624](https://github.com/pandas-dev/pandas/issues/16624))  
- Bug in `.rolling(...).quantile()` which incorrectly used different defaults than `Series.quantile()` and `DataFrame.quantile()` ([GH9413](https://github.com/pandas-dev/pandas/issues/9413), [GH16211](https://github.com/pandas-dev/pandas/issues/16211))  
- Bug in `groupby.transform()` that would coerce boolean dtypes back to float ([GH16875](https://github.com/pandas-dev/pandas/issues/16875))  
- Bug in `Series.resample(...).apply()` where an empty `Series` modified the source index and did not return the name of a `Series` ([GH14313](https://github.com/pandas-dev/pandas/issues/14313))  
- Bug in `.rolling(...).apply(...)` with a `DataFrame` with a `DatetimeIndex`, a `window` of a timedelta-convertible and `min_periods >= 1` ([GH15305](https://github.com/pandas-dev/pandas/issues/15305))  
- Bug in `DataFrame.groupby` where index and column keys were not recognized correctly when the number of keys equaled the number of elements on the groupby axis ([GH16859](https://github.com/pandas-dev/pandas/issues/16859))  
- Bug in `groupby.nunique()` with `TimeGrouper` which cannot handle `NaT` correctly ([GH17575](https://github.com/pandas-dev/pandas/issues/17575))  
- Bug in `DataFrame.groupby` where a single level selection from a `MultiIndex` unexpectedly sorts ([GH17537](https://github.com/pandas-dev/pandas/issues/17537))  
- Bug in `DataFrame.groupby` where spurious warning is raised when `Grouper` object is used to override ambiguous column name ([GH17383](https://github.com/pandas-dev/pandas/issues/17383))  
- Bug in `TimeGrouper` differs when passes as a list and as a scalar ([GH17530](https://github.com/pandas-dev/pandas/issues/17530))  

### Sparse

- Bug in `SparseSeries` raises `AttributeError` when a dictionary is passed in as data ([GH16905](https://github.com/pandas-dev/pandas/issues/16905))  
- Bug in `SparseDataFrame.fillna()` not filling all NaNs when frame was instantiated from SciPy sparse matrix ([GH16112](https://github.com/pandas-dev/pandas/issues/16112))  
- Bug in `SparseSeries.unstack()` and `SparseDataFrame.stack()` ([GH16614](https://github.com/pandas-dev/pandas/issues/16614), [GH15045](https://github.com/pandas-dev/pandas/issues/15045))  
- Bug in `make_sparse()` treating two numeric/boolean data, which have same bits, as same when array `dtype` is `object` ([GH17574](https://github.com/pandas-dev/pandas/issues/17574))  
- `SparseArray.all()` and `SparseArray.any()` are now implemented to handle `SparseArray`, these were used but not implemented ([GH17570](https://github.com/pandas-dev/pandas/issues/17570))  

### Reshaping

- Joining/Merging with a non unique `PeriodIndex` raised a `TypeError` ([GH16871](https://github.com/pandas-dev/pandas/issues/16871))  
- Bug in `crosstab()` where non-aligned series of integers were casted to float ([GH17005](https://github.com/pandas-dev/pandas/issues/17005))  
- Bug in merging with categorical dtypes with datetimelikes incorrectly raised a `TypeError` ([GH16900](https://github.com/pandas-dev/pandas/issues/16900))  
- Bug when using `isin()` on a large object series and large comparison array ([GH16012](https://github.com/pandas-dev/pandas/issues/16012))  
- Fixes regression from 0.20, `Series.aggregate()` and `DataFrame.aggregate()` allow dictionaries as return values again ([GH16741](https://github.com/pandas-dev/pandas/issues/16741))  
- Fixes dtype of result with integer dtype input, from `pivot_table()` when called with `margins=True` ([GH17013](https://github.com/pandas-dev/pandas/issues/17013))  
- Bug in `crosstab()` where passing two `Series` with the same name raised a `KeyError` ([GH13279](https://github.com/pandas-dev/pandas/issues/13279))  
- `Series.argmin()`, `Series.argmax()`, and their counterparts on `DataFrame` and groupby objects work correctly with floating point data that contains infinite values ([GH13595](https://github.com/pandas-dev/pandas/issues/13595)).  
- Bug in `unique()` where checking a tuple of strings raised a `TypeError` ([GH17108](https://github.com/pandas-dev/pandas/issues/17108))  
- Bug in `concat()` where order of result index was unpredictable if it contained non-comparable elements ([GH17344](https://github.com/pandas-dev/pandas/issues/17344))  
- Fixes regression when sorting by multiple columns on a `datetime64` dtype `Series` with `NaT` values ([GH16836](https://github.com/pandas-dev/pandas/issues/16836))  
- Bug in `pivot_table()` where the result’s columns did not preserve the categorical dtype of `columns` when `dropna` was `False` ([GH17842](https://github.com/pandas-dev/pandas/issues/17842))  
- Bug in `DataFrame.drop_duplicates` where dropping with non-unique column names raised a `ValueError` ([GH17836](https://github.com/pandas-dev/pandas/issues/17836))  
- Bug in `unstack()` which, when called on a list of levels, would discard the `fillna` argument ([GH13971](https://github.com/pandas-dev/pandas/issues/13971))  
- Bug in the alignment of `range` objects and other list-likes with `DataFrame` leading to operations being performed row-wise instead of column-wise ([GH17901](https://github.com/pandas-dev/pandas/issues/17901))  

### Numeric

- Bug in `.clip()` with `axis=1` and a list-like for `threshold` is passed; previously this raised `ValueError` ([GH15390](https://github.com/pandas-dev/pandas/issues/15390))  
- `Series.clip()` and `DataFrame.clip()` now treat NA values for upper and lower arguments as `None` instead of raising `ValueError` ([GH17276](https://github.com/pandas-dev/pandas/issues/17276)).  

### Categorical

- Bug in `Series.isin()` when called with a categorical ([GH16639](https://github.com/pandas-dev/pandas/issues/16639))  
- Bug in the categorical constructor with empty values and categories causing the `.categories` to be an empty `Float64Index` rather than an empty `Index` with object dtype ([GH17248](https://github.com/pandas-dev/pandas/issues/17248))  
- Bug in categorical operations with [Series.cat](user_guide/categorical.ipynb#categorical-cat) not preserving the original Series’ name ([GH17509](https://github.com/pandas-dev/pandas/issues/17509))  
- Bug in `DataFrame.merge()` failing for categorical columns with boolean/int data types ([GH17187](https://github.com/pandas-dev/pandas/issues/17187))  
- Bug in constructing a `Categorical`/`CategoricalDtype` when the specified `categories` are of categorical type ([GH17884](https://github.com/pandas-dev/pandas/issues/17884)).  



<a id='whatsnew-0210-pypy'></a>

### PyPy

- Compatibility with PyPy in `read_csv()` with `usecols=[<unsorted ints>]` and
  `read_json()` ([GH17351](https://github.com/pandas-dev/pandas/issues/17351))  
- Split tests into cases for CPython and PyPy where needed, which highlights the fragility
  of index matching with `float('nan')`, `np.nan` and `NAT` ([GH17351](https://github.com/pandas-dev/pandas/issues/17351))  
- Fix `DataFrame.memory_usage()` to support PyPy. Objects on PyPy do not have a fixed size,
  so an approximation is used instead ([GH17228](https://github.com/pandas-dev/pandas/issues/17228))  

### Other

- Bug where some inplace operators were not being wrapped and produced a copy when invoked ([GH12962](https://github.com/pandas-dev/pandas/issues/12962))  
- Bug in [`eval()`](https://docs.python.org/3/library/functions.html#eval) where the `inplace` parameter was being incorrectly handled ([GH16732](https://github.com/pandas-dev/pandas/issues/16732))  



<a id='whatsnew-0-21-0-contributors'></a>

## Contributors

A total of 206 people contributed patches to this release.  People with a
“+” by their names contributed a patch for the first time.


- 3553x +  
- Aaron Barber  
- Adam Gleave +  
- Adam Smith +  
- AdamShamlian +  
- Adrian Liaw +  
- Alan Velasco +  
- Alan Yee +  
- Alex B +  
- Alex Lubbock +  
- Alex Marchenko +  
- Alex Rychyk +  
- Amol K +  
- Andreas Winkler  
- Andrew +  
- Andrew 亮  
- André Jonasson +  
- Becky Sweger  
- Berkay +  
- Bob Haffner +  
- Bran Yang  
- Brian Tu +  
- Brock Mendel +  
- Carol Willing +  
- Carter Green +  
- Chankey Pathak +  
- Chris  
- Chris Billington  
- Chris Filo Gorgolewski +  
- Chris Kerr  
- Chris M +  
- Chris Mazzullo +  
- Christian Prinoth  
- Christian Stade-Schuldt  
- Christoph Moehl +  
- DSM  
- Daniel Chen +  
- Daniel Grady  
- Daniel Himmelstein  
- Dave Willmer  
- David Cook  
- David Gwynne  
- David Read +  
- Dillon Niederhut +  
- Douglas Rudd  
- Eric Stein +  
- Eric Wieser +  
- Erik Fredriksen  
- Florian Wilhelm +  
- Floris Kint +  
- Forbidden Donut  
- Gabe F +  
- Giftlin +  
- Giftlin Rajaiah +  
- Giulio Pepe +  
- Guilherme Beltramini  
- Guillem Borrell +  
- Hanmin Qin +  
- Hendrik Makait +  
- Hugues Valois  
- Hussain Tamboli +  
- Iva Miholic +  
- Jan Novotný +  
- Jan Rudolph  
- Jean Helie +  
- Jean-Baptiste Schiratti +  
- Jean-Mathieu Deschenes  
- Jeff Knupp +  
- Jeff Reback  
- Jeff Tratner  
- JennaVergeynst  
- JimStearns206  
- Joel Nothman  
- John W. O’Brien  
- Jon Crall +  
- Jon Mease  
- Jonathan J. Helmus +  
- Joris Van den Bossche  
- JosephWagner  
- Juarez Bochi  
- Julian Kuhlmann +  
- Karel De Brabandere  
- Kassandra Keeton +  
- Keiron Pizzey +  
- Keith Webber  
- Kernc  
- Kevin Sheppard  
- Kirk Hansen +  
- Licht Takeuchi +  
- Lucas Kushner +  
- Mahdi Ben Jelloul +  
- Makarov Andrey +  
- Malgorzata Turzanska +  
- Marc Garcia +  
- Margaret Sy +  
- MarsGuy +  
- Matt Bark +  
- Matthew Roeschke  
- Matti Picus  
- Mehmet Ali “Mali” Akmanalp  
- Michael Gasvoda +  
- Michael Penkov +  
- Milo +  
- Morgan Stuart +  
- Morgan243 +  
- Nathan Ford +  
- Nick Eubank  
- Nick Garvey +  
- Oleg Shteynbuk +  
- P-Tillmann +  
- Pankaj Pandey  
- Patrick Luo  
- Patrick O’Melveny  
- Paul Reidy +  
- Paula +  
- Peter Quackenbush  
- Peter Yanovich +  
- Phillip Cloud  
- Pierre Haessig  
- Pietro Battiston  
- Pradyumna Reddy Chinthala  
- Prasanjit Prakash  
- RobinFiveWords  
- Ryan Hendrickson  
- Sam Foo  
- Sangwoong Yoon +  
- Simon Gibbons +  
- SimonBaron  
- Steven Cutting +  
- Sudeep +  
- Sylvia +  
- T N +  
- Telt  
- Thomas A Caswell  
- Tim Swast +  
- Tom Augspurger  
- Tong SHEN  
- Tuan +  
- Utkarsh Upadhyay +  
- Vincent La +  
- Vivek +  
- WANG Aiyong  
- WBare  
- Wes McKinney  
- XF +  
- Yi Liu +  
- Yosuke Nakabayashi +  
- aaron315 +  
- abarber4gh +  
- aernlund +  
- agustín méndez +  
- andymaheshw +  
- ante328 +  
- aviolov +  
- bpraggastis  
- cbertinato +  
- cclauss +  
- chernrick  
- chris-b1  
- dkamm +  
- dwkenefick  
- economy  
- faic +  
- fding253 +  
- gfyoung  
- guygoldberg +  
- hhuuggoo +  
- huashuai +  
- ian  
- iulia +  
- jaredsnyder  
- jbrockmendel +  
- jdeschenes  
- jebob +  
- jschendel +  
- keitakurita  
- kernc +  
- kiwirob +  
- kjford  
- linebp  
- lloydkirk  
- louispotok +  
- majiang +  
- manikbhandari +  
- matthiashuschle +  
- mattip  
- maxwasserman +  
- mjlove12 +  
- nmartensen +  
- pandas-docs-bot +  
- parchd-1 +  
- philipphanemann +  
- rdk1024 +  
- reidy-p +  
- ri938  
- ruiann +  
- rvernica +  
- s-weigand +  
- scotthavard92 +  
- skwbc +  
- step4me +  
- tobycheese +  
- topper-123 +  
- tsdlovell  
- ysau +  
- zzgao +  