Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled monthly dependency update for January #125

Merged
merged 16 commits into from
Jan 2, 2019

Conversation

pyup-bot
Copy link
Contributor

@pyup-bot pyup-bot commented Jan 1, 2019

Update matplotlib from 3.0.2 to 3.0.2.

Changelog

2.1.0

This is the second minor release in the Matplotlib 2.x series and the first
release with major new features since 1.5.

This release contains approximately 2 years worth of work by 275 contributors
across over 950 pull requests.  Highlights from this release include:

- support for string categorical values
- export of animations to interactive javascript widgets
- major overhaul of polar plots
- reproducible output for ps/eps, pdf, and svg backends
- performance improvements in drawing lines and images
- GUIs show a busy cursor while rendering the plot


along with many other enhancements and bug fixes.

2.0.0

This previews the new default style and many bug-fixes.  A full list of
the style changes will be collected for the final release.

In addition to the style change this release includes:
- overhaul of font handling/text rendering to be faster and clearer
- many new rcParams
- Agg based OSX backend
- optionally deterministic SVGs
- complete re-write of image handling code
- simplified color conversion
- specify colors in the global property cycle via `'C0'`,
`'C1'`... `'C9'`
- use the global property cycle more places (bar, stem, scatter)

There is a 'classic' style sheet which reproduces the 1.Y defaults:

import matplotlib.style as mstyle
mstyle.use('classic')

2.0.0rc2

This is the second and final planned release candidate for mpl v2.0

This release includes:
- Bug fixes and documentation changes
- Expanded API on plot_surface and plot_wireframe
- Pin font size at text creation time
- Suppress fc-cache warning unless it takes more than 5s

2.0.0rc1

This is the first release candidate for mpl v2.0

This release includes:
- A re-implementation of the way margins are handled during auto
scaling to allow artists to 'stick' to an edge of the Axes
- Improvements to the ticking with log and symlog scales
- Deprecation of the finance module.  This will be spun off into a stand-alone package
- Deprecation of the 'hold' machinery 
- Bumped the minimum numpy version to 1.7
- Standardization of hatch width and appearance across backends
- Made threshold for triggering 'offset' in `ScalarFormatter` configurable
and default to 4 (plotting against years should just work now)
- Default encoding for mp4 is now h264
- `fill_between` and `fill_betweenx` now use the color cycle
- Default alignment of bars changed from 'edge' to 'center'
- Bug and documentation fixes

2.0.0b4

Fourth and final beta release

2.0.0b3

Third beta for v2.0.0 release

This tag includes several critical bug fixes and updates the dash patterns.

1.5.3

This release contains a few critical bug fixes:
- eliminate fatal exceptions with Qt5.7
- memory leak in the contour code
- keyboard interaction bug with nbagg
- automatic integration with the ipython event loop (if running) which
fixes 'naive' integration for IPython 5+

1.5.2

Final planned release for the 1.5.x series.

1.5.1

First bug fix release for 1.5.x series.

1.5.0

This release of matplotlib has several major new features:
- Auto-redraw using the object-oriented API.
- Most plotting functions now support labeled data API.
- Color cycling has extended to all style properties.
- Four new perceptually uniform color maps, including the soon-to-be
default 'viridis'.
- More included style sheets.
- Many small plotting improvements.
- Proposed new framework for managing the GUI toolbar and tools.

1.4.3

This is the last planned bug-fix release in the 1.4 series.

Many bugs are fixed including:
- fixing drawing of edge-only markers in AGG
- fix run-away memory usage when using %inline or saving with
a tight bounding box with QuadMesh artists
- improvements to wx and tk gui backends

Additionally the webagg and nbagg backends were brought closer to
feature parity with the desktop backends with the addition of keyboard
and scroll events thanks to Steven Silvester.

1.4.2

Minor bug-fix release for 1.4 series
- regenerated pyplot.py

1.4.1

Bug-fix release for the 1.4 series.
- reverts the changes to interactive plotting so `ion` will work as
before in all cases
- fixed boxplot regressions
- fixes for finding freetype and libpng
- sundry unicode fixes (looking up user folders, importing
seaborn/pandas/networkx with macosx backend)
- nbagg works with python 3 + new font awesome
- fixed saving dialogue in QT5
Links

Update multiprocess from 0.70.6.1 to 0.70.6.1.

Changelog

0.52

---------------

* On versions 0.50 and 0.51 Mac OSX `Lock.release()` would fail with
`OSError(errno.ENOSYS, "[Errno 78] Function not implemented")`.
This appears to be because on Mac OSX `sem_getvalue()` has not been
implemented.

Now `sem_getvalue()` is no longer needed.  Unfortunately, however,
on Mac OSX `BoundedSemaphore()` will not raise `ValueError` if it
exceeds its initial value.

* Some changes to the code for the reduction/rebuilding of connection
and socket objects so that things work the same on Windows and Unix.
This should fix a couple of bugs.

* The code has been changed to consistently use "camelCase" for
methods and (non-factory) functions.  In the few cases where this
has meant a change to the documented API, the old name has been
retained as an alias.

0.51

---------------

* In 0.50 `processing.Value()` and `processing.sharedctypes.Value()`
were related but had different signatures, which was rather
confusing.

Now `processing.sharedctypes.Value()` has been renamed 
`processing.sharedctypes.RawValue()` and
`processing.sharedctypes.Value()` is the same as `processing.Value()`.

* In version 0.50 `sendfd()` and `recvfd()` apparently did not work on
64bit Linux.  This has been fixed by reverting to using the CMSG_*
macros as was done in 0.40.  

However, this means that systems without all the necessary CMSG_*
macros (such as Solaris 8) will have to disable compilation of
`sendfd()` and `recvfd()` by setting `macros['HAVE_FD_TRANSFRER'] = 0` 
in `setup.py`.

* Fixed an authentication error when using a "remote" manager created
using `BaseManager.from_address()`.

* Fixed a couple of bugs which only affected Python 2.4.

0.50

---------------

* `ctypes` is now a prerequisite if you want to use shared memory --
with Python 2.4 you will need to install it separately.

* `LocalManager()` has been removed.

* Added `processing.Value()` and `processing.Array()`
which are similar to `LocalManager.SharedValue()` and
`LocalManager.SharedArray()`.  

* In the `sharedctypes` module `new_value()` and `new_array()` have
been renamed `Value()` and `Array()`.

* `Process.stop()`, `Process.getStoppable()` and
`Process.setStoppable()` have been removed.  Use
`Process.terminate()` instead.

* `procesing.Lock` now matches `threading.Lock` behaviour more
closely: now a thread can release a lock it does not own, and now
when a thread tries acquiring a lock it already owns a deadlock
results instead of an exception.

* On Windows when the main thread is blocking on a method of `Lock`,
`RLock`, `Semaphore`, `BoundedSemaphore`, `Condition` it will no
longer ignore Ctrl-C.  (The same was already true on Unix.)  

This differs from the behaviour of the equivalent objects in
`threading` which will completely ignore Ctrl-C.

* The `test` sub-package has been replaced by lots of unit tests in a
`tests` sub-package.  Some of the old test files have been moved
over to a new `examples` sub-package.

* On Windows it is now possible for a non-console python program
(i.e. one using `pythonw.exe` instead of `python.exe`) to use
`processing`.  

Previously an exception was raised when `subprocess.py` tried to
duplicate stdin, stdout, stderr.

* Proxy objects should now be thread safe -- they now use thread local
storage.

* Trying to transfer shared resources such as locks, queues etc
between processes over a pipe or queue will now raise `RuntimeError`
with a message saying that the object should only be shared between
processes using inheritance.

Previously, this worked unreliably on Windows but would fail with an
unexplained `AssertionError` on Unix.

* The names of some of the macros used for compiling the extension
have changed.  See `INSTALL.txt` and `setup.py`.

* A few changes which (hopefully) make compilation possible on Solaris.

* Lots of refactoring of the code.

* Fixed reference leaks so that unit tests pass with "regrtest -R::"
(at least on Linux).

0.40

---------------

* Removed `SimpleQueue` and `PosixQueue` types.  Just use `Queue` instead.

* Previously if you forgot to use the ::

   if __name__ == '__main__':
       freezeSupport()
       ...

idiom on Windows then processes could be created recursively
bringing the computer to its knees.  Now `RuntimeError` will be
raised instead.

* Some refactoring of the code.

* A Unix specific bug meant that a child process might fail to start a
feeder thread for a queue if its parent process had already started
its own feeder thread.  Fixed.

0.39

---------------

* One can now create one-way pipes by doing 
`reader, writer = Pipe(duplex=False)`.

* Rewrote code for managing shared memory maps.

* Added a `sharedctypes` module for creating `ctypes` objects allocated
from shared memory.  On Python 2.4 this requires the installation of
`ctypes`.

`ctypes` objects are not protected by any locks so you will need to
synchronize access to them (such as by using a lock).  However they
can be much faster to access than equivalent objects allocated using
a `LocalManager`.

* Rearranged documentation.

* Previously the C extension caused a segfault on 64 bit machines with
Python 2.5 because it used `int` instead of `Py_ssize_t` in certain
places.  This is now fixed.  Thanks to Alexy Khrabrov for the report.

* A fix for `Pool.terminate()`.

* A fix for cleanup behaviour of `Queue`.

0.38

---------------

* Have revamped the queue types.  Now the queue types are
`Queue`, `SimpleQueue` and (on systems which support it)
`PosixQueue`.

Now `Queue` should behave just like Python's normal `Queue.Queue`
class except that `qsize()`, `task_done()` and `join()` are not
implemented.  In particular, if no maximum size was specified when
the queue was created then `put()` will always succeed without
blocking.

A `SimpleQueue` instance is really just a pipe protected by a couple
of locks.  It has `get()`, `put()` and `empty()` methods but does
not not support timeouts or non-blocking.

`BufferedPipeQueue()` and `PipeQueue()` remain as deprecated
aliases of `Queue()` but `BufferedPosixQueue()` has been removed.
(Not sure if we really need to keep `PosixQueue()`...)

* Previously the `Pool.shutdown()` method was a little dodgy -- it
could block indefinitely if `map()` or `imap*()` were used and did
not try to terminate workers while they were doing a task.

Now there are three new methods `close()`, `terminate()` and
`join()` -- `shutdown()` is retained as a deprecated alias of
`terminate()`.  Thanks to Gerald John M. Manipon for feature
request/suggested patch to `shutdown()`.

* `Pool.imap()` and `Pool.imap_unordered()` has gained a `chunksize`
argument which allows the iterable to be submitted to the pool in
chunks.  Choosing `chunksize` appropriately makes `Pool.imap()`
almost as fast as `Pool.map()` even for long iterables and cheap
functions.

* Previously on Windows when the cleanup code for a `LocalManager`
attempts to unlink the name of the file which backs the shared
memory map an exception is raised if a child process still exists
which has a handle open for that mmap.  This is likely to happen if
a daemon process inherits a `LocalManager` instance.

Now the parent process will remember the filename and attempt to
unlink the file name again once all the child processes have been
joined or terminated.  Reported by Paul Rudin.

* `types.MethodType` is registered with `copy_reg` so now instance
methods and class methods should be picklable.  (Unfortunately there is
no obvious way of supporting the pickling of staticmethods since
they are not marked with the class in which they were defined.)

This means that on Windows it is now possible to use an instance
method or class method as the target callable of a Process object.

* On Windows `reduction.fromfd()` now returns true instances of
`_socket.socket`, so there is no more need for the
`_processing.falsesocket` type.

0.37

---------------

* Updated metadata and documentation because the project is now hosted
at `developer.berlios.de/projects/pyprocessing`.

* The `Pool.join()` method has been removed.  `Pool.shutdown()` will
now join the worker processes automatically.

* A pool object no longer participates in a reference cycle so
`Pool.shutdown()` should get called as soon as its reference count
falls to zero.

* On Windows if `enableLogging()` was used at module scope then the
logger used by a child process would often get two copies of the
same handler.  To fix this, now specifiying a handler type in
`enableLogging()` will cause any previous handlers used by the
logger to be discarded.

0.36

---------------

* In recent versions on Unix the finalizers in a manager process were
never given a chance to run before `os._exit()` was called, so old
unlinked AF_UNIX sockets could accumulate in '/tmp'.  Fixed.

* The shutting down of managers has been cleaned up.

* In previous versions on Windows trying to acquire a lock owned by a
different thread of the current process would raise an exception.
Fixed.

* In previous versions on Windows trying to use an event object for
synchronization between two threads of the same process was likely
to raise an exception.  (This was caused by the bug described
above.)  Fixed.

* Previously the arguments to `processing.Semaphore()` and
`processing.BoundedSemaphore()` did not have any defaults.  The
defaults should be 1 to match `threading`.  Fixed.

* It should now be possible for a Windows Service created by using
`pywin32` to spawn processes using the `processing` package.

Note that `pywin32` apparently has a bug meaning that `Py_Finalize()` 
is never called when the service exits so functions registered with
`atexit` never get a chance to run.  Therefore it is advisable to
explicitly call `sys.exitfunc()` or `atexit._run_exitfuncs()` at the
end of `ServiceFramework.DoSvcRun()`.  Otherwise child processes are
liable to survive the service when it is stopped.  Thanks to Charlie
Hull for the report.

* Added `getLogger()` and `enableLogging()` to support logging.

0.35

---------------

* By default processes are no longer be stoppable using the `stop()`
method: one must call `setStoppable(True)` before `start()` in order
to use the `stop()` method.  (Note that `terminate()` will work
regardless of whether the process is marked as being "stoppable".)

The reason for this is that on Windows getting `stop()` to work
involves starting a new console for the child process and installing
a signal handler for the `SIGBREAK` signal.  This unfortunately
means that Ctrl-Break cannot not be used to kill all processes of
the program.

* Added `setStoppable()` and `getStoppable()` methods -- see above.

* Added `BufferedQueue`/`BufferedPipeQueue`/`BufferedPosixQueue`.
Putting an object on a buffered queue will always succeed without
blocking (just like with `Queue.Queue` if no maximum size is
specified).  This makes them potentially safer than the normal queue
types provided by `processing` which have finite capacity and may
cause deadlocks if they fill.

`test/test_worker.py` has been updated to use `BufferedQueue` for
the task queue instead of explicitly spawning a thread to feed tasks
to the queue without risking a deadlock.

* Now when the NO_SEM_TIMED macro is set polling will be used to get
around the lack of `sem_timedwait()`.  This means that
`Condition.wait()` and `Queue.get()` should now work with timeouts
on Mac OS X.

* Added a `callback` argument to `Pool.apply_async()`.

* Added `test/test_httpserverpool.py` which runs a pool of http
servers which share a single listening socket.

* Previously on Windows the process object was passed to the child
process on the commandline (after pickling and hex encoding it).
This caused errors when the pickled string was too large.  Now if
the pickled string is large then it will be passed to the child
over a pipe or socket.

* Fixed bug in the iterator returned by `Pool.imap()`.

* Fixed bug in `Condition.__repr__()`.

* Fixed a handle/file descriptor leak when sockets or connections are
unpickled.

0.34

---------------

* Although version 0.33 the C extension would compile on Mac OSX
trying to import it failed with "undefined symbol: _sem_timedwait".
Unfortunately the `ImportError` exception was silently swallowed.

This is now fixed by using the `NO_SEM_TIMED` macro.  Unfortunately
this means that some methods like `Condition.wait()` and
`Queue.get()` will not work with timeouts on Mac OS X.  If you
really need to be able to use timeouts then you can always use the
equivalent objects created with a manager.  Thanks to Doug Hellmann
for report and testing.

* Added a `terminate()` method to process objects which is more
forceful than `stop()`.

* Fixed bug in the cleanup function registered with `atexit` which on
Windows could cause a process which is shutting down to deadlock
waiting for a manager to exit.  Thanks to Dominique Wahli for report
and testing.

* Added `test/test_workers.py` which gives an example of how to create
a collection of worker processes which execute tasks from one queue
and return results on another.

* Added `processing.Pool()` which returns a process pool object.  This
allows one to execute functions asynchronously.  It also has a
parallel implementation of the `map()` builtin.  This is still
*experimental* and undocumented --- see `test/test_pool.py` for
example usage.

0.33

---------------

* Added a `recvbytes_into()` method for receiving byte data into
objects with the writable buffer interface.  Also renamed the
`_recv_string()` and `_send_string()` methods of connection objects
to `recvbytes()` and `sendbytes()`.

* Some optimizations for the transferring of large blocks of data
using connection objects.

* On Unix `os.sysconf()` is now used by default to determine whether
to compile in support for posix semaphores or posix message queues.

By using the `NO_SEM_TIMED` and `NO_MQ_TIMED` macros (see
`INSTALL.txt`) it should now also be possible to compile in
(partial) semaphore or queue support on Unix systems which lack the
timeout functions `sem_timedwait()` or `mq_timedreceive()` and
`mq_timesend()`.

* `gettimeofday()` is now used instead of `clock_gettime()` making
compilation of the C extension (hopefully) possible on Mac OSX.  No
modificaton of `setup.py` should be necessary.  Thanks to Michele
Bertoldi for report and proposed patch.

* `cpuCount()` function added which returns the number of CPUs
in the system.

* Bugfixes to `PosixQueue` class.

0.32

---------------

* Refactored and simplified `_nonforking` module -- info about
`sys.modules` of parent process is no longer passed on to child
process.  Also `pkgutil` is no longer used.

* Allocated space from an mmap used by `LocalManager` will now be
recycled.

* Better tests for `LocalManager`.

* Fixed bug in `managers.py` concerning refcounting of shared objects.
Bug affects the case where the callable used to create a shared
object does not return a unique object each time it is called.
Thanks to Alexey Akimov for the report.

* Added a `freezeSupport()` function. Calling this at the appropriate
point in the main module is necessary when freezing a multiprocess
program to produce a Windows executable.  (Has been tested with
`py2exe`, `PyInstaller` and `cx_Freeze`.)

0.31

---------------

* Fixed one line bug in `localmanager.py` which caused shared memory maps
not to be resized properly.

* Added tests for shared values/structs/arrays to `test/test_processing`.

0.30

----------------

* Process objects now support the complete API of thread objects.

In particular `isAlive()`, `isDaemon()`, `setDaemon()` have been
added and `join()` now supports the `timeout` paramater.

There are also new methods `stop()`, `getPid()` and `getExitCode()`.

* Implemented synchronization primitives based on the Windows mutexes
and semaphores and posix named semaphores.  

* Added support for sharing simple objects between processes by using
a shared memory map and the `struct` or `array` modules.

* An `activeChildren()` function has been added to `processing` which
returns a list of the child processes which are still alive.

* A `Pipe()` function has been added which returns a pair of
connection objects representing the ends of a duplex connection over
which picklable objects can be sent.

* socket objects etc are now picklable and can be transferred between
processes.  (Requires compilation of the `_processing` extension.)

* Subclasses of `managers.BaseManager` no longer automatically spawn a
child process when an instance is created: the `start()` method must be
called explicitly.

* On Windows child processes are now spawned using `subprocess`.

* On Windows the Python 2.5 version of `pkgutil` is now used for
loading modules by the `_nonforking` module.  On Python 2.4 this
version of `pkgutil` (which uses the standard Python licence) is
included in `processing.compat`.

* The arguments to the functions in `processing.connection` have
changed slightly.

* Connection objects now have a `poll()` method which tests whether
there is any data available for reading.

* The `test/py2exedemo` folder shows how to get `py2exe` to create a
Windows executable from a program using the `processing` package.

* More tests.

* Bugfixes.

* Rearrangement of various stuff.

0.21

---------------

* By default a proxy is now only able to access those methods of its
referent which have been explicitly exposed.

* The `connection` sub-package now supports digest authentication.

* Process objects are now given randomly generated 'inheritable'
authentication keys.

* A manager process will now only accept connections from processes
using the same authentication key.

* Previously `get_module()` from `_nonforking.py` was seriously messed
up (though it generally worked).  It is a lot saner now.

* Python 2.4 or higher is now required.

0.20

---------------

* The `doc` folder contains HTML documentation.

* `test` is now a subpackage.  Running `processing.test.main()` 
will run test scripts using both processes and threads.

* `nonforking.py` has been renamed `_nonforking.py`.
`manager.py` has been renamed `manager.py`.
`connection.py` has become a sub-package `connection`

* `Listener` and `Client` have been removed from
`processing`, but still exist in `processing.connection`.

* The package is now *probably* compatible with versions of Python
earlier than 2.4.

* `set` is no longer a type supported by the default manager type.

* Many more changes.

0.12

---------------

* Fixed bug where the arguments to `processing.Manager()` were passed on
to `processing.manager.DefaultManager()` in the wrong order.

* `processing.dummy` is now a subpackage of `processing`
instead of a module.

* Rearranged package so that the `test` folder, `README.txt` and
`CHANGES.txt` are copied when the package is installed.

0.11

---------------

* Fixed bug on windows when the full path of `nonforking.py` contains a
space.

* On unix there is no longer a need to make the arguments to the
constructor of `Process` be picklable or for and instance of a
subclass of `Process` to be picklable when you call the start method.

* On unix proxies which a child process inherits from its parent can
be used by the child without any problem, so there is no longer a
need to pass them as arguments to `Process`.  (This will never be
possible on windows.)
Links

Update pandas from 0.23.4 to 0.23.4.

Changelog

0.23.4

------------------------

This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes
and bug fixes. We recommend that all users upgrade to this version.

.. warning::

Starting January 1, 2019, pandas feature releases will support Python 3 only.
See :ref:`install.dropping-27` for more.

.. contents:: What's new in v0.23.4
 :local:
 :backlinks: none

.. _whatsnew_0234.fixed_regressions:

Fixed Regressions
~~~~~~~~~~~~~~~~~

- Python 3.7 with Windows gave all missing values for rolling variance calculations (:issue:`21813`)

.. _whatsnew_0234.bug_fixes:

Bug Fixes
~~~~~~~~~

**Groupby/Resample/Rolling**

- Bug where calling :func:`DataFrameGroupBy.agg` with a list of functions including ``ohlc`` as the non-initial element would raise a ``ValueError`` (:issue:`21716`)
- Bug in ``roll_quantile`` caused a memory leak when calling ``.rolling(...).quantile(q)`` with ``q`` in (0,1) (:issue:`21965`)

**Missing**

- Bug in :func:`Series.clip` and :func:`DataFrame.clip` cannot accept list-like threshold containing ``NaN`` (:issue:`19992`)

0.23.3

----------------------

This release fixes a build issue with the sdist for Python 3.7 (:issue:`21785`)
There are no other changes.


.. _whatsnew_0211:

0.23.2

----------------------

This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes
and bug fixes. We recommend that all users upgrade to this version.

.. note::

Pandas 0.23.2 is first pandas release that's compatible with
Python 3.7 (:issue:`20552`)

.. warning::

Starting January 1, 2019, pandas feature releases will support Python 3 only.
See :ref:`install.dropping-27` for more.

.. contents:: What's new in v0.23.2
 :local:
 :backlinks: none

.. _whatsnew_0232.enhancements:

Logical Reductions over Entire DataFrame
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:meth:`DataFrame.all` and :meth:`DataFrame.any` now accept ``axis=None`` to reduce over all axes to a scalar (:issue:`19976`)

.. ipython:: python

df = pd.DataFrame({"A": [1, 2], "B": [True, False]})
df.all(axis=None)


This also provides compatibility with NumPy 1.15, which now dispatches to ``DataFrame.all``.
With NumPy 1.15 and pandas 0.23.1 or earlier, :func:`numpy.all` will no longer reduce over every axis:

.. code-block:: python

>>>  NumPy 1.15, pandas 0.23.1
>>> np.any(pd.DataFrame({"A": [False], "B": [False]}))
A    False
B    False
dtype: bool

With pandas 0.23.2, that will correctly return False, as it did with NumPy < 1.15.

.. ipython:: python

np.any(pd.DataFrame({"A": [False], "B": [False]}))


.. _whatsnew_0232.fixed_regressions:

Fixed Regressions
~~~~~~~~~~~~~~~~~

- Fixed regression in :meth:`to_csv` when handling file-like object incorrectly (:issue:`21471`)
- Re-allowed duplicate level names of a ``MultiIndex``. Accessing a level that has a duplicate name by name still raises an error (:issue:`19029`).
- Bug in both :meth:`DataFrame.first_valid_index` and :meth:`Series.first_valid_index` raised for a row index having duplicate values (:issue:`21441`)
- Fixed printing of DataFrames with hierarchical columns with long names (:issue:`21180`)
- Fixed regression in :meth:`~DataFrame.reindex` and :meth:`~DataFrame.groupby`
with a MultiIndex or multiple keys that contains categorical datetime-like values (:issue:`21390`).
- Fixed regression in unary negative operations with object dtype (:issue:`21380`)
- Bug in :meth:`Timestamp.ceil` and :meth:`Timestamp.floor` when timestamp is a multiple of the rounding frequency (:issue:`21262`)
- Fixed regression in :func:`to_clipboard` that defaulted to copying dataframes with space delimited instead of tab delimited (:issue:`21104`)


Build Changes
~~~~~~~~~~~~~

- The source and binary distributions no longer include test data files, resulting in smaller download sizes. Tests relying on these data files will be skipped when using ``pandas.test()``. (:issue:`19320`)

.. _whatsnew_0232.bug_fixes:

Bug Fixes
~~~~~~~~~

**Conversion**

- Bug in constructing :class:`Index` with an iterator or generator (:issue:`21470`)
- Bug in :meth:`Series.nlargest` for signed and unsigned integer dtypes when the minimum value is present (:issue:`21426`)

**Indexing**

- Bug in :meth:`Index.get_indexer_non_unique` with categorical key (:issue:`21448`)
- Bug in comparison operations for :class:`MultiIndex` where error was raised on equality / inequality comparison involving a MultiIndex with ``nlevels == 1`` (:issue:`21149`)
- Bug in :meth:`DataFrame.drop` behaviour is not consistent for unique and non-unique indexes (:issue:`21494`)
- Bug in :func:`DataFrame.duplicated` with a large number of columns causing a 'maximum recursion depth exceeded' (:issue:`21524`).

**I/O**

- Bug in :func:`read_csv` that caused it to incorrectly raise an error when ``nrows=0``, ``low_memory=True``, and ``index_col`` was not ``None`` (:issue:`21141`)
- Bug in :func:`json_normalize` when formatting the ``record_prefix`` with integer columns (:issue:`21536`)

**Categorical**

- Bug in rendering :class:`Series` with ``Categorical`` dtype in rare conditions under Python 2.7 (:issue:`21002`)

**Timezones**

- Bug in :class:`Timestamp` and :class:`DatetimeIndex` where passing a :class:`Timestamp` localized after a DST transition would return a datetime before the DST transition (:issue:`20854`)
- Bug in comparing :class:`DataFrame`s with tz-aware :class:`DatetimeIndex` columns with a DST transition that raised a ``KeyError`` (:issue:`19970`)

**Timedelta**

- Bug in :class:`Timedelta` where non-zero timedeltas shorter than 1 microsecond were considered False (:issue:`21484`)


.. _whatsnew_0170:

0.23.1

-----------------------

This is a minor bug-fix release in the 0.23.x series and includes some small regression fixes
and bug fixes. We recommend that all users upgrade to this version.

.. warning::

Starting January 1, 2019, pandas feature releases will support Python 3 only.
See :ref:`install.dropping-27` for more.

.. contents:: What's new in v0.23.1
 :local:
 :backlinks: none

.. _whatsnew_0231.fixed_regressions:

Fixed Regressions
~~~~~~~~~~~~~~~~~

**Comparing Series with datetime.date**

We've reverted a 0.23.0 change to comparing a :class:`Series` holding datetimes and a ``datetime.date`` object (:issue:`21152`).
In pandas 0.22 and earlier, comparing a Series holding datetimes and ``datetime.date`` objects would coerce the ``datetime.date`` to a datetime before comapring.
This was inconsistent with Python, NumPy, and :class:`DatetimeIndex`, which never consider a datetime and ``datetime.date`` equal.

In 0.23.0, we unified operations between DatetimeIndex and Series, and in the process changed comparisons between a Series of datetimes and ``datetime.date`` without warning.

We've temporarily restored the 0.22.0 behavior, so datetimes and dates may again compare equal, but restore the 0.23.0 behavior in a future release.

To summarize, here's the behavior in 0.22.0, 0.23.0, 0.23.1:

.. code-block:: python

 0.22.0... Silently coerce the datetime.date
>>> Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1)
0     True
1    False
dtype: bool

 0.23.0... Do not coerce the datetime.date
>>> Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1)
0    False
1    False
dtype: bool

 0.23.1... Coerce the datetime.date with a warning
>>> Series(pd.date_range('2017', periods=2)) == datetime.date(2017, 1, 1)
/bin/python:1: FutureWarning: Comparing Series of datetimes with 'datetime.date'.  Currently, the
'datetime.date' is coerced to a datetime. In the future pandas will
not coerce, and the values not compare equal to the 'datetime.date'.
To retain the current behavior, convert the 'datetime.date' to a
datetime with 'pd.Timestamp'.
  !/bin/python3
0     True
1    False
dtype: bool

In addition, ordering comparisons will raise a ``TypeError`` in the future.

**Other Fixes**

- Reverted the ability of :func:`~DataFrame.to_sql` to perform multivalue
inserts as this caused regression in certain cases (:issue:`21103`).
In the future this will be made configurable.
- Fixed regression in the :attr:`DatetimeIndex.date` and :attr:`DatetimeIndex.time`
attributes in case of timezone-aware data: :attr:`DatetimeIndex.time` returned
a tz-aware time instead of tz-naive (:issue:`21267`) and :attr:`DatetimeIndex.date`
returned incorrect date when the input date has a non-UTC timezone (:issue:`21230`).
- Fixed regression in :meth:`pandas.io.json.json_normalize` when called with ``None`` values
in nested levels in JSON, and to not drop keys with value as `None` (:issue:`21158`, :issue:`21356`).
- Bug in :meth:`~DataFrame.to_csv` causes encoding error when compression and encoding are specified (:issue:`21241`, :issue:`21118`)
- Bug preventing pandas from being importable with -OO optimization (:issue:`21071`)
- Bug in :meth:`Categorical.fillna` incorrectly raising a ``TypeError`` when `value` the individual categories are iterable and `value` is an iterable (:issue:`21097`, :issue:`19788`)
- Fixed regression in constructors coercing NA values like ``None`` to strings when passing ``dtype=str`` (:issue:`21083`)
- Regression in :func:`pivot_table` where an ordered ``Categorical`` with missing
values for the pivot's ``index`` would give a mis-aligned result (:issue:`21133`)
- Fixed regression in merging on boolean index/columns (:issue:`21119`).

.. _whatsnew_0231.performance:

Performance Improvements
~~~~~~~~~~~~~~~~~~~~~~~~

- Improved performance of :meth:`CategoricalIndex.is_monotonic_increasing`, :meth:`CategoricalIndex.is_monotonic_decreasing` and :meth:`CategoricalIndex.is_monotonic` (:issue:`21025`)
- Improved performance of :meth:`CategoricalIndex.is_unique` (:issue:`21107`)


.. _whatsnew_0231.bug_fixes:

Bug Fixes
~~~~~~~~~

**Groupby/Resample/Rolling**

- Bug in :func:`DataFrame.agg` where applying multiple aggregation functions to a :class:`DataFrame` with duplicated column names would cause a stack overflow (:issue:`21063`)
- Bug in :func:`pandas.core.groupby.GroupBy.ffill` and :func:`pandas.core.groupby.GroupBy.bfill` where the fill within a grouping would not always be applied as intended due to the implementations' use of a non-stable sort (:issue:`21207`)
- Bug in :func:`pandas.core.groupby.GroupBy.rank` where results did not scale to 100% when specifying ``method='dense'`` and ``pct=True``
- Bug in :func:`pandas.DataFrame.rolling` and :func:`pandas.Series.rolling` which incorrectly accepted a 0 window size rather than raising (:issue:`21286`)

**Data-type specific**

- Bug in :meth:`Series.str.replace()` where the method throws `TypeError` on Python 3.5.2 (:issue:`21078`)
- Bug in :class:`Timedelta` where passing a float with a unit would prematurely round the float precision (:issue:`14156`)
- Bug in :func:`pandas.testing.assert_index_equal` which raised ``AssertionError`` incorrectly, when comparing two :class:`CategoricalIndex` objects with param ``check_categorical=False`` (:issue:`19776`)

**Sparse**

- Bug in :attr:`SparseArray.shape` which previously only returned the shape :attr:`SparseArray.sp_values` (:issue:`21126`)

**Indexing**

- Bug in :meth:`Series.reset_index` where appropriate error was not raised with an invalid level name (:issue:`20925`)
- Bug in :func:`interval_range` when ``start``/``periods`` or ``end``/``periods`` are specified with float ``start`` or ``end`` (:issue:`21161`)
- Bug in :meth:`MultiIndex.set_names` where error raised for a ``MultiIndex`` with ``nlevels == 1`` (:issue:`21149`)
- Bug in :class:`IntervalIndex` constructors where creating an ``IntervalIndex`` from categorical data was not fully supported (:issue:`21243`, :issue:`21253`)
- Bug in :meth:`MultiIndex.sort_index` which was not guaranteed to sort correctly with ``level=1``; this was also causing data misalignment in particular :meth:`DataFrame.stack` operations (:issue:`20994`, :issue:`20945`, :issue:`21052`)

**Plotting**

- New keywords (sharex, sharey) to turn on/off sharing of x/y-axis by subplots generated with pandas.DataFrame().groupby().boxplot() (:issue:`20968`)

**I/O**

- Bug in IO methods specifying ``compression='zip'`` which produced uncompressed zip archives (:issue:`17778`, :issue:`21144`)
- Bug in :meth:`DataFrame.to_stata` which prevented exporting DataFrames to buffers and most file-like objects (:issue:`21041`)
- Bug in :meth:`read_stata` and :class:`StataReader` which did not correctly decode utf-8 strings on Python 3 from Stata 14 files (dta version 118) (:issue:`21244`)
- Bug in IO JSON :func:`read_json` reading empty JSON schema with ``orient='table'`` back to :class:`DataFrame` caused an error (:issue:`21287`)

**Reshaping**

- Bug in :func:`concat` where error was raised in concatenating :class:`Series` with numpy scalar and tuple names (:issue:`21015`)
- Bug in :func:`concat` warning message providing the wrong guidance for future behavior (:issue:`21101`)

**Other**

- Tab completion on :class:`Index` in IPython no longer outputs deprecation warnings (:issue:`21125`)
- Bug preventing pandas being used on Windows without C++ redistributable installed (:issue:`21106`)



.. _whatsnew_050:

0.23.0

----------------------

This is a major release from 0.22.0 and includes a number of API changes,
deprecations, new features, enhancements, and performance improvements along
with a large number of bug fixes. We recommend that all users upgrade to this
version.

Highlights include:

- :ref:`Round-trippable JSON format with 'table' orient <whatsnew_0230.enhancements.round-trippable_json>`.
- :ref:`Instantiation from dicts respects order for Python 3.6+ <whatsnew_0230.api_breaking.dict_insertion_order>`.
- :ref:`Dependent column arguments for assign <whatsnew_0230.enhancements.assign_dependent>`.
- :ref:`Merging / sorting on a combination of columns and index levels <whatsnew_0230.enhancements.merge_on_columns_and_levels>`.
- :ref:`Extending Pandas with custom types <whatsnew_023.enhancements.extension>`.
- :ref:`Excluding unobserved categories from groupby <whatsnew_0230.enhancements.categorical_grouping>`.
- :ref:`Changes to make output shape of DataFrame.apply consistent <whatsnew_0230.api_breaking.apply>`.

Check the :ref:`API Changes <whatsnew_0230.api_breaking>` and :ref:`deprecations <whatsnew_0230.deprecations>` before updating.

.. warning::

Starting January 1, 2019, pandas feature releases will support Python 3 only.
See :ref:`install.dropping-27` for more.

.. contents:: What's new in v0.23.0
 :local:
 :backlinks: none
 :depth: 2

.. _whatsnew_0230.enhancements:

New features
~~~~~~~~~~~~

.. _whatsnew_0230.enhancements.round-trippable_json:

JSON read/write round-trippable with ``orient='table'``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A ``DataFrame`` can now be written to and subsequently read back via JSON while preserving metadata through usage of the ``orient='table'`` argument (see :issue:`18912` and :issue:`9146`). Previously, none of the available ``orient`` values guaranteed the preservation of dtypes and index names, amongst other metadata.

.. ipython:: python

df = pd.DataFrame({'foo': [1, 2, 3, 4],
 	      'bar': ['a', 'b', 'c', 'd'],
 	      'baz': pd.date_range('2018-01-01', freq='d', periods=4),
 	      'qux': pd.Categorical(['a', 'b', 'c', 'c'])
 	      }, index=pd.Index(range(4), name='idx'))
df
df.dtypes
df.to_json('test.json', orient='table')
new_df = pd.read_json('test.json', orient='table')
new_df
new_df.dtypes

Please note that the string `index` is not supported with the round trip format, as it is used by default in ``write_json`` to indicate a missing index name.

.. ipython:: python
:okwarning:

df.index.name = 'index'

df.to_json('test.json', orient='table')
new_df = pd.read_json('test.json', orient='table')
new_df
new_df.dtypes

.. ipython:: python
:suppress:

import os
os.remove('test.json')


.. _whatsnew_0230.enhancements.assign_dependent:


``.assign()`` accepts dependent arguments
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The :func:`DataFrame.assign` now accepts dependent keyword arguments for python version later than 3.6 (see also `PEP 468
<https://www.python.org/dev/peps/pep-0468/>`_). Later keyword arguments may now refer to earlier ones if the argument is a callable. See the
:ref:`documentation here <dsintro.chained_assignment>` (:issue:`14207`)

.. ipython:: python

 df = pd.DataFrame({'A': [1, 2, 3]})
 df
 df.assign(B=df.A, C=lambda x:x['A']+ x['B'])

.. warning::

This may subtly change the behavior of your code when you're
using ``.assign()`` to update an existing column. Previously, callables
referring to other variables being updated would get the "old" values

Previous Behavior:

.. code-block:: ipython

   In [2]: df = pd.DataFrame({"A": [1, 2, 3]})

   In [3]: df.assign(A=lambda df: df.A + 1, C=lambda df: df.A * -1)
   Out[3]:
      A  C
   0  2 -1
   1  3 -2
   2  4 -3

New Behavior:

.. ipython:: python

   df.assign(A=df.A+1, C= lambda df: df.A* -1)



.. _whatsnew_0230.enhancements.merge_on_columns_and_levels:

Merging on a combination of columns and index levels
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Strings passed to :meth:`DataFrame.merge` as the ``on``, ``left_on``, and ``right_on``
parameters may now refer to either column names or index level names.
This enables merging ``DataFrame`` instances on a combination of index levels
and columns without resetting indexes. See the :ref:`Merge on columns and
levels <merging.merge_on_columns_and_levels>` documentation section.
(:issue:`14355`)

.. ipython:: python

left_index = pd.Index(['K0', 'K0', 'K1', 'K2'], name='key1')

left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3'],
                     'key2': ['K0', 'K1', 'K0', 'K1']},
                    index=left_index)

right_index = pd.Index(['K0', 'K1', 'K2', 'K2'], name='key1')

right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],
                      'D': ['D0', 'D1', 'D2', 'D3'],
                      'key2': ['K0', 'K0', 'K0', 'K1']},
                     index=right_index)

left.merge(right, on=['key1', 'key2'])

.. _whatsnew_0230.enhancements.sort_by_columns_and_levels:

Sorting by a combination of columns and index levels
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Strings passed to :meth:`DataFrame.sort_values` as the ``by`` parameter may
now refer to either column names or index level names.  This enables sorting
``DataFrame`` instances by a combination of index levels and columns without
resetting indexes. See the :ref:`Sorting by Indexes and Values
<basics.sort_indexes_and_values>` documentation section.
(:issue:`14353`)

.. ipython:: python

 Build MultiIndex
idx = pd.MultiIndex.from_tuples([('a', 1), ('a', 2), ('a', 2),
                                 ('b', 2), ('b', 1), ('b', 1)])
idx.names = ['first', 'second']

 Build DataFrame
df_multi = pd.DataFrame({'A': np.arange(6, 0, -1)},
                        index=idx)
df_multi

 Sort by 'second' (index) and 'A' (column)
df_multi.sort_values(by=['second', 'A'])


.. _whatsnew_023.enhancements.extension:

Extending Pandas with Custom Types (Experimental)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Pandas now supports storing array-like objects that aren't necessarily 1-D NumPy
arrays as columns in a DataFrame or values in a Series. This allows third-party
libraries to implement extensions to NumPy's types, similar to how pandas
implemented categoricals, datetimes with timezones, periods, and intervals.

As a demonstration, we'll use cyberpandas_, which provides an ``IPArray`` type
for storing ip addresses.

.. code-block:: ipython

In [1]: from cyberpandas import IPArray

In [2]: values = IPArray([
   ...:     0,
   ...:     3232235777,
   ...:     42540766452641154071740215577757643572
   ...: ])
   ...:
   ...:

``IPArray`` isn't a normal 1-D NumPy array, but because it's a pandas
:class:`~pandas.api.extensions.ExtensionArray`, it can be stored properly inside pandas' containers.

.. code-block:: ipython

In [3]: ser = pd.Series(values)

In [4]: ser
Out[4]:
0                         0.0.0.0
1                     192.168.1.1
2    2001:db8:85a3::8a2e:370:7334
dtype: ip

Notice that the dtype is ``ip``. The missing value semantics of the underlying
array are respected:

.. code-block:: ipython

In [5]: ser.isna()
Out[5]:
0     True
1    False
2    False
dtype: bool

For more, see the :ref:`extension types <extending.extension-types>`
documentation. If you build an extension array, publicize it on our
:ref:`ecosystem page <ecosystem.extensions>`.

.. _cyberpandas: https://cyberpandas.readthedocs.io/en/latest/


.. _whatsnew_0230.enhancements.categorical_grouping:

New ``observed`` keyword for excluding unobserved categories in ``groupby``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Grouping by a categorical includes the unobserved categories in the output.
When grouping by multiple categorical columns, this means you get the cartesian product of all the
categories, including combinations where there are no observations, which can result in a large
number of groups. We have added a keyword ``observed`` to control this behavior, it defaults to
``observed=False`` for backward-compatibility. (:issue:`14942`, :issue:`8138`, :issue:`15217`, :issue:`17594`, :issue:`8669`, :issue:`20583`, :issue:`20902`)

.. ipython:: python

cat1 = pd.Categorical(["a", "a", "b", "b"],
                      categories=["a", "b", "z"], ordered=True)
cat2 = pd.Categorical(["c", "d", "c", "d"],
                      categories=["c", "d", "y"], ordered=True)
df = pd.DataFrame({"A": cat1, "B": cat2, "values": [1, 2, 3, 4]})
df['C'] = ['foo', 'bar'] * 2
df

To show all values, the previous behavior:

.. ipython:: python

df.groupby(['A', 'B', 'C'], observed=False).count()


To show only observed values:

.. ipython:: python

df.groupby(['A', 'B', 'C'], observed=True).count()

For pivotting operations, this behavior is *already* controlled by the ``dropna`` keyword:

.. ipython:: python

cat1 = pd.Categorical(["a", "a", "b", "b"],
                      categories=["a", "b", "z"], ordered=True)
cat2 = pd.Categorical(["c", "d", "c", "d"],
                      categories=["c", "d", "y"], ordered=True)
df = DataFrame({"A": cat1, "B": cat2, "values": [1, 2, 3, 4]})
df

.. ipython:: python

pd.pivot_table(df, values='values', index=['A', 'B'],
               dropna=True)
pd.pivot_table(df, values='values', index=['A', 'B'],
               dropna=False)


.. _whatsnew_0230.enhancements.window_raw:

Rolling/Expanding.apply() accepts ``raw=False`` to pass a ``Series`` to the function
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:func:`Series.rolling().apply() <pandas.core.window.Rolling.apply>`, :func:`DataFrame.rolling().apply() <pandas.core.window.Rolling.apply>`,
:func:`Series.expanding().apply() <pandas.core.window.Expanding.apply>`, and :func:`DataFrame.expanding().apply() <pandas.core.window.Expanding.apply>` have gained a ``raw=None`` parameter.
This is similar to :func:`DataFame.apply`. This parameter, if ``True`` allows one to send a ``np.ndarray`` to the applied function. If ``False`` a ``Series`` will be passed. The
default is ``None``, which preserves backward compatibility, so this will default to ``True``, sending an ``np.ndarray``.
In a future version the default will be changed to ``False``, sending a ``Series``. (:issue:`5071`, :issue:`20584`)

.. ipython:: python

s = pd.Series(np.arange(5), np.arange(5) + 1)
s

Pass a ``Series``:

.. ipython:: python

s.rolling(2, min_periods=1).apply(lambda x: x.iloc[-1], raw=False)

Mimic the original behavior of passing a ndarray:

.. ipython:: python

s.rolling(2, min_periods=1).apply(lambda x: x[-1], raw=True)


.. _whatsnew_0210.enhancements.limit_area:

``DataFrame.interpolate`` has gained the ``limit_area`` kwarg
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:meth:`DataFrame.interpolate` has gained a ``limit_area`` parameter to allow further control of which ``NaN`` s are replaced.
Use ``limit_area='inside'`` to fill only NaNs surrounded by valid values or use ``limit_area='outside'`` to fill only ``NaN`` s
outside the existing valid values while preserving those inside.  (:issue:`16284`) See the :ref:`full documentation here <missing_data.interp_limits>`.


.. ipython:: python

ser = pd.Series([np.nan, np.nan, 5, np.nan, np.nan, np.nan, 13, np.nan, np.nan])
ser

Fill one consecutive inside value in both directions

.. ipython:: python

ser.interpolate(limit_direction='both', limit_area='inside', limit=1)

Fill all consecutive outside values backward

.. ipython:: python

ser.interpolate(limit_direction='backward', limit_area='outside')

Fill all consecutive outside values in both directions

.. ipython:: python

ser.interpolate(limit_direction='both', limit_area='outside')

.. _whatsnew_0210.enhancements.get_dummies_dtype:

``get_dummies`` now supports ``dtype`` argument
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The :func:`get_dummies` now accepts a ``dtype`` argument, which specifies a dtype for the new columns. The default remains uint8. (:issue:`18330`)

.. ipython:: python

df = pd.DataFrame({'a': [1, 2], 'b': [3, 4], 'c': [5, 6]})
pd.get_dummies(df, columns=['c']).dtypes
pd.get_dummies(df, columns=['c'], dtype=bool).dtypes


.. _whatsnew_0230.enhancements.timedelta_mod:

Timedelta mod method
^^^^^^^^^^^^^^^^^^^^

``mod`` (%) and ``divmod`` operations are now defined on ``Timedelta`` objects
when operating with either timedelta-like or with numeric arguments.
See the :ref:`documentation here <timedeltas.mod_divmod>`. (:issue:`19365`)

.. ipython:: python

 td = pd.Timedelta(hours=37)
 td % pd.Timedelta(minutes=45)

.. _whatsnew_0230.enhancements.ran_inf:

``.rank()`` handles ``inf`` values when ``NaN`` are present
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In previous versions, ``.rank()`` would assign ``inf`` elements ``NaN`` as their ranks. Now ranks are calculated properly. (:issue:`6945`)

.. ipython:: python

 s = pd.Series([-np.inf, 0, 1, np.nan, np.inf])
 s

Previous Behavior:

.. code-block:: ipython

 In [11]: s.rank()
 Out[11]:
 0    1.0
 1    2.0
 2    3.0
 3    NaN
 4    NaN
 dtype: float64

Current Behavior:

.. ipython:: python

 s.rank()

Furthermore, previously if you rank ``inf`` or ``-inf`` values together with ``NaN`` values, the calculation won't distinguish ``NaN`` from infinity when using 'top' or 'bottom' argument.

.. ipython:: python

 s = pd.Series([np.nan, np.nan, -np.inf, -np.inf])
 s

Previous Behavior:

.. code-block:: ipython

 In [15]: s.rank(na_option='top')
 Out[15]:
 0    2.5
 1    2.5
 2    2.5
 3    2.5
 dtype: float64

Current Behavior:

.. ipython:: python

 s.rank(na_option='top')

These bugs were squashed:

- Bug in :meth:`DataFrame.rank` and :meth:`Series.rank` when ``method='dense'`` and ``pct=True`` in which percentile ranks were not being used with the number of distinct observations (:issue:`15630`)
- Bug in :meth:`Series.rank` and :meth:`DataFrame.rank` when ``ascending='False'`` failed to return correct ranks for infinity if ``NaN`` were present (:issue:`19538`)
- Bug in :func:`DataFrameGroupBy.rank` where ranks were incorrect when both infinity and ``NaN`` were present (:issue:`20561`)


.. _whatsnew_0230.enhancements.str_cat_align:

``Series.str.cat`` has gained the ``join`` kwarg
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, :meth:`Series.str.cat` did not -- in contrast to most of ``pandas`` -- align :class:`Series` on their index before concatenation (see :issue:`18657`).
The method has now gained a keyword ``join`` to control the manner of alignment, see examples below and :ref:`here <text.concatenate>`.

In v.0.23 `join` will default to None (meaning no alignment), but this default will change to ``'left'`` in a future version of pandas.

.. ipython:: python
:okwarning:

 s = pd.Series(['a', 'b', 'c', 'd'])
 t = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
 s.str.cat(t)
 s.str.cat(t, join='left', na_rep='-')

Furthermore, :meth:`Series.str.cat` now works for ``CategoricalIndex`` as well (previously raised a ``ValueError``; see :issue:`20842`).

.. _whatsnew_0230.enhancements.astype_category:

``DataFrame.astype`` performs column-wise conversion to ``Categorical``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:meth:`DataFrame.astype` can now perform column-wise conversion to ``Categorical`` by supplying the string ``'category'`` or
a :class:`~pandas.api.types.CategoricalDtype`. Previously, attempting this would raise a ``NotImplementedError``. See the
:ref:`categorical.objectcreation` section of the documentation for more details and examples. (:issue:`12860`, :issue:`18099`)

Supplying the string ``'category'`` performs column-wise conversion, with only labels appearing in a given column set as categories:

.. ipython:: python

 df = pd.DataFrame({'A': list('abca'), 'B': list('bccd')})
 df = df.astype('category')
 df['A'].dtype
 df['B'].dtype


Supplying a ``CategoricalDtype`` will make the categories in each column consistent with the supplied dtype:

.. ipython:: python

 from pandas.api.types import CategoricalDtype
 df = pd.DataFrame({'A': list('abca'), 'B': list('bccd')})
 cdt = CategoricalDtype(categories=list('abcd'), ordered=True)
 df = df.astype(cdt)
 df['A'].dtype
 df['B'].dtype


.. _whatsnew_0230.enhancements.other:

Other Enhancements
^^^^^^^^^^^^^^^^^^

- Unary ``+`` now permitted for ``Series`` and ``DataFrame`` as  numeric operator (:issue:`16073`)
- Better support for :meth:`~pandas.io.formats.style.Styler.to_excel` output with the ``xlsxwriter`` engine. (:issue:`16149`)
- :func:`pandas.tseries.frequencies.to_offset` now accepts leading '+' signs e.g. '+1h'. (:issue:`18171`)
- :func:`MultiIndex.unique` now supports the ``level=`` argument, to get unique values from a specific index level (:issue:`17896`)
- :class:`pandas.io.formats.style.Styler` now has method ``hide_index()`` to determine whether the index will be rendered in output (:issue:`14194`)
- :class:`pandas.io.formats.style.Styler` now has method ``hide_columns()`` to determine whether columns will be hidden in output (:issue:`14194`)
- Improved wording of ``ValueError`` raised in :func:`to_datetime` when ``unit=`` is passed with a non-convertible value (:issue:`14350`)
- :func:`Series.fillna` now accepts a Series or a dict as a ``value`` for a categorical dtype (:issue:`17033`)
- :func:`pandas.read_clipboard` updated to use qtpy, falling back to PyQt5 and then PyQt4, adding compatibility with Python3 and multiple python-qt bindings (:issue:`17722`)
- Improved wording of ``ValueError`` raised in :func:`read_csv` when the ``usecols`` argument cannot match all columns. (:issue:`17301`)
- :func:`DataFrame.corrwith` now silently drops non-numeric columns when passed a Series. Before, an exception was raised (:issue:`18570`).
- :class:`IntervalIndex` now supports time zone aware ``Interval`` objects (:issue:`18537`, :issue:`18538`)
- :func:`Series` / :func:`DataFrame` tab completion also returns identifiers in the first level of a :func:`MultiIndex`. (:issue:`16326`)
- :func:`read_excel()` has gained the ``nrows`` parameter (:issue:`16645`)
- :meth:`DataFrame.append` can now in more cases preserve the type of the calling dataframe's columns (e.g. if both are ``CategoricalIndex``) (:issue:`18359`)
- :meth:`DataFrame.to_json` and :meth:`Series.to_json` now accept an ``index`` argument which allows the user to exclude the index from the JSON output (:issue:`17394`)
- ``IntervalIndex.to_tuples()`` has gained the ``na_tuple`` parameter to control whether NA is returned as a tuple of NA, or NA itself (:issue:`18756`)
- ``Categorical.rename_categories``, ``CategoricalIndex.rename_categories`` and :attr:`Series.cat.rename_categories`
can now take a callable as their argument (:issue:`18862`)
- :class:`Interval` and :class:`IntervalIndex` have gained a ``length`` attribute (:issue:`18789`)
- ``Resampler`` objects now have a functioning :attr:`~pandas.core.resample.Resampler.pipe` method.
Previously, calls to ``pipe`` were diverted to  the ``mean`` method (:issue:`17905`).
- :func:`~pandas.api.types.is_scalar` now returns ``True`` for ``DateOffset`` objects (:issue:`18943`).
- :func:`DataFrame.pivot` now accepts a list for the ``values=`` kwarg (:issue:`17160`).
- Added :func:`pandas.api.extensions.register_dataframe_accessor`,
:func:`pandas.api.extensions.register_series_accessor`, and
:func:`pandas.api.extensions.register_index_accessor`, accessor for libraries downstream of pandas
to register custom accessors like ``.cat`` on pandas objects. See
:ref:`Registering Custom Accessors <extending.register-accessors>` for more (:issue:`14781`).

- ``IntervalIndex.astype`` now supports conversions between subtypes when passed an ``IntervalDtype`` (:issue:`19197`)
- :class:`IntervalIndex` and its associated constructor methods (``from_arrays``, ``from_breaks``, ``from_tuples``) have gained a ``dtype`` parameter (:issue:`19262`)
- Added :func:`pandas.core.groupby.SeriesGroupBy.is_monotonic_increasing` and :func:`pandas.core.groupby.SeriesGroupBy.is_monotonic_decreasing` (:issue:`17015`)
- For subclassed ``DataFrames``, :func:`DataFrame.apply` will now preserve the ``Series`` subclass (if defined) when passing the data to the applied function (:issue:`19822`)
- :func:`DataFrame.from_dict` now accepts a ``columns`` argument that can be used to specify the column names when ``orient='index'`` is used (:issue:`18529`)
- Added option ``display.html.use_mathjax`` so `MathJax <https://www.mathjax.org/>`_ can be disabled when rendering tables in ``Jupyter`` notebooks (:issue:`19856`, :issue:`19824`)
- :func:`DataFrame.replace` now supports the ``method`` parameter, which can be used to specify the replacement method when ``to_replace`` is a scalar, list or tuple and ``value`` is ``None`` (:issue:`19632`)
- :meth:`Timestamp.month_name`, :meth:`DatetimeIndex.month_name`, and :meth:`Series.dt.month_name` are now available (:issue:`12805`)
- :meth:`Timestamp.day_name` and :meth:`DatetimeIndex.day_name` are now available to return day names with a specified locale (:issue:`12806`)
- :meth:`DataFrame.to_sql` now performs a multi-value insert if the underlying connection supports itk rather than inserting row by row.
``SQLAlchemy`` dialects supporting multi-value inserts include: ``mysql``, ``postgresql``, ``sqlite`` and any dialect with ``supports_multivalues_insert``. (:issue:`14315`, :issue:`8953`)
- :func:`read_html` now accepts a ``displayed_only`` keyword argument to controls whether or not hidden elements are parsed (``True`` by default) (:issue:`20027`)
- :func:`read_html` now reads all ``<tbody>`` elements in a ``<table>``, not just the first. (:issue:`20690`)
- :meth:`~pandas.core.window.Rolling.quantile` and :meth:`~pandas.core.window.Expanding.quantile` now accept the ``interpolation`` keyword, ``linear`` by default (:issue:`20497`)
- zip compression is supported via ``compression=zip`` in :func:`DataFrame.to_pickle`, :func:`Series.to_pickle`, :func:`DataFrame.to_csv`, :func:`Series.to_csv`, :func:`DataFrame.to_json`, :func:`Series.to_json`. (:issue:`17778`)
- :class:`~pandas.tseries.offsets.WeekOfMonth` constructor now supports ``n=0`` (:issue:`20517`).
- :class:`DataFrame` and :class:`Series` now support matrix multiplication (`) operator (:issue:`10259`) for Python>=3.5
- Updated :meth:`DataFrame.to_gbq` and :meth:`pandas.read_gbq` signature and documentation to reflect changes from
the Pandas-GBQ library version 0.4.0. Adds intersphinx mapping to Pandas-GBQ
library. (:issue:`20564`)
- Added new writer for exporting Stata dta files in version 117, ``StataWriter117``.  This format supports exporting strings with lengths up to 2,000,000 characters (:issue:`16450`)
- :func:`to_hdf` and :func:`read_hdf` now accept an ``errors`` keyword argument to control encoding error handling (:issue:`20835`)
- :func:`cut` has gained the ``duplicates='raise'|'drop'`` option to control whether to raise on duplicated edges (:issue:`20947`)
- :func:`date_range`, :func:`timedelta_range`, and :func:`interval_range` now return a linearly spaced index if ``start``, ``stop``, and ``periods`` are specified, but ``freq`` is not. (:issue:`20808`, :issue:`20983`, :issue:`20976`)

.. _whatsnew_0230.api_breaking:

Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. _whatsnew_0230.api_breaking.deps:

@coveralls
Copy link

Coverage Status

Coverage remained the same at 92.661% when pulling 58fff33 on pyup/scheduled-update-2019-01-01 into eeff318 on develop.

@coveralls
Copy link

coveralls commented Jan 1, 2019

Coverage Status

Coverage remained the same at 92.661% when pulling 2b62c50 on pyup/scheduled-update-2019-01-01 into eeff318 on develop.

Remove added pins
@jason-neal jason-neal merged commit 7e3b7fe into develop Jan 2, 2019
@jason-neal jason-neal deleted the pyup/scheduled-update-2019-01-01 branch January 2, 2019 05:36
jason-neal pushed a commit that referenced this pull request Apr 7, 2019
* Update astropy from 3.0.5 to 3.1.1

* Update coverage from 4.5.1 to 4.5.2

* Update hypothesis from 3.82.1 to 3.85.2

* Update joblib from 0.12.5 to 0.13.0

* Update matplotlib from 3.0.1 to 3.0.2

* Update mypy from 0.641 to 0.650

* Update pre-commit from 1.12.0 to 1.13.0

* Update pytest from 3.9.3 to 4.0.2

* Update scipy from 1.1.0 to 1.2.0



Former-commit-id: 6182748283ea0cda2fce6f50f778353f7a214084 [formerly 119821421f5d280e0e7cc167bf81a7aba284f479]
Former-commit-id: 6c01db6d772ad9ddfc1324dd8fcb343f62d67f92
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants