Skip to content

Commit

Permalink
Merge commit 'v0.16.2-42-g383865f' into debian
Browse files Browse the repository at this point in the history
* commit 'v0.16.2-42-g383865f': (72 commits)
  BUG: provide categorical concat always on axis 0, pandas-dev#10430     numpy 1.10 makes this an error for 1-d on axis != 0
  DOC: update missing.rst with ref to groupby.rst
  BUG: Timedeltas with no specified units (and frac) should raise, pandas-dev#10426
  BUG: using .loc[:,column] fails when the object is a multi-index, pandas-dev#10408
  Removed scikit-timeseries migration docs from FAQ
  BUG: GH10395 bug in DataFrame.interpolate with axis=1 and inplace=True
  BUG: GH10392 bug where Table.select_column does not preserve column name
  TST: Use unicode literals in string test
  PERF: fix _get_level_indexer to accept an intermediate indexer result
  PERF: bench for pandas-dev#10287
  BUG: drop_duplicates drops name(s).
  ENH: Enable ExcelWriter to construct in-memory sheets
  BLD: remove support for 3.2, pandas-dev#9118
  PERF: timedelta and datetime64 ops improvements
  PERF: parse timedelta strings in cython pandas-dev#6755
  closes bug in reset_index when index contains NaT
  Check for size=0 before setting item Fixes pandas-dev#10193
  closes bug in apply when function returns categorical
  BUG: frequencies.get_freq_code raises an error against offset with n != 1
  CI: run doc-tests always
  ...
  • Loading branch information
yarikoptic committed Jun 26, 2015
2 parents 2b157b7 + 383865f commit be8c77a
Show file tree
Hide file tree
Showing 116 changed files with 3,143 additions and 1,370 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ doc/_build
dist
# Egg metadata
*.egg-info
.eggs

# tox testing tool
.tox
# rope
Expand Down
15 changes: 1 addition & 14 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,6 @@ matrix:
- CLIPBOARD=xsel
- BUILD_TYPE=conda
- JOB_NAME: "34_slow"
- python: 3.2
env:
- NOSE_ARGS="not slow and not network and not disabled"
- FULL_DEPS=true
- CLIPBOARD_GUI=qt4
- BUILD_TYPE=pydata
- JOB_NAME: "32_nslow"
- python: 2.7
env:
- EXPERIMENTAL=true
Expand All @@ -103,13 +96,6 @@ matrix:
- BUILD_TYPE=pydata
- PANDAS_TESTING_MODE="deprecate"
allow_failures:
- python: 3.2
env:
- NOSE_ARGS="not slow and not network and not disabled"
- FULL_DEPS=true
- CLIPBOARD_GUI=qt4
- BUILD_TYPE=pydata
- JOB_NAME: "32_nslow"
- python: 2.7
env:
- NOSE_ARGS="slow and not network and not disabled"
Expand Down Expand Up @@ -180,6 +166,7 @@ before_script:

script:
- echo "script"
- ci/run_build_docs.sh &
- ci/script.sh
# nothing here, or failed tests won't fail travis

Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# pandas: powerful Python data analysis toolkit

[![Build Status](https://travis-ci.org/pydata/pandas.svg?branch=master)](https://travis-ci.org/pydata/pandas)
[![Join the chat at
https://gitter.im/pydata/pandas](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/pydata/pandas?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

## What is it

Expand Down
14 changes: 8 additions & 6 deletions ci/build_docs.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/bash


cd "$TRAVIS_BUILD_DIR"
echo "inside $0"

git show --pretty="format:" --name-only HEAD~5.. --first-parent | grep -P "rst|txt|doc"

Expand All @@ -16,18 +16,20 @@ if [ x"$DOC_BUILD" != x"" ]; then

# we're running network tests, let's build the docs in the meantime
echo "Will build docs"
conda install sphinx==1.1.3 ipython
conda install -n pandas sphinx=1.1.3 pygments ipython=2.4 --yes

source activate pandas

mv "$TRAVIS_BUILD_DIR"/doc /tmp
cd /tmp/doc

rm /tmp/doc/source/api.rst # no R
rm /tmp/doc/source/r_interface.rst # no R

echo ############################### > /tmp/doc.log
echo # Log file for the doc build # > /tmp/doc.log
echo ############################### > /tmp/doc.log
echo "" > /tmp/doc.log
echo ###############################
echo # Log file for the doc build #
echo ###############################

echo -e "y\n" | ./make.py --no-api 2>&1

cd /tmp/doc/build/html
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-2.7_32.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
xlwt
numpy
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-2.7_64.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
xlwt
numpy
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-2.7_LOCALE.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz=2013b
xlwt=0.7.5
openpyxl=1.6.2
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-2.7_SLOW.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
numpy
cython
Expand Down
4 changes: 0 additions & 4 deletions ci/requirements-3.2.txt

This file was deleted.

2 changes: 1 addition & 1 deletion ci/requirements-3.3.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz=2013b
openpyxl=1.6.2
xlsxwriter=0.4.6
Expand Down
3 changes: 2 additions & 1 deletion ci/requirements-3.4.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
dateutil
python-dateutil
pytz
openpyxl
xlsxwriter
xlrd
xlwt
html5lib
patsy
beautiful-soup
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-3.4_32.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
openpyxl
xlrd
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements-3.4_64.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
openpyxl
xlrd
Expand Down
3 changes: 2 additions & 1 deletion ci/requirements-3.4_SLOW.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
dateutil
python-dateutil
pytz
openpyxl
xlsxwriter
xlrd
xlwt
html5lib
patsy
beautiful-soup
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements_all.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
nose
sphinx
ipython
dateutil
python-dateutil
pytz
openpyxl
xlsxwriter
Expand Down
2 changes: 1 addition & 1 deletion ci/requirements_dev.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
dateutil
python-dateutil
pytz
numpy
cython
Expand Down
10 changes: 10 additions & 0 deletions ci/run_build_docs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
#!/bin/bash

echo "inside $0"

"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log &

# wait until subprocesses finish (build_docs.sh)
wait

exit 0
11 changes: 2 additions & 9 deletions ci/script.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,13 @@ if [ -n "$LOCALE_OVERRIDE" ]; then
python -c "$pycmd"
fi

# conditionally build and upload docs to GH/pandas-docs/pandas-docs/travis
"$TRAVIS_BUILD_DIR"/ci/build_docs.sh 2>&1 > /tmp/doc.log &
# doc build log will be shown after tests

if [ "$BUILD_TEST" ]; then
echo "We are not running nosetests as this is simply a build test."
else
echo nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml
nosetests --exe -A "$NOSE_ARGS" pandas --with-xunit --xunit-file=/tmp/nosetests.xml
echo nosetests --exe -A "$NOSE_ARGS" pandas --doctest-tests --with-xunit --xunit-file=/tmp/nosetests.xml
nosetests --exe -A "$NOSE_ARGS" pandas --doctest-tests --with-xunit --xunit-file=/tmp/nosetests.xml
fi

RET="$?"

# wait until subprocesses finish (build_docs.sh)
wait

exit "$RET"
5 changes: 5 additions & 0 deletions ci/submit_ccache.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,11 @@ fi

if [ "$IRON_TOKEN" ]; then

# install the compiler cache
sudo apt-get $APT_ARGS install ccache p7zip-full
# iron_cache, pending py3 fixes upstream
pip install -I --allow-external --allow-insecure git+https://github.com/iron-io/iron_cache_python.git@8a451c7d7e4d16e0c3bedffd0f280d5d9bd4fe59#egg=iron_cache

rm -rf $HOME/ccache.7z

tar cf - $HOME/.ccache \
Expand Down
10 changes: 10 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,15 @@ JSON

read_json

.. currentmodule:: pandas.io.json

.. autosummary::
:toctree: generated/

json_normalize

.. currentmodule:: pandas

HTML
~~~~

Expand Down Expand Up @@ -563,6 +572,7 @@ strings and apply several methods to it. These can be acccessed like
Series.str.slice
Series.str.slice_replace
Series.str.split
Series.str.rsplit
Series.str.startswith
Series.str.strip
Series.str.swapcase
Expand Down
74 changes: 74 additions & 0 deletions doc/source/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,77 @@ We can also pass infinite values to define the bins:
Function application
--------------------

To apply your own or another library's functions to pandas objects,
you should be aware of the three methods below. The appropriate
method to use depends on whether your function expects to operate
on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.

1. `Tablewise Function Application`_: :meth:`~DataFrame.pipe`
2. `Row or Column-wise Function Application`_: :meth:`~DataFrame.apply`
3. Elementwise_ function application: :meth:`~DataFrame.applymap`

.. _basics.pipe:

Tablewise Function Application
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. versionadded:: 0.16.2

``DataFrames`` and ``Series`` can of course just be passed into functions.
However, if the function needs to be called in a chain, consider using the :meth:`~DataFrame.pipe` method.
Compare the following

.. code-block:: python
# f, g, and h are functions taking and returning ``DataFrames``
>>> f(g(h(df), arg1=1), arg2=2, arg3=3)
with the equivalent

.. code-block:: python
>>> (df.pipe(h)
.pipe(g, arg1=1)
.pipe(f, arg2=2, arg3=3)
)
Pandas encourages the second style, which is known as method chaining.
``pipe`` makes it easy to use your own or another library's functions
in method chains, alongside pandas' methods.

In the example above, the functions ``f``, ``g``, and ``h`` each expected the ``DataFrame`` as the first positional argument.
What if the function you wish to apply takes its data as, say, the second argument?
In this case, provide ``pipe`` with a tuple of ``(callable, data_keyword)``.
``.pipe`` will route the ``DataFrame`` to the argument specified in the tuple.

For example, we can fit a regression using statsmodels. Their API expects a formula first and a ``DataFrame`` as the second argument, ``data``. We pass in the function, keyword pair ``(sm.poisson, 'data')`` to ``pipe``:

.. ipython:: python
import statsmodels.formula.api as sm
bb = pd.read_csv('data/baseball.csv', index_col='id')
(bb.query('h > 0')
.assign(ln_h = lambda df: np.log(df.h))
.pipe((sm.poisson, 'data'), 'hr ~ ln_h + year + g + C(lg)')
.fit()
.summary()
)
The pipe method is inspired by unix pipes and more recently dplyr_ and magrittr_, which
have introduced the popular ``(%>%)`` (read pipe) operator for R_.
The implementation of ``pipe`` here is quite clean and feels right at home in python.
We encourage you to view the source code (``pd.DataFrame.pipe??`` in IPython).

.. _dplyr: https://github.com/hadley/dplyr
.. _magrittr: https://github.com/smbache/magrittr
.. _R: http://www.r-project.org


Row or Column-wise Function Application
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Arbitrary functions can be applied along the axes of a DataFrame or Panel
using the :meth:`~DataFrame.apply` method, which, like the descriptive
statistics methods, take an optional ``axis`` argument:
Expand Down Expand Up @@ -678,6 +749,7 @@ Series operation on each column or row:
tsdf
tsdf.apply(pd.Series.interpolate)
Finally, :meth:`~DataFrame.apply` takes an argument ``raw`` which is False by default, which
converts each row or column into a Series before applying the function. When
set to True, the passed function will instead receive an ndarray object, which
Expand All @@ -690,6 +762,8 @@ functionality.
functionality for grouping by some criterion, applying, and combining the
results into a Series, DataFrame, etc.

.. _Elementwise:

Applying elementwise Python functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Loading

0 comments on commit be8c77a

Please sign in to comment.