Skip to content

Commit

Permalink
Merge pull request #14 from petbox-dev/develop
Browse files Browse the repository at this point in the history
Split methods for consistent type returns, improve constructor validations
  • Loading branch information
dsfulf committed Jun 17, 2020
2 parents dbe102b + 8fbd0a3 commit d9a7768
Show file tree
Hide file tree
Showing 12 changed files with 309 additions and 159 deletions.
10 changes: 5 additions & 5 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ A short example
[array([1, 2, 3, 4]), array(['one', 'two', 'one', 'two'], dtype=object)]
>>> print('Records:', '\n', tuple(t.to_records()))
Record:
Records:
((1, 'one'), (2, 'two'), (3, 'one'), (4, 'two'))
>>> gb = t.group_by(
Expand Down Expand Up @@ -197,9 +197,9 @@ Timings
In this case, lightweight also means performant. Beyond any additional
features added to the library, ``tafra`` should provide the necessary
base for organizing data structures for numerical processing. One of the
most important aspects is fast access to the data itself. By minizing
most important aspects is fast access to the data itself. By minimizing
abstraction to access the underlying ``numpy`` arrays, ``tafra`` provides
over an order of magnitude increase in performance.
an order of magnitude increase in performance.

- **Import note** If you assign directly to the ``Tafra.data`` or
``Tafra._data`` attributes, you *must* call ``Tafra._coalesce_dtypes``
Expand Down Expand Up @@ -307,6 +307,6 @@ preserve immutability.
2.5 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %%timeit
>>> tdf = df.copy()
>>> tdf['x'] = df.groupby(['y', 'z'])[['x']].transform(sum)
... tdf = df.copy()
... tdf['x'] = df.groupby(['y', 'z'])[['x']].transform(sum)
2.81 ms ± 143 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2 changes: 2 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ Methods
row_map
tuple_map
col_map
key_map
select
copy
update
Expand Down Expand Up @@ -166,6 +167,7 @@ Methods
.. automethod:: row_map
.. automethod:: tuple_map
.. automethod:: col_map
.. automethod:: key_map
.. automethod:: select
.. automethod:: copy
.. automethod:: update
Expand Down
15 changes: 8 additions & 7 deletions docs/numerical.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ as `generator expressions <https://www.python.org/dev/peps/pep-0289/>`_ wherever
possible.

Additionally, because the :attr:``data`` contains values of ndarrays, the
``map`` functions may also take functions that operator on ndarrays. This means
``map`` functions may also take functions that operate on ndarrays. This means
that they are able to take `numba <http://numba.pydata.org/>`_ ``@jit``'ed
functions as well.

Expand Down Expand Up @@ -56,7 +56,7 @@ dtype int32 float64 float64 float64
====== ====== ======= ======= =======


Next, we define our hyperbolic function and the time array to evalute:
Next, we define our hyperbolic function and the time array to evaluate:

.. code-block:: python
Expand Down Expand Up @@ -100,7 +100,7 @@ versions, so this is the recommended way. Let's time each approach:
3.38 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> pdcs = pd.DataFrame(dict(df.apply(mapper, axis=1).to_list())))
>>> %timeit pdcs = pd.DataFrame(dict(df.apply(mapper, axis=1).to_list())))
6.86 ms ± 408 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Expand Down Expand Up @@ -136,19 +136,20 @@ possible:
.. code-block:: python
>>> from numba import jit
>>> jit_kw = {'fastmath': True}
>>> @jit
>>> @jit(**jit_kw)
... def tan_to_nominal(D: float) -> float:
... return -np.log1p(-D)
... return -math.log1p(-D)
>>> @jit
>>> @jit(**jit_kw)
... def sec_to_nominal(D: float, b: float) -> float:
... if b <= 1e-4:
... return tan_to_nominal(D)
...
... return ((1.0 - D) ** -b - 1.0) / b
>>> @jit
>>> @jit(**jit_kw)
... def hyp(qi: float, Di: float, bi: float, t: np.ndarray) -> np.ndarray:
... Dn = sec_to_nominal(Di, bi)
...
Expand Down
11 changes: 9 additions & 2 deletions docs/versions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,21 @@ Version History
:noindex:


1.0.6
------

* Additional validations in constructor, primary to evaluate Iterables of values
* Split ``col_map`` to ``col_map`` and ``key_map`` as the original function's return signature depending upon an argument.
* Fix some documentation typos


1.0.5
-----

* Add ``tuple_map`` method
* Refactor all iterators and ``..._map`` functions to improve performance
* Unpack ``np.ndarray`` if given as keys to constructor
* Add ``validate=False`` in ``__post_init__`` if inputs are **known** to be
valid to improve performance
* Add ``validate=False`` in ``__post_init__`` if inputs are **known** to be valid to improve performance


1.0.4
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""
Tafra: a minimalist dataframe
Copyright (c) 2020 Derrick W. Turk and David S. Fulford
Author
------
Expand Down
16 changes: 15 additions & 1 deletion tafra/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,18 @@
__version__ = '1.0.5'
"""
Tafra: a minimalist dataframe
Copyright (c) 2020 Derrick W. Turk and David S. Fulford
Author
------
Derrick W. Turk
David S. Fulford
Notes
-----
Created on April 25, 2020
"""
__version__ = '1.0.6'

from .base import Tafra, object_formatter
from .group import GroupBy, Transform, IterateBy, InnerJoin, LeftJoin

0 comments on commit d9a7768

Please sign in to comment.