Merge pull request #14 from petbox-dev/develop

Split methods for consistent type returns, improve constructor validations
petbox-dev · Jun 17, 2020 · d9a7768 · d9a7768
2 parents dbe102b + 8fbd0a3
commit d9a7768
Show file tree

Hide file tree

Showing 12 changed files with 309 additions and 159 deletions.
diff --git a/README.rst b/README.rst
@@ -152,7 +152,7 @@ A short example
      [array([1, 2, 3, 4]), array(['one', 'two', 'one', 'two'], dtype=object)]
 
     >>> print('Records:', '\n', tuple(t.to_records()))
-    Record:
+    Records:
      ((1, 'one'), (2, 'two'), (3, 'one'), (4, 'two'))
 
     >>> gb = t.group_by(
@@ -197,9 +197,9 @@ Timings
 In this case, lightweight also means performant. Beyond any additional
 features added to the library, ``tafra`` should provide the necessary
 base for organizing data structures for numerical processing. One of the
-most important aspects is fast access to the data itself. By minizing
+most important aspects is fast access to the data itself. By minimizing
 abstraction to access the underlying ``numpy`` arrays, ``tafra`` provides
-over an order of magnitude increase in performance.
+an order of magnitude increase in performance.
 
 -   **Import note** If you assign directly to the ``Tafra.data`` or
     ``Tafra._data`` attributes, you *must* call ``Tafra._coalesce_dtypes``
@@ -307,6 +307,6 @@ preserve immutability.
     2.5 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 
     >>> %%timeit
-    >>> tdf = df.copy()
-    >>> tdf['x'] = df.groupby(['y', 'z'])[['x']].transform(sum)
+    ... tdf = df.copy()
+    ... tdf['x'] = df.groupby(['y', 'z'])[['x']].transform(sum)
     2.81 ms ± 143 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
diff --git a/docs/api.rst b/docs/api.rst
@@ -69,6 +69,7 @@ Methods
     row_map
     tuple_map
     col_map
+    key_map
     select
     copy
     update
@@ -166,6 +167,7 @@ Methods
     .. automethod:: row_map
     .. automethod:: tuple_map
     .. automethod:: col_map
+    .. automethod:: key_map
     .. automethod:: select
     .. automethod:: copy
     .. automethod:: update

diff --git a/docs/numerical.rst b/docs/numerical.rst
@@ -11,7 +11,7 @@ as `generator expressions <https://www.python.org/dev/peps/pep-0289/>`_ wherever
 possible.
 
 Additionally, because the :attr:``data`` contains values of ndarrays, the
-``map`` functions may also take functions that operator on ndarrays. This means
+``map`` functions may also take functions that operate on ndarrays. This means
 that they are able to take `numba <http://numba.pydata.org/>`_ ``@jit``'ed
 functions as well.
 
@@ -56,7 +56,7 @@ dtype  int32  float64 float64 float64
 ====== ====== ======= ======= =======
 
 
-Next, we define our hyperbolic function and the time array to evalute:
+Next, we define our hyperbolic function and the time array to evaluate:
 
 .. code-block:: python
 
@@ -100,7 +100,7 @@ versions, so this is the recommended way. Let's time each approach:
     3.38 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 
 
-    >>> pdcs = pd.DataFrame(dict(df.apply(mapper, axis=1).to_list())))
+    >>> %timeit pdcs = pd.DataFrame(dict(df.apply(mapper, axis=1).to_list())))
     6.86 ms ± 408 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
 
 
@@ -136,19 +136,20 @@ possible:
 .. code-block:: python
 
     >>> from numba import jit
+    >>> jit_kw = {'fastmath': True}
 
-    >>> @jit
+    >>> @jit(**jit_kw)
     ...  def tan_to_nominal(D: float) -> float:
-    ...     return -np.log1p(-D)
+    ...     return -math.log1p(-D)
 
-    >>> @jit
+    >>> @jit(**jit_kw)
     ... def sec_to_nominal(D: float, b: float) -> float:
     ...     if b <= 1e-4:
     ...         return tan_to_nominal(D)
     ...
     ...     return ((1.0 - D) ** -b - 1.0) / b
 
-    >>> @jit
+    >>> @jit(**jit_kw)
     ... def hyp(qi: float, Di: float, bi: float, t: np.ndarray) -> np.ndarray:
     ...     Dn = sec_to_nominal(Di, bi)
     ...

diff --git a/docs/versions.rst b/docs/versions.rst
@@ -6,14 +6,21 @@ Version History
    :noindex:
 
 
+1.0.6
+------
+
+* Additional validations in constructor, primary to evaluate Iterables of values
+* Split ``col_map`` to ``col_map`` and ``key_map`` as the original function's return signature depending upon an argument.
+* Fix some documentation typos
+
+
 1.0.5
 -----
 
 * Add ``tuple_map`` method
 * Refactor all iterators and ``..._map`` functions to improve performance
 * Unpack ``np.ndarray`` if given as keys to constructor
-* Add ``validate=False`` in ``__post_init__`` if inputs are **known** to be
-   valid to improve performance
+* Add ``validate=False`` in ``__post_init__`` if inputs are **known** to be valid to improve performance
 
 
 1.0.4

diff --git a/setup.py b/setup.py
@@ -1,6 +1,7 @@
 """
 Tafra: a minimalist dataframe
 
+Copyright (c) 2020 Derrick W. Turk and David S. Fulford
 
 Author
 ------

diff --git a/tafra/__init__.py b/tafra/__init__.py
@@ -1,4 +1,18 @@
-__version__ = '1.0.5'
+"""
+Tafra: a minimalist dataframe
+
+Copyright (c) 2020 Derrick W. Turk and David S. Fulford
+
+Author
+------
+Derrick W. Turk
+David S. Fulford
+
+Notes
+-----
+Created on April 25, 2020
+"""
+__version__ = '1.0.6'
 
 from .base import Tafra, object_formatter
 from .group import GroupBy, Transform, IterateBy, InnerJoin, LeftJoin