# Registering custom accessors

The acessors allow to add features/methods to regular dataframe : IT DOES NOT SUBCLASS OR WRAP Dataframe : they are still dataframe.

https://pandas.pydata.org/pandas-docs/stable/development/extending.html#registering-custom-accessors


 - https://towardsdatascience.com/pandas-dtype-specific-operations-accessors-c749bafb30a4
 - https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#dt-accessor
 - https://towardsdatascience.com/ready-the-easy-way-to-extend-pandas-api-dcf4f6612615
 - https://pandas.pydata.org/pandas-docs/stable/reference/series.html#string-handling
 - https://pandas.pydata.org/pandas-docs/stable/reference/series.html#accessors
 - https://realpython.com/python-pandas-tricks/#3-take-advantage-of-accessor-methods
 - https://github.com/pandas-dev/pandas/blob/3e4839301fc2927646889b194c9eb41c62b76bda/pandas/core/arrays/categorical.py#L2356
 - https://github.com/pandas-dev/pandas/blob/3e4839301fc2927646889b194c9eb41c62b76bda/pandas/core/strings.py#L1766
 - https://github.com/hgrecco/pint-pandas/blob/master/pint_pandas/pint_array.py


In [1]:
import pandas as pd
import numpy as np

@pd.api.extensions.register_dataframe_accessor("geo")
class GeoAccessor:
    def __init__(self, pandas_obj):
        self._validate(pandas_obj)
        self._obj = pandas_obj

    @staticmethod
    def _validate(obj):
        # verify there is a column latitude and a column longitude
        if "latitude" not in obj.columns or "longitude" not in obj.columns:
            raise AttributeError("Must have 'latitude' and 'longitude'.")

    @property
    def center(self):
        # return the geographic center point of this DataFrame
        lat = self._obj.latitude
        lon = self._obj.longitude
        return (float(lon.mean()), float(lat.mean()))

    def plot(self):
        # plot this array's data on a map, e.g., using Cartopy
        print(self.center)

In [2]:
ds = pd.DataFrame(
    {"longitude": np.linspace(0, 10), "latitude": np.linspace(0, 20)}
)
print(ds.geo.center)

ds.geo.plot()
# plots data on a map


(5.0, 10.0)
(5.0, 10.0)


# Physipy series accessor

In [4]:
import pandas as pd
import numpy as np
from physipy import m
from physipandas import QuantityDtype

c = pd.Series(np.arange(10)*m, 
              dtype=QuantityDtype(m))


print("-------- Use the physipy accessor")
try:
    print(c.dimension)
except Exception as e:
    print("Raised ", e)
    print(c.physipy.dimension)
    
try:
    print(c._SI_unitary_quantity)
except Exception as e:
    print("Raised ", e)
    print(c.physipy._SI_unitary_quantity)
    
try:
    print(c.mean())
except Exception as e:
    print("Raised ", e)
    print(c.physipy.values.mean())
    
c.physipy.values.is_length()

QDTYPE : new with 1 m of type <class 'physipy.quantity.quantity.Quantity'>
returning  physipy[m] with .unit 1 m
calling qARRY with ()
trying from sequence  [<Quantity : 0 m>, <Quantity : 1 m>, <Quantity : 2 m>, <Quantity : 3 m>, <Quantity : 4 m>, <Quantity : 5 m>, <Quantity : 6 m>, <Quantity : 7 m>, <Quantity : 8 m>, <Quantity : 9 m>] 10 physipy[m]
ENTERING QARRAY with [0 1 2 3 4 5 6 7 8 9] m physipy[m] False
QARRAY : init with [0 1 2 3 4 5 6 7 8 9] m of type <class 'physipy.quantity.quantity.Quantity'>
data is set to [0 1 2 3 4 5 6 7 8 9] m with len 10 10 <class 'physipy.quantity.quantity.Quantity'>
len values 10
QDTYPE : new with 1 m of type <class 'physipy.quantity.quantity.Quantity'>
returning  physipy[m] with .unit 1 m
dtype is then set to  physipy[m]
Length of QuantityArray 10
-------- Use the physipy accessor
Raised  'Series' object has no attribute 'dimension'
L
Raised  'Series' object has no attribute '_SI_unitary_quantity'
1 m
Raised  cannot perform mean with type physipy[m]


True

In [2]:
type(c.physipy.values) == type(np.arange(10)*m)

True

In [7]:
print(c.values, type(c), type(c.values))

Length of QuantityArray 10
Length of QuantityArray 10
Length of QuantityArray 10
<QuantityArray>
[0 m, 1 m, 2 m, 3 m, 4 m, 5 m, 6 m, 7 m, 8 m, 9 m]
Length: 10, dtype: physipy[m] <class 'pandas.core.series.Series'> <class 'physipandas.extension.QuantityArray'>


In [10]:
print(len(c))
print(c.shape)
print(c.ndim)

10
Length of QuantityArray 10
(10,)
1


In [13]:
arr = pd.Series(np.arange(10))
print(len(arr))
print(arr.shape)
print(arr.ndim)

10
(10,)
1
