-
-
Notifications
You must be signed in to change notification settings - Fork 17.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
derivative method for Series and DataFrame #26680
Comments
My initial reaction is that this is too domain-specific to warrant
inclusion in pandas.
…On Wed, Jun 5, 2019 at 3:16 PM scls19fr ***@***.***> wrote:
Hello,
when dealing with sensors data (energy meter, volumetric meter, ...) and
having such data in a DataFrame (or in a Series) it can be convenient to
easily calculate derivative of values (generally over local time).
This question have been asked on StackOverflow (at least 2 times)
https://stackoverflow.com/questions/39235712/calculate-local-time-derivative-of-series
https://stackoverflow.com/questions/26245242/time-differentiation-in-pandas/26246562
Maybe Series and DataFrame could have a derivative method.
I wrote some convenient methods (both for DataFrame and for Series) to
calculate derivative.
I currently monkey patch <https://en.wikipedia.org/wiki/Monkey_patch>
Pandas using the following code:
import pandas as pd
def _derivative_series(self):
if isinstance(self.index, pd.DatetimeIndex):
den = self.index.to_series(keep_tz=True).diff().dt.total_seconds()
else:
den = self.index.to_series().diff()
num = self.diff()
return num.div(den, axis=0)
def _derivative_dataframe(self, var=None):
if var is None:
if isinstance(self.index, pd.DatetimeIndex):
den = self.index.to_series(keep_tz=True).diff().dt.total_seconds()
else:
den = self.index.to_series().diff()
num = self.diff()
return num.div(den, axis=0)
else:
if pd.api.types.is_datetime64_any_dtype(self[var]):
den = self[var].diff().dt.total_seconds()
else:
den = self[var].diff()
num = self.loc[:, self.columns != var].diff()
result = num.div(den, axis=0)
result[var] = self[var]
return result.loc[:, self.columns]
def monkey_patch_pandas(pd):
pd.Series.derivative = _derivative_series
pd.DataFrame.derivative = _derivative_dataframe
So you can now use derivative method with DataFrame with DatetimeIndex
import pandas as pd
monkey_patch_pandas(pd)
from io import StringIO
dat = """time,sensor1,sensor22019-05-27 13:49:47.703850+02:00,0.0,100.22019-05-27 13:49:47.827518+02:00,0.4,102.22019-05-27 13:49:47.974124+02:00,0.8,102.42019-05-27 13:49:48.097793+02:00,1.1,104.12019-05-27 13:49:48.222461+02:00,1.2,101.12019-05-27 13:49:48.355105+02:00,1.4,102.0"""
df = pd.read_csv(StringIO(dat), index_col='time', parse_dates=True)
print("df:")print(df)print("")print("df.derivative():")print(df.derivative())print("")
it should also work with Series with DatetimeIndex
print("df['sensor1'].derivative():")print(df['sensor1'].derivative())print("")
but you can also derivate using a column name as variable
print("df.reset_index().derivative(var='time'):")print(df.reset_index().derivative(var='time'))
derivate method should also work fine with DataFrame with float index like
dat2 = """x,y1,y20.0,0.0,100.20.2,0.4,102.20.3,0.8,102.40.45,1.1,104.10.5,1.2,101.10.7,1.4,102.0"""
df = pd.read_csv(StringIO(dat2), index_col='x', parse_dates=True)
print("df:")print(df)print("")print("df.derivative():")print(df.derivative())print("")
with Series also
print("df['y1'].derivative():")print(df['y1'].derivative())print("")
and given a variable name (a DataFrame column name)
print("df.reset_index().derivative(var='x'):")print(df.reset_index().derivative(var='x'))
I wonder if some other Pandas users have similar use cases and / if adding
such a feature directly in Pandas code could be valuable.
Kind regards
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#26680?email_source=notifications&email_token=AAKAOIQNWGORW6ZAYEPDIFTPZANINA5CNFSM4HUF6RS2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GX3SZQA>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIQ4HTZI6XTZU5AZHB3PZANINANCNFSM4HUF6RSQ>
.
|
derivate and integrate are not really domain-specific calculations. I not a big fan of the monkey patch approach. |
these are out of scope for pandas; can you not simply use the scipy methods directly? alternatively if someone wanted to make a package which added these as accessors could be easily done I think, e.g.
via the accessors API: http://pandas.pydata.org/pandas-docs/stable/development/extending.html#registering-custom-accessors |
Hello,
when dealing with sensors data (energy meter, volumetric meter, ...) and having such data in a
DataFrame
(or in aSeries
) it can be convenient to easily calculate derivative of values (generally over local time) ie calculate power from energy consumption for example or flow rate from volume over time.This question have been asked on StackOverflow (at least 2 times)
https://stackoverflow.com/questions/39235712/calculate-local-time-derivative-of-series
https://stackoverflow.com/questions/26245242/time-differentiation-in-pandas/26246562
Maybe
Series
andDataFrame
could have aderivative
method.I wrote some convenient methods (both for
DataFrame
and forSeries
) to calculate derivative.I currently monkey patch Pandas using the following code:
So you can now use
derivative
method with DataFrame withDatetimeIndex
it should also work with
Series
withDatetimeIndex
but you can also derivate using a column name as variable
derivate
method should also work fine withDataFrame
with float index likewith
Series
alsoand given a variable name (a
DataFrame
column name)I wonder if some other Pandas users have similar use cases and / if adding such a feature directly in Pandas code could be valuable.
Kind regards
PS : An other approach could be to make use of
numpy.gradient
functionThe text was updated successfully, but these errors were encountered: