You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The aim is to provide an API and minimal feature set that can be extended to cover all of the (useful) functionality provided by Pandas resample method.
Long term we will need to support the functionality provided by the arguments rule, closed, label, origin, offset.
For non-trivial bucket boundaries (e.g. last Thursday of every month) we should leverage Pandas to generate the actual boundaries of interest to pass to the C++ layer. For simpler boundaries (e.g. minute bars) we can have a more compact representation, although this is not required for the MVP.
Proposed MVP:
Do not support upsampling, only downsampling.
Present the resampled data back to the user, with no option to write the data back to another symbol directly.
Leverage Pandas to convert rule, origin, and offset into a list of pairs of UTC timestamps stored as int64_t nanoseconds since epoch representing the bucket boundaries.
Pass the closed and label arguments directly through to the clause constructor.
Use the QueryBuilder directly rather than adding syntactic-sugar methods to the Library or NativeVersionStore classes.
Use "data driven" approach to empty buckets. i.e. only include buckets in the output for which there was an index value in the appropriate range.
Have a single clause to handle resampling (as opposed to the 2-stage process for hash-based groupings) since the repartition would always be trivial for resampling.
Static schema supported only
No "named agg" equivalent, so only one aggregation possible per input column
WOW , this would be such an epic addition, to this alrady awsome libary (db). esp with the newly (nearly) added first, last, count. I would be happy to help in testing . All the best.
The aim is to provide an API and minimal feature set that can be extended to cover all of the (useful) functionality provided by Pandas resample method.
Long term we will need to support the functionality provided by the arguments
rule
,closed
,label
,origin
,offset
.For non-trivial bucket boundaries (e.g. last Thursday of every month) we should leverage Pandas to generate the actual boundaries of interest to pass to the C++ layer. For simpler boundaries (e.g. minute bars) we can have a more compact representation, although this is not required for the MVP.
Proposed MVP:
rule
,origin
, andoffset
into a list of pairs of UTC timestamps stored asint64_t
nanoseconds since epoch representing the bucket boundaries.closed
andlabel
arguments directly through to the clause constructor.QueryBuilder
directly rather than adding syntactic-sugar methods to theLibrary
orNativeVersionStore
classes.e.g.
The text was updated successfully, but these errors were encountered: