Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moving Averages #100

Closed
andreierdoss opened this issue Dec 15, 2013 · 12 comments
Closed

Moving Averages #100

andreierdoss opened this issue Dec 15, 2013 · 12 comments

Comments

@andreierdoss
Copy link

Hello,

I would like to know if it's possible to calculate moving averages on a set of data or if the data passed in has to already have the averages calculated?

I have been thinking about this and I would need to have access to records outside of the current group or filter. If I pass in calculated moving averages then that data becomes obsolete when applying a filter from a different chart. Ideally I would like recalculate the moving averages on the new data set, just don't know how to access it in the reduce functions.

If it's possible, can you please provide a short example or a hint in the right direction.

Thank you,

@JackStouffer
Copy link

+1 It would be great if this was a start to other statistical methods as well.

@esjewett
Copy link

You can certainly calculate averages using the custom reducer. Not sure how that would differ from moving averages? Not to speak for them, but in the past I believe the Crossfilter maintainers have said that specialized custom reducers should be maintained as separate projects and not baked into Crossfilter itself.

@esjewett
Copy link

I see. You need to use the groupAll functionality to do this. It takes
arguments similar to reduce, but the functions are passed and must return
all the groups. This allows you to do groupings where a single record is
relevant to multiple groups, like a moving average.
https://github.com/square/crossfilter/wiki/API-Reference#dimension_groupAll

On Fri, Jan 23, 2015 at 7:51 PM, Jack Stouffer notifications@github.com
wrote:

According to http://stackoverflow.com/a/21707254/923933 moving averages
are not currently possible with the base api and would require new
functionality.


Reply to this email directly or view it on GitHub
#100 (comment).

@esjewett
Copy link

I've just updated reductio to support arbitrary aggregations across arbitrarily defined windows as well as multi-value classification scenarios. The example is just a count aggregation, but it should work with average and the move complex aggregations as well. If you'd like to try it out: https://github.com/esjewett/reductio#aggregations-groupall-aggregations

@JackStouffer
Copy link

This library looks very useful, thanks. I will comment again when I have a working example of a moving average.

@JackStouffer
Copy link

Here is the solution with the above library. Unfortunately, it proved far too slow with a large amount of records. I guess that's the limitations of working in js

Also, bonus support for dc.js

//this assumes your data is in the form of

data = [
{value:1000, date: Date('2015-01-01')}
...
]

var date_array = [];
var mapped_date_array = [];

var dateDim = cs_data.dimension(function(d) {return d.date;});
var shipped = dateDim.group().reduceSum(dc.pluck('value'));

// get a list of all of the dates used
var shipments_infinity = shipments.top(Infinity);
var i = 0;
for (i=0; i < shipments_infinity.length; i++) {
    date_array.push(shipments_infinity[i].key);
}
date_array.sort(function (date1, date2) {
    if (date1 > date2) return 1;
    if (date1 < date2) return -1;
    return 0;
})
// this is needed for indexOf, as js returns false on all object comparisons 
mapped_date_array = date_array.map(function(e) { return e.toDateString(); });

reducer = reductio().groupAll(function(record) {
    var idx = mapped_date_array.indexOf(record.date.toDateString());

    if (record.date < date_array[30]) {
          return [date_array[idx]];
    } else {
          var i = 0;
          var return_array = [];
         // we are finding the 30 day moving avg 
         for (i = 30; i >= 0; i--) {
               return_array.push(date_array[idx - i]);
         }

          return return_array;
   }
}).count(true).sum(function(d) { return d.value; }).avg(true);

var shipments_moving_avg = dateDim.groupAll();
reducer(shipments_moving_avg);

// dc.js requires the all() function to exist, supplement this with underscore.js
_.extend(shipments_moving_avg, {all: function () {return this.value()} })

someChart.width(1000).height(300)
                        .dimension(dateDim)
                        .group(shipments)
                        .stack(shipments_moving_avg, function (d) {return d.value.avg;})

@esjewett
Copy link

Glad it sort of worked. How many records of this form are we talking about? If in the 1000s or low 10000s, we can probably optimize. I'm not sure Reductio does this as efficiently as possible, so if you have a test dataset I can look at it. Probably best to discuss that over in the Reductio issues. Over the low 10000s or records, it might just not be manageable.

@JackStouffer
Copy link

Yeah, unfortunately my data contains 110,353 records and grows by about 50-100 every day. I simplified the data for this example, but the real data has eight columns, which brings the total size to about 16 MB.

@esjewett
Copy link

Are you within an order of magnitude on performance? If you switch your data to use strings for the date dimension instead of Date objects and optimize the Reductio groupAll function, that might buy you a 10x improvement.

@JackStouffer
Copy link

I created a new issue for this discussion crossfilter/reductio#12

@JackStouffer
Copy link

Close with no comment?

@RandomEtc
Copy link
Collaborator

Sorry - your last comment seemed to be directing people to a new discussion so I didn't have anything else to add.

See #151 for discussion of future of Crossfilter. I've updated square/crossfilter's README to indicate that the crossfilter org will be actively maintaining a fork. I figured you were already aware since you linked to reductio. Apologies for misunderstanding.

jdar pushed a commit to jdar/crossfilter that referenced this issue Nov 19, 2019
Important because it displays on NPM.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants