New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding write-time aggregation functions #35
Comments
If it were done simply has a simple
But I don't mind placing such code here either. Your call. |
I like the idea and I think having t = TrackUsage(app, [PrintStorage(hooks=[summarizer]), AnotherStore()]) We could then support multiple storages for different things and allow them to have their own hooks. The hooks would use their parent storages methods to do their work. It wouldn't be a hard change to make but I feel strongly about not keeping that in the 1.x release |
From #34:
The only time I'd say having |
Multiple storage merged in https://github.com/ashcrow/flask-track-usage/tree/2.0.dev0. Feel free to do work on top of this and I'll merge it in there. |
Sounds like a great plan! |
Prior to writing any code, I've started writing a rough draft of the docs. I do that sometimes to setup an overall feel of how it could work. Here is the first draft (on my fork): https://github.com/JohnAD/flask-track-usage/blob/2.0.dev0/docs/hooks.rst Sound like a good overall direction? I'm thinking that when "outside" hooks are written by an end user, a simple set of standard **kwargs are based to the function. But when "internal hooks" are reference (as documented on that |
|
I'm about 30% done. It will not be possible, at least at first, to have summaries supported on all storage classes. I'm designing it to gracefully handle that. But, for first release of 2.0; the MongoEngineStorage class will definitely work (due to my own self-interest.) Which other storage class would you like to see fully supporting all seven summaries? |
The most used storage classes seem to be |
Adds time write time aggregate functions and hooks. Closes #35
Adds time write time aggregate functions and hooks. Closes #35
Merged this work into the 2.0.0 branch! |
Adds time write time aggregate functions and hooks. Closes #35
Adds time write time aggregate functions and hooks. Closes #35
Adds time write time aggregate functions and hooks. Closes #35
Typically, in a web site, the usage data is written lightly and as fast as possible to a log. Then, a secondary, process (and possibly a secondary server) does statistical analysis on that data. This totally make sense for storage in a traditional database such as SQL and certainly for an apache-style text log.
Philosophically, however, MongoDB (and related NoSQL) databases can take a different approach. They are designed for scalablity using a variety of techniques: including an emphasis of read-optimization at the expense of write-optimization. The lack of normalization, for example, makes it expensive to update (write) certain data because such updates might have to occur across many documents. But in exchange for that, a read of any one document in a collection need never reference another document because all the important information is already gathered.
Sorry be so windy, but I'm wanting to justify my crazy idea. :) And that is this:
Rather than just write a single document to a collection on each page response, also allow aggregate updates on other documents at the same time.
For example, it is common in web log analysis to record the number of visits to each page over certain periods of time. Say hourly, daily, monthly. So, when
flask-track-usage
records a response, it could also upsert the url/datetime/period documents corresponding to it with incremented totals. In this example, it would update 3 additional documents.This would be implemented as an option of course. There would be scenarios where such aggregate work would be a bad idea or pointless.
One possiblity is to have it done as a post-storage function call. For example:
Thoughts?
The text was updated successfully, but these errors were encountered: