I am developing a native Piwik plugin to enable tracking of user interaction and pointer movements. Name and features of this plugin makes it very similar to ClickHeat (#73), but, actually there are some major differences between these two.
ClickHeat is a standalone script/application while HeatMap is going to be a native Piwik plugin.
It will use Piwik's own tracking codes/API, data structures and procedures to provide the features.
This plugin aims to generate visual maps for pointer movements in addition to click maps. Additional event/tracking and data representations will be implemented as per suggestions from the community.
On the way to the development of this plugin, I would appreciate a guideline from you on these areas:
Which tracker method (and parameters) should be used to record pointer movement coordinates (as serialized array)? Ideally, this should be done on the unload event. And it seems like using existing methods such as logPageView or logGoal will duplicate other metrics. How do we work around?
By using Piwik's own tracker code, is it possible to POST captured data instead of GET requests? There's a limit on URL length in Apache, and it seems like our captured data often hits that limit when tracked with GET requests.
How about archiving (blob format)? Should generated maps be saved as images? Or should we store the raw data as compressed streams? For the later case, will there be any performance hit? We have to run queries on huge amount of data. When this data is in MySQL table, it works fine. But, how do we achieve the same performance (and memory utilization) when this data is stored in blob format?
"visual maps for pointer movements": I think this is similar to clicktale 'mouse move heatmaps' ? http://www.clicktale.com/product/mouse_move_heatmaps
I'm not going to discuss here JS tracking as I am not sure what the requirements are.
Performance is critical for this type of data storing and processing. I'm not sure why you consider doing this from scratch rather than modifying (or reusing) GPL'ed Clickheat, as doing this from scratch is a major work, probably at least 3 weeks (mockups, spec, development, i18n, doc, etc) ?
Clickheat works fine currently (I believe?), appart from 2 main issues
Regarding clean UI integration in Piwik, it would be best to submit mockups of your vision that we could discuss here with the community.
Regarding scalability, I think that while raw data would be stored in mysql tables (I think clickheat is using files at the moment), these data sets would need to be aggregated daily (and summed for weekly/monthly/yearly aggregates), similar to other datasets in Piwik. Aggregation would ideally reuse existing algorithms (archiveProcessing) and add required helper in the class.
You mention problem with reading an click map data set from blob. Would you store one integer (eg, intensity) for each pixel in the screen? Maybe a lower resolution would be enough, ie. if you average 8 pixels, you would have eg. 1280*1024 / 8 * 20% hovered/clicked pixels = 32768 data points. If you store a simple array this would be quite cheap and fast to read. thoughts?
Maps should always be generated from the 'aggregated' data sets. Of course, these images would be cached to ensure reload is fast. Generating + caching these images could also be done at archiving time to make UI acccess very fast.
What are the segments you want/need to process your data sets against? ie. do you have a different map for all resolutions, for browsers, countries, etc. or just one single map? Segmenting is difficult and you can quickly have a processing time multiplied by factor 10 or 100. Maybe at first no segmentation should be done (appart from DATE + URL) ?
A feature to enable/disable recording would be nice. There is only need to record one day from time to time, ensuring DB doesn't get bloated with data, and Archiving process doesn't has to process mouse move + clicks every day (this would potentially be pretty slow). This could be coupled with a simple threshold 'number of pageviews tracked for this URL', for example max 500 or 1000 sessions tracked per URL per day.
Also a similar feature to choose which URLs should be tracked would be great, ie. only track clicks/moves on homepage and product page. This could be done at the JS level or PHP level (better do it at JS level if possible).
GadgetMaster, any update regarding this ticket?