Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Piwik Analytics using Snowplow Tracker and Collector #6883
Snowplow's Cloudfront collector is the most popular SnowPlow collector. It is incredibly robust and scalable, by leveraging Amazon's cloud infrastructure.
The SnowPlow tracking pixel is served from Cloudfront. The SnowPlow tracker requests the pixel (using a GET), and appends the data to be logged in SnowPlow in the query string for the GET request. Amazon provides Cloudfront logging: the request (incl. the query string) gets logged to S3 including some additional data provided by Cloudfront. (E.g. requester IP address and URL.)
Hey there! Alex from the Snowplow project here.
From what I read so far, it would be possible to import into PW standard logs generated by CloudFront. That means super-fast data collection.
Instead of sending the data collected on a page back to PW to be inserted into the database, (which is probably a relatively expensive operation, unless you buffer what's sent from browser), the data gets collected using CloudFront logs, parsed and inserted into PW on demand, during a nightly process.
That means, your server is not even touched while users browse your website. Of course, there will be no real-time statistics, but happy to trade that option for a reliable and fast alternative. It is probably why most of us use Google Analytics, to leverage on Google cloud infrastructure. It's scary to know that for any user action in the browser your database gets an insert.