-
Notifications
You must be signed in to change notification settings - Fork 2
Setting up alternative data stores
HOME > SNOWPLOW SETUP GUIDE > Step 4: setting up alternative data stores

SnowPlow supports storing your data in multiple different data stores:
| Storage | Description | Status |
|---|---|---|
| S3 | Data is stored in the S3 file system where it can be analysed using EMR emr (e.g. Hive, Pig, Mahout) | Production-ready |
| Redshift redshift | A columnar database offered as a service on EMR. Optimized for performing OLAP analysis. Scales to Petabytes | Production-ready |
| Infobright infobright | An open source columnar database accessible via the MySQL JDBC driver. (So compatible with a wide range of analytics tools.) Optimized for performing OLAP analysis. Scales to Terabytes | Production-ready |
By setting up the EmrEtlRunner (in the previous step), you are already successfully loading your data into S3 where it is accessible to EMR for analysis.
If you wish to analyse your data using a wider range of tools (e.g. BI tools like ChartIO chartio), you will want to load your data into a columnar database like Infobright to support enable use of these tools.
The StorageLoader storage-loader-setup is an application to make it simple to keep an updated copy of your data in multiple data sources including Infobright. Setting up SnowPlow so that you can maintain a copy of your data in a database like Infobright is a two step process:
- [Create a database and table in Infobright for the data] setup-infobright
- Setup the StorageLoader storage-loader-setup so that it regularly updates that table with the latest data from S3
Select the appropriate option below to walk through the steps necessary to setup SnowPlow with the following data stores:
- [Set up Redshift to work with SnowPlow] setup-redshift
- [Set up Infobright to work with SnowPlow] setup-infobright
- Setup SkyDB to work with SnowPlow (coming soon)
After you have setup one or more of the above databases, you need to:
- [Set up the StorageLoader to regularly transfer SnowPlow data into your new store] storage-loader-setup
Home | About | Project | Setup Guide | Technical Docs | Copyright © 2012-2014 Snowplow Analytics Ltd
HOME > SNOWPLOW SETUP GUIDE > Step 4: Setting up alternative data stores
- [Step 1: Setup a Collector] (setting-up-a-collector)
- Step 2a: Setup a Tracker
- Step 2b: Setup a Webhook
- [Step 3: Setup Enrich] (setting-up-enrich)
- [Step 4: Setup alternative data stores] (setting-up-alternative-data-stores)
- [4.1: setup Redshift] (setting-up-redshift)
- [4.2: setup PostgreSQL] (setting-up-postgresql)
- [4.3: installing the StorageLoader] (1-installing-the-storageloader)
- [4.4: using the StorageLoader] (2-using-the-storageloader)
- [4.5: scheduling the StorageLoader] (3-scheduling-the-storageloader)
- [4.6: loading shredded types] (4-Loading-shredded-types)
- [Step 5: Analyze your data!] (Getting started analyzing Snowplow data)
Useful resources