Permalink
Switch branches/tags
stream_enrich/0.16.1 stream_enrich/0.16.1-rc2 stream_enrich/0.16.1-rc1 stream_enrich/0.16.0 stream_enrich/0.16.0-rc1 stream_enrich/0.15.0 stream_enrich/0.15.0-rc3 stream_enrich/0.15.0-rc2 stream_enrich/0.15.0-rc1 stream_enrich/0.14.0 stream_enrich/0.14.0-rc6 stream_enrich/0.14.0-rc5 stream_enrich/0.14.0-rc4 stream_enrich/0.14.0-rc3 stream_enrich/0.14.0-rc2 stream_enrich/0.14.0-rc1 stream_enrich/0.13.0 stream_enrich/0.13.0-rc2 stream_enrich/0.13.0-rc1 stream_enrich/0.12.0 stream_enrich/0.12.0-rc3 stream_enrich/0.12.0-rc2 spark_enrich/1.13.0 spark_enrich/1.13.0-rc1 spark_enrich/1.12.0 spark_enrich/1.12.0-rc2 spark_enrich/1.12.0-rc1 spark_enrich/1.11.0 spark_enrich/1.11.0-rc3 spark_enrich/1.11.0-rc1 spark_enrich/1.10.0 spark_enrich/1.10.0-rc1 spark_enrich/1.9.0 spark_enrich/1.9.0-rc2 spark_enrich/1.9.0-rc1 scala_stream_collector/0.13.0 scala_stream_collector/0.13.0-rc1 scala_stream_collector/0.12.0 scala_stream_collector/0.12.0-rc1 scala_stream_collector/0.11.0 scala_stream_collector/0.11.0-rc4 scala_stream_collector/0.11.0-rc3 scala_stream_collector/0.11.0-rc2 scala_stream_collector/0.11.0-rc1 scala_common_enrich/0.32.0 scala_common_enrich/0.32.0-M1 scala_common_enrich/0.31.0 scala_common_enrich/0.31.0-M6 scala_common_enrich/0.31.0-M5 scala_common_enrich/0.31.0-M4 scala_common_enrich/0.31.0-M3 scala_common_enrich/0.31.0-M2 scala_common_enrich/0.31.0-M1 scala_common_enrich/0.30.0 scala_common_enrich/0.30.0-M2 scala_common_enrich/0.30.0-M1 scala_common_enrich/0.29.0 scala_common_enrich/0.29.0-M1 scala_common_enrich/0.28.0 scala_common_enrich/0.28.0-M3 scala_common_enrich/0.28.0-M1 scala_common_enrich/0.27.0 scala_common_enrich/0.27.0-rc1 scala_common_enrich/0.27.0-M2 scala_common_enrich/0.27.0-M1 scala_common_enrich/0.26.0 scala_common_enrich/0.26.0-M1 scala_common_enrich/0.25.0 scala_common_enrich/0.25.0-M5 scala_common_enrich/0.25.0-M4 scala_common_enrich/0.25.0-M3 scala_common_enrich/0.25.0-M2 scala_common_enrich/0.25.0-M1 scala-common-enrich-0.16.0-M1 relational_database_shredder/0.12.0-rc4 relational_database_shredder/0.12.0-rc3 relational_database_shredder/0.12.0-rc2 relational_database_shredder/0.12.0-rc1 rdb_shredder/0.12.0 rdb_shredder/0.12.0-rc6 rdb_shredder/0.12.0-rc5 rdb_loader/0.12.0 rdb_loader/0.12.0-rc5 rdb_loader/0.12.0-rc4 rdb_loader/0.12.0-rc3 rdb_loader/0.12.0-rc2 rdb_loader/0.12.0-rc1 r105-pompeii r104-stoplesteinan r103-paestum r102-afontova-gora r101-neapolis r100-epidaurus r99-carnac r98-argentomagus r97-knossos r96-zeugma r95-ellora r94-hill-of-tara r93-virunum
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
2478 lines (2373 sloc) 147 KB
Release 105 Pompeii (2018-05-07)
--------------------------------
Stream Enrich: bump to 0.16.1 (#3748)
Stream Enrich: ensure a one-to-one relationship between sink and record processor (#3745)
Stream Enrich: force jackson-databind to 2.9.3 (#3744)
Scala Common Enrich: update WeatherEnrichmentSpec (#3749)
Release 104 Stoplesteinan (2018-04-30)
--------------------------------------
Common: remove trailing hyphen from CHANGELOG entry for R103 (#3731)
EmrEtlRunner: fail fast when trying to skip staging or enrich in stream enrich mode (#3726)
EmrEtlRunner: factor out steps-generating function (#3718)
EmrEtlRunner: uncompress enriched files when copying to HDFS (#3719)
EmrEtlRunner: bump to 0.32.0 (#3723)
EmrEtlRunner: fix srcPattern for copying stream enriched data to HDFS (#3722)
EmrEtlRunner: check if whole enriched.good is non-empty in stream enrich mode (#3717)
Release 103 Paestum (2018-04-17)
--------------------------------
Scala Common Enrich: bump to 0.32.0 (#3673)
Scala Common Enrich: bump scala-maxmind-iplookups to 0.4.0 (#3675)
Scala Common Enrich: update IP Lookups Enrichment to support non-legacy database (#3672)
Scala Common Enrich: support extraction of IP addresses in the Forwarded header (#3475)
Scala Common Enrich: support IPv6 addresses in the IpAddressExtractor (#3474)
Scala Common Enrich: bump mandrill event versions to 1-0-1 (#3372)
Stream Enrich: bump to 0.16.0 (#3698)
Stream Enrich: bump scala-common-enrich to 0.32.0 (#3676)
Stream Enrich: force jackson-dataformat-cbor to 2.9.3 (#3701)
Spark Enrich: bump to 1.13.0 (#3705)
Spark Enrich: bump scala-common-enrich to 0.32.0 (#3674)
Spark Enrich: downgrade geoip2 to 2.5.0 (#3702)
Clojure Collector: bump to 2.0.0 (#3708)
Clojure Collector: make Flash access domains and secure configurable (#2914)
Clojure Collector: retrieve configuration only through JVM properties (#3709)
Clojure Collector: allow HTTP repositories (#3707)
Clojure Collector: add CI/CD (#3712)
Config: update database value in config/enrichments/ip_lookups.json (#3671)
EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.13.0 (#3710)
Common: rename Caravel to Superset (#3595)
Common: redirect support request to discourse in CONTRIBUTING.md (#3478)
Release 102 Afontova Gora (2018-04-03)
--------------------------------------
EmrEtlRunner: bump to 0.31.0 (#3679)
EmrEtlRunner: add ability to skip load_manifest_check (#3680)
EmrEtlRunner: add CI/CD to update AMI bootstrap scripts (#3683)
EmrEtlRunner: add stream_config.yml.sample (#3685)
EmrEtlRunner: add support for shredding from Kinesis S3 Loader's enriched event output (#3606)
EmrEtlRunner: add bootstrap action to prepare AMI 5.x for Snowplow (#3601)
EmrEtlRunner: recover from RestClient::ServiceUnavailable when making status checks (#3539)
EmrEtlRunner: recover from RestClient::RequestTimeout when making status checks (#3468)
EmrEtlRunner: launch bootstrap action for AMI 5.x (#3609)
EmrEtlRunner: pass processing manifest config to RDB Shredder (#3619)
EmrEtlRunner: fail fast in build script (#3684)
EmrEtlRunner: fail fast on duplicated storage target id (#3652)
EmrEtlRunner: do not rescue on Exception (#3577)
Redshift: remove duplicate create events table comment (#3643)
Release 101 Neapolis (2018-03-21)
---------------------------------
Scala Stream Collector: bump to 0.13.0 (#3682)
Scala Stream Collector: add Google Cloud PubSub sink (#3047)
Scala Stream Collector: split into multiple artifacts according to targeted platform (#3621)
Scala Stream Collector: expose number of requests over JMX (#3637)
Scala Stream Collector: move cross domain configuration to enabled-style (#3556)
Scala Stream Collector: truncate events exceeding the configured maximum size into a BadRow (#3587)
Scala Stream Collector: remove string interpolation false positive warnings (#3623)
Scala Stream Collector: update config.hocon.sample to support Google Cloud PubSub (#3049)
Scala Stream Collector: customize useragent for GCP API calls (#3658)
Scala Stream Collector: bump kafka-clients to 1.0.1 (#3660)
Scala Stream Collector: bump aws-java-sdk to 1.11.290 (#3665)
Scala Stream Collector: bump scala-common-enrich to 0.31.0 (#3666)
Scala Stream Collector: bump SBT to 1.1.1 (#3629)
Scala Stream Collector: bump sbt-assembly to 0.14.6 (#3667)
Scala Stream Collector: use sbt-buildinfo (#3626)
Scala Stream Collector: extend copyright notice to 2018 (#3687)
Stream Enrich: bump to 0.15.0 (#3681)
Stream Enrich: add Google Cloud PubSub source (#3150)
Stream Enrich: add Google Cloud PubSub sink (#3149)
Stream Enrich: split into multiple artifacts according to targeted platform (#3645)
Stream Enrich: rename etl version from kinesis to stream-enrich (#3642)
Stream Enrich: make source / sink configuration a coproduct (#3555)
Stream Enrich: add ability to retrieve resolver and enrichments from Google Cloud Datastore (#3152)
Stream Enrich: update config.hocon.sample to support Google Cloud PubSub (#3151)
Stream Enrich: customize useragent for GCP API calls (#3193)
Stream Enrich: bump kafka-clients to 1.0.1 (#3661)
Stream Enrich: bump amazon-kinesis-client to 1.9.0 (#3663)
Stream Enrich: bump aws-java-sdk to 1.11.290 (#3662)
Stream Enrich: bump SBT to 1.1.1 (#3657)
Stream Enrich: bump sbt-assembly to 0.14.6 (#3664)
Stream Enrich: use sbt-buildinfo (#3627)
Stream Enrich: extend copyright notice to 2018 (#3686)
Common: install Ruby 2.4.3 before deploy (#3689)
Common: fix CHANGELOG entry for R97 (#3630)
Release 100 Epidaurus (2018-02-26)
---------------------------------
Redshift: widen se_label to 4,096 to support URLs etc (#196)
Redshift: widen sensitive columns in atomic.events to support pseudonymization (#3528)
Scala Common Enrich: add PII Enrichment (#3472)
Scala Common Enrich: apply automated code formatting (#3532)
Scala Common Enrich: bump commons-codec to 1.11 (#3638)
Scala Common Enrich: bump to 0.31.0 (#3598)
Scala Common Enrich: remove unused version member in Enrichment trait (#3541)
Scala Common Enrich: use automated code formatting (#3496)
Stream Enrich: bump scala-common-enrich to 0.31.0 (#3597)
Stream Enrich: bump to 0.14.0 (#3596)
Stream Enrich: use generated Settings for version in test (#3604)
Release 99 Carnac (2018-01-25)
------------------------------
Scala Common Enrich: bump to 0.30.0 (#3562)
Scala Common Enrich: add adapter for Google Analytics (#3560)
Scala Common Enrich: extend copyright notice to 2018 (#3574)
Spark Enrich: bump to 1.12.0 (#3565)
Spark Enrich: bump scala-common-enrich to 0.30.0 (#3563)
Spark Enrich: add tests for the Google Analytics adapter (#3561)
Spark Enrich: extend copyright notice to 2018 (#3573)
Spark Enrich: change Twitter repository url to https (#3593)
EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.12.0 (#3566)
Common: extend copyright notice to 2018 in READMEs (#3575)
Release 98 Argentomagus (2018-01-05)
------------------------------------
Scala Stream Collector: bump to 0.12.0 (#3548)
Scala Stream Collector: make Flash access domains and secure configurable (#2915)
Scala Stream Collector: add URL redirect replacement macro (#3491)
Scala Stream Collector: allow use of the originating scheme during cookie bounce (#3512)
Scala Stream Collector: replace Location header with RawHeader to preserve double encoding (#3546)
Scala Stream Collector: bump nsq-java-client to 1.2.0 (#3519)
Scala Stream Collector: document the stdout sink better (#3515)
Scala Stream Collector: fix stdout sink configuration (#3550)
Scala Stream Collector: fix scaladoc for 'ipAndPartitionKey' (#3513)
Stream Enrich: bump to 0.13.0 (#3549)
Stream Enrich: bump scala-common-enrich to 0.29.0 (#3553)
Stream Enrich: bump nsq-java-client to 1.2.0 (#3520)
Scala Common Enrich: bump to 0.29.0 (#3552)
Scala Common Enrich: add validation of tracker-sent timestamps (#336)
Scala Common Enrich: add validation of collector_tstamp (#3416)
Redshift: update version of atomic.events to 0.9.0 (#3517)
Common: trigger the publishing of Stream Enrich when it is under test (#3557)
Release 97 Knossos (2017-12-18)
-------------------------------
Common: reenable publishLocal in travis for spark enrich tests to pass (#3516)
Common: rename AWS deployment credentials in .travis.yml (close #3115)
EmrEtlRunner: add ability to skip RDB Loader consistency check (#3529)
EmrEtlRunner: bump to 0.30.0 (#3526)
EmrEtlRunner: uncompress gzipped raw files when copying to HDFS (#3525)
EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.11.0 (#3002)
Scala Common Enrich: add Adapter to pre-process Olark events (#1014)
Scala Common Enrich: add adapter to pre-process Mailgun webhooks (#2734)
Scala Common Enrich: add adapter to pre-process Statusgator webhooks (#2169)
Scala Common Enrich: add adapter to pre-process Unbounce webhooks (#2615)
Scala Common Enrich: add function to camelCase all JSON fields in Adaptor (#3538)
Scala Common Enrich: bump user-agent-utils to 1.20 (#2930)
Scala Common Enrich: default port to 443 if scheme is https (#3483)
Scala Common Enrich: make enrichments.ExtractEventTypeSpec timezone-safe (#3481)
Scala Common Enrich: remove toSecond parameter in Adapter (#3534)
Scala Common Enrich: tolerate content type for GET requests sent to Clojure Collector (#2743)
Scala Common Enrich: bump to 0.28.0 (#2725)
Spark Enrich: add test for Mailgun Adapter (#2763)
Spark Enrich: add test for Olark Adapter (#2792)
Spark Enrich: add test for StatusGator Adapter (#2722)
Spark Enrich: add test for Unbounce Adapter (#2745)
Spark Enrich: bump to 1.11.0 (#3533)
Spark Enrich: fix tests that fail when running on an alternative iglu service (#3503)
Spark Enrich: fix tests that fail with error when running on a platform that doesn't have native-lzo (#3508)
Spark Enrich: improve error message in test to show index line (#3494)
Spark Enrich: bump scala-common-enrich to 0.28.0 (#2724)
Release 96 Zeugma (2017-11-21)
------------------------------
Scala Stream Collector: bump to 0.11.0 (#3433)
Scala Stream Collector: update config.hocon.sample to support NSQ (#3294)
Scala Stream Collector: add NSQ sink (#2093)
Scala Stream Collector: make Kinesis, Kafka and NSQ config a coproduct (#3449)
Scala Stream Collector: keep sending records when the Kinesis stream is resharding (#3453)
Stream Enrich: bump to 0.12.0 (#3432)
Stream Enrich: update config.hocon.sample to support NSQ (#3339)
Stream Enrich: add NSQ sink (#3337)
Stream Enrich: add NSQ source (#3336)
Common: decorrelate CI/CD for Scala Stream Collector and Stream Enrich (#3441)
Release 95 Ellora (2017-11-13)
------------------------------
Redshift: add migration script for 0.8.0 to 0.9.0 (#3440)
Redshift: widen domain_sessionidx column in atomic.events from smallint to integer (#1788)
Redshift: update atomic.events to use ZSTD compression (#3435)
EmrEtlRunner: bump to 0.29.0 (#3469)
EmrEtlRunner: reintroduce processing directory not empty no-op (#3458)
EmrEtlRunner: retrieve the correct latest run ID during archive_shredded step (#3436)
EmrEtlRunner: fix pagination issue when retrieving latest run id (#3434)
EmrEtlRunner: update rdb_loader version in config.yml.sample to 0.14.0 (#3418)
EmrEtlRunner: update rdb_shredder version in config.yml.sample to 0.13.0 (#3460)
EmrEtlRunner: update spark_enrich version in config.yml.sample to 1.10.0 (#3461)
EmrEtlRunner: bump AMI version in example config to 5.9.0 (#3465)
EmrEtlRunner: force bundler 1.15.4 during CI/CD (#3493)
Spark Enrich: overwrite output datasets (#3443)
Spark Enrich: bump to 1.10.0 (#3428)
Spark Enrich: add test for Cloudfront Sep 2016 (#3000)
Spark Enrich: bump scala-common-enrich to 0.27.0 (#3427)
Spark Enrich: bump Spark to 2.2.0 (#3466)
Scala Common Enrich: bump to 0.27.0 (#3429)
Scala Common Enrich: add support for new field in CloudFront access logs (#2933)
Config: add GCP mirror into config/iglu_resolver.json (#3430)
Storage: replace example Postgres storage target configuration with 1-1-0 (#3463)
Storage: replace example Redshift storage target configuration with 2-1-0 (#3462)
Data modeling: remove web model (#3471)
Release 94 Hill of Tara (2017-10-10)
------------------------------------
Stream Enrich: bump to 0.11.1 (#3454)
Stream Enrich: keep sending records when the Kinesis stream is resharding (#3452)
Release 93 Virunum (2017-10-03)
-------------------------------
Scala Stream Collector: bump to 0.10.0 (#3424)
Scala Stream Collector: replace spray by akka-http (#3299)
Scala Stream Collector: replace argot by scopt (#3298)
Scala Stream Collector: add support for cookie bounce (#2697)
Scala Stream Collector: allow raw query params (#3273)
Scala Stream Collector: add support for the Chinese Kinesis endpoint (#3335)
Scala Stream Collector: use the DefaultAWSCredentialsProviderChain for Kinesis Sink (#3245)
Scala Stream Collector: use Kafka callback based API to detect failures to send messages (#3317)
Scala Stream Collector: make Kafka sink more fault tolerant by allowing retries (#3367)
Scala Stream Collector: fix incorrect property used for kafkaProducer.batch.size (#3173)
Scala Stream Collector: configuration decoding with pureconfig (#3318)
Scala Stream Collector: stop making the assembly jar executable (#3410)
Scala Stream Collector: add config dependency (#3326)
Scala Stream Collector: upgrade to Java 8 (#3328)
Scala Stream Collector: bump Scala version to 2.11 (#3311)
Scala Stream Collector: bump SBT to 0.13.16 (#3312)
Scala Stream Collector: bump sbt-assembly to 0.14.5 (#3329)
Scala Stream Collector: bump aws-java-sdk-kinesis to 1.11 (#3310)
Scala Stream Collector: bump kafka-clients to 0.10.2.1 (#3325)
Scala Stream Collector: bump scala-common-enrich to 0.26.0 (#3305)
Scala Stream Collector: bump iglu-scala-client to 0.5.0 (#3309)
Scala Stream Collector: bump specs2-core to 3.9.4 (#3308)
Scala Stream Collector: bump scalaz-core to 7.0.9 (#3307)
Scala Stream Collector: bump joda-time to 2.9 (#3323)
Scala Stream Collector: remove commons-codec dependency (#3324)
Scala Stream Collector: remove snowplow-thrift-raw-event dependency (#3306)
Scala Stream Collector: remove joda-convert dependency (#3304)
Scala Stream Collector: remove mimepull dependency (#3302)
Scala Stream Collector: remove scalazon dependency (#3300)
Scala Stream Collector: run the unit tests systematically in Travis (#3409)
Stream Enrich: bump to 0.11.0 (#3425)
Stream Enrich: support AT_TIMESTAMP as initial position (#3360)
Stream Enrich: add ability to force re-download IP lookup databases on reboot (#3159)
Stream Enrich: add support for the Chinese Kinesis and DynamoDB endpoints (#3344)
Stream Enrich: replace argot by scopt (#3345)
Stream Enrich: use Kafka callback based API to detect failures to send messages (#2974)
Stream Enrich: make Kafka sink more fault tolerant by allowing retries (#2973)
Stream Enrich: make partition key for enriched event stream user-configurable (#1924)
Stream Enrich: fix incorrect property used for kafkaProducer.batch.size (#3380)
Stream Enrich: flush Kafka producer (#3342)
Stream Enrich: configuration decoding with pureconfig (#3394)
Stream Enrich: stop catching fatal errors (#1455)
Stream Enrich: stop making the assembly jar executable (#3411)
Stream Enrich: change package name (#3340)
Stream Enrich: add commons-codec dependency (#3349)
Stream Enrich: add json4s dependency (#3348)
Stream Enrich: upgrade to Java 8 (#3392)
Stream Enrich: bump Scala version to 2.11 (#3388)
Stream Enrich: bump SBT to 0.13.16 (#3382)
Stream Enrich: bump sbt-assembly to 0.14.5 (#3391)
Stream Enrich: bump kafka-clients to 0.10.2.1 (#3413)
Stream Enrich: bump config to 1.3.1 (#3412)
Stream Enrich: bump iglu-scala-client to 0.5.0 (#3387)
Stream Enrich: bump scalacheck to 1.11.3 (#3386)
Stream enrich: bump scala-common-enrich to 0.26.0 (#3385)
Stream Enrich: bump specs2 to 2.3.13 (#3383)
Stream Enrich: bump scalaz-core to 7.0.9 (#3381)
Stream Enrich: bump amazon-kinesis-client to 1.8.1 (#3379)
Stream Enrich: bump aws-java-sdk to 1.11 (#3377)
Stream Enrich: remove scalaz-specs2 dependency (#3347)
Stream Enrich: remove scalazon dependency (#3341)
Stream Enrich: remove unused dependencies (#3346)
Stream Enrich: run the unit tests systematically in Travis (#3408)
Scala Common Enrich: bump to 0.26.0 (#3333)
Scala Common Enrich: drop Scala 2.10 (#3285)
Scala Common Enrich: replace akka-http with scalaj (#3330)
Scala Common Enrich: bump scala-uri to 0.5.0 (#2893)
Scala Common Enrich: bump scala-weather to 0.3.0 (#3334)
Kinesis Elasticsearch Sink: remove (#3275)
Release 92 Maiden Castle (2017-09-11)
-------------------------------------
EmrEtlRunner: release lock in case of no-op (#3396)
EmrEtlRunner: treat archive_enriched and archive_shredded as separate steps (#3401)
EmrEtlRunner: do not pass --skip shred to RDB Loader when skipping RDB Shredder (#3403)
EmrEtlRunner: if RDB Loader step hangs and is cancelled, logs are not retrieved (#3399)
EmrEtlRunner: ensure appropriate log level for RDB logs (#3369)
EmrEtlRunner: unlink downloaded RDB logs (#3363)
EmrEtlRunner: do not try to download non-existent RDB loader log files (#3405)
EmrEtlRunner: rescue the intermittent RestClient::SSLCertificateNotVerified error (#2572)
EmrEtlRunner: pass GZIP compression argument to S3DistCp as "gz" not "gzip" (#3415)
EmrEtlRunner: update rdb_loader version in config.yml.sample to 0.13.0 (#3418)
EmrEtlRunner: bump to 0.28.0 (#3404)
Documentation: fix broken links in storage/postgres's README.md (#3390)
RDB Shredder: remove (#3398)
RDB Loader: remove (#3393)
Release 91 Stonehenge (2017-08-17)
----------------------------------
EmrEtlRunner: use S3DistCp not Sluice for staging step (#276)
EmrEtlRunner: add an S3DistCp step for the _SUCCESS file produced by RDB Shredder (#3137)
EmrEtlRunner: add step to delete raw events from HDFS before shredding (#2545)
EmrEtlRunner: use S3DistCp to move raw files from S3 to HDFS for all collector formats (#3136)
EmrEtlRunner: add file- and Consul-based locking mechanism (#3352)
EmrEtlRunner: move current behavior into a `run` command (#3104)
EmrEtlRunner: add `lint` command which validates Iglu resolver and enrichments (#1946)
EmrEtlRunner: add backend for a `generate` command (#3105)
EmrEtlRunner: add --resume-from option (#3128)
EmrEtlRunner: remove support for --start and --end flags (#3132)
EmrEtlRunner: remove support for --process-enrich and --process-shred flags (#3365)
EmrEtlRunner: handle run= sub-folders if resuming from shred (#2693)
EmrEtlRunner: add "ongoing run" message on exit with return code 4 (#3129)
EmrEtlRunner: add "no logs to process" message on exit with return code 3 (#2644)
EmrEtlRunner: retrieve RDB loader logs only when it failed or the entire run was successful (#3361)
EmrEtlRunner: bump rspec to 3.5.0 (#3116)
EmrEtlRunner: bump to 0.27.0 (#3358)
Release 90 Lascaux (2017-07-26)
-------------------------------
Common: update CI/CD to push S3 artifacts to all regional Hosted Assets buckets (#3242)
Common: add CI/CD to deploy RDB Loader to Snowplow Hosted Assets (#3025)
Common: no longer bundle StorageLoader in Bintray download (#3024)
Storage: replace example Redshift storage target configuration with 2-0-0 (#3281)
Event Manifest Populator: bump to 0.1.1 (#3295)
Event Manifest Populator: support pre-R83 enriched events (#3293)
EmrEtlRunner: make targets loading consistent with enrichments (#3268)
EmrEtlRunner: expose arbitrary EMR configuration options (#3255)
EmrEtlRunner: add maximizeResourceAllocation option to EMR cluster configuration (#3253)
EmrEtlRunner: move max attempts configuration to EMR cluster configuration (#3246)
EmrEtlRunner: use Elasticity to specify Thrift-specific configuration (#3252)
EmrEtlRunner: bump elasticity version to 6.0.12 (#3249)
EmrEtlRunner: remove storage.download from config.yml.sample (#3265)
EmrEtlRunner: add rdb_loader to config.yml.sample (#3266)
EmrEtlRunner: add S3DistCp step to move enriched and shredded files to archive (#1777)
EmrEtlRunner: add RDB Loader step for each target (#3121)
EmrEtlRunner: bump to 0.26.0 (#3254)
RDB Loader: fix eventual consistency problem (#3113)
RDB Loader: load all runs from shredded, not just the first run found (#2962)
RDB Loader: remove compupdate step (#3178)
RDB Loader: add logging around database load, analyze and vacuum (#2935)
RDB Loader: use Redshift-specific driver to connect to Redshift (#1830)
RDB Loader: remove StorageLoader (#3026)
RDB Loader: accept storage target JSONs on command-line (#3022)
RDB Loader: rewrite StorageLoader in Scala, removing file archiving step (#3023)
Java Tracker: bump git submodule to 0.8.2 (#3260)
Ruby Tracker: bump git submodule to 0.6.1 (#3264)
.NET Tracker: bump git submodule to 1.0.2 (#3258)
Python Tracker: bump git submodule to 0.8.0 (#3263)
Golang Tracker: bump git submodule to 1.1.0 (#3259)
Node.js Tracker: bump git submodule to 0.3.0 (#3262)
Android Tracker: bump git submodule to 0.6.2 (#3257)
JavaScript Tracker: bump git submodule to 2.8.0 (#3261)
Release 89 Plain of Jars (2017-06-12)
-------------------------------------
Documentation: fix incorrect hyphen underlining for R88 (#3198)
Common: refactor CI/CD deploy scripts into one (#3100)
Common: update CI/CD to deploy Spark Enrich (#3069)
Common: refactor CI/CD is release tag scripts into one (#3101)
Common: update CI/CD to deploy RDB Shredder (#3038)
Common: fix travis build due to the changes to the precise image (#3210)
Common: build local Scala Common Enrich before publishing Kinesis-related artifacts (#3220)
Common: add Sonatype credentials to .travis.yml (#3217)
Common: bump Scala to 2.11 in .travis.yml (#3227)
Scala Common Enrich: bump to 0.25.0 (#3089)
Scala Common Enrich: bump scala-iglu-client to 0.5.0 (#3092)
Scala Common Enrich: remove scala-util (#3054)
Scala Common Enrich: get rid of deprecated erasure method calls (#3008)
Scala Common Enrich: bump scalaz to 7.0.9 (#3055)
Scala Common Enrich: bump scalding-args to 0.13.0 (#3058)
Scala Common Enrich: bump specs2 to 2.3.13 (#3059)
Scala Common Enrich: bump scalaz-specs2 to 0.2 (#3060)
Scala Common Enrich: bump scala-forex to 0.5.0 (#3057)
Scala Common Enrich: bump sbt to 0.13.13 (closes #3056)
Scala Common Enrich: bump Scala to 2.11.11 (#3007)
Scala Common Enrich: add Scala 2.11 cross-building (#3061)
Scala Common Enrich: make EnrichedEvent Serializable (#3081)
Scala Common Enrich: fix failing WeatherEnrichmentSpec expectation (#3205)
Scala Common Enrich: remove ScalazArgs (#3209)
Scala Common Enrich: upgrade to Java 8 (#3212)
Scala Common Enrich: add CI/CD (#3216)
Spark Enrich: bump to 1.9.0 (#3072)
Spark Enrich: rename from Scala Hadoop Enrich (#3064)
Spark Enrich: change the package from hadoop to spark (#3076)
Spark Enrich: bump sbt-assembly to 0.14.3 (#3078)
Spark Enrich: bump SBT to 0.13.13 (#3065)
Spark Enrich: port from Scalding to Spark (#3067)
Spark Enrich: bump scala-common-enrich to 0.25 (#3096)
Spark Enrich: bump Scalaz to 7.0.9 (#3097)
Spark Enrich: bump iglu-scala-client to 0.5.0 (#3098)
Spark Enrich: bump specs2-core to 2.3.13 (#3099)
Spark Enrich: bump Scala version to 2.11 (#3070)
Spark Enrich: upgrade to Java 8 (#2381)
Spark Enrich: fix SqlQueryEnrichmentCfLinesSpec (#3224)
Spark Enrich: fix CurrencyConversionTransactionSpec (#3225)
Spark Enrich: run the unit tests systematically in Travis (#3228)
EmrEtlRunner: bump to 0.25.0 (#3039)
EmrEtlRunner: update to run Spark Enrich instead of Scala Hadoop Enrich (#3066)
EmrEtlRunner: update to run RDB Shredder instead of Scala Hadoop Shred (#3033)
EmrEtlRunner: add ability to run Spark jobs (#641)
EmrEtlRunner: replace hadoop_shred in config.yml.sample with rdb_shredder (#3035)
EmrEtlRunner: bump elasticity version to 6.0.11 (#3053)
EmrEtlRunner: use the Scalding step provided by Elasticity (#3052)
EmrEtlRunner: replace hadoop_enrich in config.yml.sample with spark_enrich (#3068)
EmrEtlRunner: bump AMI version in example config to 5.5.0 (#3207)
RDB Shredder: bump to 0.12.0 (#3042)
RDB Shredder: rename from Scala Hadoop Shred (#3031)
RDB Shredder: move from 3-enrich to 4-storage (#3032)
RDB Shredder: change the package to storage from enrich (#3036)
RDB Shredder: port from Scalding to Spark (#3034)
RDB Shredder: bump scala-common-enrich to 0.25 (#3091)
RDB Shredder: bump iglu-scala-client to 0.5.0 (#3090)
RDB Shredder: bump specs2-core to 2.3.13 (#3093)
RDB Shredder: bump Scala version to 2.11 (#3071)
RDB Shredder: upgrade to Java 8 (#3213)
RDB Shredder: run the unit tests systematically in Travis (#3229)
StorageLoader: bump to 0.11.0 (#3214)
StorageLoader: add support for Spark-based Shredder's directory structure (#3044)
Release 88 Angkor Wat (2017-04-27)
----------------------------------
Documentation: fix incorrect release date for R87 (#3126)
Common: update copyright years in README (#3148)
Common: add CI/CD for EmrEtlRunner and StorageLoader (#3102)
Common: add CI/CD for Event Manifest Populator (#3170)
Common: add AWS staging credentials to .travis.yml (#3114)
Common: update script to sync ap-northeast-2 (Seoul) Snowplow Hosted Assets bucket (#3160)
Common: update READMEs markdown in according with CommonMark (#3157)
Event Manifest Populator: add Spark job to backpopulate DynamoDB duplicate storage (#3158)
Scala Common Enrich: fix failing WeatherEnrichmentSpec expectation (#3154)
Scala Common Enrich: bump to 0.24.1 (#3155)
Scala Hadoop Shred: bump sbt-assembly to 0.14.4 (#3140)
Scala Hadoop Shred: bump SBT to 0.13.13 (#2972)
Scala Hadoop Shred: bump to 0.11.0 (#3041)
Scala Hadoop Shred: remove explicit jackson-databind dependency (#3138)
Scala Hadoop Shred: add cross-batch natural deduplication (#2999)
Storage: add example storage target configuration JSONs (#2990)
StorageLoader: bump to 0.10.0 (#3109)
StorageLoader: remove Northern Virginia endpoint for Postgres load (#3143)
StorageLoader: handle return code of 4 for EmrEtlRunner in snowplow-runner-and-loader.sh (#3139)
StorageLoader: use storage target JSONs instead of targets section in config.yml (#2992)
StorageLoader: replace table configuration property with schema (#2458)
EmrEtlRunner: bump to 0.24.0 (#3040)
EmrEtlRunner: update hadoop_shred version in config.yml.sample to 0.11.0 (#3197)
EmrEtlRunner: add script to convert config.yml targets section into JSON format (#3135)
EmrEtlRunner: remove targets section from config.yml.sample (#2989)
EmrEtlRunner: no longer use sources property when loading Elasticsearch (#2993)
EmrEtlRunner: use storage target JSONs instead of targets section in config.yml (#2991)
Release 87 Chichen Itza (2017-02-21)
------------------------------------
EmrEtlRunner: bump to 0.23.0 (#2960)
EmrEtlRunner: bump JRuby version to 9.1.6.0 (#3050)
EmrEtlRunner: bump Elasticity to 6.0.10 (#3013)
EmrEtlRunner: remove AnonIpHash from contracts.rb (#2523)
EmrEtlRunner: remove UnmatchedLzoFilesError check (#2740)
EmrEtlRunner: use S3DistCp not Sluice for archive_raw step (#1977)
EmrEtlRunner: add warning about the array of in buckets in config.yml (#2462)
EmrEtlRunner: add dedicated return code of 4 for DirectoryNotEmptyError (#2546)
EmrEtlRunner: add support for specifying EBS for Hadoop workers (#2950)
EmrEtlRunner: add example EBS configuration to config.yml.sample (#3012)
EmrEtlRunner: catch Elasticity ThrottlingExceptions while waiting for EMR (#3028)
EmrEtlRunner: catch Elasticity ArgumentErrors while waiting for EMR (#3027)
StorageLoader: bump to 0.9.0 (#2961)
StorageLoader: bump JRuby version to 9.1.6.0 (#3051)
StorageLoader: fix typo in S3Tasks.download_events (#2888)
StorageLoader: update manifest table as part of Redshift load transaction (#2280)
Redshift: added manifest table (#2265)
Release 86 Petra (2016-12-20)
-----------------------------
Common: add AWS credentials to .travis.yml (#2963)
Common: add CI/CD for Scala Hadoop Enrich (#2982)
Common: add CI/CD for Scala Hadoop Shred (#2928)
Common: migrate Hadoop Event Recovery deployment to Release Manager (#2983)
Common: remove short-hostname addon from travis.yml (#2674)
Common: update script to sync us-east-2 (Ohio) Snowplow Hosted Assets bucket (#2986)
Common: update script to sync ca-central-1 (Montreal) Snowplow Hosted Assets bucket (#3004)
Common: update script to sync eu-west-2 (London) Snowplow Hosted Assets bucket (#3005)
Common: use AWS environment variables to sync Snowplow Hosted Assets buckets (#2985)
Scala Hadoop Shred: bump to 0.10.0 (#2979)
Scala Hadoop Shred: add general top-level exception handling (#2071)
Scala Hadoop Shred: get the CustomPartitionSourceTest working with Hadoop 2.4 (#1960)
Scala Hadoop Shred: fix omitted string interpolation (#2562)
Scala Hadoop Shred: deduplicate event_ids with different event_fingerprints (synthetic duplicates) (#24)
Scala Hadoop Shred: stop catching fatal errors (#1456)
EmrEtlRunner: update hadoop_shred version in config.yml.sample to 0.10.0 (#3003)
Data modeling: add drill fields to web block (#2956)
Data modeling: resolve issues with web model (#2954)
Data modeling: restrict table scan on deduplication queries (#2929)
Data modeling: add web model (#2925)
Data modeling: delete example models (#2836)
Data modeling: remove outdated recipes (#2626)
Release 85 Metamorphosis (2016-11-15)
-------------------------------------
Scala Stream Collector: bump to 0.9.0 (#2936)
Scala Stream Collector: add Kafka sink (#2937)
Scala Stream Collector: update config.hocon.sample to support Kafka (#2943)
Scala Stream Collector: move sink.kinesis.buffer to sink.buffer in config.hocon.sample (#2938)
Stream Enrich: bump to 0.10.0 (#2942)
Stream Enrich: add Kafka sink (#2939)
Stream Enrich: add Kafka source (#2941)
Stream Enrich: update config.hocon.sample to support Kafka (#2940)
Stream Enrich: fix incorrect parsing of S3 urls (#2921)
Release 84 Steller's Sea Eagle (2016-10-07)
-------------------------------------------
Common: standardise sbt-assembly settings (#2900)
Common: refactor Kinesis release CI/CD (#2887)
Common: update script to sync ap-south-1 (Mumbai) Snowplow Hosted Assets bucket (#2903)
Scala Stream Collector: bump to 0.8.0 (#2886)
Scala Stream Collector: add scala_ into artifact filename in Bintray (#2843)
Scala Stream Collector: use nuid query parameter value to set the 3rd party network id cookie (#2512)
Scala Stream Collector: configurable cookie path (#2528)
Scala Stream Collector: call Config.resolve() to resolve environment variables in hocon (#2879)
Stream Enrich: bump to 0.9.0 (#2728)
Stream Enrich: bump Scala Tracker to 0.3.0 (#2898)
Stream Enrich: bump Scala Common Enrich to 0.24.0 (#2729)
Stream Enrich: tolerate trailing slashes for paths in IP Lookups Enrichment configuration (#2744)
Stream Enrich: call Config.resolve() to resolve environment variables in hocon (#2878)
Kinesis Elasticsearch Sink: bump to 0.8.0 (#2885)
Kinesis Elasticsearch Sink: bump Scala Tracker to 0.3.0 (#2899)
Kinesis Elasticsearch Sink: allow parametrized timeouts for jest client (#2897)
Kinesis Elasticsearch Sink: does not take into account buffer configurations (#2895)
Kinesis Elasticsearch Sink: error messages are not helpful (#2896)
Kinesis Elasticsearch Sink: ensure field names do not contain any dots (#2894)
Kinesis Elasticsearch Sink: add support for Elasticsearch 2.x (#2525)
Kinesis Elasticsearch Sink: call Config.resolve() to resolve environment variables in hocon (#2880)
StorageLoader: remove all JSON Path files (#2905)
Redshift: remove all Redshift DDL for Iglu Central schemas (#2904)
Release 83 Bald Eagle (2016-09-06)
----------------------------------
Scala Tracker: bump git submodule to 0.3.0 (#2726)
ActionScript 3.0 Tracker: bump git submodule to 0.3.0 (#2727)
Scala Common Enrich: bump to 0.24.0 (#2715)
Scala Common Enrich: add SQL Query Enrichment (#2321)
Scala Common Enrich: add POST support to IgluAdapter (#1184)
Scala Hadoop Enrich: bump to 1.8.0 (#2716)
Scala Hadoop Enrich: bump Scala Common Enrich to 0.24.0 (#2717)
Scala Hadoop Enrich: add test for SQL Query Enrichment (#2718)
Scala Hadoop Enrich: make resolver config in JobSpecHelpers injectable (#2825)
EmrEtlRunner: bump to 0.22.0 (#2784)
EmrEtlRunner: bump Ruby version to 2.2.3 (#2869)
EmrEtlRunner: bump Sluice to 0.4.0 (#1708)
EmrEtlRunner: bump Contracts to 0.9 (#2789)
EmrEtlRunner: rebuild Gemfile.lock (#2872)
EmrEtlRunner: add version recognition of currently installed commons-codec (#2735)
EmrEtlRunner: update snowplow-ami4-bootstrap.sh to take optional commons-codec version argument (#2713)
EmrEtlRunner: fix bug with double compression in shred step if enrich skipped (#2586)
EmrEtlRunner: pass GZIP compression argument to S3DistCp as "gz" not "gzip" (#2679)
EmrEtlRunner: update hadoop_enrich version in config.yml.sample to 1.8.0 (#2756)
EmrEtlRunner: replace deprecated Dir.exists? with Dir.exist? (#2799)
EmrEtlRunner: fix contract for fatal_with (#2810)
EmrEtlRunner: use region-specific Snowplow Hosted Assets buckets (#2813)
EmrEtlRunner: disable contract on build_fix_filenames due to Contracts issue #238 (#2828)
Storage: add Kinesis S3 git submodule (#2706)
StorageLoader: bump to 0.8.0 (#2785)
StorageLoader: bump Ruby version to 2.2.3 (#2870)
StorageLoader: bump Sluice to 0.4.0 (#2786)
StorageLoader: bump Contracts to 0.9 (#2790)
StorageLoader: add explicit mime-types dependency (#2805)
StorageLoader: rebuild Gemfile.lock (#2871)
StorageLoader: use Northern Virginia endpoint not global endpoint for us-east-1 (#2748)
StorageLoader: replace module_function everywhere with self (#2801)
StorageLoader: fix broken contracts (#2461)
StorageLoader: write JSON path for com.amazon.aws.lambda/s3_notification_event (#2590)
StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_foreground/jsonschema/1-0-0 (#2857)
StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_background/jsonschema/1-0-0 (#2856)
StorageLoader: write JSON path for com.snowplowanalytics.snowplow/application_error/jsonschema/1-0-0 (#2855)
Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_foreground/jsonschema/1-0-0 (#2854)
Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_background/jsonschema/1-0-0 (#2853)
Redshift: add Redshift DDL for com.snowplowanalytics.snowplow/application_error/jsonschema/1-0-0 (#2852)
Redshift: add Redshift DDL for com.amazon.aws.lambda/s3_notification_event/jsonschema/1-0-0 (#2589)
Release 82 Tawny Eagle (2016-08-08)
-----------------------------------
Common: publish each Kinesis app individually to Bintray (#2492)
Kinesis Elasticsearch Sink: bump to 0.7.0 (#2816)
Kinesis Elasticsearch Sink: configure transport port (#2102)
Kinesis Elasticsearch Sink: add support for HTTP protocol (#2092)
Kinesis Elasticsearch Sink: unify logger configuration (#1699)
Release 81 Kangaroo Island Emu (2016-06-16)
-------------------------------------------
Documentation: fix broken link in Thrift Schemas' README.md (#2498)
Common: add encrypted S3 credentials to .travis.yml (#2673)
Common: delete publish-kinesis-release.bash (#2711)
Android Tracker: bump git submodule to 0.5.4 (#2710)
JavaScript Tracker: bump git submodule to 2.6.1 1. (#2708)
Objective-C Tracker: bump git submodule to 0.6.1 (#2709)
Golang Tracker: add git submodule (#2619)
Scala Common Enrich: bump to 0.23.1 (#2699)
Scala Common Enrich: bump commons codec to 1.10 (#2691)
Stream Enrich: bump to 0.8.1 (#2701)
Stream Enrich: bump Scala Common Enrich to 0.23.1 (#2700)
Hadoop Event Recovery: update README instructions (#2348)
Hadoop Event Recovery: add continuous deployment (#2692)
Hadoop Event Recovery: rename from Scala Hadoop Bad Rows (#2694)
Hadoop Event Recovery: allow source row to be transformed with JavaScript (#2223)
Hadoop Event Recovery: capitalize Snowplow correctly in copyright notices (#2641)
StorageLoader: write JSON path for com.clearbit/person (#2631)
StorageLoader: write JSON path for com.clearbit/company (#2632)
StorageLoader: write JSON path for com.amazon.aws.lambda/java_context (#2560)
Redshift: add Redshift DDL for com.clearbit/person/jsonschema/1-0-0 (#2633)
Redshift: add Redshift DDL for com.clearbit/company/jsonschema/1-0-0 (#2634)
Redshift: add Redshift DDL for com.amazon.aws.lambda/java_context/jsonschema/1-0-0 (#2559)
Release 80 Southern Cassowary (2016-05-30)
------------------------------------------
Common: add CI/CD for Kinesis apps (#2621)
Common: add Bintray credentials to .travis.yml (#2618)
Common: change Kinesis pipeline status from "Beta" to "Production-ready" in READMEs (#2629)
Config: update config/iglu_resolver.json version to 1-0-1 (#2479)
Scala Stream Collector: bump to 0.7.0 (#2595)
Scala Stream Collector: increase tolerance of timings in tests (#2614)
Scala Stream Collector: send nonempty response to POST requests (#2606)
Scala Stream Collector: crash when unable to find stream instead of hanging (#2583)
Scala Stream Collector: stop using deprecated Config.getMilliseconds method (#2570)
Scala Stream Collector: move example configuration file to examples folder (#2566)
Scala Stream Collector: upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2384)
Scala Stream Collector: crash rather than hanging when unable to bind to the supplied port (#2551)
Scala Stream Collector: bump Spray version to 1.3.3 (#2522)
Scala Stream Collector: bump Scala version to 2.10.5 (#2565)
Scala Stream Collector: fix omitted string interpolation (#2561)
Stream Enrich: bump to 0.8.0 (#2596)
Stream Enrich: bump Common Enrich to 0.23.0 (#2612)
Stream Enrich: bump Iglu Scala Client to 0.4.0 (#2688)
Stream Enrich: add configuration setting for MaxRecords (#2610)
Stream Enrich: use nonEmpty method to check whether lists are empty (#2608)
Stream Enrich: refactor functions to avoid return keyword (#2607)
Stream Enrich: upgrade the log level for reports of stream nonexistence from INFO to ERROR (#2598)
Stream Enrich: crash when unable to find stream instead of hanging (#2584)
Stream Enrich: add standard copyright notice to AbstractSourceSpec.scala (#2580)
Stream Enrich: make logging more succinct in case of failure (#1723)
Stream Enrich: move example configuration file to examples folder (#2567)
Stream Enrich: remove src/main/resolver.json.sample (#1932)
Stream Enrich: use json4s to combine the enrichment configuration JSONs (#2259)
Kinesis Elasticsearch Sink: bump to 0.6.0 (#2597)
Kinesis Elasticsearch Sink: add configuration setting for MaxRecords (#2611)
Kinesis Elasticsearch Sink: crash when unable to find stream instead of hanging (#2585)
Kinesis Elasticsearch Sink: move example configuration file to examples folder (#2568)
Release 79 Black Swan (2016-05-12)
----------------------------------
Documentation: removed closes from CHANGELOG tickets for R78 (#2534)
Common: changed Vagrantfile to use NFS and extra CPU cores by default (#2482)
Config: removed duplicated enabled property in ua_parser_config.json (#2424)
Config: enabled switched to false in currency_conversion_config.json (#2327)
Config: enabled switched to false in weather_enrichment_config.json (#2326)
EmrEtlRunner: bumped AMI version in example config to 4.5.0 (#2604)
EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.7.0 (#2661)
EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.9.0 (#2662)
Scala Common Enrich: bumped user-agent-utils version to latest (#2516)
Scala Common Enrich: transaction item quantity type changed to JInteger (#2157)
Scala Common Enrich: bumped to 0.23.0 (#2486)
Scala Common Enrich: improved OWM error if user doesn't have historical weather (#2325)
Scala Common Enrich: added API Request Enrichment (#2051)
Scala Common Enrich: bumped Iglu Scala Client to 0.4.0 (#2333)
Scala Common Enrich: added HTTP Header Extractor Enrichment (#1373)
Scala Hadoop Enrich: bumped to 1.7.0 (#2446)
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.23.0 (#2485)
Scala Hadoop Enrich: bumped Iglu Scala Client to 0.4.0 (#2478)
Scala Hadoop Enrich: added test for API Request Enrichment (#2603)
Scala Hadoop Shred: bumped to 0.9.0 (#2480)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.23.0 (#2481)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.4.0 (#2449)
Release 78 Great Hornbill (2016-03-15)
--------------------------------------
Common: removed openjdk7 from .travis.yml (#2533)
Scala Common Enrich: bumped to 0.22.0
Scala Common Enrich: added handling for bad rows which are too long to print in full (#2419)
Kinesis: updated publish-kinesis-release.bash (#2477)
Scala Stream Collector: bumped to 0.6.0
Scala Stream Collector: added Scala Common Enrich as a library dependency (#2153)
Scala Stream Collector: added click redirect mode (#549)
Scala Stream Collector: configured the ability to use IP address as partition key (#2331)
Scala Stream Collector: converted bad rows to new format (#2006)
Scala Stream Collector: shared a single thread pool for all writes to Kinesis (#2369)
Scala Stream Collector: specified UTF-8 encoding everywhere (#2147)
Scala Stream Collector: made cookie name customizable, thanks @kazjote! (#2474)
Scala Stream Collector: added boolean collector.cookie.enabled setting (#2488)
Scala Stream Collector: made backoffPolicy fields macros (#2518)
Scala Stream Collector: updated AWS credentials to support iam/env/default not cpf (#1518)
Scala Kinesis Enrich: bumped to 0.7.0
Scala Kinesis Enrich: renamed to Stream Enrich (#2418)
Scala Kinesis Enrich: bumped Kinesis Client Library to 1.6.1 (#1823)
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.21.0 (#2033)
Scala Kinesis Enrich: bumped Iglu Scala Client to 0.3.1 (#2080)
Scala Kinesis Enrich: configured the ability to use IP address as partition key (#2332)
Scala Kinesis Enrich: started emitting KCL metrics to CloudWatch (#2357)
Scala Kinesis Enrich: converted bad rows to new format (#1207)
Scala Kinesis Enrich: removed outdated comment about ClasspathPropertiesFileCredentialsProvider from sample config file (#1519)
Scala Kinesis Enrich: removed redundant documentation from README (#2032)
Scala Kinesis Enrich: updated test suite with valid self-describing JSONs (#2151)
Scala Kinesis Enrich: updated Scala Tracker to 0.2.0 and enabled EC2 context (#2109)
Scala Kinesis Enrich: updated to use new EtlPipeline (#1933)
Scala Kinesis Enrich: specified UTF-8 encoding everywhere (#2148)
Kinesis Elasticsearch Sink: bumped to 0.5.0
Kinesis Elasticsearch Sink: bumped Kinesis Client Library to 1.6.1 (#1824)
Kinesis Elasticsearch Sink: bumped Scala Common Enrich to 0.22.0 (#2152)
Kinesis Elasticsearch Sink: added mixed output mode (#2412)
Kinesis Elasticsearch Sink: added new canonical event fields (#2089)
Kinesis Elasticsearch Sink: moved the stream-type setting into the main sink configuration object (#2490)
Kinesis Elasticsearch Sink: made source and sink fields macros (#2519)
Kinesis Elasticsearch Sink: renamed Build object to match project (#2002)
Kinesis Elasticsearch Sink: converted bad rows to new format (#1208)
Kinesis Elasticsearch Sink: updated schema regular expression in line with Iglu Central (#1998)
Kinesis Elasticsearch Sink: cached the mapping of field name to field type (#2090)
Kinesis Elasticsearch Sink: specified UTF-8 encoding everywhere (#2149)
Kinesis Elasticsearch Sink: stopped sending timestamp instead of failure count (#1951)
Kinesis Elasticsearch Sink: made performance of conversion from TSV to JSON linear (#1847)
Kinesis Elasticsearch Sink: updated to latest version of EnrichedEvent (#2089)
Release 77 Great Auk (2016-02-28)
---------------------------------
Documentation: updated tracker status table (#1999)
Documentation: fixed incorrect entries in CHANGELOG (#2443)
Common: made optionality of Lingual and HBase in config.yml clearer (#2206)
Common: fixed OpenJDK build in Travis CI (#2447)
Scala Hadoop Enrich: bumped to 1.6.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.21.0 (#2442)
Scala Common Enrich: bumped to 0.21.0
Scala Common Enrich: fixed exception for invalid API key in currency conversion (#2441)
Scala Common Enrich: fixed exception on same currency conversion (#2437)
Scala Common Enrich: switched from javax.script to org.mozilla.javascript for JavaScriptEnrichment (#2453)
Scala Hadoop Shred: bumped to 0.8.0
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.2 (#2319)
EmrEtlRunner: bumped to 0.21.0
EmrEtlRunner: attached monitoring tags to jobflow (#425)
EmrEtlRunner: now throwing exception if processing thrift with --skip s3distcp or AMI 2.x.x (#1648)
EmrEtlRunner: added bootstrap action to prepare AMI >= 3.8.0 (#2320)
EmrEtlRunner: bumped Elasticity to 6.0.7 (#2400)
EmrEtlRunner: added support for Amazon EMR 4.x.x series (#1926)
EmrEtlRunner: prevented bad CLI options from throwing stack trace (#1930)
EmrEtlRunner: made error for nonempty processing bucket collector-agnostic (#1961)
EmrEtlRunner: bumped Ruby Tracker to 0.5.2 (#2143)
EmrEtlRunner: improved retry logic for EMR bootstrap timeouts (#2150)
EmrEtlRunner: excluded previously-built executables from the build (#2163)
EmrEtlRunner: added support for additional_info in EMR section of configuration (#2211)
EmrEtlRunner: added Elasticsearch stage to help message (#2323)
EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.6.0 (#2459)
EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.8.0 (#2370)
EmrEtlRunner: removed snowplow-emr-etl-runner.sh (#2445)
StorageLoader: bumped to 0.7.0
StorageLoader: added support for supplying config file as Base64-encoded string (#2227)
StorageLoader: added ability to retrieve AWS credentials from EC2 role (#2226)
StorageLoader: excluded previously-built executables from the build (#2164)
StorageLoader: started printing stack trace for failures not caused by bad configuration (#2160)
StorageLoader: bumped Ruby Tracker to 0.5.2 (#2144)
StorageLoader: moved ANALYZE statements after VACUUM statements (#1361)
StorageLoader: added resolver config option to snowplow-runner-and-loader.sh (#2170)
StorageLoader: updated snowplow-runner-and-loader.sh to use JRuby binaries (#2233)
StorageLoader: removed snowplow-storage-loader.sh (#2444)
StorageLoader: wrote JSON Path file for com.optimizely/visitor_dimension event (#2436)
StorageLoader: wrote JSON Path file for com.optimizely/visitor_audience event (#2435)
StorageLoader: wrote JSON Path file for com.optimizely/visitor event (#2434)
StorageLoader: wrote JSON Path file for com.optimizely/variation event (#2433)
StorageLoader: wrote JSON Path file for com.optimizely/state event (#2432)
StorageLoader: wrote JSON Path file for com.optimizely/experiment event (#2431)
StorageLoader: wrote JSON Path file for io.augur.snowplow/identity_lite (#1958)
Redshift: wrote Redshift DDL for com.optimizely/visitor_dimension event (#2430)
Redshift: wrote Redshift DDL for com.optimizely/visitor_audience event (#2429)
Redshift: wrote Redshift DDL for com.optimizely/visitor event (#2428)
Redshift: wrote Redshift DDL for com.optimizely/variation event (#2427)
Redshift: wrote Redshift DDL for com.optimizely/state event (#2426)
Redshift: wrote Redshift DDL for com.optimizely/experiment event (#2425)
Redshift: added Redshift DDL for io.augur.snowplow/identity_lite (#1957)
Release 76 Changeable Hawk-Eagle (2016-01-26)
---------------------------------------------
Scala Hadoop Enrich: bumped to 1.5.1
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.20.1 (#2338)
Scala Common Enrich: bumped to 0.20.1
Scala Common Enrich: now using only base MIME type in content-type check for SendGrid Adapter (#2328)
Scala Hadoop Shred: bumped to 0.7.0
Scala Hadoop Shred: fixed good tests' checks for empty paths (#2278)
Scala Hadoop Shred: now deduplicating event_id and event_fingerprint pairs (#2246)
Scala Hadoop Shred: fixed incorrect event in SchemaValidationFailed1Spec (#2355)
Scala Hadoop Shred: updated tests to check atomic-events output (#2264)
Scala Hadoop Shred: now only writes atomic-events if JSONs shred successfully (#2245)
Scala Hadoop Shred: removed empty SchemaValidationFailed2Spec (#2271)
Scala Hadoop Shred: fixed test suite issue with multiple input lines (#2270)
EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.5.1 (#2339)
EmrEtlRunner: changed in bucket example in config.yml.sample to s3://my-in-bucket (#2358)
EmrEtlRunner: updated archive bucket examples in config.yml (#2368)
EmrEtlRunner: updated hadoop_shred version in config.yml.sample to 0.7.0 (#2360)
StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/action (#2136)
StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2135)
StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2134)
StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/productFieldObject (#2133)
StorageLoader: wrote JSON Paths file for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2132)
Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/promotionFieldObject (#2131)
Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/productFieldObject (#2130)
Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/impressionFieldObject (#2129)
Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/actionFieldObject (#2128)
Redshift: added Redshift DDL for com.google.analytics.enhanced-ecommerce/action (#2127)
Release 75 Long-Legged Buzzard (2016-01-02)
-------------------------------------------
Scala Hadoop Enrich: bumped to 1.5.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.20.0 (#2200)
Scala Hadoop Enrich: added test for loading Urban Airship Connect ndjson files (#2168)
Scala Hadoop Enrich: added test for SendGrid Adapter (#2194)
Scala Common Enrich: bumped to 0.20.0
Scala Common Enrich: added JsonLoader for Urban Airship, Mixpanel et al (#2210)
Scala Common Enrich: added Adapter to pre-process Urban Airship events (#2167)
Scala Common Enrich: abstracted Mandrill `reformatParameters` function into Adapter (#2171)
Scala Common Enrich: added Adapter to pre-process SendGrid events (#1161)
EmrEtlRunner: bumped to 0.20.0
EmrEtlRunner: updated hadoop_enrich version in config.yml.sample to 1.5.0 (#2282)
EmrEtlRunner: added raw s3 -> hdfs step with group by (#2253)
EmrEtlRunner: added directory flattening code (#2232)
EmrEtlRunner: added support for ndjson loader format (#2251)
EmrEtlRunner: improved test coverage of runner.rb (#2250)
Redshift: added Redshift DDL for a com.sendgrid/processed event (#2172)
Redshift: added Redshift DDL for a com.sendgrid/dropped event (#2173)
Redshift: added Redshift DDL for a com.sendgrid/delivered event (#2174)
Redshift: added Redshift DDL for a com.sendgrid/deferred event (#2175)
Redshift: added Redshift DDL for a com.sendgrid/bounce event (#2176)
Redshift: added Redshift DDL for a com.sendgrid/open event (#2177)
Redshift: added Redshift DDL for a com.sendgrid/click event (#2178)
Redshift: added Redshift DDL for a com.sendgrid/spamreport event (#2179)
Redshift: added Redshift DDL for a com.sendgrid/unsubscribe event (#2180)
Redshift: added Redshift DDL for a com.sendgrid/group_unsubscribe event (#2181)
Redshift: added Redshift DDL for a com.sendgrid/group_resubscribe event (#2182)
Redshift: added Redshift DDL for com.urbanairship.connect/UNINSTALL event (#2283)
Redshift: added Redshift DDL for com.urbanairship.connect/TAG_CHANGE event (#2284)
Redshift: added Redshift DDL for com.urbanairship.connect/SEND event (#2285)
Redshift: added Redshift DDL for com.urbanairship.connect/RICH_READ event (#2286)
Redshift: added Redshift DDL for com.urbanairship.connect/RICH_DELIVERY event (#2287)
Redshift: added Redshift DDL for com.urbanairship.connect/RICH_DELETE event (#2288)
Redshift: added Redshift DDL for com.urbanairship.connect/REGION event (#2289)
Redshift: added Redshift DDL for com.urbanairship.connect/PUSH_BODY event (#2290)
Redshift: added Redshift DDL for com.urbanairship.connect/OPEN event (#2291)
Redshift: added Redshift DDL for com.urbanairship.connect/LOCATION event (#2292)
Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2293)
Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2294)
Redshift: added Redshift DDL for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2295)
Redshift: added Redshift DDL for com.urbanairship.connect/FIRST_OPEN event (#2296)
Redshift: added Redshift DDL for com.urbanairship.connect/CUSTOM event (#2297)
Redshift: added Redshift DDL for com.urbanairship.connect/CLOSE event (#2298)
StorageLoader: added JSON Path file for com.sendgrid/processed event (#2183)
StorageLoader: added JSON Path file for com.sendgrid/dropped event (#2184)
StorageLoader: added JSON Path file for com.sendgrid/delivered event (#2185)
StorageLoader: added JSON Path file for com.sendgrid/deferred event (#2186)
StorageLoader: added JSON Path file for com.sendgrid/bounce event (#2187)
StorageLoader: added JSON Path file for com.sendgrid/open event (#2188)
StorageLoader: added JSON Path file for com.sendgrid/click event (#2189)
StorageLoader: added JSON Path file for com.sendgrid/spamreport event (#2190)
StorageLoader: added JSON Path file for com.sendgrid/unsubscribe event (#2191)
StorageLoader: added JSON Path file for com.sendgrid/group_unsubscribe event (#2192)
StorageLoader: added JSON Path file for com.sendgrid/group_resubscribe event (#2193)
StorageLoader: added JSON Path file for com.urbanairship.connect/UNINSTALL event (#2299)
StorageLoader: added JSON Path file for com.urbanairship.connect/TAG_CHANGE event (#2300)
StorageLoader: added JSON Path file for com.urbanairship.connect/SEND event (#2301)
StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_READ event (#2302)
StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_DELIVERY event (#2303)
StorageLoader: added JSON Path file for com.urbanairship.connect/RICH_DELETE event (#2304)
StorageLoader: added JSON Path file for com.urbanairship.connect/REGION event (#2305)
StorageLoader: added JSON Path file for com.urbanairship.connect/PUSH_BODY event (#2306)
StorageLoader: added JSON Path file for com.urbanairship.connect/OPEN event (#2307)
StorageLoader: added JSON Path file for com.urbanairship.connect/LOCATION event (#2308)
StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_RESOLUTION event (#2309)
StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_EXPIRATION event (#2310)
StorageLoader: added JSON Path file for com.urbanairship.connect/IN_APP_MESSAGE_DISPLAY event (#2311)
StorageLoader: added JSON Path file for com.urbanairship.connect/FIRST_OPEN event (#2312)
StorageLoader: added JSON Path file for com.urbanairship.connect/CUSTOM event (#2313)
StorageLoader: added JSON Path file for com.urbanairship.connect/CLOSE event (#2314)
Data modeling: removed events enriched from web-recalculate (#2275)
Data modeling: added cookie-to-user-id map to web-recalculate (#2274)
Release 74 European Honey Buzzard (2015-12-22)
----------------------------------------------
Common: added encrypted OWM API key to .travis.yml (#2243)
Scala Hadoop Enrich: bumped to 1.4.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.19.0 (#2255)
Scala Common Enrich: bumped to 0.19.0
Scala Common Enrich: added weather enrichment (#456)
Scala Common Enrich: fixed issue with BC timestamp in ExtractEventTypeSpec (#2257)
Scala Common Enrich: fixed currency conversion enrichment's test for invalid API key (#2258)
StorageLoader: wrote JSON path file for org.openweathermap/weather (#2240)
Redshift: added Redshift DDL for org.openweathermap/weather (#2241)
Release 73 Cuban Macaw (2015-12-04)
-----------------------------------
EmrEtlRunner: bumped to 0.19.0
EmrEtlRunner: added hadoop_elasticsearch to config.yml.sample (#2124)
EmrEtlRunner: added support for Elasticsearch in targets section of config (#826)
EmrEtlRunner: bumped Elasticity to 6.0.5 (#2026)
EmrEtlRunner: stopped skipping the whole job just because enrich and shred are being skipped (#2049)
Scala Common Enrich: bumped Iglu Scala Client to 0.3.1 (#2079)
Scala Common Enrich: bumped version to 0.18.0
Scala Common Enrich: moved ScalazArgs into shared library (#2010)
Scala Common Enrich: removed executable bit from Scala source files (#2022)
Scala Common Enrich: removed JSON length checks (#2041)
Scala Common Enrich: removed truncation code (#2044)
Scala Common Enrich: stopped attempting to catch fatal errors (#2045)
Scala Hadoop Enrich: bumped to 1.3.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.18.0 (#2015)
Scala Hadoop Enrich: added Iglu Scala Client as an explicit dependency (#2115)
Scala Hadoop Enrich: added .forceToDisk to speed up run (#859)
Scala Hadoop Enrich: started using Scala Common Enrich's version of ScalazArgs (#2013)
Scala Hadoop Shred: bumped to 0.6.0
Scala Hadoop Shred: added .forceToDisk to common to speed up run (#2039)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.1 (#2081)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.18.0 (#2016)
Scala Hadoop Shred: applied truncation logic to atomic-events TSV (#2042)
Scala Hadoop Shred: processed enriched events for atomic.events removing JSON fields (#1731)
Scala Hadoop Shred: started using Scala Common Enrich's version of ScalazArgs (#2014)
Storage: fixed README's link to architecture image, thanks @miike! (#2156)
Hadoop Elasticsearch Sink: added. (#824)
StorageLoader: bumped to 0.6.0
StorageLoader: added tcpKeepAlive=true to JDBC for long-running COPYs via NAT (#2145)
StorageLoader: fixed setup guide link in README, thanks @diamondo25! (#2025)
StorageLoader: loaded atomic.events from shredded folder (#1795)
Postgres: bumped atomic.events to 0.7.0
Postgres: added migration script for 0.6.0 to 0.7.0 (#2047)
Postgres: removed JSON fields from atomic.events (#1949)
Redshift: bumped atomic.events to 0.8.0
Redshift: added migration script for 0.4.0 to 0.8.0 (#2155)
Redshift: added migration script for 0.5.0 to 0.8.0 (#2119)
Redshift: added migration script for 0.6.0 to 0.8.0 (#2120)
Redshift: added migration script for 0.7.0 to 0.8.0 (#2048)
Redshift: removed JSON fields from atomic.events (#1849)
Data Modeling: added separators to custom fingerprint in deduplication queries (#2198)
Data Modeling: renamed dvce_tstamp to dvce_created_tstamp in basic recipes (#2166)
Data Modeling: removed JSON fields from deduplication queries (#2197)
Release 72 Great Spotted Kiwi (2015-10-15)
------------------------------------------
Documentation: added Scala Tracker to 1-trackers/README.md (#2114)
Common: added forwarding of port 3000 to Vagrantfile for Clojure Collector (#2011)
Unity Tracker: added git submodule (#2113)
Clojure Collector: bumped to 1.1.0
Clojure Collector: added URI redirect ability (#1102)
Clojure Collector: added basic README for the java-servlet (#2012)
Scala Common Enrich: bumped to 0.17.0
Scala Common Enrich: added cookie extractor enrichment, thanks @kazjote! (#2072)
Scala Common Enrich: converted SnowplowAdapter from object to package (#2040)
Scala Common Enrich: added Adapter to pre-process URI redirect events (#1103)
Scala Hadoop Enrich: bumped to 1.2.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.17.0 (#2027)
Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/uri_redirect event (#1104)
Redshift: added Redshift DDL for com.amazon.aws.ec2/instance_identity_document (#2086)
Redshift: added Redshift DDL for org.ietf/http_cookie (#2096)
StorageLoader: wrote JSON Path file for com.snowplowanalytics.snowplow/uri_redirect event (#1105)
StorageLoader: wrote JSON path file for org.ietf/http_cookie (#2097)
StorageLoader: wrote JSON path file for com.amazon.aws.ec2/instance_identity_document (#2085)
Deduplication: added SQL queries to deduplicate without event fingerprint (#2110)
Deduplication: updated SQL queries to use event fingerprint (#2091)
Release 71 Stork-Billed Kingfisher (2015-10-02)
-----------------------------------------------
Enrich: added example event fingerprint enrichment configuration JSON (#1990)
EmrEtlRunner: bumped to 0.18.0
EmrEtlRunner: updated AMI version in config.yml.sample to 3.7.0 (#1959)
EmrEtlRunner: updated combine_configurations.rb to add ssl_mode: disable (#1996)
Scala Common Enrich: bumped to 0.16.0
Scala Common Enrich: added derived_tstamp enrichment (#1550)
Scala Common Enrich: added validation that v_collector is set (#1600)
Scala Common Enrich: added validation that collector_tstamp is set and valid (#1611)
Scala Common Enrich: added event_vendor/name/format/version to enriched event, thanks @danisola! (#1800)
Scala Common Enrich: ported JSON schema from Scala Hadoop Shred, thanks @danisola! (#1637)
Scala Common Enrich: bumped referer-parser to 0.3.0 (#1839)
Scala Common Enrich: changed etl_tstamp in EnrichmentManager from String to Joda DateTime (#1841)
Scala Common Enrich: added support for four new fields in CloudFront access logs (#1865)
Scala Common Enrich: bumped user-agent-utils to 1.16 (#1905)
Scala Common Enrich: changed BadRow class to use ProcessingMessages (#1936)
Scala Common Enrich: ensured that all timestamp fields are nonnegative (#1938)
Scala Common Enrich: started catching all exceptions in EtlPipeline (#1954)
Scala Common Enrich: added event_fingerprint enrichment (#1965)
Scala Common Enrich: bumped Iglu Scala Client to 0.3.0 (#1989)
Scala Common Enrich: renamed dvce_tstamp to dvce_created_tstamp (#1995)
Scala Common Enrich: started extracting true_tstamp from querystring (#1968)
Scala Hadoop Enrich: bumped to 1.1.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.16.0 (#1807)
Scala Hadoop Enrich: updated tests to expect bad row JSONs with timestamps and processing messages (#1751)
Scala Hadoop Enrich: updated to use new EtlPipeline (#1931)
Scala Hadoop Enrich: bad rows for Thrift payloads now contain the original Thrift record (#1950)
Scala Hadoop Enrich: simplified validation projection code (#1986)
Scala Hadoop Shred: bumped to 0.5.0
Scala Hadoop Shred: updated tests to expect bad row JSONs with timestamps and processing messages (#1953)
Scala Hadoop Shred: added clojars.org as a resolver (#1952)
Scala Hadoop Shred: bumped Scala Common Enrich to 0.16.0 (#1935)
Scala Hadoop Shred: started using BadRow case class from Scala Common Enrich (#1914)
Scala Hadoop Shred: upgraded to Hadoop 2.4 (#1720)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.3.0 (#1221)
Redshift: bumped atomic.events to 0.7.0
Redshift: added migration script for 0.6.0 to 0.7.0 (#1988)
Redshift: added migration script for 0.5.0 to 0.7.0 (#2058)
Redshift: added event_vendor/name/format/version to atomic.events (#1801)
Redshift: updated wd_access_log_1.sql with 4 new fields and renamed "x_edge_request_type" to "x_edge_request_id" (#1940)
Redshift: added event_fingerprint to atomic.events (#1971)
Redshift: added true_tstamp to atomic.events (#1984)
Redshift: renamed dvce_tstamp to dvce_created_tstamp (#1993)
Redshift: added comment containing table version to atomic.events (#2020)
Redshift: added migration script for wd_access_log_1.sql 1-0-3 to 1-0-4 (#2029)
Postgres: bumped atomic.events to 0.6.0
Postgres: added migration script for 0.5.0 to 0.6.0 (#1987)
Postgres: added event_vendor/name/format/version to atomic.events (#1802)
Postgres: added event_fingerprint to atomic.events (#1970)
Postgres: added true_tstamp to atomic.events (#1985)
Postgres: renamed dvce_tstamp to dvce_created_tstamp (#1994)
Postgres: added comment containing table version to atomic.events (#2021)
StorageLoader: bumped to 0.5.0
StorageLoader: exposed sslmode connection option for loading Postgres and Redshift, thanks @dennisatspaceape! (#1980)
StorageLoader: updated wd_access_log_1.json with 4 new fields (#1941)
Data Modeling: updated web-incremental so failure is recoverable (#1974)
Data Modeling: renamed dvce_tstamp to dvce_created_tstamp (#2024)
Release 70 Bornean Green Magpie (2015-08-19)
--------------------------------------------
Common: added Ruby script to generate unified config.yml and iglu-resolver.json from runner.yml and loader.yml (#1774)
Common: aded postgres.yml to up.playbooks (#1767)
Common: added Vagrant push script to publish Ruby apps (#1784)
Enrich: moved enrichments folder out of EmrEtlRunner (#1574)
Enrich: changed campaign_attribution.json configuration to true (#1608)
EmrEtlRunner & StorageLoader: unified the config file format (#878)
EmrEtlRunner & StorageLoader: added support for compressing enriched events, thanks @danisola! (#1265)
EmrEtlRunner & StorageLoader: now supports environment variables in YML config files, thanks @epantera! (#1215)
EmrEtlRunner: bumped to 0.17.0
EmrEtlRunner: added retry logic for EMR bootstrap timeouts (#354)
EmrEtlRunner: added Snowplow event tracking (#678)
EmrEtlRunner: added tags for monitoring to config.yml (#1163)
EmrEtlRunner: improved hierarchy in config.yml (#1447)
EmrEtlRunner: added Snowplow tracking to config.yml (#1448)
EmrEtlRunner: moved Iglu resolver into dedicated CLI argument (#1542)
EmrEtlRunner: renamed archive step to archive_raw (#1543)
EmrEtlRunner: bumped Sluice to 0.2.2 (#1566)
EmrEtlRunner: removed use of symbols for properties in YAML configuration (#1572)
EmrEtlRunner: allowed nil for config.yml's bootstrap field (#1575)
EmrEtlRunner: simplified trail slash code now that nils are supported (#1588)
EmrEtlRunner: pinned Contracts to 0.7 (#1590)
EmrEtlRunner: now fails job if odd number of lzo files in processing (#1728)
EmrEtlRunner: added an early check that shredded is empty (#1749)
EmrEtlRunner: allowed config to be passed in via stdin (#1772)
EmrEtlRunner: added Rake task to build app (#1786)
EmrEtlRunner: moved Logging module into new Monitoring module (#1797)
EmrEtlRunner: ensured that _SUCCESS file is written last for enriched events in S3 (#1808)
EmrEtlRunner: replaced m1.small with m1.medium in config.yml, thanks @danrama! (#1826)
EmrEtlRunner: recovered from 500 error while checking job status (#1828)
EmrEtlRunner: recovered from IOError while checking job status (#1881)
EmrEtlRunner: changed .ruby-version to "jruby" (#1888)
EmrEtlRunner: now only accepts an array of in buckets (#1910)
EmrEtlRunner: validated output_compression configuration using contract (#1820)
EmrEtlRunner: handled exception when the connection times out when checking the cluster, thanks @danisola! (#1599)
EmrEtlRunner: bumped Elasticity to 6.0.3 (#1939)
Deduplication: added timetracking and updated schema name (#1962)
StorageLoader: bumped to 0.4.0
StorageLoader: allowed config to passed in via stdin (#1773)
StorageLoader: added ability to bundle as a JRuby fat jar (#675)
StorageLoader: started loading Postgres via stdin, thanks @mrwalker! (#624)
StorageLoader: added Snowplow event tracking (#679)
StorageLoader: updated to use EmrEtlRunner's expanded config.yml (#1191)
StorageLoader: pinned Contracts to 0.7 (#1497)
StorageLoader: moved "include Contracts" (#1499)
StorageLoader: renamed archive step to archive_enrich (#1544)
StorageLoader: bumped Sluice to 0.2.2 (#1567)
StorageLoader: removed use of symbols for properties in YAML configuration (#1573)
StorageLoader: added Rake task to build app (#1787)
StorageLoader: scrubbed credentials from stderr (#1918)
StorageLoader: added test suite (#1919)
StorageLoader: ensured that _SUCCESS file is written last for enriched events archived to S3 (#1814)
StorageLoader: started automatically converting "s3n" to "s3" in copy statements (#1937)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_started (#1875)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_succeeded (#1876)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_failed (#1877)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/emr_job_status (#1878)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/jobflow_step_status (#1879)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/load_succeeded (#1884)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/load_failed (#1885)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring.batch/application_context (#1942)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_started (#1870)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_succeeded (#1871)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_failed (#1872)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/emr_job_status (#1873)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/jobflow_step_status (#1874)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/load_succeeded (#1882)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/load_failed (#1883)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring.batch/application_context (#1943)
Release 69 Blue-Bellied Roller (2015-07-24)
-------------------------------------------
Incremental SQL Model: added the new incremental queries (#1857)
Incremental SQL Model: changed how query performance is tracked (#1855)
Incremental SQL Model: added new setup queries (#1853)
Incremental SQL Model: added migration queries (#1852)
Incremental SQL Model: updated the SQL runner playbook (#1851)
Incremental SQL Model: updated diagram (#1850)
Deduplication: added a step that deduplicates events (#1866)
Incremental SQL Model: replaced RANK with ROW_NUMBER (#1867)
Mobile SQL Model: added sessionization and DAU queries (#1891
SQL Models: renamed full and incremental to allow for more models ) (#1892)
StorageLoader: wrote JSON path file for com.snowplowanalytics.snowplow/client_session (#1922)
Redshift: added Redshift DDL for com.snowplowanalytics.snowplow/client_session (#1921)
Release 68 Turquoise Jay (2015-07-23)
-------------------------------------
EmrEtlRunner: bumped to 0.16.0
EmrEtlRunner: bumped Elasticity to 6.0.2 (#1903)
EmrEtlRunner: named the processing bucket in its associated "is not empty" error (#1911)
EmrEtlRunner: made in bucket an array (#1750)
EmrEtlRunner: determined path to Hadoop enrich based on its version (#1789)
EmrEtlRunner: added unit test for add_trailing_slashes function (#1904)
Release 67 Bohemian Waxwing (2015-07-13)
----------------------------------------
Common: added NFS and CORE configuration to Vagrantfile to enhance performance (#1831)
Scala Stream Collector: bumped to 0.5.0
Scala Stream Collector: stdout bad sink now prints to stderr (#1799)
Scala Stream Collector: added splitter for large event arrays (#941)
Scala Stream Collector: increased maximum record size from 50kB to 1MB (#1753)
Scala Stream Collector: added tests for splitting large requests (#1683)
Scala Stream Collector: updated bad rows to include timestamp (#1681)
Scala Stream Collector: handled case where IP is not present (#1680)
Scala Stream Collector: did some reorganisation and refactoring of the project (#1678)
Scala Stream Collector: added json4s dependency (#1673)
Scala Stream Collector: added bad stream (#1502)
Scala Common Enrich: bumped to 0.15.0
Scala Common Enrich: fixed JavascriptScriptEnrichmentSpec test to pass openjdk7 (#1793)
Scala Common Enrich: bumped scala-maxmind-iplookups to 0.3.0 (#1771)
Scala Common Enrich: bumped Scala Forex to 0.3.0 (#1770)
Scala Common Enrich: updated bad rows to include timestamp (#1577)
Scala S3 Sink: removed project from repo (#1672)
Scala Kinesis Enrich: bumped to 0.6.0
Scala Kinesis Enrich: bumped to Scala Common Enrich 0.15.0 (#1685)
Scala Kinesis Enrich: tries to send 503 records (#1756)
Scala Kinesis Enrich: made back-off fields macros (#1745)
Scala Kinesis Enrich: increased maximum record size to 1MB (#1736)
Scala Kinesis Enrich: logging all bad rows (#1722)
Scala Kinesis Enrich: exception installing MaxMind file must terminate (#1711)
Scala Kinesis Enrich: sending Snowplow hearbeat (#1406)
Scala Kinesis Enrich: allowed records of over 1Mb when running in local mode (#1663)
Scala Kinesis Enrich: fixed error when fetching MaxMind file from s3:// URI (#1645)
Scala Kinesis Enrich: sending a warning via Snowplow if no enrichment JSONs are retrieved from DynamoDB (#1621)
Scala Kinesis Enrich: sending failure to sink event to kinesis to Snowplow (#1798)
Scala Kinesis Enrich: etl_tstamp should be Redshift Formatted not raw (#1842)
Kinesis Elasticsearch Sink: bumped to 0.4.0
Kinesis Elasticsearch Sink: removed Scala Common Enrich as an assembly dependency (#1819)
Kinesis Elasticsearch Sink: bumped to Scala Common Enrich 0.15.0 (#1811)
Kinesis Elasticsearch Sink: allowed use of AWS creds instead of DefaultAWSCredentialsProviderChain (#1803)
Kinesis Elasticsearch Sink: app no longer hangs without shutting down (#1743)
Kinesis Elasticsearch Sink: updated the Elasticsearch version (#1734)
Kinesis Elasticsearch Sink: sent event to Snowplow on heartbeat (#1706)
Kinesis Elasticsearch Sink: added Scala Tracker dependency (#1705)
Kinesis Elasticsearch Sink: sending event to Snowplow when unable to write to Elasticsearch (#1704)
Kinesis Elasticsearch Sink: sending event to Snowplow on shutdown (#1703)
Kinesis Elasticsearch Sink: sending event to Snowplow on initialization (#1702)
Kinesis Elasticsearch Sink: initialized bad stream eagerly rather than lazily (#1677)
Kinesis Elasticsearch Sink: updated amazon-kinesis-connectors to 1.1.2 (#1675)
Kinesis Elasticsearch Sink: specifying character encoding in SnowplowElasticsearchTransformer (#1654)
Kinesis Elasticsearch Sink: updated bad rows to include timestamp (#1578)
Kinesis Elasticsearch Sink: moved location fields into elasticsearch section (#1517)
Kinesis Elasticsearch Sink: corrected shredding example in comment (#1276)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_warning (#1809)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/heartbeat (#1764)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/sink_write_failed (#1763)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_initialized (#1762)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/application_shutdown (#1761)
Redshift: added Redshift DDL for com.snowplowanalytics.monitoring/stream_write_failed (#1844)
Redshift: added Redshift DDL for com.snowplowanalytics.snowplow/web_page (#1835)
Redshift: added migration script for 0.3.0 to 0.6.0 (#1832)
Redshift: added migration script for 0.4.0 to 0.6.0 (#1833)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_warning (#1810)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/heartbeat (#1760)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/sink_write_failed (#1759)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_initialized (#1758)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/application_shutdown (#1757)
StorageLoader: wrote JSON path file for com.snowplowanalytics.monitoring/stream_write_failed (#1843)
StorageLoader: wrote JSON path file for com.snowplowanalytics.snowplow/web_page (#1836)
Release 66 Oriental Skylark (2015-06-16)
----------------------------------------
Documentation: replaced Hive ETL references with Kinesis Enrich in Scala Hadoop Enrich's README (#1671)
Documentation: fixed links in Scala Common Enrich's README.md, thanks @bigsnarfdude! (#1669)
Scala Tracker: added git submodule (#1724)
Scala Hadoop Enrich: bumped to 1.0.0
Scala Hadoop Enrich: renamed build to snowplow-hadoop-enrich (#1718)
Scala Hadoop Enrich: updated dependencies to Hadoop 2.4 (#1716)
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.14.0 (#1700)
Scala Hadoop Enrich: updated Core2015RefreshSpec to include JavascriptScriptEnrichment (#1746)
Scala Common Enrich: bumped to 0.14.0
Scala Common Enrich: added JavaScript scripting enrichment (#378)
Scala Common Enrich: made IpLookupsEnrichment error message more informative (#1426)
Scala Common Enrich: commons-codec dependency is no longer test-only (#1712)
Scala Common Enrich: bumped commons-lang3 to 3.4 (#1713)
Scala Common Enrich: made mkt_ and refr_ fields TSV safe, thanks @jasonbosco! (#1643)
Scala Common Enrich: updated JodaTime dependency to 2.2 (#1748)
Scala Common Enrich: now handles null message in stripInstanceEtc (#1622)
EmrEtlRunner: bumped to 0.15.0
EmrEtlRunner: now using new scala-hadoop-enrich jar path in Hosted Assets (#1719)
EmrEtlRunner: updated ami_version in config.yml to 3.6.0 (#1651)
EmrEtlRunner: added bootstrap action to prepare AMI 3.x for Snowplow (#1714)
EmrEtlRunner: now setting buffer for processing thrift in core-site.xml (#1715)
EmrEtlRunner: added S3DistpCp step for thrift files in processing (#1647)
EmrEtlRunner: added example javascript_script_config to enrichments folder (#1755)
StorageLoader: wrote JSON Path file for com.mparticle.snowplow/app_event (#1688)
StorageLoader: wrote JSON Path file for com.mparticle.snowplow/social_event (#1690)
StorageLoader: wrote JSON Path file for com.mparticle.snowplow/transaction_event (#1692)
StorageLoader: wrote JSON Path file for a com.mparticle.snowplow/session_context (#1694)
Redshift: added Redshift DDL for a com.mparticle.snowplow/app_event (#1686)
Redshift: added Redshift DDL for a com.mparticle.snowplow/social_event (#1689)
Redshift: added Redshift DDL for a com.mparticle.snowplow/transaction_event (#1691)
Redshift: added Redshift DDL for a com.mparticle.snowplow/session_context (#1693)
Data Modeling: removed restrictions in sessions and visitors-source (#1725)
Release 65 Scarlet Rosefinch (2015-05-08)
-----------------------------------------
Scala Stream Collector: bumped to 0.4.0
Scala Stream Collector: bumped Scalazon to 0.11 (#1504)
Scala Stream Collector: added support for PutRecords API (#1227)
Scala Stream Collector: added CORS support (#1165)
Scala Stream Collector: added CORS-style support for ActionScript3 Tracker (#1331)
Scala Stream Collector: added ability to disable third-party cookies (#1363)
Scala Stream Collector: removed automatic creation of stream (#1464)
Scala Stream Collector: added macros to config.hocon.sample (#1471)
Scala Stream Collector: logged the name of the stream to which records are written (#1503)
Scala Stream Collector: added shutdown hook to send stored events (#1535)
Scala Stream Collector: added configurable exponential backoff with jitter (#1592)
Scala Kinesis Enrich: bumped to 0.5.0
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.13.1 (#1618)
Scala Kinesis Enrich: bumped Scalazon to 0.11 (#1492)
Scala Kinesis Enrich: bumped Kinesis Client Library to 1.2.1 (#1580)
Scala Kinesis Enrich: added ability to retrieve resolver and enrichments from DynamoDB (#1289)
Scala Kinesis Enrich: added support for PutRecords API (#1418)
Scala Kinesis Enrich: removed automatic creation of streams (#1465)
Scala Kinesis Enrich: fixed checkpointing (#1467)
Scala Kinesis Enrich: logged the name of the stream to which records are written (#1493)
Scala Kinesis Enrich: added macros to config.hocon.sample (#1513)
Scala Kinesis Enrich: moved Iglu resolver to dedicated CLI argument (#1534)
Scala Kinesis Enrich: updated README examples with new configuration (#1549)
Scala Kinesis Enrich: stopped retrying in the case of a ShutdownException or InvalidStateException (#1552)
Scala Kinesis Enrich: stopped ignoring region setting for DynamoDB table (#1576)
Scala Kinesis Enrich: updated test suite to accommodate changes (#1581)
Scala Kinesis Enrich: added Clojars as a resolver (#1586)
Scala Kinesis Enrich: added configurable exponential backoff with jitter (#1591)
Scala Kinesis Enrich: randomize partition keys for bad events (#1631)
Scala Kinesis Enrich: stopped sending records of over 50kB (#1649)
Kinesis Elasticsearch Sink: bumped to 0.3.0
Kinesis Elasticsearch Sink: made DynamoDB region configurable (#1583)
Kinesis Elasticsearch Sink: added macros to config.hocon.sample (#1515)
Kinesis Elasticsearch Sink: changed "connector" to "sink" in config (#1474)
Kinesis Elasticsearch Sink: stopped failing silently for inputs with fewer than 24 tab-separated fields (#1584)
Kinesis Elasticsearch Sink: stopped analyzing text fields by default (#1624)
Kinesis Elasticsearch Sink: removed automatic creation of bad stream (#1626)
Kinesis Elasticsearch Sink: randomized partition keys for failed records (#1633)
Kinesis LZO S3 Sink: bumped to 0.2.0
Kinesis LZO S3 Sink: removed automatic creation of stream (#1529)
Kinesis LZO S3 Sink: changed "connector" to "sink" in config (#1473)
Kinesis LZO S3 Sink: made DynamoDb region configurable (#1582)
Kinesis LZO S3 Sink: added macros to config.hocon.sample (#1472)
Kinesis LZO S3 Sink: changed the configuration to use the S3 region instead of the full endpoint URI (#1327)
Release 64 Palila (2015-04-16)
------------------------------
Common: added top-level data modeling folder (#1523)
Common: updated root README to include data modeling (#1612)
ActionScript 3.0 Tracker: added git submodule (#1546)
EmrEtlRunner: bumped to 0.14.0
EmrEtlRunner: bumped Elasticity to 4.0.5 (#758)
EmrEtlRunner: added support for specifying EMR service role (#1595)
EmrEtlRunner: added support for specifying EMR jobflow role (#1232)
Scala Common Enrich: bumped to 0.13.1
Scala Common Enrich: prevented UaParserEnrichment from creating a new Parser on every event (#1616)
Scala Hadoop Enrich: bumped to 0.14.1
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.13.1 (#1617)
Redshift: bumped atomic.events to 0.6.0
Redshift: added migration script for 0.5.0 to 0.6.0 (#1606)
Redshift: increased mkt_clickid to varchar(128) (#1605)
Redshift: removed legacy cubes (#1613)
Postgres: bumped atomic.events to 0.5.0
Postgres: added migration script for 0.4.0 to 0.5.0 (#1604)
Postgres: increased mkt_clickid to varchar(128) (#1603)
Postgres: removed legacy cubes (#1614)
Postgres: added user_id field to migration script for 0.3.0 to 0.4.0 (#1620)
Data Modeling: updated reference data.iso_country_codes so DISTSTYLE is ALL (#1393)
SQL Runner: added basic sessions / visits / page views model that can be pivoted on directly from any BI tool (#1273)
Looker: simplified LookML model and made it consistent with Redshift data models (#1522)
Release 63 Red-Cheeked Cordon-Bleu (2015-04-02)
-----------------------------------------------
Common: updated kinesis push to remove sub-folders from zipfile (#1378)
EmrEtlRunner: added example configuration JSONs for new enrichments (#1545)
Scala Common Enrich: bumped to 0.13.0
Scala Common Enrich: bumped referer-parser to 0.2.3 (#670)
Scala Common Enrich: converted transactions from given currency to base currency (#370)
Scala Common Enrich: bumped CampaignAttributionEnrichment version to 0.2.0 (#1338)
Scala Common Enrich: added mkt_clickid and mkt_network fields to POJO (#1073)
Scala Common Enrich: added derived_contexts field to POJO (#787)
Scala Common Enrich: added geo_timezone field to POJO (#787)
Scala Common Enrich: added etl_tags field to POJO (#1247)
Scala Common Enrich: added currency fields to POJO (#1316)
Scala Common Enrich: changed enrichment configuration to use SchemaCriterion rather than SchemaKey (#1353)
Scala Common Enrich: extracted original IP address from CollectorPayload headers (#1372)
Scala Common Enrich: extracted dvce_sent_tstamp from stm field (#1383)
Scala Common Enrich: added dvce_sent_tstamp to POJO (#1384)
Scala Common Enrich: added refr_domain_userid and refr_dvce_sent_tstamp to POJO (#1449)
Scala Common Enrich: added domain_sessionid field to POJO (#1538)
Scala Common Enrich: added derived_tstamp field to POJO (#1557)
Scala Common Enrich: populated refr_ fields based on page_url querystring (#1461)
Scala Common Enrich: populated domain_sessionid field based on "sid" parameter (#1541)
Scala Common Enrich: parsed the page URI in the EnrichmentManager (#1463)
Scala Common Enrich: added ua-parser enrichment (#62)
Scala Common Enrich: added ability to disable user-agent-utils enrichment (#792)
Scala Common Enrich: used Netaporter to parse querystrings if httpclient fails, thanks @danisola! (#1429)
Scala Hadoop Enrich: bumped to 0.14.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.13.0 (#1340)
Scala Hadoop Enrich: added integration tests for currency conversion enrichment (#1430)
Scala Hadoop Enrich: added tests for other new EnrichedEvent fields (#1337)
Scala Hadoop Shred: bumped to 0.4.0
Scala Hadoop Shred: bumped Scala Common Enrich to 0.13.0 (#1343)
Scala Hadoop Shred: bumped json4sJackson to 3.2.11 (#1344)
Scala Hadoop Shred: extracted JSONs from derived_contexts field (#786)
Scala Hadoop Shred: updated to reflect new enriched event format (#1332)
Scala Kinesis Enrich: bumped to 0.4.0
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.13.0 (#1369)
Scala Kinesis Enrich: emitted updated EnrichedEvent (#1368)
Scala Kinesis Enrich: unified logger configuration, thanks @kazjote! (#1367)
Redshift: bumped atomic.events to 0.5.0
Redshift: added migration script for 0.4.0 to 0.5.0 (#1335)
Redshift: added refr_domain_userid and refr_dvce_tstamp to atomic.events (#1450)
Redshift: added dvce_sent_tstamp column (#1385)
Redshift: added foreign key constraint to all Redshift shredded tables (#1365)
Redshift: changed JSON field encodings to lzo (#1350)
Redshift: added etl_tags column (#1245)
Redshift: added column for mkt_clickid and mkt_network (#1093)
Redshift: widened domain_userid column to hold UUID (#1090)
Redshift: added Redshift DDL for ua_parser_context (#789)
Redshift: added new derived_contexts field (#784)
Redshift: updated ip_address to support IPv6 addresses (#656)
Redshift: added new currency fields (#366)
Redshift: added domain_sessionid column (#1539)
Redshift: widened structured event, URL, and referer fields (#1553)
Redshift: added derived_tstamp column (#1558)
Postgres: bumped atomic.events to 0.4.0
Postgres: added migration script for 0.3.0 to 0.4.0 (#1347)
Postgres: added refr_domain_userid and refr_dvce_tstamp to atomic.events (#1451)
Postgres: added dvce_sent_tstamp column (#1386)
Postgres: added column for geo_timezone (#1336)
Postgres: added etl_tags column (#1246)
Postgres: removed primary key constraint on event_id (#1187)
Postgres: added column for mkt_clickid and mkt_network (#1092)
Postgres: widened domain_userid column to hold UUID (#1091)
Postgres: added new derived_contexts field (#785)
Postgres: updated ip_address to support IPv6 addresses (#655)
Postgres: added new currency fields (#365)
Postgres: added domain_sessionid column (#1540)
Postgres: widened structured event, URL, and referer fields (#1554)
Postgres: added derived_tstamp column (#1559)
StorageLoader: wrote JSON Path file for ua_parser_context (#790)
Kinesis Elasticsearch Sink: bumped to 0.2.0
Kinesis Elasticsearch Sink: added new EnrichedEvent fields (#1345)
Kinesis Elasticsearch Sink: stopped verifying number of fields in enriched event (#1333)
Kinesis Elasticsearch Sink: changed organization to com.snowplowanalytics in BuildSettings (#1279)
Kinesis Elasticsearch Sink: renamed application.conf.example to config.hocon.sample (#1244)
Release 62 Tropical Parula (2015-03-17)
---------------------------------------
Common: updated `vagrant up` to work with latest Peru version (#1475)
Ruby Tracker: bumped git submodule to 0.4.1 (#1488)
Python Tracker: bumped git submodule to 0.6.0 (#1487)
PHP Tracker: bumped git submodule to 0.2.1 (#1486)
JavaScript Tracker: bumped git submodule to 2.3.0 (#1485)
Java Tracker: bumped git submodule to 0.7.0 (#1484)
Objective-C Tracker: renamed from iOS Tracker and bump git submodule to 0.3.2 (#1483)
EmrEtlRunner: bumped to 0.13.0
EmrEtlRunner: fixed copy to staging for Tomcat7 logs with hyphen after .txt (#1480)
EmrEtlRunner: added missing :archive: in BucketHash (#1475)
EmrEtlRunner: added support for custom bootstrap actions, thanks @danisola! (#1405)
EmrEtlRunner: removed time_diff as a dependency (#1352)
EmrEtlRunner: fixed breaking get_assets spec (#1287)
EmrEtlRunner: now tolerating more exception types in EmrJob's wait_for (#358)
EmrEtlRunner: bumped Contracts to 0.7 (#1498)
EmrEtlRunner: moved `include Contracts` into classes and modules (#1438)
Release 61 Pygmy Parrot (2015-03-02)
------------------------------------
Common: bumped VERSION file to r61-pygmy-parrot
Common: added Gradle to up.playbooks (#1270)
Common: added .travis.yml file and Travis button to repo (#1359)
Common: added Release button to README (#1428)
Common: added License button to README (#1427)
Clojure Collector: bumped to 1.0.0
Clojure Collector: updated access-valve to depend on Tomcat 8 classes (#1203)
Clojure Collector: updated .ebextensions to depend on Tomcat 8 (#1202)
Clojure Collector: added ability to disable third-party cookies (#1362)
Clojure Collector: added CORS support (#1146)
Clojure Collector: added CORS-style support for ActionScript3 Tracker (#1330)
Clojure Collector: added support for /:vendor/:version to HEAD (#1166)
Clojure Collector: now using UTF-8 for character encoding throughout (#1354)
Scala Common Enrich: bumped to 0.12.0
Scala Common Enrich: updated SnowplowAdapter to accept "charset=UTF-8" (#1424)
Scala Common Enrich: Base64 decoding does not specify UTF-8 charset (#1403)
Scala Common Enrich: removed incorrect extra layer of URL decoding from non-Bas64-encoded JSONs (#1396)
Scala Common Enrich: added support for ti_nm for transaction item name as well as ti_na (#1401)
Scala Common Enrich: added CloudfrontAccessLogAdapter (#1282)
Scala Common Enrich: made timestamp field of CollectorPayload an Option (#1417)
Scala Hadoop Enrich: bumped to 0.13.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.12.0 (#1395)
Scala Hadoop Enrich: added test for non-Base64-encoded JSON (#1394)
Scala Hadoop Enrich: updated tests to include Unicode (#1390)
Scala Hadoop Enrich: added integration test for CloudfrontAccessLogAdapter (#1423)
Scala Hadoop Bad Rows: removed .travis.yml (#1382)
EmrEtlRunner: bumped to 0.12.0
EmrEtlRunner: now appending region name to Clojure Collector log files (#1379)
EmrEtlRunner: added support for moving and archiving timestamped Clojure Collector log files (#1400)
EmrEtlRunner: now appending rather than prepending instance names to Clojure Collector log files (#1404)
EmrEtlRunner: changed Clojure Collector log timestamp format to match CloudFront logs (#1398)
EmrEtlRunner: added dedicated return code for no files to process (#1397)
EmrEtlRunner: now allowing tsv/*/* and json/*/* as :etl:collector_format (#1284)
EmrEtlRunner: now performing S3DistCp from processing for tsv/com.amazon.aws.cloudfront/* (#1431)
EmrEtlRunner: added output directory empty check prior to staging step (#1151)
StorageLoader: updated shell script to only run StorageLoader if EmrEtlRunner found files (#1399)
StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/flash_context (#1305)
StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/timing event (#1388)
StorageLoader: wrote JSON Path file for a com.amazon.aws.cloudfront/wd_access_log event (#1285)
StorageLoader: wrote JSON Path file for a com.google.analytics/cookies context (#1409)
StorageLoader: wrote JSON Path file for a com.snowplowanalytics.snowplow/desktop_context (#1421)
Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/timing event (#1387)
Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/flash_context (#1304)
Redshift: added Redshift DDL for a com.amazon.aws.cloudfront/wd_access_log event (#1286)
Redshift: added Redshift DDL for a com.google.analytics/cookies context (#1408)
Redshift: added Redshift DDL for a com.snowplowanalytics.snowplow/desktop_context (#1420)
Release 60 Bee Hummingbird (2015-02-03)
---------------------------------------
Common: added VERSION file in root to assist vagrant push (#1293)
Common: added vagrant push scripting to publish Kinesis apps (#1288)
Common: added lzo.yml to up.playbooks (#1325)
Thrift Raw Event: bumped Thrift version to 0.9.1 (#1225)
Thrift Raw Event: added collector-payload-1 and schema-sniffer-1 (#1322)
Thrift Raw Event: created a subproject for each Thrift class (#1298)
Thrift Raw Event: updated README and project description to reflect new structure (#1300)
Thrift Raw Event: renamed to thrift-schemas (#1299)
Scala Stream Collector: bumped to 0.3.0
Scala Stream Collector: started sending CollectorPayloads instead of SnowplowRawEvents (#1226)
Scala Stream Collector: added support for POST requests (#187)
Scala Stream Collector: added support for any {api-vendor}/{api-version} for GET and POST (#652)
Scala Stream Collector: stopped decoding URLs (#1217)
Scala Stream Collector: changed 1x1 pixel response to use a stable GIF (#1260)
Scala Stream Collector: renamed default.conf to config.hocon.sample (#1243)
Scala Stream Collector: started using ThreadLocal to handle Thrift serialization, thanks @denismo and @pkallos! (#1254)
Scala Stream Collector: added healthcheck for load balancers, thanks @duncan! (#1360)
EmrEtlRunner: bumped to 0.11.0
EmrEtlRunner: added "thrift" collector format (#1301)
EmrEtlRunner: implemented time_diff manually (#1310)
EmrEtlRunner: fixed failure reporting when jobflow step(s) created_at is nil (#1351)
Scala Common Enrich: bumped to 0.11.0
Scala Common Enrich: added schema-sniffer-1 and collector-payload-1 dependencies (#1296)
Scala Common Enrich: bumped user-agent-utils version to 1.14 (#1224)
Scala Common Enrich: changed EnrichedEvent field name to ip_organization (#1145)
Scala Common Enrich: changed "thrift" to "thrift-raw" in Loader object (#1302)
Scala Common Enrich: added tests for getLoader function (#558)
Scala Hadoop Enrich: bumped to 0.12.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.11.0 (#1294)
Scala Hadoop Enrich: added collector-payload-1 and snowplow-thrift-raw-event as test dependencies (#1248)
Scala Hadoop Enrich: added support for processing Thrift raw events, thanks @pkallos! (#538)
Scala Hadoop Enrich: added tests to Hadoop Enrich for processing Thrift raw events (#559)
Scala Kinesis Enrich: bumped to 0.3.0
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.11.0 (#1295)
Scala Kinesis Enrich: renamed default.conf to config.hocon.sample (#1242)
Kinesis Elasticsearch Sink: added LICENSE-2.0.txt (#1329)
Kinesis LZO S3 Sink: added. Version 0.1.0, thanks @pkallos! (#1016)
Version 0.9.14 (2014-12-31)
---------------------------
Common: added dedicated Vagrant setup (#1266)
Common: added Quickstart section to README (#1268)
Common: added script to sync region-specific Snowplow Hosted Assets buckets (#1269)
CloudFront Collector: replaced 1x1 pixel with stable GIF (#1259)
Clojure Collector: bumped to 0.9.1
Clojure Collector: increased Tomcat's HTTP header tolerance to 64kB (#1249)
Clojure Collector: changed 1x1 pixel response to use a stable GIF (#1258)
EmrEtlRunner: bumped to 0.10.0
EmrEtlRunner: removed hyphen from the pattern match for Clojure Collector logs (#1194)
EmrEtlRunner: on job failure, log overall jobflow and individual step statuses (#1153)
Scala Common Enrich: bumped to 0.10.0
Scala Common Enrich: bumped Scala Iglu Client to 0.2.0 (#1222)
Scala Common Enrich: updated SnowplowAdapter to accept payload_data versions above 1-0-0 (#1220)
Scala Common Enrich: updated SnowplowAdapter to make charset=utf-8 optional (#1257)
Scala Common Enrich: added Adapter to pre-process Pingdom events (#1164)
Scala Common Enrich: added Adapter to pre-process PagerDuty events (#1158)
Scala Common Enrich: added Adapter to pre-process Mandrill events (#1061)
Scala Hadoop Enrich: bumped to 0.11.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.10.0 (#1223)
Scala Hadoop Enrich: added test job for PingdomAdapter (#1176)
Scala Hadoop Enrich: added test job for PagerdutyAdapter (#1175)
Scala Hadoop Enrich: added test job for MandrillAdapter (#1171)
Scala Hadoop Enrich: added test job for more relaxed payload_data schema matching (#1235)
Scala Hadoop Shred: bumped to 0.3.0
Scala Hadoop Shred: bumped Scala Common Enrich to 0.10.0 (#1236)
Scala Hadoop Shred: bumped Iglu Scala Client to 0.2.0 (#1230)
Scala Hadoop Shred: loosened match criteria for unstructured events and contexts (#1231)
StorageLoader: wrote JSON Path file for com.pingdom/incident_notify_of_close event (#1182)
StorageLoader: wrote JSON Path file for com.pingdom/incident_assign event (#1181)
StorageLoader: wrote JSON Path file for com.pingdom/incident_notify_user event (#1251)
StorageLoader: wrote JSON Path file for com.pagerduty/incident event (#1177)
StorageLoader: wrote JSON Path file for com.mandrill/message_sent event (#1059)
StorageLoader: wrote JSON Path file for com.mandrill/message_bounced event (#1058)
StorageLoader: wrote JSON Path file for com.mandrill/message_opened event (#1057)
StorageLoader: wrote JSON Path file for com.mandrill/message_marked_as_spam event (#1056)
StorageLoader: wrote JSON Path file for com.mandrill/message_delayed event (#1055)
StorageLoader: wrote JSON Path file for com.mandrill/message_soft_bounced event (#1054)
StorageLoader: wrote JSON Path file for com.mandrill/message_clicked event (#1053)
StorageLoader: wrote JSON Path file for com.mandrill/message_rejected event (#1052)
StorageLoader: wrote JSON Path file for com.mandrill/recipient_unsubscribed event (#1051)
Redshift: added Redshift DDL for a com.pingdom/incident_notify_of_close event (#1180)
Redshift: added Redshift DDL for a com.pingdom/incident_assign event (#1179)
Redshift: added Redshift DDL for a com.pingdom/incident_notify_user (#1252)
Redshift: added Redshift DDL for a com.pagerduty/incident event (#1178)
Redshift: added Redshift DDL for a com.mandrill/message_sent event (#1050)
Redshift: added Redshift DDL for a com.mandrill/message_bounced event (#1049)
Redshift: added Redshift DDL for a com.mandrill/message_opened event (#1048)
Redshift: added Redshift DDL for a com.mandrill/message_marked_as_spam event (#1047)
Redshift: added Redshift DDL for a com.mandrill/message_delayed event (#1046)
Redshift: added Redshift DDL for a com.mandrill/message_soft_bounced event (#1045)
Redshift: added Redshift DDL for a com.mandrill/message_clicked event (#1044)
Redshift: added Redshift DDL for a com.mandrill/message_rejected event (#1043)
Redshift: added Redshift DDL for a com.mandrill/recipient_unsubscribed event (#1042)
Redshift: removed trailing commas from com.mailchimp SQL table definitions (#1174)
Version 0.9.13 (2014-12-01)
---------------------------
Scala Common Enrich: bumped to 0.9.1
Scala Common Enrich: added error handling for Netaporter URI parsing (#1216)
Scala Kinesis Enrich: bumped to 0.2.1
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.9.1
Scala Kinesis Enrich: fixed conflict with Specs2 version, thanks @knservis! (#1213)
Scala Hadoop Enrich: bumped to 0.10.1
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.9.1
Deleted test-file in repository root (#1219)
Version 0.9.12 (2014-11-26)
---------------------------
Scala Stream Collector: bumped to 0.2.0
Scala Stream Collector: changed organization to "com.snowplowanalytics" (#1168)
Scala Stream Collector: made the --config option mandatory (#1128)
Scala Stream Collector: added ability to set AWS credentials from environment variables (#1116)
Scala Stream Collector: now enforcing Java 7 for compilation (#1068)
Scala Stream Collector: increased request character limit to 32768 (#987)
Scala Stream Collector: improved performance by using Future, thanks @pkallos! (#580)
Scala Stream Collector, Scala Kinesis Enrich: made endpoint configurable, thanks @sambo1972! (#978)
Scala Stream Collector, Scala Kinesis Enrich: added support for IAM roles, thanks @pkallos! (#534)
Scala Stream Collector, Scala Kinesis Enrich: replaced stream list with describe to tighten permissions, thanks @pkallos! (#535)
Scala Kinesis Enrich: bumped to 0.2.0
Scala Kinesis Enrich: bumped Scala Common Enrich to 0.9.0
Scala Kinesis Enrich: changed organization to "com.snowplowanalytics" (#1167)
Scala Kinesis Enrich: made the --config option mandatory (#1126)
Scala Kinesis Enrich: updated instructions in README (#1125)
Scala Kinesis Enrich: added ability to set AWS credentials from environment variables (#1117)
Scala Kinesis Enrich: now enforcing Java 7 for compilation (#1067)
Scala Kinesis Enrich: replaced printlns with Java Logger (#521)
Scala Kinesis Enrich: started sending bad records to a separate stream (#463)
Scala Kinesis Enrich: added page_url and page_referrer back into enrichment output (#686)
Scala Kinesis Enrich: stopped opening a new file for each enriched event, thanks @pkallos! (#714)
Scala Common Enrich: bumped to 0.9.0
Scala Common Enrich: added BadRow from Scala Hadoop Enrich (#1118)
Scala Common Enrich: added ability to override collector-set nuid with tracker-set tnuid (#1095)
Scala Common Enrich: made URI parsing more permissive using NetAPorter's URI library, thanks @rupeshmane! (#1172)
Scala Hadoop Enrich: bumped to 0.10.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.9.0
Scala Hadoop Enrich: moved BadRow into Scala Common Enrich (#1119)
Scala Hadoop Enrich: updated README with new Snowplow capitalization (#1127)
Kinesis Elasticsearch Sink: added. Version 0.1.0
Version 0.9.11 (2014-11-10)
---------------------------
Clojure Collector: bumped to 0.9.0
Clojure Collector: add support for /:vendor/:version to GET (#1131)
Scala Common Enrich: bumped to 0.8.0
Scala Common Enrich: bumped json4s to 3.2.11 (#1141)
Scala Common Enrich: bumped Scala Iglu Client to 0.1.1 (#1140)
Scala Common Enrich: removed check that POST request has body and content-type (#1132)
Scala Common Enrich: moved payload API detection into CollectorApi.parse (#1113)
Scala Common Enrich: fixed bug in CljTomcatLoader expecting request body to be "_" instead of "-" (#1112)
Scala Common Enrich: added Adapter to pre-process CallRail events (#1108)
Scala Common Enrich: added Adapter to pre-process MailChimp events (#1086)
Scala Common Enrich: added Adapter to pre-process Iglu-compatible events (#1060)
Scala Hadoop Enrich: bumped to 0.9.0
Scala Hadoop Enrich: added job test for unrecognized api name/version (#1115)
Scala Hadoop Enrich: updated DiscardableCfLinesSpec given /not-ice.png is no longer discarded (#1114)
Scala Hadoop Enrich: added test job for MailchimpAdapter (#1159)
Scala Hadoop Enrich: added test job for CallrailAdapter (#1160)
Redshift: removed not null constraint on change_form's value column (#1162)
Redshift: added Redshift DDL for a com.callrail/call_complete event (#1110)
Redshift: added Redshift DDL for a com.mailchimp/campaign_sending_status event (#1085)
Redshift: added Redshift DDL for a com.mailchimp/cleaned_email event (#1084)
Redshift: added Redshift DDL for a com.mailchimp/email_address_change event (#1083)
Redshift: added Redshift DDL for a com.mailchimp/profile_update event (#1082)
Redshift: added Redshift DDL for a com.mailchimp/unsubscribe event (#1081)
Redshift: added Redshift DDL for a com.mailchimp/subscribe event (#1080)
StorageLoader: wrote JSON Path file for com.callrail/call_complete event (#1109)
StorageLoader: wrote JSON Path file for com.mailchimp/campaign_sending_status event (#1079)
StorageLoader: wrote JSON Path file for com.mailchimp/cleaned_email event (#1078)
StorageLoader: wrote JSON Path file for com.mailchimp/email_address_change event (#1077)
StorageLoader: wrote JSON Path file for com.mailchimp/profile_update event (#1076)
StorageLoader: wrote JSON Path file for com.mailchimp/unsubscribe event (#1075)
StorageLoader: wrote JSON Path file for com.mailchimp/subscribe event (#1074)
Version 0.9.10 (2014-11-06)
---------------------------
StorageLoader: wrote JSON Path file for PerformanceTiming (#1147)
StorageLoader: wrote JSON Path file for social_interaction (#1029)
StorageLoader: wrote JSON Path file for site_search (#1027)
StorageLoader: wrote JSON Path file for change_form (#1025)
StorageLoader: wrote JSON Path file for submit_form (#1023)
StorageLoader: wrote JSON Path file for remove_from_cart (#1021)
StorageLoader: wrote JSON Path file for add_to_cart (#1019)
Redshift: converted all Redshift DDLs to use tabs (#1034)
Redshift: added Redshift DDL for PerformanceTiming (#1032)
Redshift: added Redshift DDL for social_interaction (#1030)
Redshift: added Redshift DDL for site_search (#1028)
Redshift: added Redshift DDL for change_form (#1026)
Redshift: added Redshift DDL for submit_form (#1024)
Redshift: added Redshift DDL for remove_from_cart (#1022)
Redshift: added Redshift DDL for add_to_cart (#1020)
Version 0.9.9 (2014-10-27)
--------------------------
.NET Tracker: added git submodule. Version 0.1.0 (#1000)
PHP Tracker: added git submodule. Version 0.1.0 (#1013)
Clojure Collector: bumped to 0.8.0
Clojure Collector: fixed regression in log record format caused by #854 (#992)
Clojure Collector: correctly handles multiple IPs in X-Forwarded-For (#970)
StorageLoader: bumped to 0.3.3
StorageLoader: selecting Snowplow's hosted-assets bucket based on region (#1012)
EmrEtlRunner: bumped to 0.9.2
EmrEtlRunner: no rows to process now returns 0, not 1 (#1018)
EmrEtlRunner: fixed bug where --process-enrich doesn't work, thanks @kingo55! (#1089)
EmrEtlRunner: now checking that output directories are empty before running (#1124)
Scala Common Enrich: bumped to 0.7.0
Scala Common Enrich: bumped scala-maxmind-iplookups to 0.2.0 (#1002)
Scala Common Enrich: added support for non-GA campaign attribution: phase 1 (#402)
Scala Common Enrich: rewrote AttributionEnrichments tests as RefererParserEnrichment tests (#974)
Scala Common Enrich: allow but downcase a-f characters in incoming event_id (#1006)
Scala Common Enrich: extract useragent from ua parameter (#1011)
Scala Common Enrich: fixed issue where unset integer fields throw an NPE (#570)
Scala Common Enrich: fixed issue where unset double fields throw an NPE (#1062)
Scala Common Enrich: added tests for ConversionUtils.stringToJInteger (#1064)
Scala Common Enrich: now enforcing Java 7 for compilation (#1065)
Scala Hadoop Enrich: bumped to 0.8.0
Scala Hadoop Enrich: bumped Scala Common Enrich to 0.7.0 (#995)
Scala Hadoop Enrich: added test for empty integer and double fields to ensure no NPE thrown (#1063)
Scala Hadoop Enrich: now enforcing Java 7 for compilation (#1066)
Scala Hadoop Enrich: updated test jobs to reflect updated useragent parsing (#1070)
Version 0.9.8 (2014-09-18)
--------------------------
iOS Tracker: added git submodule. Version 0.1.1 (#982)
Android Tracker: added git submodule. Version 0.1.1 (#983)
Clojure Collector: bumped to 0.7.0
Clojure Collector: merged snowplow/tomcat-cf-access-log-valve into Snowplow as clojure-collector/access-valve (#898)
Clojure Collector: bumped access-valve to 0.1.0
Clojure Collector: changed access-valve's package path to com.snowplowanalytics.snowplow.collectors.clojure.accessvalve (#924)
Clojure Collector: changed access-valve to use Gradle (#899)
Clojure Collector: changed access-valve to publish to war-resources/.ebextensions (#900)
Clojure Collector: updated access-valve and added web.xml to log request body and content type (#901)
Clojure Collector: fixed empty querystring in access-valve (#938)
Clojure Collector: fixed IP address forwarding for VPC-based environments (#854)
Clojure Collector: added support for API vendor and version in routing (#925)
Clojure Collector: added support for POST as well as GET (#654)
Scala Stream Collector: fixed broken link to `thrift-raw-event`, thanks @bamos! (#955)
Scala Common Enrich: bumped to 0.6.0
Scala Common Enrich: split out Clojure and CloudFront Collector event processing (#943)
Scala Common Enrich: added CljTomcatLoaderSpec tests (#963)
Scala Common Enrich: filtering non-GETs from CloudfrontLoader (#944)
Scala Common Enrich: replaced all Argonaut code with json4s (#945)
Scala Common Enrich: renamed CanonicalOutput to EnrichedEvent (#964)
Scala Common Enrich: replaced CanonicalInput and TrackerPayload with CollectorPayload and RawEvent (#946)
Scala Common Enrich: updated EnrichmentManager to process RawEvent not CanonicalInput (#903)
Scala Common Enrich: added Snowplow Tp2 Adapter to convert event JSON to NEL of RawEvents (#904)
Scala Common Enrich: geo-IP lookup now supports ip parameter on querystring (#961)
Scala Common Enrich: IP address anonymization now works with ip parameter on querystring (#960)
Scala Hadoop Enrich: bumped to 0.7.0
Scala Hadoop Enrich: bumped to Scala Common Enrich 0.6.0 (#940)
Scala Hadoop Enrich: updated to support generating multiple enriched events from one raw payload (#902)
StorageLoader: wrote JSON Path file for mobile_context (#776)
StorageLoader: wrote JSON Path file for geolocation_context (#962)
Redshift: added Redshift DDL for mobile_context (#542)
Redshift: added Redshift DDL for geolocation_context (#950)
Version 0.9.7 (2014-09-02)
--------------------------
Ruby Tracker: bumped git submodule to 0.3.0 (#939)
Java Tracker: bumped git submodule to 0.5.1 (#948)
Node.js Tracker: added git submodule. Version 0.1.0 (#949)
Trackers: fixed broken git submodule links, thanks @OAGr! (#957)
EmrEtlRunner: bumped to 0.9.1
EmrEtlRunner: fixed @jobflow.ec2_subnet_id not being set due to incorrect guard, thanks @rslifka! (#956)
EmrEtlRunner: fixed bugs in --process-bucket (#973)
EmrEtlRunner: renamed --process-bucket option to --process-enrich (#972)
EmrEtlRunner: changed -s option for --skip to -x prevent clash with -s for --start (#975)
EmrEtlRunner: now allows shredding without prior enrichment (#927)
StorageLoader: bumped to 0.3.2
StorageLoader: removed EMPTYASNULL for loading JSONs (#942)
StorageLoader: added missing targetUrl field to ad_impression JSON Path file, thanks @gisripa! (#951)
StorageLoader: made providing jsonpath_assets optional (#958)
StorageLoader: added support for cross-region Redshift COPY (#971)
Hive Storage: bumped table-def.q to 0.2.0
Hive Storage: added and removed fields to synchronize with 0.9.6's enriched event format (#965)
Scala Hadoop Shred: bumped to version 0.2.1
Scala Hadoop Shred: fixed multiple JSONs not being shredded for a single row (#968)
Scala Hadoop Shred: strengthened test suite (#967)
Version 0.9.6 (2014-07-26)
--------------------------
Java Tracker: bumped git submodule to 0.4.0 (#892)
EmrEtlRunner: bumped to 0.9.0
EmrEtlRunner: passed etl_tstamp into Hadoop Enrich as an argument (#396)
EmrEtlRunner: removed enrichment-specific code (#811)
EmrEtlRunner: removed enrichment-specific parameters from config.yml.sample (#809)
EmrEtlRunner: replaced enrichment-specific arguments from EmrEtlRunner (#808)
EmrEtlRunner: removed %3D code following Scalding upgrade (#849)
EmrEtlRunner: fixed contract on partition_by_run (#894)
EmrEtlRunner: updated Bash script to support enrichments path (#916)
StorageLoader: bumped to 0.3.1
StorageLoader: now looking in eu-west-1 region for s3://snowplow-hosted-assets (#895)
StorageLoader: updated combined Bash script to support enrichments path (#917)
Scala Hadoop Enrich: bumped to 0.6.0
Scala Hadoop Enrich: bumped Scala to 2.10.4 (#912)
Scala Hadoop Enrich: bumped Scalding to 0.11.1 (#911)
Scala Hadoop Enrich: bumped Hadoop to 1.2.1 (#913)
Scala Hadoop Enrich: bumped to Scala Common Enrich 0.5.0 (#788)
Scala Hadoop Enrich: passed etl_tstamp into Scala Common Enrich (#817)
Scala Hadoop Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#835)
Scala Hadoop Enrich: removed %3D handling for compatibility with old Scalding Args (#850)
Scala Hadoop Enrich: added ability to download additional MaxMind databases (#885)
Scala Hadoop Enrich: added runHadoop and Tool.main tests (#914)
Scala Common Enrich: bumped to 0.5.0
Scala Common Enrich: bumped user-agent-utils version, thanks @pkallos! (#662)
Scala Common Enrich: bumped referer-parser to 0.2.2 (#864)
Scala Common Enrich: bumped httpclient to 4.3.3 (#897)
Scala Common Enrich: bumped scala-maxmind-geoip to scala-maxmind-iplookups 0.1.0 (#882)
Scala Common Enrich: stored etl_tstamp in new field in CanonicalOutput (#818)
Scala Common Enrich: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#836)
Scala Common Enrich: made referer parsing configurable with list of internal domains (#857)
Scala Common Enrich: migrated configurable enrichments to new EnrichmentRegistry (#858)
Scala Common Enrich: added validation of enrichments JSON (#807)
Scala Common Enrich: replaced "anon_ip_quartets" with "anon_ip_octets" everywhere (#547)
Scala Common Enrich: added ability to extract event_id from querystring (#723)
Scala Common Enrich: extracted CanonicalInput's userId as network_userid, thanks @pkallos! (#855)
Scala Common Enrich: added MaxMind region_name field (#873)
Scala Common Enrich: added IP -> ISP lookup (#861)
Scala Common Enrich: added IP -> organization lookup (#887)
Scala Common Enrich: added IP -> domain lookup (#886)
Scala Common Enrich: added IP -> net speed lookup (#889)
Scala Common Enrich: added validation for transaction ID (#428)
Scala Common Enrich: renamed Tests to Specs for consistency (#618)
Scala Hadoop Shred: bumped to 0.2.0
Scala Hadoop Shred: bumped to Scala Common Enrich 0.5.0 (#918)
Scala Hadoop Shred: trailing empty fields no longer cause shredding for that row to fail (#921)
Scala Hadoop Shred: updated column offsets for enriched events TSV (#915)
Redshift: bumped atomic.events to 0.4.0
Redshift: added migration script for 0.3.0 to 0.4.0
Redshift: added etl_tstamp to atomic.events (#819)
Redshift: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#834)
Redshift: added new MaxMind fields (#871)
Redshift: applied runlength encoding to all fields keyed off IP address (#883)
Redshift: migration script added for 0.3.0 to 0.4.0 (#838)
Postgres: bumped atomic.events to 0.3.0
Postgres: added migration script for 0.2.0 to 0.3.0
Postgres: added etl_tstamp to atomic.events (#820)
Postgres: removed event_vendor and ue_name and renamed ue_properties to unstruct_event (#833)
Postgres: added new MaxMind fields (#871)
Postgres: migration script added for 0.2.0 to 0.3.0 (#837)
Version 0.9.5 (2014-07-09)
--------------------------
Ruby Tracker: added git submodule. Version 0.1.0 (#645)
Java Tracker: added git submodule. Version 0.2.0 (#843)
JavaScript Tracker: bumped git submodule to 2.0.0 (#635)
Python Tracker: bumped Python Tracker git submodule to 0.4.0 (#634)
Scala Hadoop Shred: added. Version 0.1.0
EmrEtlRunner: bumped to 0.8.0
EmrEtlRunner: updated S3DistCp steps to use new S3DistCpStep from Elasticity (#629)
EmrEtlRunner: added --skip s3distcp option (#313)
EmrEtlRunner: added ability to start Lingual in EmrEtlRunner (#623)
EmrEtlRunner: added ability to start HBase in EmrEtlRunner (#622)
EmrEtlRunner: improved load performance by switching ETL to write out to HDFS (#278)
EmrEtlRunner: now invoking Scala Hadoop Shredder after main job (#644)
EmrEtlRunner: added :iglu: section to config.yml for Scala Hadoop Shred (#814)
EmrEtlRunner: updated to run Scala Hadoop Shred following Hadoop Enrich (#815)
EmrEtlRunner: added --skip shred option (#659)
StorageLoader: bumped to 0.3.0
StorageLoader: bumped Sluice to 0.2.1 (#881)
StorageLoader: added initial Ruby.contracts support (#391)
StorageLoader: updated config.yml to support shredding (#897)
StorageLoader: added ACCEPTINVCHARS to StorageLoader (#411)
StorageLoader: wrote JSON Path files for ad_* events (#642)
StorageLoader: wrote JSON Path file for link_click (#599)
StorageLoader: wrote JSON Path file for screen_view (#643)
StorageLoader: wrote JSON Path file for schema.org's WebPage (#772)
StorageLoader: added :jsonpath_assets: setting for StorageLoader (#606)
StorageLoader: added ability to load custom tables using JSON Paths (#607)
StorageLoader: added --skip shred option (#660)
StorageLoader: added :in: hint on StorageLoader configuration, thanks @joaolcorreia! (#755)
Redshift: added Redshift DDL for ad_* events (#639)
Redshift: added Redshift DDL for link_click events (#600)
Redshift: added Redshift DDL for screen_view events (#640)
Redshift: added Redshift DDL for schema.org's WebPage (#771)
Looker Analytics: wrote LookML for ad_* events (#605)
Looker Analytics: wrote LookML for screen_view events (#637)
Looker Analytics: wrote LookML for link_click events (#636)
Looker Analytics: wrote LookML for schema.org's WebPage (#770)
Looker Analytics: updated LookML to use liquid templating (#851)
Version 0.9.4 (2014-05-30)
---------------------------
Redshift: added reference_data.country_codes (#779)
Postgres: added reference_data.country_codes (#781)
Looker Analytics: New 'traffic_pulse' dashboard with globally configurable drill-down variables (#765)
Looker Analytics: Snowplow website specific dimensions and metrics removed: base model is now company-generic (#764)
Looker Analytics: cleaner joining of data sets in Looker model (#763)
Looker Analytics: dimensions and metrics renamed to make it clearer for an analyst getting started with the data (#761)
Looker Analytics: added distkeys and sortkeys to derived tables to speed up query times (#696)
Looker Analytics: derived tables now auto-generated when new data is loaded into atomic.events (#688)
Looker Analytics: 'visits' renamed to 'sessions' (#762)
Looker Analytics: LookML models versioned using SchemaVer (#766)
Version 0.9.3 (2014-05-21)
--------------------------
EmrEtlRunner: bumped to 0.7.0
EmrEtlRunner: bumped Sluice to 0.2.1 (#405)
EmrEtlRunner: bumped Elasticity to 3.0.4 (#665)
EmrEtlRunner: replaced hadoop_version setting with ami_version setting (#701)
EmrEtlRunner: fixed handling of region, placement and ec2_subnet_id (#754)
EmrEtlRunner: fixed regression where 0 files staged still kicks off EMR (#409)
EmrEtlRunner: stopped Sluice file operation threads being killed by folders (#401)
EmrEtlRunner: fixed disabling of Cascading error catching (#721)
EmrEtlRunner: renamed Clojure Collector log files in processing bucket to support multiple instances (#717)
EmrEtlRunner: added initial Ruby.contracts support into EmrEtlRunner (#392)
EmrEtlRunner: updated to use the Ruby Logger (#194)
EmrEtlRunner: updated so it's embeddable in other applications (#128)
EmrEtlRunner: added ability to bundle as a JRuby fat jar (#674)
EmrEtlRunner: added initial unit tests (#672)
Clojure Collector: bumped to 0.6.0
Clojure Collector: load balancer IP address getting stored in logs (#719)
Documentation: removed all Snowplow tracking from READMEs, thanks @acinader! (#720)
Documentation: fixed EmrEtlRunner documentation is (slightly) inconsistent, thanks @pvdb! (#749)
Version 0.9.2 (2014-04-30)
--------------------------
Scala Hadoop Enrich: bumped to 0.5.0
Scala Hadoop Enrich: bumped to Scala Common Enrich 0.4.0 (#699)
Scala Hadoop Enrich: bumped SBT to 0.13.2 (#702)
Scala Hadoop Enrich: bumped to using using sbt-assembly 0.11.2 (#704)
Scala Common Enrich: bumped to 0.4.0
Scala Common Enrich: upgraded to support new and future CloudFront file formats (#698)
Scala Common Enrich: bumped SBT to 0.13.2 (#703)
Scala Hadoop Bad Rows: added. Version 0.1.0
Hive Storage: bumped table-def.q to 0.1.0
Hive Storage: added new unstructured fields to Hive table definition (#709)
Hive Storage: added raw page_url and page_referrer into Hive table (#710)
Hive Storage: added name_tracker field to Hive table (#711)
Version 0.9.1 (2014-04-11)
--------------------------
Scala Hadoop Enrich: bumped to 0.4.0
Scala Hadoop Enrich: bumped to Scala Common Enrich 0.3.0 (#497)
Scala Hadoop Enrich: renamed AnonQuartets to AnonOctets (#498)
Scala Hadoop Enrich: renamed all Snowplow Hadoop Tests to Specs (#515)
Scala Hadoop Enrich: added page_url and page_referrer back into ETL's output (#483)
Scala Common Enrich: bumped to 0.3.0
Scala Common Enrich: bumped Argonaut to 6.0.3 (#620)
Scala Common Enrich: added app and mob as valid platform codes, thanks @kinabalu! (#524)
Scala Common Enrich: added support for remaining platform codes (#516)
Scala Common Enrich: updated POJO in Scalding ETL to include new unstructured fields (#362)
Scala Common Enrich: updated POJO in Scalding ETL to include name_tracker field (#595)
Scala Common Enrich: extract evn from Tracker Protocol (#604)
Scala Common Enrich: extract tna from Tracker Protocol (#616)
Scala Common Enrich: extract and validate unstructured events (#142)
Scala Common Enrich: extract and validate custom contexts (#426)
Scala Common Enrich: reformat incoming event and context JSONs (#589)
Scala Common Enrich: make sure to error a JSON if > length (#567)
EmrEtlRunner: bumped to 0.6.0
EmrEtlRunner: bumped Elasticity to 3.0.2 (#587)
EmrEtlRunner: allowed AWS VPC selection in EmrEtlRunner (#581)
EmrEtlRunner: set :visible_to_all_users to true for EMR jobs, thanks @smugryan! (#560)
Redshift: atomic-def script bumped to 0.3.0
Redshift: migration script added for 0.2.2 to 0.3.0
Redshift: added new unstructured fields to Redshift table definition (#361)
Redshift: changed distkey to be event_id, not domain_userid (#584)
Redshift: added raw page_url and page_referrer into Redshift table (#591)
Redshift: added name_tracker field to Redshift table (#594)
Redshift: converted Redshift varchar(38) for event IDs to char(36) (#282)
Postgres: atomic-def script bumped to 0.2.0
Postgres: migration script added for 0.1.x to 0.2.0
Postgres: added new unstructured fields to Postgres table definition (#359)
Postgres: added raw page_url and page_referrer into Postgres table (#592)
Postgres: added name_tracker field to Postgres table (#593)
Postgres: converted varchar(36) for event IDs to char(36) (#596)
StorageLoader: bumped to 0.2.0
StorageLoader: added TIMEFORMAT 'auto' to StorageLoader to handle outlier dvce_timestamps (#427)
JavaScript Tracker: bumped git submodule to 1.0.1 (#585)
Python Tracker: added git submodule pointing to 0.1.0 (#586)
Version 0.9.0 (2014-02-04)
--------------------------
Thrift Raw Event: added. Version 0.1.0
Thrift Raw Event: specified Thrift IDL for new raw event schema (#430)
Scala Stream Collector: added. Version 0.1.0
Scala Stream Collector: implemented new spray-can (Akka Http) Scala stream collector (#432)
Scala Kinesis Enrich: added. Version 0.1.0
Scala Kinesis Enrich: implemented initial Kinesis-based enrichment (#460)
Scala Common Enrich: bumped to 0.2.0
Scala Common Enrich: added Thrift SnowplowRawEvent as a dependency to common-enrich (#475)
Scala Common Enrich: added ability to read Thrift SnowplowRawEvent (Thrift) (#462)
Scala Common Enrich: renamed CloudFront to Cloudfront in code (#495)
Scala Common Enrich: renamed AnonQuartets to AnonOctets (#491)
Scala Common Enrich: added raw -> CanonicalInput tests (#484)
Scala Common Enrich: updated GET payload extraction to handle empty payloads (#502)
Git submodules: changed git:// protocol in .gitmodules to https:// (#512)
NodeJS Collector: removed contrib-nodejs-collector from 2-collectors (#474)
JavaScript Tracker: bumped JS Tracker submodule to 0.13.1 release (#511)
Version 0.8.13 (2014-01-08)
---------------------------
Looker Analytics: added 0.1.0
Looker Analytics: created Snowplow metadata model for Looker BI (www.looker.com) (#472)
Version 0.8.12 (2014-01-07)
---------------------------
Hadoop ETL: bumped to 0.3.6
Hadoop ETL: bumped to SBT 0.13.0 (#404)
Hadoop ETL: bumped to using sbt-assembly 0.10.1 (#421)
Hadoop ETL: bumped to Scala 2.10.3 (#423)
Hadoop ETL: bumped to Scalding 0.8.11 (#422)
Hadoop ETL: upgraded useragent utils to 1.11 & moved to Maven dependency (#416)
Hadoop ETL: added test running back into sbt-assembly step (#420)
Hadoop ETL: updated copyright messages to be Snowplow not SnowPlow, and to 2014 not 2013 (#419)
Hadoop ETL: added ValidatedString as a type to package.scala (#328)
Hadoop ETL: added missing validation to stringToJByte (#408)
Hadoop ETL: missing page URI no longer interpreted as bad row (#399)
Hadoop ETL: updated CfRegex to reflect Cfcs(Cookie) can be empty (#410)
Hadoop ETL: numeric fields in tr_ and ti_ now parsed to doubles, not madeTsvSafe strings (#400)
Hadoop ETL: moved ETL core into separate project scala-enrich-common (#417)
Scala Common Enrich: updated ETL versioning to include host and common versions (#448)
Postgres: bumped cube-pages.sql to 0.1.1
Postgres: minor fix: cube_pages.complete referenced non-existent table cube_pages.basic, thanks @mrwalker! (#414)
Version 0.8.11 (2013-10-22)
---------------------------
Hadoop ETL: bumped to 0.3.5
Hadoop ETL: added Argonaut 6.0 as a dependency (#342)
Hadoop ETL: added fromTimestamp to EventEnrichments (#340)
Hadoop ETL: added makeTsvSafe to ConversionUtils (#338)
Hadoop ETL: added JsonUtils (#323)
Hadoop ETL: added support for 3 and 4 return values from MapTransformer (#324)
Hadoop ETL: updated GetJsonPayload to use Argonaut and renamed to JsonPayload (#339)
Hadoop ETL: added ability to mask IP addresses in ETL (#309)
Hadoop ETL: refr_ and page_ fields now stored raw (#374)
Hadoop ETL: defensively fixed raw spaces in page and referer URLs (#346)
Hadoop ETL: fixed regression, single-encoded %s logic didn't account for % itself (#347)
Hadoop ETL: added unit tests for fixTabsNewlines (#332)
Hadoop ETL: tests now report the failing CanonicalOutput field (#325)
Hadoop ETL: now handling all fields double-encoded as per CloudFront post-14-September (#348)
Hadoop ETL: added support for 21 Oct CloudFront access log format (#384)
Hadoop ETL: added truncation to refr_term (#379)
Hadoop ETL: added truncation to se_label (#394)
Hadoop ETL: made all prior ME.identity fields TSV-safe (#395)
EmrEtlRunner: bumped to 0.5.0
EmrEtlRunner: bumped Sluice to 0.1.5 (#96)
EmrEtlRunner: bumped Elasticity to 2.6 (#345)
EmrEtlRunner: enabled EMR Job Flow debugging for easier access to logs (#279)
EmrEtlRunner: ETL job no longer fails if there's no data for last run period (#296)
EmrEtlRunner: empty processing dir check now works if dir contains 1 file (#326)
EmrEtlRunner: added ability to mask IP addresses in ETL (#309)
EmrEtlRunner: made the examples match what you get from git out of the box, thanks @shermozle (#331)
StorageLoader: bumped to 0.1.1
StorageLoader: bumped Sluice to 0.1.5 (#96)
StorageLoader: fixed "\" in fields acts as an escape character for Postgres, thanks @kingo55 (#329)
StorageLoader: added ability to --skip analyze (#335)
StorageLoader: moved VACUUM SORT ONLY to a --include step (#321)
StorageLoader: added COMPROWS to config and --include compupdate option (#344)
StorageLoader: changed Postgres VACUUM FULL to VACUUM (#357)
StorageLoader: added TRUNCATECOLUMNS for Redshift load (#360)
StorageLoader: added FILLRECORD to our Redshift COPY command (#380)
Postgres: fixed error in `recipes_basic.technology_mobile` recipe (#397)
Version 0.8.10 (2013-10-18)
---------------------------
Redshift: bumped atomic.events to 0.2.2
Redshift: added migration script for 0.2.1 to 0.2.2
Redshift: moved events table to a new atomic schema in atomic-def.sql (#301)
Redshift: added SQL DDL to define Redshift recipes (#297)
Redshift: added SQL DDL to define Redshift cubes (#298)
Postgres: bumped atomic.events to 0.1.1
Postgres: added migration script for 0.1.0 to 0.1.1
Postgres: renamed table-def file to atomic-def.sql
Postgres: moved NOT NULL constraint on event field to event_vendor field (#318)
Postgres: added SQL DDL to define Postgres recipes (#303)
Postgres: added SQL DDL to define Postgres cubes (#302)
Documentation: fixed wrong path to no-js-tracker subdirectory, thanks @gregakespret (#343)
Documentation: improved "Find out more" table in README, thanks @dideler (#353)
Version 0.8.9 (2013-09-05)
--------------------------
Hadoop ETL: bumped to 0.3.4
Hadoop ETL: updated to handle singly-encoded %s in CloudFront querystring field (#333)
Version 0.8.8 (2013-08-04)
--------------------------
JavaScript Tracker: moved into own repo (#277)
Hadoop ETL: bumped to 0.3.3
Hadoop ETL: URL-decodes "%3D" to "=" to allow Hive-style directory names as arguments (#305)
Hadoop ETL: bumped referer-parser to 0.1.1 to fix java.lang.NullPointerException (#314)
EmrEtlRunner: bumped to 0.4.0
EmrEtlRunner: bumped Sluice to 0.0.7 (#299)
EmrEtlRunner: removed :snowplow: section from config.yml.sample (#289)
EmrEtlRunner: simplified EmrEtlRunner and its config (#287)
EmrEtlRunner: added run= to timestamped ETL folder names (#294)
EmrEtlRunner: updated "Jobflow started" stdout message to include jobflow ID (#315)
Hive ETL: removed folder 3-enrich/hive-etl as no longer supported (#286)
Hive storage: updated hive-storage scripts to work with current Redshift-format flatfile (#290)
Infobright: removed folder 4-storage/infobright as not currently supported (#285)
Postgres: add Postgres table definition in atomic schema (#160)
StorageLoader: bumped to 0.1.0
StorageLoader: bumped Sluice 0.0.7 (#300)
StorageLoader: removed code to delete Hive ETL's empty event files (#306)
StorageLoader: fixed bug where download path has to be set (even when using Redshift) (#280)
StorageLoader: optimized ANALYZE and VACUUM commands (#283)
StorageLoader: added MAXERROR as StorageLoader configuration value for Redshift (#273)
StorageLoader: added support for loading Postgres (#161)
StorageLoader: removed Infobright loading capability (#307)
StorageLoader: added support for loading into multiple storage targets (#311)
Version 0.8.7 (2013-07-07)
--------------------------
JavaScript Tracker: bumped to 0.12.0
JavaScript Tracker: fixed document reference to use documentAlias (#247)
JavaScript Tracker: fixed bug with setCustomUrl (#267)
JavaScript Tracker: changed ev_ to se_ for structured events (#197)
JavaScript Tracker: fixed Firefox failure when "Always ask" set for cookies (#163)
JavaScript Tracker: fixed bug in page ping functionality detected in IE 8 (#260)
JavaScript Tracker: replaced forEach as not supported in IE 6-8 (#295)
EmrEtlRunner: fixed bug in config.yml.sample (#291)
Arduino tracker: added git submodule link (#292)
Version 0.8.6 (2013-06-03)
--------------------------
Hadoop ETL: bumped to 0.3.2
Hadoop ETL: bumped Scalding to 0.8.5
Hadoop ETL: bumped Scala version to 2.10.0
Hadoop ETL: bumped scala-maxmind-geoip to 0.0.5 to work with Scala 2.10.0
Hadoop ETL: bumped SBT from 0.12.1 to 0.12.3
Hadoop ETL: bumped Specs2 to 1.14
Hadoop ETL: replaced Bytes in CanonicalOutput with JBytes (#254)
Hadoop ETL: disabled "corruption" detection in ETL overriding custom URLs with longer collector referer URLs (#268)
EmrEtlRunner: bumped to 0.3.0
EmrEtlRunner: updated config.yml.sample to support spot task instances
EmrEtlRunner: let EmrEtlRunner use spot task instances (#193)
EmrEtlRunner: consolidate small files prior to running ETL job (#207)
Version 0.8.5 (2013-05-24)
--------------------------
Hadoop ETL: bumped to 0.3.1
Hadoop ETL: now supports downloading GeoLiteCity.dat from public S3 URL if needed, thanks @petervanwesep (part of #258)
Hadoop ETL: added Twitter Maven Repo as a resolution repo, thanks @rgabo (#239)
Hadoop ETL: stripping control characters in addition to tabs and newlines (#259)
Hadoop ETL: fixed issue with large values for se_value (#263)
Hadoop ETL: renamed ev_ fields in CanonicalOutput to se_
Hadoop ETL: extractResolution renamed and fails gracefully if view dimensions exceed Integer max size (#264)
EmrEtlRunner: bumped to 0.2.1
EmrEtlRunner: returns public S3 URL to GeoLiteCity.dat file if hosted by Snowplow, thanks @petervanwesep (part of #258)
Redshift: table-def script bumped to 0.2.1
Redshift: migration script added for 0.2.0 to 0.2.1
Redshift: bumped se_value from a float to a double
Redshift: increased size of `_urlport` fields, thanks @petervanwesep (#266)
Infobright: bumped setup_ and verify_infobright.sql to 0.0.9
Infobright: added migration script 0.0.8->0.0.9
Infobright: increased size of `_urlport` fields, thanks @petervanwesep (#266)
Version 0.8.4 (2013-05-16)
--------------------------
Hadoop ETL: bumped to 0.3.0
Hadoop ETL: added geo-ip lookup to Scalding ETL
Hadoop ETL: bumped referer-parser from 0.1.0-M6 to to 0.1.0
Hadoop ETL: removed truncation of page_referrer (#236)
Hadoop ETL: added truncation of referer path/qs/fragment (#235)
Hadoop ETL: removing tabs found in referer search terms (#234)
Hadoop ETL: fixed client timestamp so it's not incorrectly localised - thanks @rgabo (#238)
Hadoop ETL: added parsing of collector version `cv` (#243)
Hadoop ETL: bumped Scalaz from 7.0.0-M9 to 7.0.0
Hadoop ETL: removed .gets from extractPageUri (#249)
EmrEtlRunner: bumped to 0.2.0
EmrEtlRunner: now passes MaxMind .dat file into Scalding ETL (#213)
EmrEtlRunner: improve messages when ETL job starts and fails (#230)
Redshift: table-def script bumped to 0.2.0
Redshift: migration script added for 0.1.0 to 0.2.0
Redshift: added geo-ip fields to Redshift table definition (#226)
Redshift: rename ev_ fields to se_ for structured events (#227)
Version 0.8.3 (2013-05-14)
--------------------------
JavaScript Tracker: bumped to 0.11.2
JavaScript Tracker: added unstructured events, thanks @rgabo, @tarsolya, @lackac (#198)
JavaScript Tracker: remove leading ampersand in querystring (#188)
Clojure Collector: bumped to 0.5.0
Clojure Collector: upgraded to use Tomcat AccessLogValve 0.0.4 (#240)
Clojure Collector: now logging Clojure Collector and Tomcat AccessLogValve versions (#239)
Common: completed splitting custom event type into: unstructured and structured events (#133)
Version 0.8.2 (2013-05-08)
--------------------------
Clojure Collector: bumped to 0.4.0
Clojure Collector: remove duplicate of wrap-request-logging in middleware.clj (#221)
Clojure Collector: check/potentially bump lein-ring dependency in project.clj (#222)
Clojure Collector: simplify building Clojure Collector, thanks @butlermh (#223, #225)
Clojure Collector: fix Tomcat log bug of missing cs(Referer) (#220)
Version 0.8.1 (2013-04-12)
--------------------------
Hadoop ETL: bumped to 0.2.0
Hadoop ETL: break referer_url into constituent parts (part of #175)
Hadoop ETL: remove raw referrer_url (as no space in Redshift table defn) (part of #175)
Hadoop ETL: added referer parsing (#176)
Redshift: table-def script bumped to 0.1.0
Redshift: migration script added for 0.0.1 to 0.1.0
Redshift: add/update referer fields in Redshift table definition (#204)
Redshift: fix bug where mkt_source and mkt_medium are getting swapped around (#215)
Common: replaced embedded architecture images with CloudFront-hosted images
Common: completed rename of 3-etl to 3-enrich (#99)
Common: "SnowPlow" -> "Snowplow" in 1st and 2nd level READMEs
Version 0.8.0 (2013-04-03)
--------------------------
Hadoop ETL: added. Version 0.1.0 (#177)
Hadoop ETL: truncate 6 "high risk" fields for Redshift (raw useragent, page title etc) (#192)
Hadoop ETL: ev_value now extracted as a float (#201)
EmrEtlRunner: bumped to 0.1.0
EmrEtlRunner: updated to work with new config.yml fields (part of #178)
EmrEtlRunner: added support for Hadoop ETL (part of #178)
EmrEtlRunner: added run ID and human-friendly job name (#100)
EmrEtlRunner: added run IDs to output folders (Hadoop ETL only) (#79)
EmrEtlRunner: changed .rvmrc to .ruby-version, thanks @richo (part of #190)
StorageLoader: changed .rvmrc to .ruby-version, thanks @richo (part of #190)
StorageLoader: added final missing /Gemfile to BUNDLE_GEMFILE in Bash script, thanks @frutik (#206)
Common: started rename of 3-etl to 3-enrich (part of #99)
Version 0.7.6 (2013-03-03)
--------------------------
HiveQL: redshift-etl.q added. Version 0.0.1 (#174)
HiveQL: hive-rolling-etl.q renamed to hive-etl.q and bumped to 0.5.7
HiveQL: non-hive-rolling-etl.q renamed to mysql-infobright-etl.q and bumped to 0.0.8 (part of #172)
EmrEtlRunner: bumped to 0.0.9
EmrEtlRunner: renamed :snowplow: variable names and added new Redshift one in config.yml (part of #172)
EmrEtlRunner: updated to support Redshift as a storage format (#173)
EmrEtlRunner: added missing /Gemfile to BUNDLE_GEMFILE in Bash script
StorageLoader: bumped to 0.0.5
StorageLoader: added Redshift-specific fields to config.yml (part of #159)
StorageLoader: added Redshift load support into StorageLoader (part of #159)
StorageLoader: added missing /Gemfile to BUNDLE_GEMFILE in Bash scripts
Redshift: table-def.sql script added. Version 0.0.1 (#158)
Infobright: bumped setup_ and verify_infobright.sql to 0.0.8
Infobright: widened useragent field (#184)
Infobright: added migration script 0.0.7->0.0.8
Serde: fixed and enabled broken tests (#14). Version unchanged
Version 0.7.5 (2013-02-25)
--------------------------
JavaScript Tracker: bumped to 0.11.1
JavaScript Tracker: fixed bug with cookie secure flag killing user ID cookies (#181)
Version 0.7.4 (2013-02-22)
--------------------------
JavaScript Tracker: bumped to 0.11.0
JavaScript Tracker: introduced setAppId() and deprecated setSiteId() (#168)
JavaScript Tracker: 1st party user ID now transmitted as duid (domain uid) (part of #150)
JavaScript Tracker: now sends dtm - the client timestamp (#149)
JavaScript Tracker: deprecated and disabled attachUserId()
JavaScript Tracker: deprecated getVisitorId() and getVisitorInfo() - use getDomainUserId() and getDomainUserInfo() instead
JavaScript Tracker: add setUserId which sets the uid field (#167)
JavaScript Tracker: SnowPlow cookies no longer tied to site ID (#148)
Clojure Collector: bumped to 0.3.0
Clojure Collector: now append nuid (network aka 3rd party) user ID, not uid (#150)
Serde: bumped to 0.5.5
Serde: renamed tstamp field to dtm
Serde: dt and tm split into dvce_x and collector_x (#149)
Serde: extract new nuid and duid fields (#150)
Serde: renamed visit_id to domain_sessionidx (#171)
HiveQL: hive-rolling-etl.q bumped to 0.5.6
HiveQL: non-hive-rolling-etl.q bumped to 0.0.7
HiveQL: dt and tm split into dvce_x and collector_x (#149)
HiveQL: now extracts uid, nuid and duid (#150)
HiveQL: renamed visit_id to domain_sessionidx (#171)
Infobright: bumped setup_infobright.sql to 0.0.7
Infobright: renamed dt and tm to dvce_x and collector_x (#149)
Infobright: now supports uid, nuid and duid (#150)
Infobright: renamed visit_id to domain_sessionidx (#171)
Infobright: added migration script 0.0.6 CloudFront collector -> 0.0.7
Infobright: added migration script 0.0.6 Clojure collector -> 0.0.7
Version 0.7.3 (2013-02-15)
--------------------------
JavaScript Tracker: bumped to 0.10.0
JavaScript Tracker: updated copyright notices
JavaScript Tracker: removed deprecated setAccount(), setTracker(), setHeartBeatTimer() - BREAKING CHANGE (#86)
JavaScript Tracker: added document charset to querystring (#138)
JavaScript Tracker: page ping no longer killed by 1 heartbeat w/o activity (#132)
JavaScript Tracker: added document & viewport dimensions (#94)
JavaScript Tracker: introduced trackStructEvent and deprecated trackEvent (#143)
JavaScript Tracker: cleaned up getRequest code to use improved requestStringBuilder
JavaScript Tracker: fixed logImpression (was using wrong argument names) (#162)
JavaScript Tracker: added scroll offsets to page ping (#127)
Serde: bumped to 0.5.4
Serde: updated copyright notices
Serde: structured events now logged as "struct" not "custom" - DATA CHANGE
Serde: added setting of new event_vendor field (to com.snowplowanalytics) (#144)
Serde: added extraction of doc charset (#138)
Serde: added extraction of document & viewport dimensions (#94)
Serde: added extraction of scroll offsets for enhanced page ping (#127)
Serde: added extraction of URL components (#105)
HiveQL: hive-rolling-etl.q bumped to 0.5.5
HiveQL: non-hive-rolling-etl.q bumped to 0.0.6
HiveQL: updated copyright notices
HiveQL: now supports charset, document & viewport, URL components, event_vendor and enhanced page ping
Infobright: bumped setup_infobright.sql to 0.0.6
Infobright: updated copyright notices
Infobright: added migration scripts (0.0.4->.6; 0.0.5->.6)
Infobright: added charset, document & viewport, URL components, event_vendor enhanced page ping
Version 0.7.2 (2013-01-29)
--------------------------
No-JavaScript Tracker: added. Version 0.1.0
JavaScript Tracker: bumped to 0.9.1
JavaScript Tracker: fixed bug where secure flag not being set on cookies sent via HTTPS
Clojure Collector: bumped to 0.2.0
Clojure Collector: fixed Tomcat config issue of times being recorded in 12-hour clock
Serde: added NoJsTrackerTest
Serde: fixed CljTomcatFormatTest
Version 0.7.1 (2013-01-22)
--------------------------
EmrEtlRunner: bumped to 0.0.8
EmrEtlRunner: updated copyright notices
EmrEtlRunner: added .rvmrc file (part of #121, #84)
EmrEtlRunner: removed .gemspec file
EmrEtlRunner: added dependencies to Gemfile and re-generated Gemfile.lock
StorageLoader: bumped to 0.0.4
StorageLoader: updated copyright notices
StorageLoader: added .rvmrc file (part of #121, #84)
StorageLoader: removed .gemspec file
StorageLoader: added dependencies to Gemfile and re-generated Gemfile.lock
Documentation: updated to use `bundle install` (#122)
Version 0.7.0 (2013-01-04)
--------------------------
Clojure Collector: added. Version 0.1.0
HiveQL: hive-rolling-etl.q bumped to 0.5.4
HiveQL: non-hive-rolling-etl.q bumped to 0.0.5
HiveQL: v_collector now set via Hive variable, not Serde (#118)
EmrEtlRunner: bumped to 0.0.7
EmrEtlRunner: bumped to using Sluice 0.0.6
EmrEtlRunner: added "Complete" message at end of run (part of #97)
EmrEtlRunner: validates "clj-tomcat" as collector format (#119)
EmrEtlRunner: passes collector format through to HiveQL (#119)
EmrEtlRunner: support for log files generated by Clojure Collector on Tomcat (#117)
Serde: added broken CljTomcatFormatTest
StorageLoader: bumped to 0.0.3
StorageLoader: bumped to using Sluice 0.0.6
StorageLoader: added "Complete" message at end of run (part of #97)
StorageLoader: --skip argument now supports a list (#81)
Infobright: bumped setup_infobright.sql to 0.0.5
Infobright: added migration script (0.0.4 -> 0.0.5)
Infobright: user_id field widened to 38 chars to support UUID
Version 0.6.5 (2012-12-26)
--------------------------
JavaScript Tracker: bumped to 0.9.0
JavaScript Tracker: each event now sent with an event type `e` (#63)
JavaScript Tracker: refactoring of event definition code
JavaScript Tracker: added attachUserId(boolean) method (#92)
JavaScript Tracker: removed configCustomData from logImpression (#115)
JavaScript Tracker: cleaned up activity tracking (page pings)
JavaScript Tracker: added a combine only option to snowpak.sh
Serde: bumped to 0.5.3
Serde: now extracts event type (`e`) from querystring (#63)
Serde: now attaches UUID event_id to each event (#89)
Serde: added support for IP address override in querystring (#90)
Serde: no longer dies on corrupted querystring (#114)
HiveQL: hive-rolling-etl.q bumped to 0.5.3
HiveQL: non-hive-rolling-etl.q bumped to 0.0.4
HiveQL: event and event_id now extracted from Serde (#63, #89)
EmrEtlRunner: updated config file template
Version 0.6.4 (2012-12-20)
--------------------------
HiveQL: renamed table-def.q to non-hive-format-table-def.q
HiveQL: added hive-format-table-def.q (#111)
Infobright: bumped setup_infobright.sql to 0.0.4
Infobright: added migration script (0.0.3 -> 0.0.4)
Infobright: now supports long br_langs and urls (#107)
Infobright: removed lookup from fields which slow a large load (#107)
Version 0.6.3 (2012-12-18)
--------------------------
JavaScript Tracker: bumped to 0.8.2
JavaScript Tracker: fixed regressions from splitting JS into multiple files (#103)
HiveQL: hive-rolling-etl.q bumped to 0.5.2
HiveQL: addded missing comma in hive-rolling-etl.q (#112)
Version 0.6.2 (2012-11-29)
--------------------------
JavaScript Tracker: bumped to 0.8.1
JavaScript Tracker: fixed bug with trailing comma (#102)
JavaScript Tracker: removed console.log when not debugging (#101)
JavaScript Tracker: removed minified sp.js from version control (added .gitignore to keep it out)
SnowCannon: bumped submodule to latest shermozle/SnowCannon commit
Version 0.6.1 (2012-11-28)
--------------------------
JavaScript Tracker: bumped to 0.8.0
JavaScript Tracker: rename ice.png to i - BREAKING CHANGE (#29)
JavaScript Tracker: added setCollectorCf() and deprecated setAccount() (#32)
JavaScript Tracker: Tracker constructor now supports Cf or Url (part of #44)
JavaScript Tracker: getTrackerCf() and -Url() added, getTracker() deprecated (part of #44)
JavaScript Tracker: added tracker version (`tv`) to querystring (#41)
JavaScript Tracker: added color depth tracking (part of #69)
JavaScript Tracker: added timezone tracking (part of #69)
JavaScript Tracker: added user fingerprinting (#70)
JavaScript Tracker: broke out .js into multiple files (#55)
EmrEtlRunner: bumped to 0.0.6
EmrEtlRunner: --skip takes multiple args (part of #83, supercedes #80)
EmrEtlRunner: add --process-bucket to process a bucket directly (part of #83)
StorageLoader: bumped to 0.0.2
StorageLoader: changed the data file encloser to NULL (#88)
Serde: bumped to 0.5.2
Serde: now extracts color depth, timezone and fingerprint fields
Serde: added useragent into ETL (#68)
Serde: now extracts platform field
HiveQL: hive-rolling-etl.q bumped to 0.5.1
HiveQL: non-hive-rolling-etl.q bumped to 0.0.3
HiveQL: now extracts color depth, timezone and fingerprint fields
HiveQL: now includes raw useragent as a separate field (#68)
HiveQL: platform field no longer a placeholder
HiveQL: event_name field renamed to event (prep for #89)
HiveQL: added event_id as a placeholder
Infobright: bumped setup_infobright.sql to 0.0.3
Infobright: added migration script (0.0.1/2 -> 0.0.3)
Infobright: now includes color depth, timezone and fingerprint fields
Infobright: now includes raw useragent (#68)
Infobright: event_name field renamed to event
Infobright: added event_id as a placeholder (prep for #89)
Version 0.6.0 (2012-11-12)
--------------------------
EmrEtlRunner: bumped to 0.0.5
EmrEtlRunner: bumped gem dependencies to match StorageLoader (including Sluice 0.0.4)
EmrEtlRunner: renamed snowplow-emr-etl.sh to snowplow-emr-etl-runner.sh
StorageLoader: added. Ruby app to load SnowPlow events into local databases etc
Serde: bumped to 0.5.1
Serde: changed all Booleans to Bytes for non-Hive output
HiveQL: bumped non-hive-rolling-etl.q to 0.0.2
HiveQL: changed non-hive-rolling-etl.q to use the two _bt Byte fields
Infobright: bumped setup_infobright.sql to 0.0.2
Infobright: changed booleans to tinyint(1)s (non-breaking change)
Version 0.5.2 (2012-11-05)
--------------------------
EmrEtlRunner: bump to 0.0.4
EmrEtlRunner: fixed reference to old version of Hive deserializer in config.yml (fixes #71)
EmrEtlRunner: fixed bug using sub-folders with the Processing Bucket (fixes #72)
EmrEtlRunner: can now skip move-files-to-Processing-Bucket or EMR stages (fixes #58)
EmrEtlRunner: S3 filecopy code now moved to Sluice, an external Ruby gem
Version 0.5.1 (2012-10-31)
--------------------------
Data model: stubbed new event_name and platform fields
Infobright: added setup scripts and docs into 4-storage/infobright (fixes #57)
Infobright: added version handling (v_tracker, v_collector, v_etl)
HiveQL: removed hive-exact-etl.q as no longer supported
HiveQL: added non-hive-rolling-etl.q for Infobright- (and other db-)friendly event file format
HiveQL: added version handling (v_tracker, v_collector, v_etl) (fixes #42)
Serde: bumped to 0.5.0
Serde: updated to avoid throwing exceptions on a bad field, fixes #52 (thanks @mtibben!)
Serde: moved some exception handling closer to the throw point, pull req #66 (thanks @mtibben!)
Serde: added continue_on_unexpected_error (thanks @mtibben!)
Serde: tabs are changed to 4 spaces, fixes #61
Serde: browser features are now also available as individual fields, for non-hive-rolling-etl.q to use
Serde: added version handling (v_tracker, v_collector, v_etl)
EmrEtlRunner: bumped to 0.0.3
EmrEtlRunner: moved 3 .rb files in lib/ into lib/snowplow-emr-etl-runner
EmrEtlRunner: added/updated configuration options (:etl: section and hiveql versioning params)
Version 0.5.0 (2012-10-24)
--------------------------
Tidied up folder structure inside 3-etl/
Serde: assembles to /target, not to /upload any more (and jars won't be committed to Git)
EmrEtlRunner: added. Ruby application to run Hive ETL process on Amazon EMR
Version 0.4.10 (2012-10-10)
---------------------------
SnowCannon: bumped submodule to latest shermozle/SnowCannon commit
HiveQL: moved app_id to end of table for backwards compatibility
HiveQL: fixed bug where pointing to serde 0.4.8 NOT new serde 0.4.9
Version 0.4.9 (2012-10-01)
--------------------------
Serde: fixed bug where row not nulled if a critical field un-parseable
Serde: added support for new application ID (#33)
Serde: added deserialization of ecommerce fields, plus tests (#34, #51)
Serde: test suite enhancements (adding Scala helper objects)
Serde: added tests including #13 and #10
snowplow.js: bumped to 0.7.0
snowplow.js: renamed said to aid for application ID
Version 0.4.8 (2012-09-14)
--------------------------
Serde: added support for /i as well as /ice.png (issue #35)
Serde: added support for new (2012-09-12) CloudFront format
Serde: handles Cf bucket with Forward Query String = yes (issue #39)
Serde: made marketing attribution parsing more robust
Version 0.4.7 (2012-09-05)
--------------------------
snowplow.js: bumped to version 0.6
snowplow.js: added setSiteId functionality
snowplow.js: added ecommerce tracking
Version 0.4.6 (2012-08-18)
--------------------------
snowplow.js: added setCollectorUrl functionality
Version 0.4.5 (2012-08-03)
--------------------------
Serde: upgraded httpclient and tweaked URL code (issue #15)
Serde: now extracting our 5 marketing fields (issue #12)
Serde: added support for client-timestamp (issue #18)
Serde: now stripping line breaks (issue #23)
Version 0.4.4 (2012-07-28)
--------------------------
Restructured into 5 sub-systems
Updated README to explain sub-systems
Version 0.4.3 (2012-07-02)
--------------------------
Removed status code checks from Serde
Serde now outputs into /upload folder (to be uploaded by SnowPlow::Etl Ruby gem)
Version 0.4.2 (2012-06-19)
--------------------------
Moved serde into /hive from own repo
Version 0.4.1 (2012-06-16)
--------------------------
Updated serde to 0.4.4
Moved documentation to wiki
Version 0.4.0 (2012-05-30)
--------------------------
Improved names of querystring params
Added page-url to QS as fallback
Added Hive Deserializer as submodule
Documentation updates
Version 0.3.0 (2012-05-18)
--------------------------
Mostly documentation
Version 0.2.0
-------------
Formalised minification process
Version 0.1.0
-------------
Initial release of SnowPlow.js