Skip to content

Commit

Permalink
Merge c28f793 into 43e11b5
Browse files Browse the repository at this point in the history
  • Loading branch information
istreeter committed Jul 8, 2021
2 parents 43e11b5 + c28f793 commit ce6f8cd
Show file tree
Hide file tree
Showing 44 changed files with 315 additions and 93 deletions.
28 changes: 28 additions & 0 deletions CHANGELOG
@@ -1,3 +1,31 @@
Version 0.2.0 (2021-07-08)
--------------------------
Improve configuration defaults and examples (#41)
Remove unused name and id fields from config hocon (#38)
Ignore missing comment on atomic events table (#36)
Add PubSub configuration to the example config file (#34)
Rename storage to output in config file (#33)
Use jsonb instead of varchar for complex enums (#31)
Add ability to disable CloudWatch metric reportings (#11)
Remove requirement to always use Kinesis Enhanced Fan Out (#14)
Bump postgresql to 42.2.0 (#15)
Bump schema-ddl to 0.13.1 (#27)
Bump log4s to 1.10.0 (#26)
Bump fs2 to 2.5.6 (#25)
Bump doobie to 0.13.4 (#24)
Bump fs2-google-pubsub to 0.17.0 (#23)
Bump fs2 aws to 3.1.1 (#22)
Bump circe to 0.14.1 (#30)
Bump cats-effect to 2.5.1 (#29)
Bump decline to 2.0.0 (#28)
Bump commons-codec to 1.15 (#21)
Use hocon for cli config instead of self-describing json (#20)
Build docker images for amd64 and arm64 architectures (#19)
Use AdoptOpenJDK 11 as docker base image (#18)
Publish to maven central instead of bintray (#17)
Attach jar files to github release (#10)
Fix type derivation of enum fields (#12)

Version 0.1.1 (2020-11-10)
--------------------------
Add support for nullable fields (#6)
Expand Down
11 changes: 6 additions & 5 deletions README.md
Expand Up @@ -9,20 +9,20 @@

Assuming [Docker][docker] is installed:

1. Add own [`config.json`][config] (specify connection and stream details)
1. Add own `config.hocon` to specify connection and stream details, using [the examples][config] as a guide and [the docs site][config-docs] as a reference.
2. Add own [`resolver.json`][resolver] (all schemas must be on [Iglu Server 0.6.0+][iglu-server])
3. Run the Docker image:

```bash
$ docker run --rm -v $PWD/config:/snowplow/config \
snowplow/snowplow-postgres-loader:latest \
--resolver /snowplow/config/resolver.json \
--config /snowplow/config/config.json
--config /snowplow/config/config.hocon
```

## Copyright and License

Snowplow Postgres Loader is copyright 2020 Snowplow Analytics Ltd.
Snowplow Postgres Loader is copyright 2020-2021 Snowplow Analytics Ltd.

Licensed under the **[Apache License, Version 2.0][license]** (the "License");
you may not use this software except in compliance with the License.
Expand All @@ -33,8 +33,9 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

[config]: https://github.com/snowplow-incubator/snowplow-postgres-loader/blob/master/config/config.json
[resolver]: https://github.com/snowplow-incubator/snowplow-postgres-loader/blob/master/config/resolver.json
[config]: ./config/
[resolver]: ./config/resolver.json
[config-docs]: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-postgres-loader/postgres-loader-configuration-reference/

[docker]: https://www.docker.com/
[iglu-server]: https://github.com/snowplow-incubator/iglu-server
Expand Down
2 changes: 2 additions & 0 deletions build.sbt
Expand Up @@ -56,6 +56,7 @@ lazy val loader = project
.settings(BuildSettings.addExampleConfToTestCp)
.settings(BuildSettings.assemblySettings)
.settings(BuildSettings.dynVerSettings)
.settings(BuildSettings.testSettings)
.settings(
libraryDependencies ++= Seq(
Dependencies.circeConfig,
Expand All @@ -71,3 +72,4 @@ lazy val loader = project
.enablePlugins(JavaAppPackaging, DockerPlugin, BuildInfoPlugin)

addCompilerPlugin("com.olegpy" %% "better-monadic-for" % "0.3.1")

28 changes: 28 additions & 0 deletions config/README.md
@@ -0,0 +1,28 @@
## Example snowplow-postgres-loader configuration files

These example hocon files can be provided with the `--config` command line argument.

* [`config.kinesis.minimal.hocon`](./config.kinesis.minimal.hocon): A minimal config file containing
only the required fields when streaming from AWS kinesis.
* [`config.pubsub.minimal.hocon`](./config.pubsub.minimal.hocon): A minimal config file containing
only the required fields when streaming from GCP PubSub.
* [`config.kinesis.reference.hocon`](./config.kinesis.reference.hocon): A complete config file demonstrating
all relevant fields when streaming from AWS kinesis
* [`config.pubsub.reference.hocon`](./config.pubsub.reference.hocon): A complete config file demonstrating
all relevant fields when streaming from GCP PubSub.

#### Iglu resolver config

[`resolver.json`](./resolver.json) is an example iglu resolver file, to be provided with the `--resolver`
command line argument, required in addition to the main hocon file.

All repositories listed in the resolver must be Iglu Servers version 0.6.0 or above.


#### Links

See [the lightbend config readme](https://github.com/lightbend/config/blob/main/README.md) for examples
of how to provide config parameters using environment variable substitutions or system properties.

See [the snowplow docs site](https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/loaders-storage-targets/snowplow-postgres-loader/postgres-loader-configuration-reference/)
for the full snowplow-postgres-loader configuration reference.
18 changes: 18 additions & 0 deletions config/config.kinesis.minimal.hocon
@@ -0,0 +1,18 @@
# The minimum required config options for loading from kinesis
{
"input": {
"type": "Kinesis"
"streamName": "enriched-events"
"region": "eu-central-1"
}

"output" : {
"type": "Postgres"
"host": "localhost"
"database": "snowplow"
"username": "postgres"
"password": ${POSTGRES_PASSWORD}
"schema": "atomic"
}
}

@@ -1,9 +1,4 @@
{
# Human-readable storage target name, used only for logging
"name": "Acme Ltd. Snowplow Postgres"
# Machine-readable unique identifier
"id": "5c5e4353-4eeb-43da-98f8-2de6dc7fa947"

"input": {
# Enable the Kinesis source
"type": "Kinesis"
Expand All @@ -24,13 +19,6 @@
# }
}

# Alternative input block for PubSub source
# "input": {
# "type": "PubSub"
# "projectId": "my-project"
# "subscriptionId": "my-subscription"
# }

"output" : {
"type": "Postgres"
# PostgreSQL host ('localhost' for enabled SSH Tunnel)
Expand Down Expand Up @@ -59,4 +47,5 @@
# "cloudWatch": false
}
}

}
17 changes: 17 additions & 0 deletions config/config.pubsub.minimal.hocon
@@ -0,0 +1,17 @@
# The minimum required config options for loading from PubSub
{
"input": {
"type": "PubSub"
"projectId": "my-project"
"subscriptionId": "my-subscription"
}

"output" : {
"type": "Postgres"
"host": "localhost"
"database": "snowplow"
"username": "postgres"
"password": ${POSTGRES_PASSWORD}
"schema": "atomic"
}
}
33 changes: 33 additions & 0 deletions config/config.pubsub.reference.hocon
@@ -0,0 +1,33 @@
{
"input": {
# Enable the Pubsub source
"type": "PubSub"
# Your GCP project id
"projectId": "my-project"
# Your GCP PubSub subscription id
"subscriptionId": "my-subscription"
}

"output" : {
"type": "Postgres"
# PostgreSQL host ('localhost' for enabled SSH Tunnel)
"host": "localhost"
# PostgreSQL database port
"port": 5432
# PostgreSQL database name
"database": "snowplow"
# PostgreSQL user to load data
"username": "postgres"
# PostgreSQL password, either plain text or from an environment variable
"password": "mysecretpassword"
"password": ${?POSTGRES_PASSWORD}
# PostgreSQL database schema
"schema": "atomic"
# JDBC ssl mode
"sslMode": "REQUIRE"
}

# Kind of data stored in this instance. Either ENRICHED_EVENTS or JSON
"purpose": "ENRICHED_EVENTS"

}
17 changes: 2 additions & 15 deletions config/resolver.json
Expand Up @@ -5,27 +5,14 @@
"cacheTtl": 600,
"repositories": [
{
"name": "Iglu Central",
"name": "Iglu Central Mirror",
"priority": 1,
"vendorPrefixes": [
"com.snowplowanalytics"
],
"connection": {
"http": {
"uri": "http://iglucentral.com"
}
}
},

{
"name": "Iglu Central - Mirror 01",
"priority": 1,
"vendorPrefixes": [
"com.snowplowanalytics"
],
"connection": {
"http": {
"uri": "http://mirror01.iglucentral.com"
"uri": "http://my-iglu-mirror.example.com"
}
}
}
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down Expand Up @@ -35,6 +35,7 @@ import com.snowplowanalytics.iglu.schemaddl.migrations.FlatSchema

import com.snowplowanalytics.snowplow.analytics.scalasdk.Event
import com.snowplowanalytics.snowplow.badrows.{BadRow, Failure, FailureDetails, Payload, Processor}
import com.snowplowanalytics.snowplow.postgres.storage.definitions.EventsTableName
import Entity.Column
import Shredded.{ShreddedSelfDescribing, ShreddedSnowplow}

Expand Down Expand Up @@ -103,7 +104,7 @@ object transform {
.map { cols =>
val columns = cols.collect { case Some(c) => c }
val tableName = data.schema match {
case Atomic => "events"
case Atomic => EventsTableName
case other => StringUtils.getTableName(SchemaMap(other))
}
ShreddedSelfDescribing(Entity(tableName, data.schema, columns))
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020 Snowplow Analytics Ltd. All rights reserved.
* Copyright (c) 2020-2021 Snowplow Analytics Ltd. All rights reserved.
*
* This program is licensed to you under the Apache License Version 2.0,
* and you may not use this file except in compliance with the Apache License Version 2.0.
Expand Down Expand Up @@ -30,7 +30,6 @@ import com.snowplowanalytics.iglu.schemaddl.migrations.{SchemaList => DdlSchemaL

import com.snowplowanalytics.snowplow.badrows.FailureDetails
import com.snowplowanalytics.snowplow.postgres.shredding.schema.fetch
import com.snowplowanalytics.snowplow.postgres.shredding.transform.Atomic
import com.snowplowanalytics.snowplow.postgres.streaming.IgluErrors
import com.snowplowanalytics.snowplow.postgres.streaming.sink.Insert

Expand Down Expand Up @@ -90,16 +89,10 @@ object ddl {

DdlSchemaList.fromSchemaList(list, fetch[F](resolver)).leftMap(IgluErrors.of).map { list =>
val statement = generator(list)
val tableName = getTableName(origin)
val tableName = StringUtils.getTableName(SchemaMap(origin))
statement.update().run.void *>
sql.commentTable(schema, tableName, list.latest)
}
}
}

def getTableName(schemaKey: SchemaKey): String =
schemaKey match {
case Atomic => "events"
case other => StringUtils.getTableName(SchemaMap(other))
}
}

0 comments on commit ce6f8cd

Please sign in to comment.