/
datamill-co.yml
202 lines (186 loc) · 7.64 KB
/
datamill-co.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
capabilities: []
description: PostgreSQL database loader
dialect: postgres
domain_url: https://www.postgresql.org/
keywords:
- database
label: PostgreSQL
logo_url: /assets/logos/loaders/postgres.png
maintenance_status: active
name: target-postgres
namespace: target_postgres
pip_url: singer-target-postgres
prereq: |
#### Dependencies
`target-postgres` [requires](https://www.psycopg.org/docs/install.html#runtime-requirements) the
[`libpq` library](https://www.postgresql.org/docs/current/libpq.html) to be available on your system.
If you've installed PostgreSQL, you should already have it, but you can also install it by itself using the
[`libpq-dev` package](https://pkgs.org/download/libpq-dev) on Ubuntu/Debian or the
[`libpq` Homebrew formula](https://formulae.brew.sh/formula/libpq) on macOS.
quality: silver
repo: https://github.com/datamill-co/target-postgres
settings:
- description: |
Whether the Target should create column indexes on the important columns
used during data loading.
These indexes will make data loading slightly slower
but the deduplication phase much faster. Defaults to on for better baseline performance.
kind: boolean
label: Add Upsert Indexes
name: add_upsert_indexes
value: true
- description: Raw SQL statement(s) to execute as soon as the connection to Postgres
is opened by the target. Useful for setup like `SET ROLE` or other connection
state that is important.
label: After Run SQL
name: after_run_sql
- description: |
How often, in rows received, to count the buffered rows and bytes to
check if a flush is necessary.
There's a slight performance penalty to checking
the buffered records count or bytesize, so this controls how often this is polled
in order to mitigate the penalty. This value is usually not necessary to set as
the default is dynamically adjusted to check reasonably often.
kind: integer
label: Batch Detection Threshold
name: batch_detection_threshold
- description: Raw SQL statement(s) to execute as soon as the connection to Postgres
is opened by the target. Useful for setup like `SET ROLE` or other connection
state that is important.
label: Before Run SQL
name: before_run_sql
- description: Include `true` in your config to disable [Singer Usage Logging](https://github.com/datamill-co/target-postgres#usage-logging).
kind: boolean
label: Disable Collection
name: disable_collection
value: false
- description: Include `false` in your config to disable `target-postgres` from crashing
on invalid records
kind: boolean
label: Invalid Records Detection
name: invalid_records_detect
value: true
- description: Include a positive value `n` in your config to allow for `target-postgres`
to encounter at most `n` invalid records per stream before giving up.
kind: integer
label: Invalid Records Threshold
name: invalid_records_threshold
value: 0
- description: The level for logging. Set to `DEBUG` to get things like queries executed,
timing of those queries, etc.
kind: options
label: Logging Level
name: logging_level
options:
- label: Debug
value: DEBUG
- label: Info
value: INFO
- label: Warning
value: WARNING
- label: Error
value: ERROR
- label: Critical
value: CRITICAL
value: INFO
- description: The maximum number of rows to buffer in memory before writing to the
destination table in Postgres
kind: integer
label: Max Batch Rows
name: max_batch_rows
value: 200000
- description: |
The maximum number of bytes to buffer in memory before writing to the destination table in Postgres. Default: 100MB in bytes
kind: integer
label: Max Buffer Size
name: max_buffer_size
value: 104857600
- description: Whether the Target should create tables which have no records present
in Remote.
kind: boolean
label: Persist Empty Tables
name: persist_empty_tables
value: false
- label: Database
name: postgres_database
- label: Host
name: postgres_host
value: localhost
- kind: password
label: Password
name: postgres_password
sensitive: true
- kind: integer
label: Port
name: postgres_port
value: 5432
- aliases:
- schema
description: |
Note that `$MELTANO_EXTRACT__LOAD_SCHEMA` [will expand to](https://docs.meltano.com/guide/configuration.html#expansion-in-setting-values) the value of the [`load_schema` extra](https://docs.meltano.com/concepts/plugins#load-schema-extra) for the extractor used in the pipeline, which defaults to the extractor's namespace, e.g. `tap_gitlab` for [`tap-gitlab`](/extractors/gitlab).
Name of the schema where the tables will be created. If `schema_mapping`
is not defined then every stream sent by the tap is loaded into this schema.
label: Schema
name: postgres_schema
value: $MELTANO_EXTRACT__LOAD_SCHEMA
- description: Only used if a SSL request w/ a client certificate is being made
label: SSL Cert
name: postgres_sslcert
value: ~/.postgresql/postgresql.crt
- description: Used for authentication of a server SSL certificate
label: SSL CRL
name: postgres_sslcrl
value: ~/.postgresql/root.crl
- description: Only used if a SSL request w/ a client certificate is being made
label: SSL Key
name: postgres_sslkey
value: ~/.postgresql/postgresql.key
- description: Refer to the [libpq
docs](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS)
for more information about SSL.
label: SSL Mode
name: postgres_sslmode
value: prefer
- description: Used for authentication of a server SSL certificate
name: postgres_sslrootcert
value: ~/.postgresql/root.crt
- label: Username
name: postgres_username
- description: |
Whether the Target should emit `STATE` messages to stdout for further
consumption.
In this mode, which is on by default, STATE messages are buffered
in memory until all the records that occurred before them are flushed according
to the batch flushing schedule the target is configured with.
kind: boolean
label: State Support
name: state_support
value: true
settings_group_validation:
- - postgres_database
- postgres_host
- postgres_password
- postgres_port
- postgres_schema
- postgres_username
target_schema: $TARGET_POSTGRES_POSTGRES_SCHEMA
usage: |
## Troubleshooting
### Error: `psycopg2.ProgrammingError: syntax error at or near "-"`
This error message indicates that the extractor you are using this loader with generates
stream names that include the source database schema in addition to the table name: `<schema>-<table>`, e.g. `public-accounts`.
This is not supported by [this variant](#alternative-variants) of `target-postgres`.
Instead, use the [`transferwise` variant](https://meltano.com/plugins/loaders/postgres--transferwise.html) which was made to be used with extractors that behave this way.
### Error: `pg_config executable not found` or `libpq-fe.h: No such file or directory`
This error message indicates that the [`libpq`](https://www.postgresql.org/docs/current/libpq.html) dependency is missing.
To resolve this, refer to the ["Dependencies" section](#dependencies) above.
### `ld`, `clang`, `lssl`, or `linker` Errors
If you have an error message like:
* `ld: library not found for -lssl`
* `clang: error: linker command failed with exit code 1`
* `error: command 'clang' failed with exit status 1`
These error messages indicates that there is a problem installing OpenSSL.
This [Stack Overflow answer](https://stackoverflow.com/questions/26288042/error-installing-psycopg2-library-not-found-for-lssl)
has specific recommendations on setting the `LDFLAGS` and/or `CPPFLAGS` environment variables.
Set those prior to running `meltano add loader target-postgres`.
variant: datamill-co