Releases · pganalyze/collector

22 Jan 06:22

github-actions

v0.36.0

7e92fbe

v0.36.0

Config parsing improvements:
- Fail fast when pganalyze section is missing in config file
- Ignore duplicates in db_name config setting
  - Previously this could cause malformed snapshots that would be submitted
    correctly but could not be processed
- Validate db_url parsing to avoid collector crash with invalid URLs
Include pganalyze-collector-setup program (see 0.35 release notes) in supported packages
Rename <unidentified queryid> query text placeholder to <query text unavailable>
- This makes it clearer what the underlying issue is
Revert to using <truncated query> instead of <unparsable query> in some situations
- When a query is cut off due to pg_stat_activity limit being reached,
  show <truncated query>, to make it clear that increasing track_activity_query_size
  would solve the issue
Ignore I/O stats for AWS Aurora utility statements
- AWS Aurora appears to report incorrect blk_read_time and blk_write_time values
  for utility statements (i.e., non-SELECT/INSERT/UPDATE/DELETE); we zero these out for now
Fix log-based EXPLAIN bug where query samples could be dropped if EXPLAIN failed
Add U140 log event (inconsistent range bounds)
- e.g.: ERROR: range lower bound must be less than or equal to range upper bound
Fix issue where incomplete schema information in snapshots was not marked correctly
- This could lead to schema objects disappearing and being re-created
Fix trailing newline handling for GCP and self-hosted log streams
- This could lead to queries being poorly formatted in the UI, or some queries
  with single-line comments being ignored
Include additional collector configuration settings in snapshot metadata for diagnostics
Ignore "insufficient privilege" queries w/o queryid
- Previously, these could all be aggregated together yielding misleading stats

Assets 3

06 Dec 04:28

github-actions

v0.35.0

2c338a7

v0.35.0

Add new "pganalyze-collector-setup" program that streamlines collector installation
- This is initially targeted for self-managed servers to make it easier to set up
  the collector and required configuration settings for a locally running Postgres
  server
- To start, this supports the following environments:
  - Postgres 10 and newer, running on the same server as the collector
  - Ubuntu 14.04 and newer
  - Debian 10 and newer
Collector test: Show server URLs to make it easier to access the servers in
pganalyze after the test
Collector test+reload: In case of errors, return exit code 1
Ignore manual vacuums if the collector can't access pg_stat_progress_vacuum
Don't run log test for Heroku, instead provide info message
- Also fixes "Unsupported log_line_prefix setting: ' sql_error_code = %e '"
  error on Heroku Postgres
Add pganalyze system user to adm group in Debian/Ubuntu packages
- This gives the collector permission to read Postgres log files in a default
  install, simplifying Log Insights setup
Handle NULL parameters for query samples correctly
Add a skip_if_replica / SKIP_IF_REPLICA option (#117)
- You can use this to configure the collector in a no-op mode on
  replicas (we only query if the monitored database is a replica), and
  automatically switch to active monitoring when the database is no
  longer a replica.
Stop building packages for CentOS 6 and Ubuntu 14.04 (Trusty)
- Both of these systems are now end of life, and the remaining survivor
  of the CentOS 6 line (Amazon Linux 1) will be EOL on December 31st 2020.

Assets 3

08 Nov 04:51

github-actions

v0.34.0

9c2230b

v0.34.0

Check and report problematic log collection settings
- Some Postgres settings almost always cause a drastic increase in log
  volume for little actual benefit. They tend to cause operational problems
  for the collector (due to the load of additional log parsing) and the
  pganalyze service itself (or indeed, likely for any service that would
  process collector snapshots), and do not add any meaningful insights.
  Furthermore, we found that these settings are often turned on
  accidentally.
- To avoid these issues, add some client-side checks in the collector to
  disable log processing if any of the problematic settings are on.
- The settings in question are:
  - log_min_duration_statement less than 10ms
  - log_statement set to 'all'
  - log_duration set to 'on'
  - log_error_verbosity set to 'verbose'
- If any of these are set to these unsupported values, all log collection will be
  disabled for that server. The settings are re-checked every full snapshot, and can be
  explicitly re-checked with a collector reload.
Log Insights improvements
- Self-managed server: Process logs every 3 seconds, instead of on-demand
- Self-managed server: Improve handling of multi-line log events
- Google Cloud SQL: Always acknowledge Pub Sub messages, even if collector doesn't handle them
- Optimize stitching logic for reduced CPU consumption
- Explicitly close temporary files to avoid running out of file descriptors
Multiple changes to improve debugging in support situations
- Report collector config in full snapshot
  - This reports certain collector config settings (except for passwords/keys/credentials)
    to the pganalyze servers to help with debugging.
- Print collector version at beginning of test for better support handling
- Print collection status and Postgres version before submitting snapshots
- Change panic stack trace logging from Verbose to Warning
Add support for running the collector on ARM systems
- Note that we don't provide packages yet, but with this the collector
  can be built on ARM systems without any additional patches.
Introduce API system scope fallback
- This fallback is intended to allow changing the API scope, either based
  on user configuration (e.g. moving the collector between different
  cloud provider accounts), or because of changes in the collector identify
  system logic.
- The new "api_system_scope_fallback" / PGA_API_SYSTEM_SCOPE_FALLBACK config
  variable is intended to be set to the old value of the scope. When the
  pganalyze backend receives a snapshot with a fallback scope set, and there
  is no server created with the regular scope, it will first search the
  servers with the fallback scope. If found, that server's scope will be
  updated to the (new) regular scope. If not found, a new server will be
  created with the regular scope. The main goal of the fallback scope is to
  avoid creating a duplicate server when changing the scope value
Use new fallback scope mechanism to change scope for RDS databases
- Previously we identified RDS databases by their ID and region only, but
  the ID does not have to be unique within a region, it only has to be
  unique within the same AWS account in that region. Thus, adjust the
  scope to include both the region and AWS Account ID (if configured or
  auto-detected), and use the fallback scope mechanism to migrate existing
  servers.
Add support for GKE workload identity Yash Bhutwala #91
Add support for assuming AWS instance roles
- Set the role to be assumed using the new aws_assume_role / AWS_ASSUME_ROLE
  configuration setting. This is useful when the collector runs in a different
  AWS account than your database.

Assets 3

11 Sep 15:50

github-actions

v0.33.1

0f93630

v0.33.1

Ignore internal admin databases for GCP and Azure
- This avoids collecting data from these internal databases, which produces
  unnecessary errors when using the all databases setting.
Add log_line_prefix check to GCP self-test
Schema stats handling: Avoid crash due to nil pointer dereference
Add support for "%m [%p]: [%l-1] db=%d,user=%u " log_line_prefix

Assets 3

04 Sep 01:07

lfittl

v0.33.0

3f9a0d4

v0.33.0

Add helper for log-based EXPLAIN access and use if available
- This lets us avoid granting the pganalyze user any access to the data to follow
  the principle of least privilege
- See https://github.com/pganalyze/collector#setting-up-log-explain-helper
Avoid corrupted snapshots when OIDs get reused across databases
- This would have shown as data not being visible in pganalyze,
  particularly for servers with many databases where tables were
  dropped and recreated often
Locked relations: Ignore table statistics, handle other exclusive locks
- Tables being rewritten would cause the relation statistics query to
  fail due to statement timeout (caused by lock being held)
- Non-relation locks held in AccessExclusiveLock mode would cause all
  relation information to disappear, but only for everything thats not
  the top-level relation information. This is due to the behaviour of
  NOT IN when the list contains NULLs (never being true, even if an
  item doesn't match the list). The top-level relation information
  was using a LEFT JOIN that doesn't suffer from this problem. This likely
  caused problems reported as missing index information, or indices
  showing as being recently created even though they've exited for a
  while.
Improvements to table partitioning reporting
Enable additional settings to work correctly when used in Heroku/Docker
- DB_NAME
- DB_SSLROOTCERT_CONTENTS
- DB_SSLCERT_CONTENTS
- DB_SSLKEY_CONTENTS

Assets 3

17 Aug 03:21

lfittl

v0.32.0

8db95aa

v0.32.0

Add ignore_schema_regexp / IGNORE_SCHEMA_REGEXP configuration option
- This is like ignore_table_pattern, but pushed down into the actual
  stats-gathering queries to improve performance. This should work much
  better on very large schemas
- We use a regular expression instead of the current glob-like matching
  since the former is natively supported in Postgres
- We now warn on usage of the deprecated ignore_table_pattern field
Add warning for too many tables being collected (and recommend ignore_schema_regexp)
Allow keeping of unparsable query texts by setting filter_query_text: none
- By default we replace everything with <unparsable query> (renamed
  from the previous <truncated query> for clarity), to avoid leaking
  sensitive data that may be contained in query texts that couldn't be
  parsed and that Postgres itself doesn't mask correctly (e.g. utility
  statements)
- However in some situations it may be desirable to have the original
  query texts instead, e.g. when the collector parser is outdated
  (right now the parser is Postgres version 10, and some newer Postgres 12
  query syntax fails to parse)
- To support this use case, a new "filter_query_text" / FILTER_QUERY_TEXT
  option is introduced which can be set to "none" to keep all query texts.
EXPLAIN plans / Query samples: Support log line prefix without %d and %u
- Whilst not recommended, in some scenarios changing the log_line_prefix
  is difficult, and we want to make it easy to get EXPLAIN data even in
  those scenarios
- In case the log_line_prefix is missing the database (%d) and/or the user
  (%u), we simply use the user and database of the collector connection
Log EXPLAIN: Run on all monitored databases, not just the primary database
Add support for stored procedures (new with Postgres 11)
Handle Postgres error checks using Go 1.13 error helpers
- This is more correct going forward, and adds a required type check for
  the error type, since the database methods can also return net.OpError
- Fixes "panic: interface conversion: error is *net.OpError, not *pq.Error"
Collect information on table partitions
- Relation parents as well as partition boundary (if any)
- Partitionining strategy in use
- List of partitioning fields and/or expression
Log Insights: Track TLS protocol version as a log line detail
- This allows verification of which TLS versions were used to connect to the
  database over time
Log Insights: Track host as detail for connection received event
- This allows more detailed analysis of which IPs/hostnames have connected
  to the database over time
Example collector config: Use collect all databases option in the example
- This improves the chance that this is set up correctly from the
  beginning, without requiring a backwards incompatible change in the
  collector

Assets 3

24 Jun 04:46

lfittl

v0.31.0

de9f64b

v0.31.0

Add Log Insights support for Azure Database for PostgreSQL
Log Insights: Avoid unnecessary "Timeout" error when there are other failures
Log EXPLAIN: Don't run EXPLAIN logic when there are no query sample
Improve non-fatal error messages to clarify the collector still works
Log grant failure: Explain root cause better (plan doesn't support it / fair use limit reached)

Assets 3

12 Jun 15:57

lfittl

v0.30.0

63e8e69

v0.30.0

Track local replication lag in bytes
RDS: Handle end of log files correctly
High-frequency query collection: Avoid race condition, run in parallel
- This also resolves a memory leak in the collector that was causing
  increased memory usage over time for systems that have a lot of
  pg_stat_statements query texts (causing the full snapshot to take
  more than a minute, which triggered the race condition)

Assets 3

02 Jun 09:32

lfittl

v0.29.0

9e3a8b7

v0.29.0

Package builds: Use Golang 1.14.3 patch release
- This fixes golang/go#37436 which was causing
  "mlock of signal stack failed: 12" on Ubuntu systems
Switch to simpler tail library to fix edge case bugs for self-managed systems
- The hpcloud library has been unmaintained for a while, and whilst
  the new choice doesn't have much activity, in tests it has shown
  to work better, as well as having significantly less lines of code
- This also should make "--test" work reliably for self-managed systems
  (before this returned "Timeout" most of the time)
Index statistics: Don't run relation_size on exclusively locked indices
- Previously the collector was effectively hanging when it encountered an
  index that has an ExclusiveLock held (e.g. due to a REINDEX)
Add another custom log line prefix: "%m %r %u %a [%c] [%p] "
RDS fixes
- Fix handling of auto-detection of AWS regions outside of us-east-1
- Remember log marker from previous runs, to avoid duplicate log lines
Add support for Postgres 13
- This adds support for running against Postgres 13, which otherwise breaks
  due to backwards-incompatible changes in pg_stat_statements
- Note that there are many other new statistics views and metrics that
  will be added separately

Assets 3

19 May 07:58

lfittl

v0.28.0

954694c

v0.28.0

Add "db_sslkey" and "db_sslcert" options to use SSL client certificates
Add Ubuntu 20.04 packages
Update to Go 1.14, latest libpq
Ensure that we set system type correctly for Heroku full snapshots
Detect cloud providers based on hostnames from DB_URL / db_url as well
- Previously this was only detected for the DB_HOST / db_host setting, and that is unnecessarily restrictive
- Note that this means your instance may show up under a new ID in pganalyze after upgrading to this version
Log Explain
- Ignore pg_start_backup queries
- Support EXPLAIN for queries with parameters
Log Insights improvements
- Experimental: Google Cloud SQL log download
- Remove unnecessary increment of log line byte end position
- Make stream-based log processing more robust
Add direct "http_proxy" & similar collector settings for Proxy config
- This avoids problems in some environments where its not clear whether
  the environment variables are set. The environment variables HTTP_PROXY,
  http_proxy, HTTPS_PROXY, https_proxy, NO_PROXY and no_proxy continue to
  function as expected.
Fix bug in handling of state mutex in activity snapshots
- This may have been the cause of "unlock of unlocked mutex" errors
  when having multiple servers configured.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: pganalyze/collector

v0.36.0

v0.35.0

v0.34.0

v0.33.1

v0.33.0

v0.32.0

v0.31.0

v0.30.0

v0.29.0

v0.28.0