Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug Report: on config reload the database connection is not reestablished with new dynamic credentials #17466

Open
2 tasks done
jgrumboe opened this issue Apr 20, 2023 · 12 comments
Labels
bug Something isn't working help wanted Help/Contributions wanted from community members

Comments

@jgrumboe
Copy link
Contributor

jgrumboe commented Apr 20, 2023

📜 Description

We use Hashicorp Vault to create dynamic database credentials for backstage and consul-template to write the app-config.yaml file containing these.
Whenever consul-template fetches new credentials and writes them to app-config.yaml, backstage catches the changes and reloads the config files, but the database connection is not reestablished with the new credentials. That leads to "error: password authentication failed" errors in processing the entities because the database connection still uses the old credentials (fetched when the process was started).

I found issue #7213, where someone described the same use case, and it was closed by #9608. This is why I thought I should be working.
In our case, the credentials are directly in the main app-config.yaml, not in an $include file.

👍 Expected behavior

After reloading config files and recognizing that credentials are changing, I would expect the database connection to be newly initialized.

👎 Actual Behavior with Screenshots

I searched for the current database username in the app-config.yaml, once right after startup.
image

So, "v-admin-di-backstag-VkCjzxAQSiymMsuCStdB-1681996840" was the initial username.

Then I waited for the TTL to expire (after 20 minutes for testing purpose) and reread the config file.
image

The username changed to "v-admin-di-backstag-7O3BEEOx8mzoxSUtAF3h-1681999388".

Then I checked the loglines of the backstage container.
2023-04-20T15:49:05.822+02:00 | {"dd":{"env":"staging","service":"backstage","version":"20230420-1.12.1-17f53315"},"level":"info","message":"Found 4 new secrets in config that will be redacted","service":"backstage"}
indicates the reload of the config. (Happens regularly in the logfile.)

2023-04-20T16:10:17.545+02:00 | {"code":"28P01","dd":{"env":"staging","service":"backstage","version":"20230420-1.12.1-17f53315"},"file":"auth.c","length":147,"level":"warn","line":"338","message":"Processing of group:default/rbmh_digital_video_player_engineer failed password authentication failed for user \"v-admin-di-backstag-VkCjzxAQSiymMsuCStdB-1681996840\"","name":"error","plugin":"catalog","routine":"auth_failed","service":"backstage","severity":"FATAL","stack":"error: password authentication failed for user \"v-admin-di-backstag-VkCjzxAQSiymMsuCStdB-1681996840\"\n at Parser.parseErrorMessage (/usr/src/app/node_modules/pg-protocol/dist/parser.js:287:98)\n at Parser.handlePacket (/usr/src/app/node_modules/pg-protocol/dist/parser.js:126:29)\n at Parser.parse (/usr/src/app/node_modules/pg-protocol/dist/parser.js:39:38)\n at Socket.<anonymous> (/usr/src/app/node_modules/pg-protocol/dist/index.js:11:42)\n at Socket.emit (node:events:513:28)\n at Socket.emit (/usr/src/app/node_modules/dd-trace/packages/datadog-instrumentations/src/net.js:61:25)\n at addChunk (node:internal/streams/readable:315:12)\n at readableAddChunk (node:internal/streams/readable:289:9)\n at Socket.Readable.push (node:internal/streams/readable:228:10)\n at TCP.onStreamRead (node:internal/stream_base_commons:190:23)","type":"plugin"}
indicates that still the old username is in use.

👟 Reproduction steps

If you're not using Hashicorp Vault to generate dynamic credentials, it may also work if you have a database with two sets of usernames/credentials for backstage.
Starting backstage with the first set of credentials in the app-config.yaml, and changing the config file to the second set after some minutes.

📃 Provide the context for the Bug.

No response

🖥️ Your Environment

OS: Darwin 22.4.0 - darwin/x64
node: v16.18.0
yarn: 1.22.19
cli: 0.22.5 (installed)
backstage: 1.12.1

Dependencies:
@backstage/app-defaults 1.2.1
@backstage/backend-app-api 0.4.1
@backstage/backend-common 0.13.5, 0.18.3
@backstage/backend-dev-utils 0.1.1
@backstage/backend-plugin-api 0.5.0
@backstage/backend-tasks 0.5.0
@backstage/catalog-client 1.4.0
@backstage/catalog-model 1.2.1
@backstage/cli-common 0.1.12
@backstage/cli 0.22.5
@backstage/config-loader 1.1.9
@backstage/config 1.0.7
@backstage/core-app-api 1.6.0
@backstage/core-components 0.12.5
@backstage/core-plugin-api 1.5.0
@backstage/errors 1.1.5
@backstage/eslint-plugin 0.1.2
@backstage/integration-aws-node 0.1.2
@backstage/integration-react 1.1.11
@backstage/integration 1.4.3
@backstage/plugin-api-docs 0.9.1
@backstage/plugin-app-backend 0.3.43
@backstage/plugin-auth-backend 0.18.1
@backstage/plugin-auth-node 0.2.12
@backstage/plugin-catalog-backend-module-github 0.2.6
@backstage/plugin-catalog-backend-module-msgraph 0.5.2
@backstage/plugin-catalog-backend 1.8.0
@backstage/plugin-catalog-common 1.0.12
@backstage/plugin-catalog-graph 0.2.28
@backstage/plugin-catalog-import 0.9.6
@backstage/plugin-catalog-node 1.3.4
@backstage/plugin-catalog-react 1.4.0
@backstage/plugin-catalog 1.9.0
@backstage/plugin-events-node 0.2.4
@backstage/plugin-github-actions 0.5.16
@backstage/plugin-home 0.4.32
@backstage/plugin-org 0.6.6
@backstage/plugin-permission-common 0.7.4
@backstage/plugin-permission-node 0.7.6
@backstage/plugin-permission-react 0.4.11
@backstage/plugin-proxy-backend 0.2.37
@backstage/plugin-scaffolder-backend 1.12.0
@backstage/plugin-scaffolder-common 1.2.6
@backstage/plugin-scaffolder-node 0.1.1
@backstage/plugin-scaffolder-react 1.2.0
@backstage/plugin-scaffolder 1.12.0
@backstage/plugin-search-backend-module-pg 0.5.4
@backstage/plugin-search-backend-node 1.1.4
@backstage/plugin-search-backend 1.2.4
@backstage/plugin-search-common 1.2.2
@backstage/plugin-search-react 1.5.1
@backstage/plugin-search 1.1.1
@backstage/plugin-tech-radar 0.6.2
@backstage/plugin-techdocs-backend 1.6.0
@backstage/plugin-techdocs-module-addons-contrib 1.0.11
@backstage/plugin-techdocs-node 1.6.0
@backstage/plugin-techdocs-react 1.1.4
@backstage/plugin-techdocs 1.6.0
@backstage/plugin-user-settings 0.7.1
@backstage/release-manifests 0.0.9
@backstage/test-utils 1.2.6
@backstage/theme 0.2.18
@backstage/types 1.0.2
@backstage/version-bridge 1.0.3

👀 Have you spent some time to check if this bug has been raised before?

  • I checked and didn't find similar issue

🏢 Have you read the Code of Conduct?

Are you willing to submit PR?

None

@jgrumboe jgrumboe added the bug Something isn't working label Apr 20, 2023
@jgrumboe
Copy link
Contributor Author

#7466 is probably linked and has an idea about how knex could be configured to support credential rotation.

@freben freben added the help wanted Help/Contributions wanted from community members label Apr 21, 2023
@freben
Copy link
Member

freben commented Apr 21, 2023

Borderline feature request rather than bug - it was never properly implemented to reestablish a connection pool when credentials changed :) I put a help wanted label on it for now.

Agreed that #7466 seems to have the right idea. Maybe coupled with some config.subscribe in the DatabaseManager class to react only if the database part of the config actually had any changes in it.

@Rugvip
Copy link
Member

Rugvip commented Apr 24, 2023

There's some existing code that does something similar that might be used as a reference: https://github.com/backstage/backstage/blob/master/packages/backend-common/src/database/connectors/sqlite3.ts

Simples might be to refactor the config loading to be done on-demand inside the connection config callback function. No need to actually observe for config changes then either I would think.

@jgrumboe
Copy link
Contributor Author

@Rugvip Could you point me to the line of code you're referring to? Maybe because I'm a real beginner in Typescript, I can't find anything looking like "reloading"/"subscribing" in the sqlite code.

@freben
Copy link
Member

freben commented May 15, 2023

@jgrumboe Note how this function is passed in as the connection, instead of a string or {} literal. I think the idea is that that function is called when requesting a connection, and there might be your chance to read the config all over again there every single time if you like.

guentherwieser added a commit to redbullmediahouse/backstage that referenced this issue Jun 20, 2023
Note: Fails to work correctly as the config provided to Knex is based on a deep copy
Signed-off-by: Günther Wieser <guenther.wieser@creative-it.com>
@guentherwieser
Copy link

guentherwieser commented Jun 20, 2023

Hi!

I've added a PR to show what we have tried so far.

We were able to detect config changes and invalidate existing Knex connections. Knex then creates new connections. Unfortunately, due to certain limitations which I will describe later, these new connections still are based on the original config.

Reasons:

  • The config supplied to the connection factory (connection.ts) from DatabaseManager.ts is a full copy of the config instance passed to the DatabaseManager - but it lacks some fundamental features like being able to subscribe to changes, and it won't get updated to the new config values when the config changes
  • This config is passed on to the database-specific connection factory, e.g. postgres.ts - thus, we can tell Knex that the connection it has created should be dropped and a new one should be created, but we cannot easily push the new config into this part of the code
  • The reason why we cannot push the changed config into the code easily is that DatabaseManager, connection factory, and database specific connection factory do A LOT OF STUFF with the connection settings

These code pieces create plugin-specific config copies if needed, cache the factories for these plugins, merge certain pre-defined properties with the ones from the user-created config, and on top of that execute database-specific code (e.g. to use a specific role with Postgres).

Long story short: to support the reload of the connection config, we would need to rewrite the DatabaseManager, the connection factory, and the database-specific factories so that all these property mergings and code executions that are needed to correctly setup a new connection, are decoupled from the Knex-specific parts. So the factories wouldn't eventually create the Knex database object but would provide all means so that we can send a Knex config to the Knex factory that allows us to retrieve a new connection based on the current config.

How Knex can be set up to dynamically retrieve the current connection config, and how the invalidation of an existing connection works, is described here (below the database-specific examples): https://knexjs.org/guide/#configuration-options

We have added these already in the code of the PR, so the connection config for Knex is already an executed function instead of properties, and the expirationChecker() function is already working (currently, in the PR it is not based on config changes but more or less a timeout). But as said, the connection function will always create the "same" connection again and again as currently there is no practical way to have access (in a meaningful way with all the magic that needs to be applied) to the changed config object.

I know that's a lot :-) My question would be if anyone sees an easier way, or has any other valuable input on our approach?

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 20, 2023
@mbenson
Copy link
Contributor

mbenson commented Oct 20, 2023

Despite the lack of progress, can we agree that the issue itself is valuable, persists and is therefore not "stale" per se? All of which to say I hope the maintainers will again remove the stale label that the poor, innocent bot applied.

@github-actions github-actions bot removed the stale label Oct 20, 2023
freben added a commit that referenced this issue Nov 9, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to config.ts. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 9, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to config.ts. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 10, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to the config folder. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 10, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to the config folder. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 10, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to the config folder. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 11, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to the config folder. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
freben added a commit that referenced this issue Nov 17, 2023
… to move toward #17466 it became clear that the entire config reading layer needed to be concentrated in one place. So this is what this pull request does! Nothing functionally changed; it just starts out by moving all of the config parsing in DatabaseManager out to the config folder. I'll continue on this journey, but the PR would be much too horribly large if this wasn't split into parts :)

Signed-off-by: Fredrik Adelöw <freben@gmail.com>
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 19, 2023
@vinzscam vinzscam removed the stale label Dec 21, 2023
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Feb 19, 2024
@tudi2d tudi2d removed the stale label Feb 22, 2024
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Apr 22, 2024
@freben freben removed the stale label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Help/Contributions wanted from community members
Projects
None yet
Development

No branches or pull requests

8 participants