Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for refreshKeys #12156

Merged
merged 35 commits into from Jul 7, 2022

Conversation

kissmikijr
Copy link
Contributor

@kissmikijr kissmikijr commented Jun 20, 2022

Hey, I just made a Pull Request!

This is a potential implementation of the #12155 Nothing set in stone, would like some feedback on it!

✔️ Checklist

  • A changeset describing the change and affected packages. (more info)
  • Added or updated documentation
  • Tests for new functionality and regression tests for bug fixes
  • Screenshots attached (for UI changes)
  • All your commits have a Signed-off-by line in the message. (more info)

@kissmikijr kissmikijr requested a review from a team as a code owner June 20, 2022 19:59
@github-actions github-actions bot added awaiting-review area:catalog Related to the Catalog Project Area labels Jun 20, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jun 20, 2022

Changed Packages

Package Name Package Path Changeset Bump Current Version
@backstage/plugin-catalog-backend plugins/catalog-backend minor v1.2.1-next.2

@github-actions
Copy link
Contributor

Thanks for the contribution!
All commits need to be DCO signed before they are reviewed. Please refer to the the DCO section in CONTRIBUTING.md or the DCO status for more info.

Comment on lines 543 to 553
await Promise.all(
keys.map(k => {
return tx<DbRefreshKeysRow>('refresh_keys')
.insert({
entity_ref: stringifyEntityRef(k.entity),
key: k.key,
})
.onConflict(['entity_ref', 'key'])
.ignore();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to perform this with tx.batchInsert when I want to use an onConflict. Would it be better to check first if they exist and only do an insert if don't?

*/

exports.up = async function up(knex) {
await knex.schema.createTable('refresh_keys', table => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this table should have a dedicated id field ?

Copy link
Member

@jhaals jhaals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

Not sure if I should comment on the RFC or the PRFC might be good to combine them next time 😅

const tx = txOpaque as Knex.Transaction;
const { keys } = options;

await Promise.all(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we want to create a new set of refresh keys for the given entity in the database so that this method actually replacing the existing keys for that entity to not get into a situation where we potentially have an ever growing list of keys.

Given that this is probably better named setRefreshKeys

@@ -81,6 +81,14 @@ export type ReplaceUnprocessedEntitiesOptions =
type: 'delta';
};

export type RefreshKeyOptions = {
keys: { key: String; entity: Entity }[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be entityRef instead of a full entity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be, however in this case it would mean we'd need to emit the entityRefs from the processors, so the entityRef construction would happen inside all of the processors. With the current implementation they just emit an entity, and in the processing engine we can construct the refs based on the emitted entity.

I'm fine with both of them, maybe leaning towards the emission of entityRefs just for the sake of reducing the data thatis sent :D

@kissmikijr
Copy link
Contributor Author

kissmikijr commented Jun 22, 2022

Not sure if I should comment on the RFC or the PRFC might be good to combine them next time 😅

Yeah I agree, I'll definietly combine them next time, it is a littlebit awkward this way. :/

@kissmikijr kissmikijr changed the title [PRFC] - add support fot refreshKeys [PRFC] - add support for refreshKeys Jun 22, 2022
@kissmikijr
Copy link
Contributor Author

@jhaals With this my intention was to understand the problem space and try to explore the way it can be done, if you think in a general sense it looks okey, I'd move forward with adding tests, and making sure it passes the ci and move it to a regular PR.

Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
@kissmikijr kissmikijr force-pushed the store-refresh-keys-to-entities branch from 6f915db to 3125f54 Compare June 28, 2022 08:23
@kissmikijr kissmikijr requested a review from jhaals June 28, 2022 21:53
@benjdlambert benjdlambert added this to In progress in Mammalian Mathematician Jun 29, 2022
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
@kissmikijr kissmikijr requested a review from Rugvip July 5, 2022 22:33
Comment on lines 138 to 146
processingResult.refresh(
`url:${relativeUrl({
key: resolverKey,
value: resolverValue,
baseUrl: location.target,
read,
resolveUrl,
})}`,
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is gonna work for a lot of resolver implementations, but not all. I think it's worth an investigation into what it would mean to let the resolver return or emit refresh keys. Think for example an OpenAPI-aware resolver that follows $ref URLs that we then add to the set of refresh keys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, that's gonna be more future-proof for sure! Let me see how would it look.

},
})) {
emit(parseResult);
emit(
processingResult.refresh(
`${LOCATION_TYPE}://${normalizedFilePath}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for flip-flopping, but we're leaning back towards using the serialized location format for refresh keys after all. Feels like that's the more future proof design that will let us add new behavior without awkwardness.

Think for example a refresh key that simply references a GitHub repo, we'd probably do something like github-repo:https://github.com/backstage/backstage for that since we need pretty much all parts of that URL anyway. If we just use the scheme it would need to either be the bare https://github.com/backstage/backstage, but there's a potential risk for overlap there. Another more explicit option is then github-repo://github.com/backstage/backstage, but that's where it starts getting real awkward, especially of we have values that don't really fit as URLs.

So with that, let's switch this back to file:<path> for now, as in drop the // 😅

@@ -93,6 +93,8 @@ export class UrlReaderProcessor implements CatalogProcessor {
value: parseResults as CatalogProcessorEntityResult[],
});
}

emit(processingResult.refresh(`${location.type}:${location.target}`));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So right now leaning towards what's currently here, the same <type>:<target> format as we use for locations.

Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
@kissmikijr kissmikijr requested a review from a team as a code owner July 6, 2022 14:34
Signed-off-by: Kiss Miklos <miklos@roadie.io>
@kissmikijr
Copy link
Contributor Author

@Rugvip Addressed the comments, opted to add the emit to the resolvers as the easiest solution right now. I can see some tests failed, but they pass locally, can it be that it is flaky?

@kissmikijr kissmikijr requested a review from Rugvip July 6, 2022 15:10
Copy link
Member

@Rugvip Rugvip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling we're getting pretty close to :shipit: 😁

@@ -567,6 +578,7 @@ export type PlaceholderResolverParams = {
baseUrl: string;
read: PlaceholderResolverRead;
resolveUrl: PlaceholderResolverResolveUrl;
emit: CatalogProcessorEmit;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, think this makes sense! 👍

@freben could you have a look as well to make sure you think this fits alright with how the placeholder processor is put together?

I'm thinking we should also call out in the changeset that custom placeholder resolvers should be updated to emit refresh keys.

Comment on lines 76 to 77
refresh(options: RefreshOptions): Promise<void>;
refreshByRefreshKeys(options: RefreshByRefreshKeysOptions): Promise<void>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making this an addition to the existing RefreshOptions rather than a new method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we won't expose the refreshKeys on the DefualtRefreshService we won't need to add the options to the existing RefreshOptions either, so I'll hold on to this and do it based on how we decide.

Comment on lines 52 to 56
async refreshByRefreshKeys(options: RefreshByRefreshKeysOptions) {
await this.database.transaction(async tx => {
await this.database.refreshByRefreshKeys(tx, options);
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we need to do something different here tbh, just passing through without authorization is pretty unexpected. What I'm thinking is that we'd keep refreshing by keys internal, so it's not a thing you can request via the API. The only way to refresh via keys would be some form of direct runtime access for example via the entity providers API.

Have you considered this step yet? didn't see it in the RFC. How do you end up wanting to pipe refresh data into the catalog? Maybe we could even skip this part in the PR and leave it to a followup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree my initial idea was to just expose it on the refresh service and have its own endpoint. I can accept that we do not want to do this.

The other thing that seems kinda easy is to add the refreshing to the EntityProvider, they have access to the DefaultProcessingDatabase so should be easy to expose a new method that calls the appropriate refresh function. However, I don't think EntityProviders should handle refreshes, for me they should care about creating/deleting the entities. So I'd probably not put this behaviour on to the EntityProviders.

I think refreshing belongs to the catalog. So I am thinking about something like that the CatalogBuilder could return a refresh function and then everyone could decide how they want to use that to refresh their stuff.

stg like:

export default async function createPlugin(
  env: PluginEnvironment,
): Promise<Router> {
  const builder = await CatalogBuilder.create(env);
  builder.addProcessor(new ScaffolderEntitiesProcessor());
  const { processingEngine, router, refresh } = await builder.build();
  
  await processingEngine.start();
  
  refresh(['refreshKey-1', 'refreshKey-2'])

  return router;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually removed the possibility to trigger refreshes via the RefreshService to be able to tackle it in a separate PR if we decide it that way.

@kissmikijr kissmikijr changed the title [PRFC] - add support for refreshKeys feat: add support for refreshKeys Jul 7, 2022
@Rugvip Rugvip added the needs discussion Bring up for discussion during next sync label Jul 7, 2022
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Signed-off-by: Kiss Miklos <miklos@roadie.io>
Mammalian Mathematician automation moved this from Review in progress to Reviewer approved Jul 7, 2022
Copy link
Member

@Rugvip Rugvip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! 👍

Shall we :shipit: as is and leave the triggering of refreshes as a followup?

I think the builder addition could work, especially for use-cases like your own. Looking at #11611 this is very likely to be an extension point though, so I almost doesn't matter how we implement it for now as it'll be migrated to that anyway.

I would like us to explore the EntityProvider approach too though, making it possible for a provider to trigger refreshes through the connection. Or maybe even a new form of abstraction like some form of refresh sources

@kissmikijr
Copy link
Contributor Author

Shall we :shipit: as is and leave the triggering of refreshes as a followup?

I'd say so let's merge this one and I'll create a follow-up issue where we can discuss the different approaches for triggering the refreshes. I can explore both and create some PRs with the different approaches to discuss which fits better.

@Rugvip Rugvip merged commit 9a03e9c into backstage:master Jul 7, 2022
Mammalian Mathematician automation moved this from Reviewer approved to Done Jul 7, 2022
@Rugvip
Copy link
Member

Rugvip commented Jul 7, 2022

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:catalog Related to the Catalog Project Area needs discussion Bring up for discussion during next sync
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants