Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time resource shouldn't share version history #2908

Closed
vito opened this issue Dec 4, 2018 · 12 comments · Fixed by #3133
Closed

Time resource shouldn't share version history #2908

vito opened this issue Dec 4, 2018 · 12 comments · Fixed by #3133
Assignees
Labels
enhancement release/documented Documentation and release notes have been updated. resiliency
Milestone

Comments

@vito
Copy link
Member

vito commented Dec 4, 2018

What challenge are you facing?

With #2386 we identified that the time resource would be adversely affected by a shared history, because it'll fire off a ton of builds at once if everyone configures the same interval.

What would make this better?

In concourse/rfcs#1 we're planning to make this feature (shared history) opt-out, by allowing the resource's info script to include some configuration that disables it.

We don't want to do all of concourse/rfcs#1 just yet, so for now let's just hack in some way for v1 resources to opt-out.

Implementation note discussed with @clarafu: this could work by generating a random hash for resources that have opted-out and using it to generate a unique resource config for the resource. This would allow everything else to just keep working as-is (versions remain tied to resource configs, and we'll just have different resource configs).

@clarafu
Copy link
Contributor

clarafu commented Jan 2, 2019

After lots of discussion, we have decided to change the database schema in a few different ways than originally intended. For this current version, we will no longer be using the /info script to determine whether or not the resource should have shared version history, but rather use the resource_metadata.json file (which already exists) within each resource type's tarball to return back a unique_version_history field along with the existing fields (such as privileged). We might want to implement the /info for the resources V2 epic but we didn't want to worry about that for now because it might involve more architecture changes. One thing to note is that the worker version will need to be bumped in order to make sure that all the workers will register with latest change.

We will then save the boolean value into the unique_version_history column in the base_resource_types table. This introduces a little bit of weirdness due to the fact that the latest worker that gets registered will set the unique_version_history value.

A unique_versions_resource_id column will be added to the resource_configs table in order to mark resource configs that have unique version history. If the unique_versions_resource_id column is null, resources that point to that resource config will have shared version history. If it is not null, that resource_config will belong uniquely to the resource_id specified in the unique_versions_resource_id column (meaning unique versions).

ToDOO:

  • add migration for the two additional columns (unique_versions_resource_id foreign key to resources table id column ON DELETE CASCADE and unique_version_history boolean) and a unique constraint for source and unique_versions_resource_id for the resource_configs table
  • populate the unique_version_history from the WorkerResourceType (from resource_metadata.json)
  • findOrCreate in resource_configs.go, use the unique_version_history in order to determine if the versions will be shared by creating a new resource config for each resource id
  • add the option to specify unique_version_history for a resource in the pipeline config
  • add the unique_version_history field to the time-resource's resource-metadata.json
  • bump the worker version

@vito vito added this to the Resources v2 milestone Jan 7, 2019
@clarafu
Copy link
Contributor

clarafu commented Jan 7, 2019

The feature was implemented with the unique_version_history property on the resource configuration, but the issue asks for it to be on the resource_type. This will come easy because when we construct the resource config for a resource that uses a uniquely configured resource_type, we can construct the resource config descriptor with it's unique value from the resourceTypes object.

@topherbullock
Copy link
Member

Here's how it works now:

  • base resources can be marked as having unique (non-global) version histories
  • specific resource types in users' pipelines can be marked as having unique_version_history (similar to privileged)

TODO:

  • look into moving the uniqueness of the version history down to the resource_config_versionstable; currently this is on the resource_config, and this causes us to need to pass this around to all the places we deal with versions.

@vito
Copy link
Member Author

vito commented Jan 9, 2019

@clarafu and I chatted about moving uniqueness down to the versions table rather than on the resource config. This would make it easier to feature-flag this (#3013) without causing an increase in check containers (caused by everyone having unique resource configs).

We initially proposed just adding a resource_id column to resource_config_versions, but thinking about it again we would need to be careful to know when to use unique vs. when to use global versions in the various queries that run against the resource_config_versions table. For example, if a resource is unique, the radar and scheduler will need to always look for resource_id = ?, but if not, it'll always need to look for resource_id IS NULL. The trouble is the scheduler doesn't inherently know whether it's unique; that's determined by the radar.

We didn't have this problem when the uniqueness was on resource_configs because the resource pointed directly to its already-unique resource config.

Maybe we need that join table after all? 🤔 Then the resource would just point to its entry in the join table, and versions would point back to it.

resource_version_ownership:

| id  | resource_config_id | resource_id |
| 123 | 456                | 789         | # for unique
| 101 | 112                | NULL        | # for global

@vito
Copy link
Member Author

vito commented Jan 9, 2019

idk

@vito
Copy link
Member Author

vito commented Jan 9, 2019

The join table kind of scares me because we would need a guarantee that there's only one NULL entry for a given resource config. And I don't think UNIQUE constraints work on NULL values. (We would have to make sure.)

@topherbullock
Copy link
Member

topherbullock commented Jan 9, 2019

Would we need the null values for each global resource config? Could we check that table before defaulting to global versions for a given resource config?

@vito
Copy link
Member Author

vito commented Jan 9, 2019

@topherbullock Yeah, the thinking was that each resource config would have an entry with NULL for resource_id for the global versions. But there should only ever be one, so we'd need upsert semantics.

Not sure what you mean by defaulting to global versions

@vito
Copy link
Member Author

vito commented Jan 9, 2019

And I don't think UNIQUE constraints work on NULL values.

Looks like this could be done with partial indexes. I think something like this would work:

CREATE UNIQUE INDEX resource_version_ownership_global 
ON resource_version_ownership (resource_config_id)
WHERE resource_id IS NULL;

CREATE UNIQUE INDEX resource_version_ownership_resource_id 
ON resource_version_ownership (resource_config_id, resource_id)
WHERE resource_id IS NOT NULL;

@clarafu
Copy link
Contributor

clarafu commented Jan 10, 2019

A summary of where this issue is going now is that we will be adding a join table named resource_version_ownershipwhich has three columns: id, resource_id and resource_config_id. It will have a row for every resource config that is shared between resources and a row for every resource config that is unique for every resource. An example of the table would look like this:

resource_version_ownership:

| id  | resource_config_id | resource_id |
| 123 | 456                | 789         | # for unique
| 101 | 112                | NULL        | # for global

This join table will allow the creation of resource configs to be directly dependent on a pipeline resource. The population of this table will happen when we SetResourceConfig during radar. Resource config versions will then point to resource_version_ownership_ids instead of resource_config_ids because versions will be dependent on the ownership of each resource config.

Resource types versioning will also be affected because we currently save custom resource type's version history (since we treat resource types the same as resources in terms of them both having a resource config and resource config versions). But it does not make much sense to have resource types be able to have unique version history (it would be a very weird situation like this?

resource_types:
- name: some-type
  type: docker
  source: ...
  unique_version_history: true

- name: some-other-type
  type: some-type
  source: ...

) but since users can't even see (or probably don't even know that resource types have its own version history), we will just have resource types to always default to shared.

Todo list:

  • add migration for creating table
  • figure out how to consolidate unique between base resource type and custom types
  • work on SetResourceConfig to Select if ownership row exists, and if it does and the resource_config_id is different than the current on, delete it then insert. If it does not exist insert it.
  • make resource_config_versions point to resource_version_ownership_id instead of resource_config_ids (Changes to resource_config.go (SaveVersions, FindVersion, etc), pipeline.go (LoadVersionsDB), etc)

@jchesterpivotal
Copy link
Contributor

jchesterpivotal commented Jan 18, 2019

My considered view is that NULLs are satan, or at least have satan on speed dial. There's no way to prove that a NULL wasn't inserted by buggy code and no way to repair the data if that's the case. Plus they cause wonkiness and edge cases in all sorts of queries and query plans.

(Disclaimer: Based on just passing by, I don't have deep context etc etc)

@topherbullock
Copy link
Member

@jchesterpivotal I think as long as the schema ensures that combinations of values ( assuming NULL is a "value" ) always amount to a sensible state by using the indexes @vito mentioned above, and constraints where necessary, we'll maintain the integrity of the data and mitigate most of your (very valid) concerns.

We embrace NULL for GC of containers and volumes, where a foreign key constraint acts as a reference counter for a row, and when the set of reference fields is NULL we know it can be removed.

I'm satisfied so long as that NULL is useful, and the guard rails are there in the schema; but the devil is often in the details.

vito added a commit that referenced this issue Jan 29, 2019
Finish up tasks before v5.0.0
@vito vito added release/no-impact This is an issue that never affected released versions (i.e. a regression caught prior to shipping). release/documented Documentation and release notes have been updated. and removed release/no-impact This is an issue that never affected released versions (i.e. a regression caught prior to shipping). labels Feb 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement release/documented Documentation and release notes have been updated. resiliency
Projects
None yet
5 participants