Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create values-schema mechanism to update values to conform to schema changes #259

Closed
7 of 28 tasks
0-sv opened this issue Dec 22, 2020 · 12 comments · Fixed by #684
Closed
7 of 28 tasks

Create values-schema mechanism to update values to conform to schema changes #259

0-sv opened this issue Dec 22, 2020 · 12 comments · Fixed by #684
Assignees
Labels
Story Scrum story

Comments

@0-sv
Copy link
Contributor

0-sv commented Dec 22, 2020

User story

As a developer making breaking changes to the values-schema, I want to have a mechanism that will copy the corresponding values to their new locations.

Problem

We break the schema when:

  1. We move properties around (more frequent)
  2. We delete props (more frequent)
  3. We mutate (change property type or value shape, less frequent)

otomi-core should have a mechanism that migrates values to conform to the new spec.

Proposed solution

A schema will get a version (semver, patches are additions, minors are breaking changes) and a changes property holding information about breaking changes:

  1. The new location (if changed) by mapping old to new
  2. The list of deleted props
  3. The new type (if changed) by giving a go template for transformation (see schema change below)

Workflow:

Every time a developer changes the schema which includes these changes as DOD, the otomi migrate-values command should be performed that will massage the values to migrate to the new structure.

Schema change:

Sample part:

changes:
  - version: 0.23.7
    locations:
      charts.bla.someProp: someNewRootProp.someProp
    deletions: 
      - charts.bla.someOtherProp
    mutations:
      # image tag went from semver to glob
      charts.bla.image.tag: 'printf "v%s"'
  - version: 0.23.8

Tasks:

core:

  • Enrich values-schema.yaml
    • changes property
    • a lint function to ensure that making a breaking change will be checked before commit
  • otomi migrate-values-rev to show all history from previous versions from version control.
  • Create otomi migrate-values script that migrates otomi-values forward based on changes defined, and sets latest version in values.
    • Connect otomi-tasks to bin/migrate-values.sh and bin/otomi migrates-values
    • Load otomi-values file as json with fileName as argument to start modification per file
    • Create a mechanism that will be able to pin an otomi-values repository to a certain version so the modification can take place
      • getNewVersion() mock
      • getNewVersion()
      • getOldVersion() mock
      • getOldVersion()
    • otomi-values modification
      • displacements ()
      • deletions ()
      • mutations ()
  • CLI functionality

api:

  • Add extra call to tools server just before deploy: migrate-values

Definition Of Done

  • Tasks are done
  • (Unit) Tests are added to code
  • Refactoring:
    • Rewire exported functions
    • Type safety for typescript
  • (Architecture) Design Record(s) have been added as adr/*.md and appended to list in adr/_index.md
  • Specs and demo files have been updated to reflect code changes
  • Documentation has been updated (docs/lifecycle-management/versioning)
  • Functionality, code, and/or documentation has been peer reviewed
  • Relevant team members have been notified
@0-sv 0-sv added the Story Scrum story label Dec 22, 2020
@0-sv 0-sv self-assigned this Dec 22, 2020
@0-sv
Copy link
Contributor Author

0-sv commented Dec 22, 2020

Copied from unassigned issues to link it with pull requests for changes in otomi-core.

@0-sv
Copy link
Contributor Author

0-sv commented Dec 24, 2020

A schema will get a version (semver, patches are additions, minors are breaking changes)

Let's just refer to SemVer 2.0.0 then for clarity? https://semver.org

This was referenced Dec 29, 2020
@Morriz Morriz added this to the February 2021 milestone Jan 27, 2021
@0-sv 0-sv added the on hold Waiting for another party to do something label Jan 27, 2021
@0-sv 0-sv removed the on hold Waiting for another party to do something label Feb 23, 2021
@0-sv 0-sv modified the milestones: February 2021, March 2021 Feb 23, 2021
@0-sv 0-sv added the on hold Waiting for another party to do something label Mar 9, 2021
@Morriz Morriz removed this from the March 2021 milestone Mar 9, 2021
@Morriz
Copy link
Contributor

Morriz commented Mar 9, 2021

Where is the REQ discussion for this @svatwork ?

@0-sv
Copy link
Contributor Author

0-sv commented Mar 10, 2021

This is the first one: https://github.com/redkubes/otomi-core/discussions/344

@Morriz Morriz added Epic and removed Epic labels Mar 10, 2021
@Morriz
Copy link
Contributor

Morriz commented Mar 23, 2021

thanks for all the hard work and congrats with the lessons learned, but this darling will be closed in favor of our simplifying efforts ;)

@Morriz Morriz closed this as completed Mar 23, 2021
@Morriz Morriz changed the title Create values-schema mechanism that supports backwards compatibility Create values-schema mechanism to update values to conform to schema changes Apr 11, 2021
@Morriz Morriz reopened this Apr 11, 2021
@Morriz Morriz removed the on hold Waiting for another party to do something label Apr 11, 2021
@Morriz
Copy link
Contributor

Morriz commented Apr 11, 2021

Reopening as we need this to migrate values forward automatically.

@Morriz
Copy link
Contributor

Morriz commented Apr 23, 2021

Parking this once more as this seems more complex seen the different files involved, and knowing we don't have much need for it yet.

@0-sv
Copy link
Contributor Author

0-sv commented May 4, 2021

I have some new insights. I think we maybe should reconsider how we will approach this problem.

I was reading Designing Data-Intensive Applications, and the problem domain is "schema evolution".

Schema evolution has solutions in other serialisation/encoding formats. Thrift/Protobuf's solution is described like so:

As you can see from the examples, an encoded record is just the concatenation of its encoded fields.
Each field is identified by its tag number (the numbers 1 , 2 , 3 in the sample schemas) and
annotated with a datatype (e.g., string or integer). If a field value is not set, it is simply
omitted from the encoded record. From this you can see that field tags are critical to the meaning
of the encoded data. You can change the name of a field in the schema, since the encoded data never
refers to field names, but you cannot change a field’s tag, since that would make all existing
encoded data invalid. You can add new fields to the schema, provided that you give each field a new tag number. If old
code (which doesn’t know about the new tag numbers you added) tries to read data written by new
code, including a new field with a tag number it doesn’t recognize, it can simply ignore that field.
The datatype annotation allows the parser to determine how many bytes it needs to skip. This
maintains forward compatibility: old code can read records that were written by new code. 

What about backward compatibility? As long as each field has a unique tag number, new code can
always read old data, because the tag numbers still have the same meaning. The only detail is that
if you add a new field, you cannot make it required. If you were to add a field and make it
required, that check would fail if new code read data written by old code, because the old code will
not have written the new field that you added. Therefore, to maintain backward compatibility, everyof the schema must be optional or have a default value. Removing a field is just like adding a field, with backward and forward compatibility concerns
reversed. That means you can only remove a field that is optional (a required field can never be
removed), and you can never use the same tag number again (because you may still have data written
somewhere that includes the old tag number, and that field must be ignored by new code).

Datatypes and schema evolution 

What about changing the datatype of a field? That may be possible check the documentation for
details but there is a risk that values will lose precision or get truncated. For example, say you
change a 32-bit integer into a 64-bit integer. New code can easily read data written by old code,
because the parser can fill in any missing bits with zeros. However, if old code reads data written
by new code, the old code is still using a 32-bit variable to hold the value. If the decoded 64-bit
value won’t fit in 32 bits, it will be truncated. 

A curious detail of Protocol Buffers is that it does not have a list or array datatype, but instead has a repeated marker for fields (which is a third option alongside required and optional ). As you can see in Figure 4-4 , the encoding of a repeated field is just what it says on
the tin: the same field tag simply appears multiple times in the record. This has the nice effect
that it’s okay to change an optional (single-valued) field into a repeated (multi-valued) field.
New code reading old data sees a list with zero or one elements (depending on whether the field was
present); old code reading new data sees only the last element of the list. Thrift has a dedicated list datatype, which is parameterized with the datatype of the list
elements. This does not allow the same evolution from single-valued to multi-valued as Protocol
Buffers does, but it has the advantage of supporting nested lists.

Screenshot 2021-05-04 at 12.33.59.png

Screenshot 2021-05-04 at 12.34.10.png

Ie., keys can be referenced with tags and their meaning does not depend on their "nest" in the schema.

The Apache Avro and Parquet projects have also rethought schema evolution but are specific to their problem domains.

Draw your own conclusions, but it seems kinda hacky to move properties around by nest to retain their meaning.

@Morriz
Copy link
Contributor

Morriz commented May 6, 2021

since we are only evolving forward I have no problem with our chosen approach...this is about automating a version to a newer one and never having to know what was the state before

@Morriz
Copy link
Contributor

Morriz commented May 31, 2021

Can you do this after the KinD delivery @svatwork

@0-sv
Copy link
Contributor Author

0-sv commented Jun 23, 2021

To be honest, I don't think this story is realistic and I hope we can have a chat whether this will really solve a problem.

@Morriz
Copy link
Contributor

Morriz commented Jun 23, 2021

why wouldn't it be? I observe that we create, delete and move props, and transform values. Can we not cover those use cases in the approach we designed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Story Scrum story
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants