Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Downgrading System #1387

Merged
merged 119 commits into from
Sep 12, 2022

Conversation

ssteinbach
Copy link
Collaborator

@ssteinbach ssteinbach commented Aug 25, 2022

Overview

As OpenTimelineIO makes its way into vendor tools, it is likely that multiple versions of OTIO will exist in different applications that would like to interoperate bidirectionally within a workflow. Therefore, we are adding a system that enables a newer version of the OpenTimelineIO library to write files with schemas that are compatible with older versions of the library.

See also: #1295.

C++

  • Introduces the concept of a schema_version_map, a mapping of schema names to desired schema versions
  • Adds the CORE_VERSION_MAP, a compiled in constant mapping of "label" to schema_version_map. Labels are currently associated with releases of the OpenTImelineIO library (ie. "0.15.0"). Intended to describe sets of schema versions that are compatible with specific releases of OpenTimelineIO.
  • Adds register_downgrade_function to pair with register_upgrade_function. Like upgrade functions, downgrade functions take an AnyDictionary and operate on it in place.
  • Adds schema_version_targets arguments to to_json_string and to_json_file family of functions/methods. These are optional<schema_version_map> and allow the user to optionally provide a set of schema version targets for downgrading during serialization
  • during serialization, if a downgrade needs to happen, uses the CloningEncoder to build an AnyDictionary of the object, which is then passed through the downgrade functions and into the serializer.
  • Adds type_version_map method to TypeRegistry for querying the schema names and versions of all currently registered types
  • Adds the io_perf_test and upgrade_downgrade_example example C++ programs
  • Adds some python-like convenience methods to AnyDictionary: get_default, set_default and has_key
  • use std::unordered_map instead of std::map inside the writer for a number of internal data structures where order is not relevant
  • when converting built-in but schema'd types to AnyDictionary, like RationalTime, don't store them as concrete types but store them as AnyDictionary. Otherwise they can't be downgraded.
  • add override tagging to Encoder/Writer virtual methods
  • serialization performance improvements when not downgrading. Avoided copies, const& etc. 20mb OTIO serializes with no downgrading in 0.03s (was 0.04s).

Python

  • Adds python bindings for registering downgrade functions, exposed via the downgrade_function_from decorator, to echo the upgrade_function_for function.
  • Adds python bindings for querying the full schema/version map and CORE_VERSION_MAP
  • Adds version_manifest field to the plugin manifest system, allowing you to define custom family/label/schema version sets to target
  • Adds OTIO_DEFAULT_TARGET_VERSION_FAMILY_LABEL environment variable for telling the python API that you want to use that as a default downgrade target.
  • Adds autogen_version_map python script that generates the CORE_VERSION_MAP.cpp file for the c++ api

Performance

There is a performance cost to using the downgrading system. On a 20mb OTIO file where all clips are downgraded, the difference in serialization time was a slowdown of about 4x (0.03s -> 0.12s with a release build on an M1 Mac). Deserialization is not impacted at all.

As noted above, the baseline serialization performance was also improved, even when not downgrading.

Performance Scaling With File Size

  • all times in seconds
  • dg = with downgrading
  • →str = to string
  • →file = to file
OTIO size read →str dg→str →file dg→file
20mb 0.18 0.04 0.14 0.14 0.25
86mb 0.77 0.15 0.59 0.64 1.02
257mb 2.35 0.47 1.75 1.83 3.07
1.0gb 9.72 1.85 7.06 7.71 12.3

Follow up issues

Further Design Discussions

  • In the current system, if adding a new attribute to a schema and the new attribute's default value is the correct one with no further questions if it isn't present, then we don't add an upgrade function and do not increment the schema version (for example: the enabled flag). For downgrading, its true that the parser ignores extra keys, so we could leave enabled in objects and read them in old versions, however if someone goes in and edits the value of enabled the old APIs won't know what to do with them and it is possible to create data that would be interpreted differently in the old system (which neither models nor reads the .enabled flag and therefore assumes everything is enabled) and the newer system, which would read that as being enabled false.
  • The existing system (in particular the unit test system, but the plugin system should be tested) is built around the fact that if you double register something, it is a no-op instead of an error. Long term we should probably address this, but it might be a bit of a retrofit for the TypeRegistry architecture
  • Right now the core schemas are everything that is present just in the C++ core, and the version is tied to the release of the software library. We probably want to separate the two versions of things and establish an "Otio interoperability version" which is a set of schema versions, independent from the versions of the library, and also discuss how schemas are included or not in that list.
  • Should examples be compiled by default? Right now they are not. It could help to ensure that they stay current. Maybe even better than that would be to move forward with the discussed-previously idea of splitting the C++ and python projects up and setting the C++ library up with its own unit test suite.
  • Should we offer a mode where the schema/up/downgrade functions are internal only and not exposed as a public interface? Would vendors want something like that?

@ssteinbach ssteinbach added this to the Public Beta 15 milestone Aug 25, 2022
@ssteinbach ssteinbach added this to In progress in Interoperation Aug 26, 2022
examples/io_perf_test.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Sep 4, 2022

Codecov Report

Merging #1387 (d97241b) into main (5b418db) will decrease coverage by 0.19%.
The diff coverage is 78.04%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1387      +/-   ##
==========================================
- Coverage   86.27%   86.08%   -0.20%     
==========================================
  Files         196      199       +3     
  Lines       19871    20252     +381     
  Branches     2309     2333      +24     
==========================================
+ Hits        17144    17434     +290     
- Misses       2161     2246      +85     
- Partials      566      572       +6     
Flag Coverage Δ
py-unittests 86.08% <78.04%> (-0.20%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/opentimelineio/serializableObject.cpp 62.06% <0.00%> (-1.47%) ⬇️
src/opentimelineio/typeRegistry.h 100.00% <ø> (ø)
...entimelineio-bindings/otio_serializableObjects.cpp 91.57% <0.00%> (ø)
src/py-opentimelineio/opentimelineio/__init__.py 100.00% <ø> (ø)
...neio/opentimelineio/console/autogen_version_map.py 25.64% <25.64%> (ø)
tests/test_adapter_plugin.py 86.84% <50.00%> (-0.58%) ⬇️
src/opentimelineio/typeRegistry.cpp 76.59% <52.00%> (-5.76%) ⬇️
src/opentimelineio/serialization.cpp 80.17% <74.67%> (-2.41%) ⬇️
src/py-opentimelineio/opentimelineio/versioning.py 83.33% <83.33%> (ø)
tests/test_serializable_object.py 91.87% <83.82%> (-6.44%) ⬇️
... and 14 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5b418db...d97241b. Read the comment docs.

src/opentimelineio/anyDictionary.h Outdated Show resolved Hide resolved
src/opentimelineio/anyDictionary.h Outdated Show resolved Hide resolved
src/opentimelineio/typeRegistry.cpp Outdated Show resolved Hide resolved
src/opentimelineio/anyDictionary.h Outdated Show resolved Hide resolved
docs/tutorials/versioning-schemas.md Show resolved Hide resolved
src/opentimelineio/typeRegistry.h Show resolved Hide resolved
tests/test_serializable_object.py Outdated Show resolved Hide resolved
@darbyjohnston
Copy link
Contributor

I'm not qualified to review the schema logic, just left a couple notes on the C++ code.

From your list of questions:

Performance

Is it bad? Have you tried profiling it?

discussed-previously idea of splitting the C++ and python projects

I like that idea.

@ssteinbach
Copy link
Collaborator Author

Yes, currently performance is about a 10x slowdown when downgrading. I have some leads on improving it though, so stay tuned for later in the week!

Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
- get_default, set_default, has_key

Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
@ssteinbach ssteinbach marked this pull request as ready for review September 8, 2022 05:32
Copy link
Collaborator

@rogernelson rogernelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach! A few c++ comments here and there, mostly minor.

examples/io_perf_test.cpp Outdated Show resolved Hide resolved
examples/io_perf_test.cpp Outdated Show resolved Hide resolved
examples/io_perf_test.cpp Show resolved Hide resolved
examples/io_perf_test.cpp Show resolved Hide resolved
examples/io_perf_test.cpp Show resolved Hide resolved
src/opentimelineio/CORE_VERSION_MAP.cpp Show resolved Hide resolved
src/opentimelineio/anyDictionary.h Show resolved Hide resolved
src/opentimelineio/anyDictionary.h Show resolved Hide resolved
src/opentimelineio/typeRegistry.cpp Show resolved Hide resolved
src/opentimelineio/typeRegistry.cpp Outdated Show resolved Hide resolved
@ssteinbach
Copy link
Collaborator Author

I added a performance scaling section to the pr text, just noting here in case folks following it are interested.

@meshula meshula merged commit f7d2946 into AcademySoftwareFoundation:main Sep 12, 2022
@jminor jminor added this to In progress in Schema Development via automation Sep 23, 2022
MichaelPlug pushed a commit to MichaelPlug/OpenTimelineIO that referenced this pull request Aug 5, 2023
* refactor versioning tests into their own class
* DRY cleanup in the serializer before other stuff
* DRY reduction in the json FILE serializer
* Add io_perf_test to repo
* Add a call w/ downgrade manifest to io_perf_test
* add anydictionary convenience functions
* add .cache to gitignore
* add override tags
* add perf tests for no-downgrade scenarios
* autogen version info struct
* add exceptions for overwriting up/downgrade fn
* add exception when double registering a type
* move schema version types into typeRegistry
* add version manifest plugin
* lint pass
* comment formatting for RTD
* add upgrade_downgrade_example in C++
* Add notes to environment variables markdown.
* Improve error handling and text for env var errors

- the OTIO_DEFAULT_TARGET_VERSION_FAMILY_LABEL has checking to make sure
  the format is correct and that the version/label requested are
  present.
- Adds a custom exception that gets raised if there is a problem
- Adds a unit test for testing this behavior

* Performance tuning (cache the child cloning encoder across the entire serialization, use const & as much as possible)

Signed-off-by: ssteinbach <ssteinbach@users.noreply.github.com>
Co-authored-by: meshula <meshula@users.noreply.github.com>
Signed-off-by: Michele Spina <michelespina96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Interoperation
  
In progress
Schema Development
  
In progress
Development

Successfully merging this pull request may close these issues.

None yet

6 participants