Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed C++ API Documentation #391

Closed
wants to merge 22 commits into from
Closed

Proposed C++ API Documentation #391

wants to merge 22 commits into from

Conversation

davidbaraff
Copy link
Contributor

This adds a proposal for the new C++ strategy for OTIO. Sample header files to be left on my forked branch and not put in the true PixarAnimationStudios/OpenTimelineIO repository.

@codecov-io
Copy link

codecov-io commented Nov 28, 2018

Codecov Report

Merging #391 into master will decrease coverage by 0.23%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #391      +/-   ##
==========================================
- Coverage   87.15%   86.92%   -0.24%     
==========================================
  Files          67       63       -4     
  Lines        6734     5604    -1130     
==========================================
- Hits         5869     4871     -998     
+ Misses        865      733     -132
Impacted Files Coverage Δ
opentimelineio_contrib/adapters/fcpx_xml.py 92.19% <0%> (-1.56%) ⬇️
opentimelineio/schema/stack.py 90% <0%> (-1.43%) ⬇️
...elineio_contrib/adapters/tests/test_aaf_adapter.py 97.16% <0%> (-0.75%) ⬇️
opentimelineio/adapters/cmx_3600.py 91.57% <0%> (-0.04%) ⬇️
opentimelineio/algorithms/__init__.py 100% <0%> (ø) ⬆️
opentimelineio_contrib/adapters/extern_rv.py 0% <0%> (ø) ⬆️
opentimelineio_contrib/adapters/xges.py
opentimelineio/algorithms/timeline_algo.py
...elineio_contrib/adapters/aaf_adapter/aaf_writer.py
...lineio_contrib/adapters/tests/test_xges_adapter.py
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 88089b1...3947d88. Read the comment docs.

docs/cxx-proposal/cxx.rst Outdated Show resolved Hide resolved
Since OTIO originated as Python (and has an extensive test suite, in Python), our starting position
is that existing Python code (adapters, plugins, schemadefs) must continue to work, as currently
written. Python code in the `core` or `schema` directories will of course be rewritten, but Python code
outside those modules should not be aware of any change.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have already made a few pre-emptive changes to the Python API to ease the transition to C++, so this statement is stronger than we really plan. I don't want to soften this too much, but we should be honest by noting that some changes are happening, but only when necessary or desirable (e.g. the immutability of TimeRange, etc.) and that we are doing this ahead of time to give people time to adjust.

docs/cxx-proposal/cxx.rst Show resolved Hide resolved
docs/cxx-proposal/cxx.rst Outdated Show resolved Hide resolved
clip2->metadata()["other_clip"] = clip1;

will work just fine: writing/reading or simply cloning ``clip1`` would yield
a new ``clip1`` that pointed to a new ``clip2`` and vice versa.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a larger discussion about either enforcing the tree structure at the schema level, or explicitly allowing non-tree structures.

We suspect that some future schemas/features will want to use this (e.g. Effects that use multiple clips as sources, or linked audio & video clips on two different tracks) but so far we have been very happy with the simplicity of the tree structure as an aid for making OTIO useful and easy to learn.

In any case, if/when we choose to allow for non-tree structures, we should be explicit and intentional about it.

docs/cxx-proposal/cxx.rst Outdated Show resolved Hide resolved
docs/cxx-proposal/cxx.rst Outdated Show resolved Hide resolved
docs/cxx-proposal/cxx.rst Outdated Show resolved Hide resolved
docs/cxx-proposal/cxx.rst Show resolved Hide resolved
@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@jminor jminor changed the title Just doc changes Proposed C++ API Documentation Nov 29, 2018
@jminor
Copy link
Collaborator

jminor commented Nov 29, 2018

Thanks for the changes. This looks great. I will send this out to a few people for comments and then we can open it up for wider discussion.

The need for an "optional" (i.e. a container class that holds either no value
or some specific value, for a given type T) is currently small, but does occur
in one key place (schemas which need to hold either a ``TimeRange`` or
indicate their time range is undefined).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "no value" sentinel value in TimeRange is arguably preferable to optional. The argument is that an algebra can be defined on TimeRange. If optional is involved, then there is necessarily a layer of checking and conditional structure that must be invoked before deciding to access a TimeRange; whereas a sentinel value would allow the "no value" condition to participate in an algebra, via identity, associativity, commutativity, etc.

the count of the instance is zero, and if so deletes the object in question.
- An ``any`` instance holding a ``SerializableObject*`` actually holds a
``Retainer<SerializableObject>``. That is, blind data safely retains schema instances.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the mechanism for safety under concurrency? On the face of it, if I return a pointer, and Retainer the return value, isn't there a window where the pointer could be deleted before it is Retainer'd? Or is the idea that objects subject to this regime are owned at a higher scope - the schema from which the object was brokered holds the ultimate reference that in the usual case prevents this? Secondarily, I imagine the reference count is fenced and/or atomic?

@meshula
Copy link
Collaborator

meshula commented Nov 29, 2018

I don't see anything in this proposal about exceptions. I have comments in another issue about not being in favor of exception heavy designs, so I won't belabor the point here, but would hope to see a strategy for error handling laid out in the proposal.

Overall, the proposal looks nice to me.

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 29, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 30, 2018 via email

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 30, 2018 via email

@meshula
Copy link
Collaborator

meshula commented Nov 30, 2018

Thanks for the thoughtful responses.

Edit: conversation about sentinel values moved to https://github.com/PixarAnimationStudio/OpenTimelineIO/issues/394

it will return false and set *err_msg accordingly

This is easy to bind to other languages. I'm wondering if there is a case where it's actually a good idea to pass nullptr for err_msg? One could simply force that the user must supply a string that might be populated with an error message. This would likely result in end user code that actually reports problems. I'd probably encourage a result enumeration instead of a bool return.

The scenario I am imagining is that without an enumerated return value, applications might resort to parsing the string in order to handle certain types of problems. For example, an application might desire a different resolution strategy for each of these cases: (a) invalid range specified, (b) referenced file not found at indicated URI.

With an enumerated result return, I wouldn't care if the err_msg is a pointer or a reference.

concurrency ... Never mind, I figured it out.

You've probably thought of this already, but the canonical use case is MVC pretty much needs the UI on one thread and processing on another. OpenTimelineIO is no doubt going to be traversed simultaneously for rendering a timeline, and performing some operation, like splitting a ranged item. One might imagine that OTIO operations will always be trivially fast and therefore don't need to be scaled, which is an invitation for someone somewhere to put 100,000 items intersecting at one time point :)

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Nov 30, 2018 via email

@meshula
Copy link
Collaborator

meshula commented Nov 30, 2018

Agreed. Moved the sentinel value comment to a new issue to keep the conversation focused.

Change from using plain string for error messages to a more structured form.
@eidmi
Copy link

eidmi commented Dec 5, 2018

From the SG Autodesk team, here are our comments. Note that several of those come from comparing to our own current C++ implementation, therefore might be more relevant to our usage of OTIO than others.

Overall, very clean and modern code. We like it!

  1. STL is used in the public interface (map, vector and string). That means if clients use it, they will be tied to the same compiler versions
    • I assume you are probably following the VxF reference platform (https://www.vfxplatform.com/).
    • We currently don't have that restriction with our implementation. It would be sad to inherit it just for this.

  2. The C++ implementation will not make use of C++ exceptions
    • This would be very clear if the methods were marked as noexcept (currently they are not).

  3. A function which can “fail” will indicate this by taking an argument std::string* err_msg which it will set with a readable error message string if the pointer is non-null.
    • I would prefer passing an int instead since this makes it much easier if we are checking for specific known errors.
    • There could be a separate namespace/class for translating the codes to a human-readable string -- it would also make localization much easier on the client side.

  4. If the caller created a schema object (by calling new, or equivalently, by obtaining the instance via a deserialize call) they are responsible for calling possibly_delete() when they are done with the instance, or by giving the pointer to someone else to hold".
    • I assume using Retainer instead of assigning to a raw pointer would mean that I would not need to call possibly_delete?
    • Can I used a shared_ptr with the deleter function set to call possibly_delete?

  5. There's a lot of levels of indirection. Clip, for example, has the following inheritance:

• Clip -> Item -> Composable -> SerializableObjectWithMetadata -> SerializableObject
• Not an issue per se, but that much abstraction tends to lead to performance issues. Using a template serializers that handles the various type would be compile time evaluated and it would also greatly simplify the current somewhat complex memory management model if polymorphism could be eliminated from the public interface.
• I imagine it's done this way though since it makes it easier to map to Python, so it may not be possible in that context.

  1. This is just a style preference, but instead of a constructor with a bunch of default values, especially for invalid values like this:

public:
explicit TimeTransform(RationalTime offset = RationalTime(), double scale = 1, double rate = -1)
: _offset(offset), _scale(scale), _rate(rate)
{
}

I prefer this:

public:
using InvalidRate = -1;

private:
RationalTime _offset;
double _scale{ 1 };
double _rate{ InvalidRate };

  1. Where is the Timeline object? Global start time?

  2. In our implementation, we’re exposing the fact that we are based on rapidjson. On your side, you say your implementation depends on it, but I did not see any rapidjson object in the API.
    • It is convenient to use rapidjson to compare a whole document using rapidjson::Document::operator==.
    • You seem to rely on the fact that your underlying data container is an std::map and always sorts its entries based on the key. Not very convenient to compare OTIO strings that could come from another parser.
    • "SerializableObject::is_equivalent_to()" mentions in its doc that it uses "==" for everything except schema comparison, but I don't see any operator== in their API.

  3. If base classes like SerializableObject are not meant to be persisted without derivation, shouldn't they be pure virtual and NOT have a schema name/version for those classes?

TX,

The SG Autodesk team

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Dec 5, 2018 via email

@meshula
Copy link
Collaborator

meshula commented Dec 5, 2018

The ErrorStatus object addresses what I was looking for, thanks!

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Dec 5, 2018 via email

@jminor
Copy link
Collaborator

jminor commented Dec 6, 2018

@eidmi, we have some questions about your questions

  • (1) Yes, we have been targeting the VFX platform 2016+ for the Python version of OTIO. Can we get more context about the issue you're raising? If we don't expose STL (map, vector, string) what is a reasonable alternative? Can you point us to an example of a library that works the way you propose?

  • (3) Does @davidbaraff's recent change address your concern about error codes?

  • (5) We are using polymorphism to some advantage. Is the performance issue mostly related to virtual methods?

  • (7) Oops, we forgot that in the docs. We'll add that shortly.

I'll let @davidbaraff comment on the other items.

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Dec 6, 2018 via email

@rogernelson
Copy link
Collaborator

rogernelson commented Dec 6, 2018

Thanks for the responses! These questions were an amalgamation of questions coming from Eric and myself. I'll respond to the questions I asked here best I can. I'll let Eric respond to the other ones.

(1) Yes, we have been targeting the VFX platform 2016+ for the Python version of OTIO. Can we get more context about the issue you're raising? If we don't expose STL (map, vector, string) what is a reasonable alternative? Can you point us to an example of a library that works the way you propose?

The issue is that exposing STL in headers requires that anyone linking with the library use the same compiler and compiler version, since STL is implementation specific (different ABI) and you can get weird random behaviour if this is not the case. We also have libraries that could benefit from using otio, but thus far we've avoid imposing any restrictions on compilers. There are essentially two ways of doing this. One way is to put the code that uses STL in the headers (so the client compiles it with their compiler) -- rapidjson is an example of this. The other is to wrap any STL with a private-implementation idiom (ex. return your own Vector to clients that only has a void* in the header and exposes vector methods. To implement the class, reinterpret_cast back to an std::vector in your implementation and call the corresponding method). This way, the STL you need is eseentially part of your library instead of the STL implementation. For what it's worth our particular libraries use a combination of both of those techniques. Neither option is particularly appealing I realize, which is why things like the VxF platform exist.

There's a good paper here on the challenges of writing libraries (ELF libraries, so posix systems):
https://software.intel.com/sites/default/files/m/a/1/e/dsohowto.pdf

(3) Does @davidbaraff's recent change address your concern about error codes?

Yes, thanks!

(5) We are using polymorphism to some advantage. Is the performance issue mostly related to virtual methods?

Yes, I generally prefer to write libraries where clients provide their own "polymorphic" behaviour by specializing a template for their own structure/class. Not only because its faster (compile time over runtime), but also because it allows the library to keep all inheritance as a private implementation detail. This allows you the flexibility of changing interfaces without having to worry about breaking client code. And it's also nice not to have to impose that the client inherits from the library's class structure, particularly when the client may want to use a given class with multiple libraries that do this. By removing inheritance from the equation, it also allows for value semantics instead of reference semantics, which can be quite convenient when they're used for multithreading.

But again, this is a preference that comes from writing code that has to fit into large complex hierarchies and not a hard requirement.

There's a talk/code example on this approach here, if you are interested in this subject:
https://github.com/sean-parent/sean-parent.github.io/blob/master/papers-and-presentations.md#value-semantics-and-concept-based-polymorphism

We haven't thought of a good way to combine them given that we want two separate libraries.

Combining error codes is always a bit painful. One simple solution worth considering is starting the otio error codes at an offset. Something like

enum class Outcome {
OK,
INVALID_TIMECODE_RATE = 0x100,
...
}

At least you don't end up with overlapping error codes that way (besides OK set to zero, which is a good thing). Also note that I used an enum class instead on enum. Since you're writing new code from scratch, I would encourage it's use over an enum.

• Can I used a shared_ptr with the deleter function set to call possibly_delete?
I believe so, if the fact that other parts of the code (e.g. a Python bridge) are not using shared_ptr (but just dealing with raw pointers and Retainers) doesn’t present an issue. So yes, if that doesn’t mess up the reference counting aspects, then you’d be free to use shared_ptr.

Great. I was just asking because I was wondering if there was something "special" that the Retainer was doing. If this is the case, you might be able to simplify by the Retainer class by implementing it with a shared_ptr with a custom deleter function. You could potentially also add a ScopeGuard that does the same trick with a unique_ptr.

  1. we’re exposing the fact that we are based on rapidjson. On your side, you say your implementation depends on it, but I did not see any rapidjson object in the API.
    There isn’t any in the API. It’s a completely internal detail. We were just saying for now we’re depending on it, in order to build the library.

This was Eric's question so he might expand a bit on this, but this is good for me. The one point I would like to make would be to encourage you to define RAPIDJSON_NAMESPACE so you don't violate the ODR with clients linking with your library that are also using rapidjson.

@eric-desruisseaux-adsk
Copy link
Contributor

Thanks for the quick and clear answers!

When you get down to the level of comparing two C++ any instances that each hold something, if both hold a SerializableObejct, you pull the objects out and use is_equivalent_to() to compare them. Otherwise, if the any() holds a primimtive type, you pull the primitive types out, and do == on the primitive types (assuming the types match). For AnyVector/AnyDictionary you recurse through the elements. So the == is coming from comparisons between bool, int, double, RationalTime.

So, from a user point-of-view, if I have 2 instances of Timeline objects (each one being the root node of a complete OTIO tree) and I want to compare both trees, I can use the SerializableObject::is_equivalent_to function to do timeline1.is_equivalent_to(timeline2)? If this function is recursive, I don't see how it can reach each children's primitive. For instance, how would it reach the LinearTimeWarp's "_time_scalar" property? The only virtual functions in the LinearTimeWarp class are read_from/write_to and those are for serialization, so probably not used in the implementation of the (non-virtual) is_equivalent_to function. And I don't see a way for the derived classes (eg.: LinearTimeWarp) to push its primitives in a base class container (of anys). I might be missing something here though. :)

Also, my questions/concerns about exposing (or not) rapidjson Document and us using it to compare OTIO trees are actually answered by the fact that we could do the comparison using 2 C++ trees (agani, with SerializableObject::is_equivalent_to function). The fact that you use std::map internally is not a problem at all.

@davidbaraff
Copy link
Contributor Author

davidbaraff commented Dec 6, 2018 via email

@ssteinbach
Copy link
Collaborator

Just to make it easy for people to find, the headers linked by the documentation that @davidbaraff is adding can be found at this link:
https://github.com/davidbaraff/OpenTimelineIO/tree/sample-c%2B%2B-headers/proposed-c%2B%2B-api/otio

@ssteinbach
Copy link
Collaborator

Lets integrate these documentation changes back into the master branch now that we've landed the C++ branch into master.

@ssteinbach ssteinbach added this to the Public Beta 12 milestone Aug 7, 2019
@ssteinbach ssteinbach added this to To Do in C++ API via automation Aug 7, 2019
@jminor
Copy link
Collaborator

jminor commented Dec 3, 2019

Addressed by #610

@jminor jminor closed this Dec 3, 2019
@jminor jminor added this to In progress in Documentation via automation Oct 12, 2020
@jminor jminor moved this from In progress to To do in Documentation Oct 12, 2020
@meshula meshula moved this from To do to Done in Documentation Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To Do
Development

Successfully merging this pull request may close these issues.

None yet

8 participants