Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore same values when using copyToRealm #6224

Merged
merged 61 commits into from
Nov 2, 2018
Merged

Conversation

cmelchior
Copy link
Contributor

@cmelchior cmelchior commented Oct 8, 2018

This PR changes the implementation of copyToRealm to use Object Stores Object::create method.
This allows us to ignore fields with the same value. Currently this generates a lot of extra set instructions when syncing Realms which has a negative performance impact on the server.

A side-effect of this is also that using this method fine-grained notifications will not report any changes for this field. However this is only true if using copyToRealm or copyToRealmOrUpdate

Implementation is done by adding a new OsObjectBuilder class that will build up a list of data on the C++ side before creating the whole thing. From the user API side, we have a new ImportFlag enum.

Enums are being used instead of booleans or int flags since that mirros the API used by RealmObjectSchema for specifying field properties. It also has the upside of being more typesafe (plus I strongly suspect more flags in the future).

// New enum
enum ImportFlag {
  DO_NOT_SET_SAME_VALUES
}

// Usage
realm.copyToRealm(object, ImportFlag.DO_NOT_SET_SAME_VALUES)`
realm.copyToRealmOrUpdate(object, ImportFlag.DO_NOT_SET_SAME_VALUES)`

For the first iteration only copyToRealm/copyToRealmOrUpdate will use this method. We should consider adding the same flags to other methods like and createOrUpdateFromJSon*. insertOrUpdate is designed to be super performant, so it is unclear if we can support ImportFlags in this method without a lot of extra effort.

Performance

Performance in the new implementation is pretty much on par with the current performance. When diffing is enabled performance decrease about 20% which is unavoidable as all properties needs to be compared with what is on disk.

5.8.0
image

5.7.0
image

Diffed vs. non-diffed performance in 5.8
image

Relevant files to review

  • java_object_accessor.hpp contains the majority of the C++ code
  • io_realm_internal_objectstore_OsObjectBuilder.cpp contains the JNI boundary
  • OsObjectBuilder contains most of the Java code, but isn't really very interesting.
  • See all the proxy classes for the change to the output from the annotation processor

TODO

  • Cleanup Changelog
  • Cleanup Javadoc
  • Javadoc in OsObjectBuilder
  • Finish unit tests of OsObjectBuilder
  • Cleanup OsObjectBuilder
  • Modify annotation processor to use OsObjectBuilder
  • Update annotation processor unit tests.
  • Performance test
  • Add unit test for notifications triggering correctly
  • Any input to name of flag? Other suggestions: DO_NOT_WRITE_SAME_VALUES, IGNORE_SAME_VALUES? Others.. The current choice is somewhat arbitrary
  • Send flags to C++
  • Hook up into Object Store (using a WIP branch)
  • Fix JavaContext::print() crashing on CI
  • Remove use of util::Any or figure another way to get performance back to normal.

return BinaryData();

auto& value = any_cast<OwnedBinaryData&>(v);
const auto& data = value.get();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe?

@cmelchior
Copy link
Contributor Author

Ready for review @nhachicha @kneth

Copy link
Collaborator

@nhachicha nhachicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job ❤️ I think we need to update insertOrUpdate even though it's faster, it's still not optimised to generate fewer SET instructions for Sync

@cmelchior
Copy link
Contributor Author

All comments addressed @nhachicha.

While it is true that ideally this should be added to all ways of importing data, it isn't clear it is worth the effort currently. It is, no matter what, outside the scope of this PR. I have created #6263 that tracks adding it to the rest of our API's

@cmelchior
Copy link
Contributor Author

Ups. Some other comments where hiding. All should be fixed now

Copy link
Collaborator

@nhachicha nhachicha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏when CI is happy

@cmelchior cmelchior merged commit 9c4c5e6 into master Nov 2, 2018
@bmunkholm bmunkholm deleted the cm/copy-ignore-same-values branch September 29, 2020 07:44
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants