Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework the sync tests #8435

Merged
merged 10 commits into from
Dec 19, 2023
Merged

Rework the sync tests #8435

merged 10 commits into from
Dec 19, 2023

Conversation

tgoyne
Copy link
Member

@tgoyne tgoyne commented Dec 13, 2023

This unfortunately turned into a giant unreviewable mess, but I don't really know how to split it up. An overview of the changes:

Each XCTestCase subclass now creates a server-side App which is used for most of the tests in that test case. Apps are now never shared between test cases, and do not share Mongo collections with other apps. Together this means that state from one test case should never bleed over to other test cases.

This required rearranging many of the tests, as we had test cases with both PBS and FLX tests. While doing this I also discovered several tests which were just plain in the wrong test case due to files having multiple multi-thousand-line test cases and people thought they were adding tests to the one defined at the top. I split these files into multiple files, which I think makes things much more managable but unfortunately results in the diff being unhelpful.

Each test case now explicitly defines which set of classes it uses, and only Rules for those classes is created in the server app. This cuts the time required to create apps roughly in half and helps offset the fact that we're now creating more apps. It also gets rid of the weird things like the hardcoded list of flx-compatible types in RealmServer.

Creating an app now waits for the initial sync to complete. With PBS this is required to avoid some very strange bugs. With FLX it mostly ensures that if this times out for some reason we get a test failure there, rather than later in some very confusing place in the middle of a test.

Client reset tests now use the API endpoint for FLX in addition to PBS. This makes them dramatically faster (several seconds instead of 30+).

FLX tests now consistently follow the pattern of using one of the object fields as a partition key to query on rather than querying for all objects of a type. Some tests already did this, while others tried to clear the data first (which did not always work if the server was in the middle of processing old requests), and some just plain broke if tests were run in the wrong order.

In the very early days of sync, opening the same Realm URL twice required two different processes, so all our sync tests spawned child processes. That hasn't been true for a very long time, but the tests stuck around and some more were written in that style due to mimicking the existing tests. I've ported almost all of them over to operating in a single process, which makes them both simpler and much faster (5s to .5s in many cases).

The tests are now run with developer mode off. This was initially required due to the change where opening with a class subset is now considered a breaking change in developer mode, but now that test cases explicitly specify their types that isn't a problem any more. However, it does let us once again test subscriptions failing due to an unqueryable field, and that test revealed that we were using the wrong error domain for that error.

I added some new helper functions for the things I discovered I was going to have to change in literally hundreds of places. Creating a temporary user is now just self.createUser() rather than separate steps of creating credentials and logging in. self.name is now used as the tag value for partitions and user names and such rather than #function or NSStringFromSelector(_cmd), which makes it so that it doesn't have to be explicitly passed into helper functions. There were a number of places where this previously was done incorrectly and #function was used inside helper functions, which didn't achieve the desired effect.

@tgoyne tgoyne self-assigned this Dec 13, 2023
@tgoyne tgoyne force-pushed the tg/rework-sync-tests branch 3 times, most recently from 656fa80 to b22a853 Compare December 14, 2023 23:41
@tgoyne tgoyne changed the base branch from master to tg/write-transaction-notification December 14, 2023 23:55
@tgoyne tgoyne marked this pull request as ready for review December 15, 2023 02:15
@tgoyne tgoyne mentioned this pull request Dec 15, 2023
@tgoyne tgoyne force-pushed the tg/write-transaction-notification branch from 407d57b to a382ae5 Compare December 17, 2023 17:31
Copy link
Contributor

@dianaafanador3 dianaafanador3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work into making the sync test suite better. just some minor comments. And as a plus, I'm missing a little bit more documentation/comments into our test cases common code, I guess if you take a look at this PR everything makes sense, but if you are someone new this make feel like magic, or you can even miss adding this for a new sync test file.

@@ -0,0 +1,715 @@
////////////////////////////////////////////////////////////////////////////
//
// Copyright 2016 Realm Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This date doesn't correspond to the date of creation of this file or test at least, in this and the other files created

@@ -1228,4 +1252,7 @@ public class RealmServer: NSObject {
}
}

extension String: Error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

while true {
let complete = try session.apps[appServerId].sync.progress.get()
.map { resp in
guard let resp = resp as? Dictionary<String, Any?> else { return false }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just have all this conditions in the same guard?

- parameter type: The type of the object to be queried.
- parameter query: A query which will be used to modify the existing query.
*/
public func updateQuery<T: Object>(toType type: T.Type, where query: (Query<T>) -> Query<Bool>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding this, if the above API can be used for this

@@ -166,6 +182,20 @@ import Realm.Private
self.predicate = query?(Query()).predicate ?? NSPredicate(format: "TRUEPREDICATE")
}

/**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same in here, doesn't the above API already can be used for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment in the body says why: optional function parameters are always escaping, but this function doesn't escape.

Copy link
Contributor

@dianaafanador3 dianaafanador3 Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, didn't check the comment within the code, only the doctoring above


let realm = try Realm(configuration: config1)
try realm.write {
try write { realm in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use populateData, if we are checking for count and not for specific data

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is checking the number of sections, which depends on the number of unique first names.

@tgoyne tgoyne force-pushed the tg/write-transaction-notification branch from a382ae5 to bdd98d3 Compare December 19, 2023 04:39
@tgoyne tgoyne force-pushed the tg/rework-sync-tests branch 2 times, most recently from e2d0e0f to f2d95b6 Compare December 19, 2023 17:14
@tgoyne
Copy link
Member Author

tgoyne commented Dec 19, 2023

I'm going to take a pass at writing some explanations of wtf the sync tests are doing, because you're absolutely right that there's a lot of complexity that's completely undocumented.

@dianaafanador3
Copy link
Contributor

I'm going to take a pass at writing some explanations of wtf the sync tests are doing, because you're absolutely right that there's a lot of complexity that's completely undocumented.

Don't worry, I think is something I mentioned only as a bonus point

…hange

Core has allowed this for a while, but we had our own validation which made it not work.
The RLMSyncSubscriptionSet was retained until the task was deallocated even
after the task completed, which is not strictly incorrect but made a test
unreliable.
This compiles because the value can be implicitly converted to an optional, but
it isn't actually checking anything. Even if the non-optional has an invalid
nil value, it ends up as `.some(invalid)` and will pass the test.
We'll never get a success after an error, so not fulfilling the expectation
just pointless waits for a timeout.
This unfortunately turned into a giant unreviewable mess, but I don't really
know how to split it up. An overview of the changes:

Each XCTestCase subclass now creates a server-side App which is used for most
of the tests in that test case. Apps are now never shared between test cases,
and do not share Mongo collections with other apps. Together this means that
state from one test case should never bleed over to other test cases.

This required rearranging many of the tests, as we had test cases with both PBS
and FLX tests. While doing this I also discovered several tests which were just
plain in the wrong test case due to files having multiple multi-thousand-line
test cases and people thought they were adding tests to the one defined at the
top. I split these files into multiple files, which I think makes things much
more managable but unfortunately results in the diff being unhelpful.

Each test case now explicitly defines which set of classes it uses, and only
Rules for those classes is created in the server app. This cuts the time
required to create apps roughly in half and helps offset the fact that we're
now creating more apps. It also gets rid of the weird things like the hardcoded
list of flx-compatible types in RealmServer.

Creating an app now waits for the initial sync to complete. With PBS this is
required to avoid some very strange bugs. With FLX it mostly ensures that if
this times out for some reason we get a test failure there, rather than later
in some very confusing place in the middle of a test.

Client reset tests now use the API endpoint for FLX in addition to PBS. This
makes them dramatically faster (several seconds instead of 30+).

FLX tests now consistently follow the pattern of using one of the object fields
as a partition key to query on rather than querying for all objects of a type.
Some tests already did this, while others tried to clear the data first (which
did not always work if the server was in the middle of processing old
requests), and some just plain broke if tests were run in the wrong order.

In the very early days of sync, opening the same Realm URL twice required two
different processes, so all our sync tests spawned child processes. That hasn't
been true for a very long time, but the tests stuck around and some more were
written in that style due to mimicking the existing tests. I've ported almost
all of them over to operating in a single process, which makes them both
simpler and much faster (5s to .5s in many cases).

The tests are now run with developer mode off. This was initially required due
to the change where opening with a class subset is now considered a breaking
change in developer mode, but now that test cases explicitly specify their
types that isn't a problem any more. However, it does let us once again test
subscriptions failing due to an unqueryable field, and that test revealed that
we were using the wrong error domain for that error.

I added some new helper functions for the things I discovered I was going to
have to change in literally hundreds of places. Creating a temporary user is
now just `self.createUser()` rather than separate steps of creating credentials
and logging in. `self.name` is now used as the tag value for partitions and
user names and such rather than `#function` or `NSStringFromSelector(_cmd)`,
which makes it so that it doesn't have to be explicitly passed into helper
functions. There were a number of places where this previously was done
incorrectly and `#function` was used inside helper functions, which didn't
achieve the desired effect.
@tgoyne tgoyne force-pushed the tg/write-transaction-notification branch from bdd98d3 to 535097a Compare December 19, 2023 19:21
@tgoyne tgoyne changed the base branch from tg/write-transaction-notification to master December 19, 2023 20:22
@tgoyne tgoyne merged commit a73034c into master Dec 19, 2023
124 of 126 checks passed
@tgoyne tgoyne deleted the tg/rework-sync-tests branch December 19, 2023 21:36
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants