Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[xamarin-ios][sg-ssl enabled] cbl replicator does not resume after SG restart #1140

Closed
euniceatcouchbase opened this issue Apr 17, 2019 · 9 comments
Assignees
Milestone

Comments

@euniceatcouchbase
Copy link

euniceatcouchbase commented Apr 17, 2019

Library Version

2.5.0-190

.NET Runtime

Xamarin-iOS

Operating System / Device Details

iOS 12.2

Log Output

Attached.
sgOffline-dotnet-console-log.txt
test_default_conflict_withConflicts_and_sgOffline.txt
x-android-2.1-dotnet-console.txt
x-android-2.1-pytest-console.txt
x-android-2.5-dotnet-console.txt
x-android-2.5-pytest-console.txt
x-ios-2-1-2-with-sg-2-1-3-dotnet-console-log.txt
x-ios-2-1-2-with-sg-2-1-3-pytest-log.txt
x-ios-2-1-2-with-sg-2.5-dotnet-console-log.txt
x-ios-2-1-2-with-sg-2.5-pytest-log.txt

Expected behavior

xattrs = true, no-conflicts = false, sg-ssl = true, start replication with push_pull and continuous=true on xamarin-ios

stop SGW, update & delete some docs on cbl, then restart SGW, expecting replicator resumes, activityLevel may set to "offline" right after SGW restart then becomes active and replication should automatically start.

Actual behavior

replicator's activityLevel remains in "stopped" state with error of "Couchbase.Lite.CouchbaseLiteException: CouchbaseLiteException (LiteCoreDomain / 26): Connection closed."

Steps To Reproduce

setups:
1-CBS 6.0.1-2037
1-SGW 2.5.0-271 with xattrs=true, no-conflicts=false, sg-ssl=true
TestServer Xamarin-iOS 2.5.0-190

steps:

  1. Create docs in SG.
  2. Create two conflicts with 2-hex in sg.
  3. update doc in sg to have new revision to one of the conflicted branch of sg.
  4. Start replication with push_pull and contiinous true
  5. Stop sg
  6. Update and delete doc in cbl
  7. Start sg and wait until replication is idle

Reproduction Project

test command:
pytest -s -rsx -k test_default_conflict_withConflicts_and_sgOffline --timeout 1800 --liteserv-version=2.5.0-190 --liteserv-host=localhost --liteserv-port=8080 --xattrs --sg-ssl --sync-gateway-version=2.5.0-271 --mode=cc --server-version=6.0.1-2037 --liteserv-platform=xamarin-ios --create-db-per-test=cbl-test testsuites/CBLTester/CBL_Functional_tests/TestSetup_FunctionalTests

@euniceatcouchbase euniceatcouchbase changed the title [xamarin-ios] cbl replicator does not resume after SG restart [xamarin-ios][regression][sg-ssl enabled] cbl replicator does not resume after SG restart Apr 17, 2019
@euniceatcouchbase
Copy link
Author

Attached more logs from xamarin-ios 2.1, xamarin-android 2.5 with same test configurations, tests all passed.
The issue is a regression on sg-ssl enabled xamarin-ios platform only.

@borrrden borrrden added this to the Iridium milestone Apr 17, 2019
@borrrden
Copy link
Member

I think this is just the case of the behavior being different, but the test doesn't actually realize that 2.1 is also failing.

I see the exact same behavior in the 2.1 logs except that an actual error is being set on the replicator instead of just stopping with no error. That is the reason that the test suite flags it as a failure in 2.5 but not 2.1.

If you check the logs for the 2.1 run you will see also that the replicator never leaves "stopped". I'm curious how this test can pass on 2.1, but it takes me ages to set up this environment correctly to run the test (Someone one of these days has to make a developer mode that skips all the cruft and just RUNS THE TEST )

@euniceatcouchbase
Copy link
Author

To test 2.1 I had to use a branch for 2.1 of the mobile-testkit repo, according to @sridevi-15, on 2.1 test framework code, errors were not checked, I guess that's why 2.1 test passed.
What's your suggestion on this scenario? If this is expected behavior, we can add handling to the test code, which will be checking error 26 - the connection error along with replicator in stopped state?

@borrrden
Copy link
Member

How are you checking the 2.1 version? When I rollback the test server app to 2.1, comment out all the 2.5 specific code, and run the test I get the same failure. It doesn't look like a regression, and furthermore the weird thing is it only happens if Sync Gateway is running on a unix platform. If it is running on Windows then the test passes.

@euniceatcouchbase
Copy link
Author

euniceatcouchbase commented Apr 17, 2019

I use feature/cbl-2.1.2-tests branch, I tested both SGW 2.1 and 2.5 (logs attached)

@borrrden
Copy link
Member

I think probably a lot has changed between then and now, including test logic. It's not expected behavior, but it's not a regression and I don't think it is serious enough to block the release. So if it were up to me, it would be fixed in Cobalt, and this would be added as a known issue to Iridium.

CC: @djpongh @rajagp

@borrrden borrrden added bug and removed regression labels Apr 17, 2019
@djpongh djpongh changed the title [xamarin-ios][regression][sg-ssl enabled] cbl replicator does not resume after SG restart [xamarin-ios][sg-ssl enabled] cbl replicator does not resume after SG restart Apr 17, 2019
@djpongh
Copy link
Contributor

djpongh commented Apr 17, 2019

Agree with @borrrden. Tagged to add to release notes.

@djpongh djpongh added the icebox label Apr 19, 2019
@borrrden
Copy link
Member

Looks like this error is thrown when an SSL connection aborts. This is a pretty generic way to throw that so I hope that no other situation also throws this same exception.

To fix this for Cobalt: need to add an iOS specific translation for this exception to a transient error code (preferably only if TLS is active but that might not be possible).

@djpongh djpongh removed the icebox label Apr 22, 2019
@djpongh djpongh closed this as completed Apr 22, 2019
@djpongh djpongh reopened this Apr 22, 2019
@borrrden borrrden modified the milestones: Iridium, Cobalt May 7, 2019
Sandychuang8 pushed a commit that referenced this issue May 18, 2019
borrrden pushed a commit that referenced this issue May 21, 2019
* Add an iOS specific translation for this exception to a transient error code
@Sandychuang8
Copy link
Contributor

Since PR is approved and merged into master. I will close this issue here. @borrrden created a Jira ticket (https://issues.couchbase.com/browse/CBL-2) for this issue so we can follow up on test fixes from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants