Skip to content
This repository has been archived by the owner on Jan 15, 2021. It is now read-only.

Random testThaliPullReplicationFromNotificationCoordinated test failure #1744

Closed
chapko opened this issue Jan 21, 2017 · 8 comments
Closed
Assignees

Comments

@chapko
Copy link
Contributor

chapko commented Jan 21, 2017

Logs from devices:

Branch: iOS
Commit: 4b97950

Reproducible on desktop

@chapko chapko self-assigned this Jan 21, 2017
@chapko
Copy link
Contributor Author

chapko commented Feb 3, 2017

Turns out cancelling replication does not always emits 'complete' event. We should also listen to 'cancel' event. I'm still trying to figure out how to reliably reproduce this issue.

@yaronyg
Copy link
Member

yaronyg commented Feb 3, 2017

Have we tried to update our release of PouchDB to see if perhaps this has been addressed? If you do get a repro then it would be good to file this as a bug with PouchDB even as we work around it.

@chapko
Copy link
Contributor Author

chapko commented Feb 3, 2017 via email

@chapko
Copy link
Contributor Author

chapko commented Feb 6, 2017

Actually this is not a PouchDB bug. Something is wrong with ForverAgent. I tried to run test using default https agent and forever-agent:

  • https.Agent 0/100 failures
  • ForeverAgent.SSL 23/100 failures.

@chapko
Copy link
Contributor Author

chapko commented Feb 6, 2017

The bug itself is that when we are trying to make request our callback is never called. It is reproducible with our fork of forever-agent and with original forever-agent.

@yaronyg
Copy link
Member

yaronyg commented Feb 6, 2017

Then it's time to file a bug with forever-agent. I wonder if anyone will notice. :) In the meantime we should obviously fix it in our fork.

@chapko
Copy link
Contributor Author

chapko commented Feb 7, 2017

I've finally found the problem and this is a bug in forever-agent.

Imagine the scenario when we are making 3 serial requests to the same http server using the same ForeverAgent instance:

  1. We create http.ClientRequest object via http.request(...). It asks agent for socket connection. Agents creates new net.Socket object and provides it to the http.ClientRequest object.
  2. Request writes body into the provided socket. Once we receive reply via this socket, parser creates http.IncomingMessage stream. As soon as this stream ends, the underlying socket emits 'free' event marking this socket as "free".
  3. ForeverAgent listens to "free" event and stores these sockets to reuse them later.
  4. Following requests to the same server would result in agent providing used sockets.

So basically if we have "free" socket it can be "borrowed" by some request. As soon as the corresponding response (IncomingMessage) emits 'end' event, our sockets is "free" again.

And everything works fine until we destroy socket.

Destroyed socket emits 'close' event. Agent listens to this event to remove closed sockets. And it works as intended, destroyed socket is removed from the agent, but the problem is that agent still listens to the 'free' event on removed socket. And if we destroy socket in the middle of request and if there is a corresponding response object, response will be ended emitting 'end' event marking the socket as "free" again (1 2 3).

The next part was to figure out if it is agent's job to filter out destroyed sockets or agent should not listen to 'free' event after removing socket or it is node's job to not emit 'free' event on destroyed socket. Nodejs after 0.10 (not sure what exact) version implemented ForeverAgent functionality natively and it just checks if the socket is destroyed before providing it to the request objects. And the latest node has the same problem with ForeverAgent (that means it still emits 'free' events after destroying and default agent still listens to the 'free' event after socket has been removed). So the obvious solution is to check if the socket is destroyed when agent receives 'free' event.

I don't think ForeverAgent is still maintained, nodejs 0.10 and 0.12 are abandoned since the December 2016, and node 4+ supports this functionality natively. Anyway I will provide a pull request to the original repo.

@yaronyg
Copy link
Member

yaronyg commented Feb 7, 2017

@chapko Thanks! This is a painful but necessary catch and yet another reason why we have to figure out how to get off JXcore.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants