Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FTP hangs on connect #27

Closed
cammellos opened this issue Jul 29, 2016 · 8 comments · Fixed by #31
Closed

FTP hangs on connect #27

cammellos opened this issue Jul 29, 2016 · 8 comments · Fixed by #31

Comments

@cammellos
Copy link
Contributor

Hi,
thanks for the library!

We are facing an issue with connecting to a ftp server.

The connection seems to be getting stuck in the open function, waiting for a server reply and never times out.

It quite difficult to replicate as it's intermittent, but my understanding is that the FtpClient actually connects, but hangs in receiving data from the server.

We are using the with-ftp method and setting a data-timeout-ms, but it's only set after open has run.

As a temporary fix, I have modified the library to set a data timeout before actually connecting to the server. We are now checking whether that fixes the issue, which might take some time as we don't have a way to replicate this condition locally.

It looks like it might be this issue:
http://stackoverflow.com/questions/2125350/commons-net-ftp-deadlock

I am happy to send a PR if we are able to verify that setting the data timeout before connecting solves the issue, but I might need some input on how you would like to change the open function signature.

At the moment we changed it to:

(defn open
  ([url] (open url "UTF-8"))
  ([url control-encoding
    {:keys [data-timeout-ms
            control-keep-alive-timeout-sec
            control-keep-alive-reply-timeout-ms]
     :or {data-timeout-ms -1
          control-keep-alive-timeout-sec 300
          control-keep-alive-reply-timeout-ms 1000}}]

But we are happy to change it.

I have attached the thread stacktrace.

In the meantime, I was wondering if you have any input, did you come across any similar problem?

stack.txt

Again, thanks for the library and sorry for the lengthy message!
cheeers
andrea
stack.txt

@miner
Copy link
Owner

miner commented Jul 29, 2016

I haven't seen this problem before. It sounds like you want to call .setDataTimeout in open before .connect to see if that fixes the problem. That's worth a try.

However, if you're stuck in actually opening the connection, you probably need to use .setConnectTimeout (inherited from SocketClient).

I'm happy to accept changes to fix this, especially if you can provide a good test case.

@cammellos
Copy link
Contributor Author

Hi Steve,
thanks for the reply,

I can confirm that it's not the connection timing out.

I have tried setting setDataTimeout but the problem persisted,

I have now tried using setDefaultTimeout before connecting and so far so
good, finger crossed :)

I will do a bit of investigation and once I can confirm that fixes it I
will start working on a PR.

thanks!
andrea

On Fri, Jul 29, 2016 at 4:13 PM, Steve Miner notifications@github.com
wrote:

I haven't seen this problem before. It sounds like you want to call
.setDataTimeout in open before .connect to see if that fixes the problem.
That's worth a try.

However, if you're stuck in actually opening the connection, you probably
need to use .setConnectTimeout (inherited from SocketClient).

I'm happy to accept changes to fix this, especially if you can provide a
good test case.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#27 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA-EsMPJdpw0WmZrqTLlBsIsqUncVzfuks5qahiMgaJpZM4JYDmS
.

@OskarD
Copy link

OskarD commented Sep 22, 2016

@miner @cammellos any progress on this?

We would like to deploy this fix since it seems to throw an exception instead of hanging when it loses connection to the FTP. This happened in our production environment, and we didn't find out until several days later that our service had stopped. Is the snapshot safe to use or is it incomplete?

@cammellos
Copy link
Contributor Author

Hi,
we have been using our fix in production for the past month and we had no
issues with it, I will try to push a pull request this weekend.

On Thu, Sep 22, 2016 at 12:23 PM, Oskar Danielsson <notifications@github.com

wrote:

@miner https://github.com/miner @cammellos
https://github.com/cammellos any progress on this?

We would like to deploy this fix since it seems to throw an exception
instead of hanging when it loses connection to the FTP. This happened in
our production environment, and we didn't find out until several days later
that our service had stopped. Is the snapshot safe to use or is it
incomplete?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#27 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AA-EsLAgo7dFTpifJMR9PkNJpCN_paKNks5qsmU4gaJpZM4JYDmS
.

@rutchkiwi
Copy link

@cammellos I'm also experiencing this issue, thanks for your investigation into it! Did your fix work in production, and how did you fix it? Any help is appreciated, if I can fix this I might try and make that PR as well!

@cammellos
Copy link
Contributor Author

cammellos commented Apr 25, 2017 via email

@rutchkiwi
Copy link

Thanks a lot Andrea! Either the PR or the gist would be massively appreciated! :)

@miner
Copy link
Owner

miner commented Apr 27, 2017

Fixed in release version 0.3.9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants