Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection Error, RemoteDisconnected #144

Closed
lisavader opened this issue Apr 8, 2021 · 18 comments
Closed

Connection Error, RemoteDisconnected #144

lisavader opened this issue Apr 8, 2021 · 18 comments

Comments

@lisavader
Copy link

Hi,

When I try to download data for a relatively large number of genomes, e.g.:
ncbi-genome-download bacteria -t 562 -l complete -F assembly-report

I get the following error message:
ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

I don't get this issue when downloading only one or a few genomes.
Looking at similar issues it seems that a Connection Error is usually due to the connection of users themselves, and not an error caused by ncbi-genome-download. However because the connection is closed by the remote end, I'm not sure.

If anyone could help me out that'd be greatly appreciated!

Best,
Lisa

@ilasadar
Copy link

Same problem (04/10/2021):

ncbi-genome-download --formats fasta bacteria --parallel 4
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_011742285.2'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815795.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815575.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_009498175.3'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815655.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815675.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815835.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017869345.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815595.1'
WARNING: Skipping entry, as it has no ftp directory listed: 'GCF_017815615.1'
ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')))

@Daikuang
Copy link

Similar problem
ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', OSError(0, 'Error')))

@tantony3
Copy link

I've also been having the same issue for a week:

ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

@kblin
Copy link
Owner

kblin commented Apr 13, 2021

Hm, it looks like NCBI might have introduced some kind of connection limit. I'm not aware of any documentation on this from the NCBI side of things, and unlike with the Entrez API, there's not really a way to provide e.g. an API key to get a less strict rate limit. I'll try if I can reproduce and debug this a bit further.

@kblin
Copy link
Owner

kblin commented Apr 13, 2021

Ok, looks like I'm getting the ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', OSError(0, 'Error')),) one myself here. I'll see if I can find out what's happening.

@kblin
Copy link
Owner

kblin commented Apr 13, 2021

Now I got the ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',)),) one. Unfortunately there's really not much to find out about this, because the connection is closed not at the HTTP GET request level but one level below that, so there's really no communication of what the issue is.
I'm currently trying to add a rate limiting step to see if that fixes it, but this will slow down things considerably.

@wshuai294
Copy link

I met the same bug, and I am looking forward to your solution.

ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',)),)

@kblin
Copy link
Owner

kblin commented Apr 14, 2021

Nope, still happens, even at just 1 request per second, it just takes longer to get there.
As this already happens at the stage of downloading the checksum files, you can't even cache these and restart easily, so I'm also struggling to find a good workaround.

@kblin
Copy link
Owner

kblin commented Apr 14, 2021

Having said that, I hear from a couple of colleagues that also other connections to the NCBI FTP servers die with the same issues, regardless of if the HTTPS protocol is being used (like for ncbi-genome-download) or if old-fashioned FTP is being used. So maybe there's just some networking issues at the NCBI side of things at the moment?

@lisavader
Copy link
Author

Thank you for looking into the issue! Let's hope it's only a temporary NCBI connection problem.

@naturepoker
Copy link

Having said that, I hear from a couple of colleagues that also other connections to the NCBI FTP servers die with the same issues, regardless of if the HTTPS protocol is being used (like for ncbi-genome-download) or if old-fashioned FTP is being used. So maybe there's just some networking issues at the NCBI side of things at the moment?

I can attest to even pure FTP downloads getting cut off more or less randomly, regardless of the protocol used, going into June 9th 2021. It looks like NCBI introduced some sort of arbitrary cutoff for shutting down connections. One would wonder if they can't just communicate with the research community directly on what's needed...

@npsonis
Copy link

npsonis commented Mar 16, 2022

Hi, still an issue today

@kblin
Copy link
Owner

kblin commented Mar 17, 2022

This is on the NCBI side of things, though. Not much we can do about this on the client side.

@evezeyl
Copy link

evezeyl commented Mar 27, 2022

Same problem - Then thinking that it would be nice with a resume command - not sure if when we relaunch everything starts from scratch then, but the ability to resume would be perfect then.

@kblin
Copy link
Owner

kblin commented Mar 28, 2022

ncbi-genome-download doesn't re-download files that are correctly downloaded and current. But in order to check that, it does need to fetch all checksum files again on startup, and if you're downloading a lot of records that can also take a while.

@chasemc
Copy link

chasemc commented Nov 8, 2022

Hitting this today

@chasemc
Copy link

chasemc commented Nov 8, 2022

NLM had a bunch of website issues a couple of days ago maybe also something going on with the FTP
ERROR: Download from NCBI failed: ConnectionError(ProtocolError('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')))

@kblin
Copy link
Owner

kblin commented Nov 8, 2022

Again, this is an issue on the NCBI side, nothing ncbi-genome-download can do about it.

Repository owner locked and limited conversation to collaborators Nov 8, 2022
@kblin kblin converted this issue into discussion #198 Nov 8, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants