New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop ftp:// urls from metalinks #99

Closed
nirik opened this Issue Jun 22, 2015 · 11 comments

Comments

Projects
None yet
5 participants
@nirik
Member

nirik commented Jun 22, 2015

ftp causes issues with many firewalls and is in general a horrible protocol. We should stop offerering them in metalink urls.

We might want to check/contact any mirrors that have only ftp urls and ask them to fix it or update to add a http{s} url.

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber Nov 9, 2015

Member

I had a look at the URLs in the database. From all mirrors there are 37 (total 2009) categories with FTP only links:

  1 ftp-redirect
  1 http-forbidden (403)
  4 private
  5 no-dns
  7 http-works
  8 http-not-found (404)
 11 no-http

So this looks like we could disable FTP in mirrormanager. I will provide a patch which will prevent that FTP URLs are added in the future. Then we can remove the existing FTP URLs from the database.

Member

adrianreber commented Nov 9, 2015

I had a look at the URLs in the database. From all mirrors there are 37 (total 2009) categories with FTP only links:

  1 ftp-redirect
  1 http-forbidden (403)
  4 private
  5 no-dns
  7 http-works
  8 http-not-found (404)
 11 no-http

So this looks like we could disable FTP in mirrormanager. I will provide a patch which will prevent that FTP URLs are added in the future. Then we can remove the existing FTP URLs from the database.

@pypingou

This comment has been minimized.

Show comment
Hide comment
@pypingou

pypingou Nov 9, 2015

Member

@adrianreber do we want to make this configurable is other system want to support ftp?

Member

pypingou commented Nov 9, 2015

@adrianreber do we want to make this configurable is other system want to support ftp?

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber Nov 9, 2015

Member

@pypingou good point. Other MM users might want to use FTP.

Member

adrianreber commented Nov 9, 2015

@pypingou good point. Other MM users might want to use FTP.

henrysher pushed a commit to henrysher/fedora-infra-ansible that referenced this issue Dec 17, 2015

First step to disable FTP in MirrorManager
As discussed in

fedora-infra/mirrormanager2#99

This is the first step to remove FTP from MirrorManager. With this
change it is no longer possible to enter FTP URLs into MM.

Signed-off-by: Adrian Reber <adrian@lisas.de>

@ralphbean ralphbean added the medium label Jan 11, 2016

@nirik

This comment has been minimized.

Show comment
Hide comment
@nirik

nirik Mar 7, 2016

Member

Whats the status here? can we drop these yet?

Member

nirik commented Mar 7, 2016

Whats the status here? can we drop these yet?

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber Mar 8, 2016

Member

We probably could. New mirrors can only be added without FTP URLs and there has not been any negative feedback until now. When I am looking at mirrors with problems I am manually deleting FTP URLs when I see them. It would be nice to remove those URLs all at once, but I see right now no problem removing them slowly for now. If anybody wants to remove all the FTP URLs right now, there are no objections from me.

Member

adrianreber commented Mar 8, 2016

We probably could. New mirrors can only be added without FTP URLs and there has not been any negative feedback until now. When I am looking at mirrors with problems I am manually deleting FTP URLs when I see them. It would be nice to remove those URLs all at once, but I see right now no problem removing them slowly for now. If anybody wants to remove all the FTP URLs right now, there are no objections from me.

@nirik

This comment has been minimized.

Show comment
Hide comment
@nirik

nirik Mar 8, 2016

Member

So, we just need to go into the db and remove all the ftp containing items?

I'm for doing this sooner rather than later. I get a pretty constant stream of people asking me why we have ftp:// urls and when they are going to go away.

@pypingou any thoughts?

Member

nirik commented Mar 8, 2016

So, we just need to go into the db and remove all the ftp containing items?

I'm for doing this sooner rather than later. I get a pretty constant stream of people asking me why we have ftp:// urls and when they are going to go away.

@pypingou any thoughts?

@mdomsch

This comment has been minimized.

Show comment
Hide comment
@mdomsch

mdomsch Mar 8, 2016

Member

FYI, the crawler crawls via rsync if available, falling back to FTP if
available, and finally http, priority being the fastest and least intrusive
way of getting the list of files. Removing FTP will slow down the crawler
for any mirror that doesn't offer rsync but has offered FTP.

On Tue, Mar 8, 2016 at 2:12 PM, Kevin Fenzi notifications@github.com
wrote:

So, we just need to go into the db and remove all the ftp containing items?

I'm for doing this sooner rather than later. I get a pretty constant
stream of people asking me why we have ftp:// urls and when they are
going to go away.

@pypingou https://github.com/pypingou any thoughts?


Reply to this email directly or view it on GitHub
#99 (comment)
.

Member

mdomsch commented Mar 8, 2016

FYI, the crawler crawls via rsync if available, falling back to FTP if
available, and finally http, priority being the fastest and least intrusive
way of getting the list of files. Removing FTP will slow down the crawler
for any mirror that doesn't offer rsync but has offered FTP.

On Tue, Mar 8, 2016 at 2:12 PM, Kevin Fenzi notifications@github.com
wrote:

So, we just need to go into the db and remove all the ftp containing items?

I'm for doing this sooner rather than later. I get a pretty constant
stream of people asking me why we have ftp:// urls and when they are
going to go away.

@pypingou https://github.com/pypingou any thoughts?


Reply to this email directly or view it on GitHub
#99 (comment)
.

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber Mar 10, 2016

Member

Unfortunately that (RSYNC > FTP > HTTP) is not true for quite some time now already (I think). Even MirrorManager1 preferred HTTP over FTP according to

https://git.fedorahosted.org/cgit/mirrormanager.git/tree/server/crawler_perhost#n393

But then there is also the logic to first crawl per category (RSYNC), per directory (FTP) and then per file (HTTP). So there is still the possibility that mirrors are crawled using FTP but I haven't seen it very often. I think that in cases where we need too much time to crawl RSYNC is the only sane option. Especially as the crawlers seem to be behind some kind of NAT using FTP to crawl might become (or already is) problematic.

Member

adrianreber commented Mar 10, 2016

Unfortunately that (RSYNC > FTP > HTTP) is not true for quite some time now already (I think). Even MirrorManager1 preferred HTTP over FTP according to

https://git.fedorahosted.org/cgit/mirrormanager.git/tree/server/crawler_perhost#n393

But then there is also the logic to first crawl per category (RSYNC), per directory (FTP) and then per file (HTTP). So there is still the possibility that mirrors are crawled using FTP but I haven't seen it very often. I think that in cases where we need too much time to crawl RSYNC is the only sane option. Especially as the crawlers seem to be behind some kind of NAT using FTP to crawl might become (or already is) problematic.

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber Apr 12, 2016

Member

Debian is also planning to remove FTP mirrors:

https://lists.debian.org/debian-mirrors/2016/04/msg00000.html

Member

adrianreber commented Apr 12, 2016

Debian is also planning to remove FTP mirrors:

https://lists.debian.org/debian-mirrors/2016/04/msg00000.html

@adrianreber

This comment has been minimized.

Show comment
Hide comment
@adrianreber

adrianreber May 31, 2016

Member

All FTP URLs have been removed from Fedora's MirrorManager DB. Adding new FTP URLs to the DB is no longer possible. See commit above.

Member

adrianreber commented May 31, 2016

All FTP URLs have been removed from Fedora's MirrorManager DB. Adding new FTP URLs to the DB is no longer possible. See commit above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment