New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List of mirror URLs for IERS download #20
Comments
Pinging @taldcroft, @mhvk, @eteq, and @adrn since Opsie... turns out I did not have enough foresight when I implemented the mirror in astropy/astropy#8308 😬 |
Seems like a good idea. My only question would be if we are not doing something similar already elsewhere - just to be sure we don't duplicate different mechanisms (but the suggested scheme sounds good). |
At least as of today, https://maia.usno.navy.mil/ser7/finals2000A.all is required for access. http:// (without the 's') no longer responds. |
@lpsinger , is it less flaky now with latest astropy dev after astropy/astropy#8993 is merged? |
@pllim, no, it's still really flaky. Our project (growth-astro/growth-too-marshal) has to download the IERS files in its CI tests, and even with the https URL, it's failing more often than it is succeeding. I think that we really do need a list of fallback URLs. Also, astroplan needs to be able to use the fallbacks. |
Chiming in here as well (in addition to astropy/astroplan#356 (comment)). We have no problem providing our own mirror, mitigating the flakiness issue, but there still seem to be some consistency issues with how it is detected by the library. This might be specific to astroplan itself rather than astropy, but wanted to at least link the issues. |
I don't think that you should provide your own mirror unless you can guarantee that you can update it whenever a new copy is issued. |
Agreed with @lpsinger. I am also not sure whether it is so necessary to download IERS-A in every CI run - it is a hefty file and why test someone else's server? (Or add unnecessary pressure to it) It seems to me for testing one's own code, it is much more important to check that a given state leads to the right outcome. Anyway, I'm still in favour of the simple solution proposed on top, of just making a list of URLs, starting with just those sites that already claim to provide up-to-date copies. |
The issue with Astroplan is that it directly checks for the existence of the file in the cache, and the filename in the cache would depend on which URL is successfully downloaded. Can Astropy instead manage this? |
The issue for us is described in astropy/astroplan#410, namely that the military domains are banned from certain countries and the ftp download doesn't work with the astropy setup (because it is redirecting to ftps). The https://datacenter.iers.org/data/9/finals2000A.all should work but this limits us to only one server and as far as I can tell all of the URLs listed above are hosted on single-server instances that have proved to be far from reliable. It is trivial to host a 3MB file in a cloud-based CDN and it is trivial to set up something that keeps that file up to date (indeed, it took me five minutes using a combination of Google Cloud Storage, Cloud Functions, and Cloud Scheduler).
Yes, this is our bigger issue and a reliable global override of |
I'm also 👍 to the list of URLs solution as in the OP and @mhvk's comment.
In principle I agree but logistically there's a lot more to it if it's supposed to be permanent. "Easy right now" is not the same thing as "easy to keep running for the next 10 years". After all, that explicitly is the purpose of the IERS itself (the service, not the file)! That said, combined with the list of URLs this isn't so bad because it makes it one of several options and thus adds something like more fault tolerance. |
While I think this PR is good, it doesn't help my problem as the last two urls are invalid and the top two urls are still military domains. How do I actually change the value of I've tried to follow the guides in the page on the config system but there is nothing about IERS in there and I can't seem to enter a value in any single file that will persist. The documentation is far from clear. Sorry for posting here, but we don't really need more IERS issues/PRs floating around. |
@wtgee , before I propose a solution for you, how do you envision you want to customize your URL? Is it per session? Are you okay with modifying Addendum: IERS_A can be changed using the config system, but not IERS_B. I don't know why. |
This is to be used on some telescopes that run in a mostly automated fashion (https://projectpanoptes.org). We control the full install of the machine and the software itself is running on some docker images, so I have no problems modifying any of the set config files. We do mostly have a single session (i.e. control daemon) that is running the main telescope, but various other scripts also need to access this data (e.g. weather plotting, some processing, etc) separate from the main control. I can easily have a cron job that is running to update the entire system every week (or whatever). I just don't want to have to manually specify the url in every single script that imports astropy (mostly astroplan). This could also be related to @lpsinger comment about astroplan: astropy/astroplan#356 (comment), which @bmorris3 is aware of. As mentioned, our other issue is that we already have placement in a few countries that don't have the ability to access US military domains, so the primary and mirror URLs are a no-go for us (although this will be a minority of installs in the long-run). As also mentioned, we have no problem setting up our own mirror of the data. The issue is just getting this to work globally in the software in a consistent fashion (as below).
I don't get how to do this permanently. I tried modifying the config file in various places but nothing seemed to persist permanently. Thanks! |
@wtgee , what version of |
Custom IERS A URLs per-session example Disclaimer: The "mirror" URLs here are not real mirrors. Do not use them for production. >>> from astropy.utils import iers
>>> from astropy.utils.iers import conf as iers_conf
>>> iers_conf.iers_auto_url
'https://maia.usno.navy.mil/ser7/finals2000A.all'
>>> iers_conf.iers_auto_url = 'https://astroconda.org/aux/astropy_mirror/iers_a_1/finals2000A.all'
>>> iers_conf.iers_auto_url_mirror = 'https://astroconda.org/aux/astropy_mirror/iers_a_2/finals2000A.all'
>>> table = iers.IERS_Auto.open() # Note the URL
Downloading https://astroconda.org/aux/astropy_mirror/iers_a_1/finals2000A.all |
Custom IERS A URLs from config file example Disclaimer: The "mirror" URLs here are not real mirrors. Do not use them for production. And to test this properly, clear your cache if you want. Cache is in In your
Then, start a fresh session after you modified that file above, or do the following to force a reload: >>> from astropy.utils.iers import conf as iers_conf
>>> iers_conf.iers_auto_url
'https://maia.usno.navy.mil/ser7/finals2000A.all'
>>> iers_conf.reload()
>>> iers_conf.iers_auto_url
'https://astroconda.org/aux/astropy_mirror/iers_a_1/finals2000A.all' Now |
@lpsinger et al. , as a stop-gap solution that can be backported, how about changing the primary |
While I'm all about this in theory, that particular domain isn't working because of the https redirect. Following your example and setting both domains to ensure usage: >>> from astropy.utils import iers
>>> from astropy.utils.iers import conf as iers_conf
>>> iers_conf.iers_auto_url
'https://datacenter.iers.org/data/9/finals2000A.all'
>>> iers_conf.iers_auto_url_mirror
'https://datacenter.iers.org/data/9/finals2000A.all'
>>> table = iers.IERS_Auto.open()
WARNING: failed to download https://datacenter.iers.org/data/9/finals2000A.all and https://datacenter.iers.org/data/9/finals2000A.all, using local IERS-B: HTTP Error 403: Forbidden;HTTP Error 403: Forbidden [astropy.utils.iers.iers]
>>> table # Note it is IERS_B
<IERS_B length=20405> Edit: just a note that the above is not from one of the restricted domain countries. The NASA ftp site used to have the same issues with an ftps redirect but that does appear to be working for me now. |
This works great (with my custom mirrors), thanks for both examples. FWIW, I just had The remaining issues are astroplan specific and should be solved by @lpsinger's astropy/astroplan#418 |
Great! Until someone has time to work on refactoring for List O' Mirrors support, at least this is not blocking your work anymore. 🤞 |
astropy/astropy#9182 allows you to call download_file with remote_url set to the "authoritative" location of the data, and whatever is downloaded will be stored in the cache under this URL. But you can provide a list of URLs from which that data can actually be obtained, in order, and it need not include the "authoritative" location. Does this solve your problem? It would be possible, in principle, to add a mechanism in the config file where users could specify translations for URLs - that is, a user who knew they couldn't, or might not be able to, access a certain URL could provide a list of backup URLs, and these would be used even for download calls internal to astropy. |
I'm not sure if this is the right thread, but there's a big red warning on both |
@bhazelton - It was briefly mentioned on a slack thread, and we're asking about USNO recommendations, but it's definitely worth to have its own issue. Would you be interested in opening one? |
sure, I wrote it up in astropy/astropy#9427. Feel free to add anything I missed. |
Should we move this to https://github.com/astropy/astropy-iers-data ? |
Perhaps, though if we add something, it likely still has to be documented in |
Transferred from astropy/astropy
Both of the USNO hosts (maia.usno.navy.mil, toshi.nofs.navy.mil) for retrieving the IERS Bulletin A dataset are flaky. Instead of a primary URL specified by the configuration setting
utils.iers.iers_auto_url
and mirror URL byutils.iers.iers_auto_url_mirror
, I suggest a single list of URLs.Here is what I propose for backward compatibility:
iers_auto_url
andiers_auto_url_mirror
to be either a string or a list of strings.iers_auto_url
andiers_auto_url_mirror
and removing duplicates.iers_auto_url_mirror
configuration setting for removal in a future release.Here is a possible default URL list:
The text was updated successfully, but these errors were encountered: