-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coordinates: ICRSCoordinates should accept coordinate strings with no space #1115
Comments
I think the best approach will be to create a new parser that imports productions from the existing angle parser. (I don't know if this is even possible with PLY). I'll have a look at it when I have a chance (getting out from under the matplotlib release candidate cycle). |
There was a long discussion about this at the time of the last coordinates revision - the agreement was to not put this in the So I agree with @mdboom that we will want to use a proper parser, but I think it probably should not go in |
Ah, I don't remember that discussion. What was the reasoning? Seems to me that, e.g. |
I'm actually rather neutral on the topic, but my understanding of the decision is that the class constructors were to have relatively straightforward syntax, without too much "magical" parsing - basically anything that can be just split off and delegated to the The idea, I think, was to avoid "parsing creep" - that is, there are certainly other slightly more obscure ways of writing down coordinates, and we don't want to just slowly add more and more formats until there's no distinction between the constructor and parser. Another thing that occurs to me now, though: What do we do if a user does |
My view on this as a non-astronomer is this: if there are file formats that include coordinates in this way (without the space and the J prefix etc), then we should try to support that. Otherwise, supporting Python/Numpy numbers is preferable to supporting all kinds of magical strings. Once we make a determination on that, I'm happy to look into building a coordinate parser if we decide to go that way. |
My similar but slightly different take on @mdboom's version is this: if there are papers that give coordinates in this way, then we should support it (in addition to file formats, as @mdboom says). By that metric this should be in there, because you do sometimes see "J061800.25+203142.5" or similar in a paper without the coordinates given in some format (usually as a passing reference to some particular object, but still). But I would still say this should all go in the "higher-level" class we have on the drawing board for v0.3 (which would sumsume the parsing function we discussed earlier), rather than the at the level of |
Don't forget that format ambiguities are common in well-known catalogs, e.g. IGR J01234-2338 (is it 0h 12m 34s or 01h 23.4m? answer is the latter), or PKS 0123+082 (that's 8.2 degrees, not 82 degrees). I think the original case from @mdboom is unambiguous, but if you actually go through all papers and source catalogs in astronomy you will find that astronomers have been creative with source naming when it comes to including the coordinates in the name. This mostly applies to older formats with fewer digits as shown above. |
That's a good point, and is part of why I'm thinking this should go in the higher-level class/parser function. Then we have one place where you add code to do things like checking if "PKS" is at the start of the string, and if so, interpret the coordinates appropriately. Then the (likely rather confusing) parsing code can be kept separate of the actual coordinate-handling code. |
@adrn - this issue isn't relevant as-is, because ICRSCoordinates no longer exists. For 1.0, this sort of capability will probably be subsumed into But the issue still technically exists in that it is on |
I think we need a more general and pluggable infrastructure to handle parsing string coordinates. We already have the case of SunPy wanting to parse strings like |
I agree with @taldcroft here - this was one of the major motivations for That said, @mhvk may be right that the easiest way to proceed is to start with the parser and then add it to |
Closed by #2920 |
Sorry, this issue is not completely addressed by #2920 (there was a comment there that said it closed this issue). The strings here are still not parsed, and I'm not sure if we decided we should or not. There was a comment above by @taldcroft that we should support IAU names at least. Removing the 1.0 milestone. |
I've looked at these formats while doing #2920. The first one is easy, one must really need to use the unit as both Regarding the second, I wasn't sure whether to include them with hard wired fix lenght as HHMMSS.ss+DDMMSS.s and (D)DDMMSS.ss+DDMMSS.s. The current example is OK, but it can be quite ambiguous when one says btw is there an official IAU format/name list somewhere? I only found examples of supported and avoidable formats, but nothing comprehensive. |
@bsipocz @astrofrog This IAU spec document states that RA is always specified in HMS, so now I don't think there is a unit ambiguity. In this document, it doesn't look like there is much flexibility and this could just be implemented as a regex... The relevant text from the linked document:
I think this regex captures everything outlined in the document:
e.g., try it on these examples: J061800.25+203142.5
SDSS J061800.25+203142.5
RX J1426.8+6950
PSR J1302-6350
PN G001.2-00.3
QSO B004848-4242.8
QSO B004848-4242
SDSS 061800.25+203142.5
B004848-4242.8
061800.25+203142.5 and these should fail: QSO 00484-4242
GRO J317-85 |
Yes, we should still have a parser! But be sure to also capture odd numbers of digits to get coordinates like
|
I'll have to look in to implementing this in a parser -- I have no experience with that and don't understand @mdboom 's original comment!
|
@adrn - Not before the weekend, but then I probably can do it. |
Pinging on this! @bsipocz I can give it a try if you are swamped. |
I'm good with the general idea of this now. I just have to laugh/cry when I see the contortions that ensue from this insistence on sexagesimal. Is it the intent here to fully implement this IAU spec? Coordinates using an even number of digits (in either RA or Dec), fewer than seven, are expressed in the sexagesimal system. The sequences HHMM.mm or DD.dd where mm and dd are decimal parts of a minute or degree, respectively, should be avoided. If the number of digits is odd and fewer than six, the right-most digit represents a decimal part of hours, degrees or minutes (as, e.g., in the PKS-style HHMM+DDd or in IRAS source designation HHMMm+DDMM) and not tens of minutes or seconds (e.g.. the formats HHMMS or +DDM should be avoided). If the number of digits is more than six, the digits in excess of six are decimal parts of seconds of time for RA or of angle for Dec ; explicit use of the decimal points is encouraged (e.g., HHMMSS.ss or DDMMSS.s). |
The contortions mostly come about because both radio and X-ray astronomers had so many coordinates of poor quality... Anyway, I do think if we implement the specification, we might as well implement it fully. (Though we cannot avoid that, surely, people regularly will continue to assume coordinates are rounded rather than truncated...) |
Reviving this thread! @bsipocz are you still interested in trying this? |
@adrn - Indeed, this slipped through multiple times. I try to get back to it this week. (and I wish github would finally introduce a personal todo to help to avoid cases like this...) |
👋 O hai @bsipocz |
ouch |
This now works through |
The
ICRSCoordinates
initializer should probably accept:Right now these fail for a number of reasons:
I started working on fixing this, but got lost in the "new" pyparsing code...
The text was updated successfully, but these errors were encountered: