-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes MissingSchema exception when doing a fetch_result() in a TAP async job with a relative result URL #192
Conversation
…ync job with a relative result URL. Some TAP services return a JobSummary containing relative URLs for results, leading to an exception like the following: `MissingSchema: Invalid URL './21530/./results/result': No schema supplied. Perhaps you meant http://./21530/./results/result?` If the result URL is relative, we have to prepend the async TAP request URL to it before fetching the results. Fixes astropy#191.
That looks very reasonable, and I like it. I don't think the build errors are your fault - one is the docs build (there's another PR open about that) and the other one looks like a votable change which also seems unrelated. I'm going to wait on getting any answers back from the list if this is part of the spec or not, and I might ask you to rebase this after I fix some more of the issues to get your green check. Thanks much for the PR! |
On Thu, Oct 31, 2019 at 03:16:26PM -0700, Christine Banek wrote:
That looks very reasonable, and I like it. I don't think the build
errors are your fault - one is the docs build (there's another PR
open about that) and the other one looks like a votable change
which also seems unrelated. I'm going to wait on getting any
answers back from the list if this is part of the spec or not, and
I might ask you to rebase this after I fix some more of the issues
to get your green check.
Well... I'm a big fan of the golden rule of interoperability ("Be
strict in what you produce and lenient in what you accept"), so I'll
certainly not veto the PR. On the other hand, I've just argued on
the IVOA grid list
(http://mail.ivoa.net/pipermail/grid/2019-November/003041.html) that
allowing relative URIs in UWS results has a couple of pitfalls. Since
I can't see a strong reason to allow them, I'd therefore rather just
outlaw them.
*If* people agree with me on grid@, I'd say relative URIs (when we
accept them in the spirit of the golden rule) should raise a warning
-- you see, I hope that one of these days implementors will use pyVO
to validate their implemenations, and then it shouldn't be our fault
if they don't realise they're violating the spec.
…-- Markus
|
I wrote to the sysadmin at CEFCA, and they promptly changed the returned data to contain absolute URLs. They also fixed a lot of other issues regarding to VO standards I reported. As commented by @msdemlei, I believe this should be correct course of actions when there are discrepancies between implementations. |
Apologies for languishing on this one for a while. I'm going to get some more time to work on pyvo issues next week. I think the general consensus is that the UWS spec does allow for relative URLs, but not in a particularly explicit way. To quote @bmajor from his excellent email: I don't see text in the spec that would indicate that relative URLs are not allowed either, and they are legitimate values for 'anyURI'. https://www.w3.org/TR/xmlschema11-2/#anyURI But since they're not mentioned in UWS I'd say the regular rules determining the base of relative URLs should be applied (https://www.w3.org/TR/WD-html40-970917/htmlweb.html section 5.1.2), in which case they would be relative to the result-list endpoint. One would get the result that contains the relative URL from: /{jobs}/{job-id}/results/{myresult} Which would contain (for example) ./rel-url/to/result And resolve to /{jobs}/{job-id}/results/rel-url/to/result This seems to agree with the sentence in section 2.2.2.3 of UWS "A sensible default for their URIs is to make them children of /{jobs}/{job-id}/results, but this is not required.". And that all makes sense to me, although it seems like I would rather not allow relative URLs, for the reasons that @msdemlei mentions are some of the pitfalls, such as storing UWS results and then trying to look them up again, and forgetting where the base part of the URL is where the result was gotten from (therefore, being unable to reconstruct the full URL). Pyvo doesn't really support this as a feature though, and I'm less worried about some of these because inevitably, results won't live forever either. They will be archived or deleted after a while, so the URL will almost certainly never last forever. Now to stop arguing with myself, I think the right thing to do is allow the relative URLs, since it is technically allowed, and at least some people are probably doing this, as you were. But maybe add a log message to say that this isn't exactly recommended. I don't know what log zone to put this in, as messages that check for protocol errors that won't be fixed in the near term can be annoying. Perhaps we could create a new zone that is for protocol uneasyness - times where pyvo can work around questionable server or protocol usage, but allowing people to know. Of course, the question of should we change the UWS spec to outlaw or not recommend is another matter that can be left to another day. Again, sorry for the delay on this one. |
Okay sorry for the delay again, if you could rebase on top of the current master, you should be able to get the green build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code addition works with a service supplying a relative result.href
and the changes to result_uri
look good.
The self._job.result_uris
will still be relative in this case, but these are not used for TAP as only a single result is expected. It would be nice to apply the fix to ensure result_uris
are a list of absolute URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the previous discussion, I think this is good to go. The comment about result_uris()
is a good cleanup item, but I'm convinced it can be addressed later since the behavior for that is no different than before, and I don't think it's likely to be called soon. We can open an issue for that.
Hello,
You have emailed the wrong person. I have the @bmajor GitHub account
…On Thu, Dec 12, 2019, 2:24 PM Tom ***@***.***> wrote:
Merged #192 <#192> into master.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#192?email_source=notifications&email_token=AAYSSY7CILVEXU25IWKW73DQYK22ZA5CNFSM4JHTEIE2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOVORQRXI#event-2879588573>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAYSSY6LZYN3JTESN6PBDDLQYK22ZANCNFSM4JHTEIEQ>
.
|
Some TAP services return a JobSummary containing relative URLs for results, leading to an exception like the following:
MissingSchema: Invalid URL './21530/./results/result': No schema supplied. Perhaps you meant http://./21530/./results/result?
If the result URL is relative, we have to prepend the async TAP request URL to it before fetching the results.
Fixes #191.