Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes MissingSchema exception when doing a fetch_result() in a TAP async job with a relative result URL #192

Merged
merged 1 commit into from
Dec 12, 2019

Conversation

streeto
Copy link
Contributor

@streeto streeto commented Oct 31, 2019

Some TAP services return a JobSummary containing relative URLs for results, leading to an exception like the following:

MissingSchema: Invalid URL './21530/./results/result': No schema supplied. Perhaps you meant http://./21530/./results/result?

If the result URL is relative, we have to prepend the async TAP request URL to it before fetching the results.

Fixes #191.

…ync job with a relative result URL.

Some TAP services return a JobSummary containing relative URLs for results, leading to an exception like the following:

`MissingSchema: Invalid URL './21530/./results/result': No schema supplied. Perhaps you meant http://./21530/./results/result?`

If the result URL is relative, we have to prepend the async TAP request URL to it before fetching the results.

Fixes astropy#191.
@cbanek
Copy link
Contributor

cbanek commented Oct 31, 2019

That looks very reasonable, and I like it. I don't think the build errors are your fault - one is the docs build (there's another PR open about that) and the other one looks like a votable change which also seems unrelated. I'm going to wait on getting any answers back from the list if this is part of the spec or not, and I might ask you to rebase this after I fix some more of the issues to get your green check.

Thanks much for the PR!

@cbanek cbanek added this to the v1.1 milestone Oct 31, 2019
@msdemlei
Copy link
Contributor

msdemlei commented Nov 4, 2019 via email

@streeto
Copy link
Contributor Author

streeto commented Nov 7, 2019

I wrote to the sysadmin at CEFCA, and they promptly changed the returned data to contain absolute URLs. They also fixed a lot of other issues regarding to VO standards I reported. As commented by @msdemlei, I believe this should be correct course of actions when there are discrepancies between implementations.

@cbanek
Copy link
Contributor

cbanek commented Nov 8, 2019

Apologies for languishing on this one for a while. I'm going to get some more time to work on pyvo issues next week.

I think the general consensus is that the UWS spec does allow for relative URLs, but not in a particularly explicit way. To quote @bmajor from his excellent email:

I don't see text in the spec that would indicate that relative URLs are not allowed either, and they are legitimate values for 'anyURI'.

https://www.w3.org/TR/xmlschema11-2/#anyURI

But since they're not mentioned in UWS I'd say the regular rules determining the base of relative URLs should be applied (https://www.w3.org/TR/WD-html40-970917/htmlweb.html section 5.1.2), in which case they would be relative to the result-list endpoint. One would get the result that contains the relative URL from:

/{jobs}/{job-id}/results/{myresult}

Which would contain (for example) ./rel-url/to/result

And resolve to

/{jobs}/{job-id}/results/rel-url/to/result

This seems to agree with the sentence in section 2.2.2.3 of UWS "A sensible default for their URIs is to make them children of /{jobs}/{job-id}/results, but this is not required.".

And that all makes sense to me, although it seems like I would rather not allow relative URLs, for the reasons that @msdemlei mentions are some of the pitfalls, such as storing UWS results and then trying to look them up again, and forgetting where the base part of the URL is where the result was gotten from (therefore, being unable to reconstruct the full URL).

Pyvo doesn't really support this as a feature though, and I'm less worried about some of these because inevitably, results won't live forever either. They will be archived or deleted after a while, so the URL will almost certainly never last forever.

Now to stop arguing with myself, I think the right thing to do is allow the relative URLs, since it is technically allowed, and at least some people are probably doing this, as you were. But maybe add a log message to say that this isn't exactly recommended. I don't know what log zone to put this in, as messages that check for protocol errors that won't be fixed in the near term can be annoying. Perhaps we could create a new zone that is for protocol uneasyness - times where pyvo can work around questionable server or protocol usage, but allowing people to know.

Of course, the question of should we change the UWS spec to outlaw or not recommend is another matter that can be left to another day.

Again, sorry for the delay on this one.

@cbanek
Copy link
Contributor

cbanek commented Nov 13, 2019

Okay sorry for the delay again, if you could rebase on top of the current master, you should be able to get the green build.

Copy link
Contributor

@stargaser stargaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code addition works with a service supplying a relative result.href and the changes to result_uri look good.

The self._job.result_uris will still be relative in this case, but these are not used for TAP as only a single result is expected. It would be nice to apply the fix to ensure result_uris are a list of absolute URLs.

Copy link
Contributor

@tomdonaldson tomdonaldson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the previous discussion, I think this is good to go. The comment about result_uris() is a good cleanup item, but I'm convinced it can be addressed later since the behavior for that is no different than before, and I don't think it's likely to be called soon. We can open an issue for that.

@tomdonaldson tomdonaldson merged commit 78e3d52 into astropy:master Dec 12, 2019
@bmajor
Copy link

bmajor commented Dec 12, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TAPService run_async() fails when the result URI is relative
6 participants