-
Notifications
You must be signed in to change notification settings - Fork 1.3k
import-url: allow queries in URL #3432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This basically breaks You should either add full support of params or go around of it. If you are going to implement it then you should leave the old class for anything but http urls for efficiency. |
|
ok so this is still a bug then. We need to support queries because some sites need them for downloadable links (I seem to recall dropbox for example returns a html webpage if you leave out |
| obj.fill_parts(scheme, host, user, port, path) | ||
| obj.params = params | ||
| obj.query = query | ||
| obj.fragment = fragment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't bother overriding fill_parts as it's not used publicly anywhere else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then don't. It used via .replace(), .__div__(), .parent, If the inherited implementation works then you may skip though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those use from_parts (which would use this overridden method). The point is nothing else explicitly uses fill_parts directly so there's no need to override fill_parts for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You keep saying don't need to override, but you override. Not sure I understand what you are up to here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean technically the correct way to do things would be:
def fill_parts(self, scheme, host, user, port, path, params, query, fragment):
super().fill_parts(self, scheme, host, user, port, path)
self.params = params
self.query = query
self.fragment = fragmentbut this isn't required right now as fill_parts is really a private method not used anywhere else.
Slightly more thorough testing
|
@shcheklein / @jorgeorpinel / @Suor ready for review |
Suor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some simplifications are possible. Sorry for late review.
| obj.fill_parts(scheme, host, user, port, path) | ||
| obj.params = params | ||
| obj.query = query | ||
| obj.fragment = fragment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then don't. It used via .replace(), .__div__(), .parent, If the inherited implementation works then you may skip though.
| def __init__(self, url): | ||
| p = urlparse(url) | ||
| stripped = p._replace(params=None, query=None, fragment=None) | ||
| super().__init__(stripped.geturl()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use restringification? May use .from_parts() or .fill_parts().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More future-proof to use super() to use the parent's logic. Otherwise we'd need
p = urlparse(url)
-stripped = p._replace(params=None, query=None, fragment=None)
-super().__init__(stripped.geturl())
+assert p.password is None
+self.fill_parts(p.scheme, p.hostname, p.username, p.port, p.path)and not call super(), which is allowed but not great practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, we can ignore all of this as long as it works. Some tech debt, but might be resolved later.
Needing to parse, restringify and parse is a kludge.
Looks like disallowing URLs containing queries happened in b83564d (#2707), i.e.
dvc>0.66.1... Not sure why. Seems fine to me to remove theassert not p.queryrestriction.URLInfofor http URLs (via subclassHTTPURLInfo(URLInfo))dvc import-url -v https://www.dropbox.com/?test data