Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle GCS paths that contain '?' char #460

Merged
merged 1 commit into from Apr 4, 2020

Conversation

chakruperitus
Copy link
Contributor

@chakruperitus chakruperitus commented Apr 4, 2020

Correctly handle GCS paths that contain '?' char

Motivation

GCS (and s3) paths support the character '?' . Use of urllib.parse.urlsplit was splitting the path at '?' due to which such files cannot be opened. This was already handled for S3 paths correctly.
Moved the existing utility function in s3 to utils so that it can be called from both s3 and gcs implementations.

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

Move _safe_urlsplit to utils and used it from s3 and gcs
implementations to support paths with '?' character
@mpenkov
Copy link
Collaborator

mpenkov commented Apr 4, 2020

Good catch, thank you! And congratulations on your first PR to smart_open! 🥇

@mpenkov mpenkov merged commit ea45989 into piskvorky:master Apr 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot read files with '?' character in the path when using google storage
2 participants