Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly encode URL strings before use by Fetch URL #6140

Open
tfmorris opened this issue Nov 5, 2023 · 3 comments · May be fixed by #6142
Open

Correctly encode URL strings before use by Fetch URL #6140

tfmorris opened this issue Nov 5, 2023 · 3 comments · May be fixed by #6142
Assignees
Labels
fetch urls About fetching URLs in a project Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Milestone

Comments

@tfmorris
Copy link
Member

tfmorris commented Nov 5, 2023

Currently users are required to know what URLs might need encoding and correctly encode them before using them with the "Edit Column" -> "Add column by fetching URLs..." operation. I can't think of a circumstance where the user benefits from using the string directly, as is, so I would suggest that we attempt to encode it as a URL in the same way that a web browser would for addresses entered in its browser bar.

Proposed solution

Before attempting to use strings as URLs for the Fetch URL operation, encode them in the same way the a web browser would for text that a user typed/pasted into the browser address bar. These strings are cell values as transformed by the GREL expression the user enters (which defaults to simply value).

Alternatives considered

  • Leaving it as is, which seems sub-optimal
  • It may be desirable to have this be optional and switched on by default, so that users can disable it if need be, just to be extra cautious
@tfmorris tfmorris added Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators labels Nov 5, 2023
@tfmorris tfmorris self-assigned this Nov 5, 2023
@thadguidry
Copy link
Member

thadguidry commented Nov 5, 2023

@tfmorris Thanks for writing this issue. I would like to see the problem stated or edited so that it's more clear of the direct issue and which function or feature of OpenRefine. In my mind, users currently can use Add Column based on... and then use GREL .escape("url") correct? Then given that method of first creating URLs in that manner, then secondary running Fetch on that created column of escaped/encoded URLs, then it would be the same as browsers perform, correct or not quite?

@thadguidry
Copy link
Member

thadguidry commented Nov 5, 2023

Rereading your other created issues, so maybe proposal should begin with:

Change the internal Fetch function so that it encodes the value...

Which will bring clarity to WHAT and WHERE things are proposing to change. (I realize of course you likely were head down typing all these issues up quickly and assumed more but let's be explicit to help any new contributors looking at these, no matter self-assigned or not as a best practice; and thanks!)

@tfmorris tfmorris added the fetch urls About fetching URLs in a project label Nov 6, 2023
@tfmorris
Copy link
Member Author

tfmorris commented Nov 6, 2023

@thadguidry Even if value.escape("url") worked (which it doesn't), my assertion is that the user shouldn't have to know what values need escaping or how they need to be escaped (which is complex).

I did some copy editing to the issue to hopefully make it clearer. Let me know if it's still too confusing.

tfmorris added a commit to tfmorris/OpenRefine that referenced this issue Nov 6, 2023
tfmorris added a commit to tfmorris/OpenRefine that referenced this issue Nov 8, 2023
@tfmorris tfmorris added this to the 3.8 milestone Nov 15, 2023
tfmorris added a commit to tfmorris/OpenRefine that referenced this issue Nov 26, 2023
@tfmorris tfmorris removed the Status: Pending Review Indicates that the issue or pull request is awaiting review by project maintainers or collaborators label Jan 5, 2024
tfmorris added a commit to tfmorris/OpenRefine that referenced this issue Jan 5, 2024
tfmorris added a commit to tfmorris/OpenRefine that referenced this issue Feb 2, 2024
@tfmorris tfmorris modified the milestones: 3.8, 3.9 May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fetch urls About fetching URLs in a project Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Projects
None yet
2 participants