Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pasteHandler() incorrectly treats some strings as links #46287

Open
kadamwhite opened this issue Dec 2, 2022 · 4 comments
Open

pasteHandler() incorrectly treats some strings as links #46287

kadamwhite opened this issue Dec 2, 2022 · 4 comments
Labels
[Feature] Raw Handling Related to the ability to convert content to blocks, paste handling, etc [Type] Bug An existing feature does not function as intended

Comments

@kadamwhite
Copy link
Contributor

kadamwhite commented Dec 2, 2022

Description

When selecting some content and trying to paste other content into its place, certain strings are interpreted as links which should not be. This causes the selected text to be converted to an invalid link, instead of replacing the selected text with the clipboard contents.

This occurs because the check for whether a text isURL is to run it through the browser's URL constructor, and any non-error result is treated as proof the string is a URL. Passing a string formatted Word: Other Words into this constructor results in a valid URL object, with not-really-URL contents. For example the string Movie: The Revenge of Sequels II in the example below gets parsed as such:

URL {
    href: "movie: The Revenge of Sequels II"
    origin: "null"
    pathname: " The Revenge of Sequels II"
    protocol: "movie:"
}

movie: is interpreted as a protocol, and The Revenge of Sequels II is treated as the pathname. (Note the leading space in that string.) This is valid on the part of URL, but the likelihood of a user consciously pasting a valid URL with a leading space in the path name seems excruciatingly low.

I propose that we should also check the first character of the pathname on the generated URL object, and assume the content is not a URL if that character is whitespace:

diff --git a/packages/url/src/is-url.js b/packages/url/src/is-url.js
index 1fda847947..34fcc30a2d 100644
--- a/packages/url/src/is-url.js
+++ b/packages/url/src/is-url.js
@@ -17,8 +17,8 @@ export function isURL( url ) {
 	// A URL can be considered value if the `URL` constructor is able to parse
 	// it. The constructor throws an error for an invalid URL.
 	try {
-		new URL( url );
-		return true;
+		const parsed = new URL( url );
+		return parsed.pathname[ 0 ] !== ' ';
 	} catch {
 		return false;
 	}

It's possible I am overlooking some category of URLs where a leading space in the pathname would still be valid, but this heuristic works with every link type I can think of.

Step-by-step reproduction instructions

  1. Insert a text block (paragraph, heading, etcetera)
  2. Fill that block with the text "Replace me"
  3. Copy the string Movie: The Revenge of Sequels II to the clipboard
  4. Select the text "Replace me" in the editor
  5. Paste the copied text (Movie: The Revenge of Sequels II) over the text "Replace me"
  6. Observe that, instead of being replaced, "Replace me" is converted into a hyperlink with the destination Movie: The Revenge of Sequels II

Screenshots, screen recording, code snippet

issue reproduction

Environment info

  • WordPress 6.1
  • No plugins installed
  • macOS 12.6.1, tested in Chrome and Firefox

Please confirm that you have searched existing issues in the repo.

Yes

Please confirm that you have tested with all plugins deactivated except Gutenberg.

Yes

@Thelmachido
Copy link

I can replicate this issue

WordPress 6.1.1
Gutenberg 14.6.1
TT3 Theme

@Thelmachido Thelmachido added [Block] Paragraph Affects the Paragraph Block [Type] Bug An existing feature does not function as intended labels Dec 6, 2022
@t-hamano
Copy link
Contributor

t-hamano commented Dec 10, 2022

I think protocols are not limited to http or https. Therefore, I think it is correct behavior that isURL() determines movie: as a URL with a custom protocol.

Also, the code you suggested determines that URLs with protocols such as mailto: and tel: are not URLs. My sense is that many users expect these URLs to be pasted as links.

runkit

However, since I think few users will paste URLs with irregular custom protocols, it might be a good idea to determine major protocols as correct URLs.

@t-hamano t-hamano added [Feature] Raw Handling Related to the ability to convert content to blocks, paste handling, etc and removed [Type] Bug An existing feature does not function as intended [Block] Paragraph Affects the Paragraph Block labels Dec 10, 2022
@t-hamano
Copy link
Contributor

I noticed that the same issue was discussed in #24895. And PR #28534 is the solution to this problem.

Also, as noted in this comment, an approach using isValidHref function seems to have been attempted.

@kadamwhite
Copy link
Contributor Author

@t-hamano Thank you for finding the older issue PR, I had not been able to locate that in my own searching!

I do believe that it's unusual for an href to have a space in it. Apple's documentation around tel: links, for example, gives all examples in the format tel:1-.... where there is no space after the protocol. The linked PR by @mrclay checks /^\s/.test( parsed.pathname ) and assumes any pathname starting with whitespace is not valid, which matches my own assumptions.

This issue definitely describes the same problem as #24895.

@jordesign jordesign added the [Type] Bug An existing feature does not function as intended label Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Raw Handling Related to the ability to convert content to blocks, paste handling, etc [Type] Bug An existing feature does not function as intended
Projects
None yet
Development

No branches or pull requests

4 participants