Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short URL converted to lowercase when custom protocol is used #2876

Closed
4 tasks done
Dreamerset opened this issue Mar 18, 2021 · 8 comments
Closed
4 tasks done

Short URL converted to lowercase when custom protocol is used #2876

Dreamerset opened this issue Mar 18, 2021 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@Dreamerset
Copy link

Dreamerset commented Mar 18, 2021

  • I've read the Troubleshooting First Steps
  • This request isn't a duplicate of an existing issue, opened or closed
  • I've read the docs and followed them (if applicable)
  • This is not a personal support request that should be posted on the YOURLS Discourse community

Describe the bug
Short URL for custom protocols are converted to lowercase

To Reproduce
Prerequisites:
Make sure your config.php includes:

/** URL shortening method: 36 or 62 */    
define( 'YOURLS_URL_CONVERT', 62 ); 

and

/** Allow more protocols if needed here. */  
$yourls_allowedprotocols = array( 'http://', 'https://', 'custom://' );
  1. Go to https://my_short_url.domain/admin/index.php
  2. Shorten: custom://SomeText to https://my_short_url.domain/SomeText

Expected behavior
The resulted short URL is: https://my_short_url.domain/SomeText
The resulted original URL is: custom://SomeText

Actual behavior
The resulted short URL is: https://my_short_url.domain/SomeText
The resulted original URL is: custom://sometext

Versions
YOURLS v 1.8.1

@Dreamerset Dreamerset added the bug Something isn't working label Mar 18, 2021
@dgw dgw changed the title Short URL only in small caps when custom protocol is used Short URL converted to lowercase when custom protocol is used Mar 18, 2021
@ozh ozh self-assigned this Mar 20, 2021
@ozh
Copy link
Member

ozh commented Mar 21, 2021

First of all, thanks for taking the time to file details. Much appreciated when we can simply read an issue rather than try to understand it :)

Regarding the issue itself : indeed, and this is intended behavior. The URL is not converted to lowercase -- just its "authority" part.

A "URI" has various components, see Uniform_Resource_Identifier#Syntax : scheme:[//authority]path[?query][#fragment]

When the //authority part is present, it is case insensitive, see https://tools.ietf.org/html/rfc3986#section-3.2.2
For instance:

In your example, custom://SomeText and custom://sometext should be the same, so we're lowercasing it to avoid considering same URI as different entries (eg submit http://github.com and then http://GitHub.com, it should state that the URL is already existing).

In your example again, custom://this/SoMeTeXt and custom://this/sometext will be correctly stored, the part of the URI after the authority being preserved as case sensitive (SoMeTeXt and sometext)

If you want to know more about how and why it works this way:

What was your actual use case ?

@ozh ozh closed this as completed Mar 21, 2021
@Dreamerset
Copy link
Author

Thank you for your detailed response.
Our use case is in the area of deep links. We share url over arbitrary messaging service (over which we don't have control) and want it clickable in the messaging service receiving UI as deep link on mobile devices. Sending scheme:someText is not clickable while https://my.domain/someText is. When the original url behind https://my.domain/someText is scheme://someText it works just fine.
After reading your detailed response we understood that YOURLS behavior is indeed the expected one. To achieve our goal we should structure properly the custom protocol original link. Instead of scheme://someText it should be scheme:///someText.

@ozh
Copy link
Member

ozh commented Mar 22, 2021

But then, if you don't know what messaging service is used, how can you craft the correct deep link ? ie whatsapp://something or messenger://something ?

@Dreamerset
Copy link
Author

The proprietary protocol is fixed - custom:// and it triggers specific application on the mobile if available. The issue in sending it plain is that it is not clickable in the UI of the receiving messaging app. When it is not clickable you can't start the app associated with the custom:// protocol.
Example with messages sent over SMS (short message service)
Message content: custom://someText results on the receiving mobile as plain text message you can only read.
However if the message content is: https://my_short_url.domain/someText with original url custom://someText then the received message contains a clickable link. The receiving user can click on it and after opening momentarily the available browser the associated with the custom:// protocol app will open using the someText for its purposes.
It is completely possible that we just don't know enough and there is a better way to send custom protocol urls over arbitrary messaging service and have them clickable on the receiving UI.

@Dreamerset
Copy link
Author

Dreamerset commented Mar 23, 2021

Few more words on:

In your example, custom://SomeText and custom://sometext should be the same, so we're lowercasing it to avoid considering same URI as different entries (eg submit http://github.com and then http://GitHub.com, it should state that the URL is already existing).

Writing in the address bar http://github.com or http://GitHub.com will result in the same behavior. So for YOURLS the reason of lowercasing the //authority part is to avoid considering same URI as different entries. I believe this validation could be achieved without altering the //authority part of the original url by only upper/lowercasing in the moment of searching/validating the records but not in the persistence stage of the process.

Is my line of thinking correct or am I missing something again?

@ozh
Copy link
Member

ozh commented Mar 27, 2021

Since authority is case insensitive, it's much much more efficient to store lowercase once and perform operations without worrying (during searches, adding new URLs, displaying stats, etc...) , than store "as is" and perform all operations with a lowercasing overhead of the entire database everytime

@Dreamerset
Copy link
Author

Thank you for your response ozh. Fully support the operations performance efficiency.
This is just for the sake of completeness without implementation expectation.
The motivation to store authority "as is" might be part of the marketing strategy.
Markus Winand suggests an elegant solution without performance impact here for various DB including MySQL at the bottom of his article.

@ozh
Copy link
Member

ozh commented Mar 28, 2021

The thing is, we cannot just do case insensitive search: http://GITHUB.com/omg and http://github.com/OMG are not the same URLs.

This said, the "part of the marketing strategy" point is valid (or, more generally speaking, "the end user should be able to do whatever they want to do"). If you think there is a missing filter somewhere that would allow such user freedom (maybe at this end of this function?), feel free to submit a pull request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants