-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL not handled/encoded correctly #1303
Comments
Seems to be encoding - decoding URL case.
|
I'm not sure how to fix this properly in the code but I've implemented a crude workaround by creating a plugin:
|
it works using above code. code: |
@LeoColomb I just can't understand, when I goto yourls/admin/ to add such encoded url, stored correctly. |
Thanks to @adigitalife! His solution may not be the most elegant one, it works! |
I was surprised to hear about this issue so long after 1.6 was released but glad to find a temporary fix. I created and activated the following plugin per adigitalife's info on this thread. Thanks, and will be watching for a more permanent fix. Meanwhile, should this plugin be placed in the official plugin list here to help others? https://github.com/YOURLS/YOURLS/wiki/Plugin-List |
I've just committed something that should fix that. Please try to break and report :) Re-open this issue if needed. |
Very nice, ozh. Is this the one commit you made to address this? 59dff6b |
@innov8ion yes |
Definitely, there is a issue with encoding. May need a full rewrite. |
Not sure if this is related , but when I try and shorten a link with a percentage sign in it , for example - |
@markwaters I think it's tricky if the actual URL contains '%23' because it's actually the same as '#' so it gets converted automatically. There's been a discussion that maybe YOURLS should do any kind of encoding at all. I'm not sure what's the current status with that. |
I have a similar problem with Microsoft OneDrive links. They often include %2... and %3... in their hashed IDs, which are then converted to other charters, like /, +, " ", etc. |
hello, i still have the same issue. i have tried to install the plugin: Fix Long URLs with no avail. |
It depends on the exact characters in the URL that you have a problem with. My plugin only looks at some specific characters. If your problematic characters are not in the plugin, you'll need to modify the plugin to add them. |
thank you !! |
I've just read through this (current and open) thread and it seems to be evolving with different problems. I am after a better understanding of the workings of YOURLS and have a few questions about this topic, Why Encode?
If an API or a Plugin or Microsoft OneDrive provides a Long URL that does not work, is that YOURLS job to fix it? If I typed DecodingFor the exact same reason, that YOURLS should not encode, YOURLS should not decode. If I typed Database etc.Database inputs should be screened for SQL injections, etc. However, this is not at all the same as URL encoding. If a program (or human) provides the wrong information, should we not just make a note that program has a bug? So the Plugin or API or Bookmark program can be fixed? The Real SolutionIt seems to me the real solution for these errors Real ReasonsCan any YOURLS Developer explain why YOURLS should try to fix wrong inputs from external sources? Why not fix the problem at the source Plugin, Bookmark, or API? |
@ozh @LeoColomb - Please review this logic? Suggested Solution
Second SuggestionIf moving this out of Core Code Bloat and into a Plugin is not desired, would it be acceptable to add a Core Option to disable all |
There is a need to encode or decode because depending on the context, URLs supplied are, or are not, raw text.
|
I think I ran into this issue when adding this long url: https://worcade.stackstorage.com/s/MzCWEihRfYldw5X?dir=/Terms of Service If I leave the automatically generated shortcode, the shortlink works. If I customize the shortcode to 'terms', the shortlink breaks... Jorn suggested a workaround: I now shortened the link with bit.ly and then "shortened" the link with our custom domain using Yourls. |
@ozh I just saw your post today. It seems we can use a little logic to sort this out at the point YOURLS receives the long URL. Decode EverythingFirst off, how about decode everything upon receipt? If the text box is not encoded the decoded output would be the same. Someone might copy/paste an encoded URL into the text box too. So it seems everything should be decoded upon receipt by YOURLS. Dubble encodingThe problem seems to be double encoding without decoding. Double decoding or decoding unencoded text is not a problem. Encode EverythingBy encoding everything just before saving in the database, it reverses the first decode and makes the URL Internet ready. Encoding an unencoded URL would have no effect on its functionality. Examples
PS
|
UpdateI'll leave this here in case it helps someone. I've discovered the issue, and it's unrelated to this thread. The problem was the "Keep Query String" plugin, which was activated on prod, but not on dev. Carry on. :) Just to add to this issue - I haven't read the entire thread, but hopefully this adds to the discussion. I have two configurations, both essentially identical, for our dev and prod systems. The prod system produces URLs in which URL parameters have slashes encoded as The dev system produces links that work just fine. That is, slashes in URL parameters are completely unencoded - they appear as The difference between these two systems is minor. Both systems run CentOS and an essentially identical stack. Both systems run version 1.7.3 of YOURLS. I'm thinking it could be a difference in DB configurations or maybe due to the fact that the dev server runs Apache and the prod server runs Litespeed. The most curious wrinkle on this entire mess is that the Test casesWe see the same behavior via the GUI and when using the API. longurl (input): stored longurl in the database on both dev and prod: converted shorturl (output) on dev (functions as desired): converted shorturl (output) on prod (double-encodes slashes): |
@rinogo - That's good intel Rich,
The URL spaces were encoded as stored in the database, compared to the input. At some point, the domain name changes and adds a Maybe we should start gathering data on sites that are failing? Example: Are all sites that fail running under CentOS? Debian? RedHat? FreeBSD? or Ubuntu? What sites are NOT failing? I run without this problem in: The error does seem to be after the long URL is retrieved from the DB. Perhaps if we discover what the sites that fail have in common, it will lead to why some sites fail? |
Hi, @PopVeKind! I think that's a great idea! I can double-check my setup, but I think everything is working fine since I discovered that the discrepancy was due to a plugin that was installed on prod but not on dev. (I added an update to my post after the fact - perhaps you're working off of email notifications). Regardless, I'll keep an eye on it; if we have problems, I'll definitely report back here. Thanks to you all (contributors and users alike) for making YOURLS awesome! |
I worked around the whole issue with encoding (or at least I hope so) by using a plugin that just uses the URL as it is sent by the user of the API or the form or wherever:
I can't believe this issue has been open for 7 years on a project whose sole purpose it is to handle URLs. If it's so important for the server to handle all the bad input, at the very least it should offer an option to turn off this behavior from the API if they're a well behaved client. |
@jadkik in what context does this work? Adding URL manually + bookmarklet + adding "https://sho.rt/" in front of any URL ? |
I am trying to understand this and I have a question. Does anyone have a problem when entering the long URL into the I have been reading over this URL encoding issue and do not see this complaint. If this always works ok, I believe I need to focus on the differences between this method and the other methods.
Today the thought came to me that this is not a single problem, but a group of interrelated problems. My goal is to build a server to reproduce these faults on and then test to see where the code goes astray. I will need actual faults, methods, and configurations to reproduce these problems. @tbsampson your result is what I would expect by running your origional through urldecode. Did you submit your code via the GET method? The PHP manual says, @dfbasis Your call also looks like it was run through @jadkik Your filter looks interesting, which method were you using when you had the failure, and which method did this filter fix for you? @adigitalife your filter is interesting because it does not have the problem of decoding %2F, %3B, or %3D that @tbsampson is describing (not necessarily an encoding issue). |
Test URL from: #1303 (comment) |
Hello Do we have a procedure to follow when encoding issues occur? some examples.. gets stored int he DB as this: gets stored int he DB as this: I do not know how to work around it, could anybody recommend smth pls? |
@ozh @PopVeKind sorry for not getting back to you on this. Here is my use case. Only via the API. I am still on The main problem is with the plus-sign and at-sign in my case. YOURLS tends to replace it with an actual plus instead of a With the plugin:Properly-encoded URL:
Without the plugin:Properly-encoded URL:
UsageThe properly encoded URL is generated and sent like that import java.net.URLEncoder;
// ...
String longUrl = baseUrl + "/page/confirm/" +
URLEncoder.encode(confirmCode, "UTF-8") +
"?email=" + URLEncoder.encode(userEmail, "UTF-8");
// ...
OkHttpClient client = new OkHttpClient();
RequestBody formBody = new FormBody.Builder()
.add("format", "json")
.add("action", "shorturl")
.add("username", "username")
.add("password", "password")
.add("url", longUrl)
.build();
Request request = new Request.Builder()
.url("http://short.exam.pl/yourls-api.php")
.post(formBody)
.build(); |
@jadkik So with your last post on this thread, did you find a solution. I'm not following what you read. Is there a plugin I can load that will prevent YourLS from taking tags like %20 and changing them to the actual character? I just updated to 1.7.9 and now many of my links are breaking. |
@jamiers99 I had posted that plugin which worked for me: #1303 (comment) It worked for my use case, may break some other things like bookmark and the editing on the UI (haven't tested). |
Closing: 2691 and associated PR will supersede |
What steps will reproduce the problem?
What is the expected output? What do you see instead?
What I expect is that YOURLS should be smart enough to know that '%20' is a space and should be left alone. Instead, what's happening is that YOURLS is converting the '%' to '%25' and therefore, the '%20' becomes '%2520'.
Perhaps there could be a check before cleaning up the long URL to detect '%20' or whatever else it could be (e.g. '%5C' for backspace, etc).
This is a COPY of Issue 1303: %20 in long URL not handled correctly, filed on Google Code before the project was moved on Github.
The text was updated successfully, but these errors were encountered: