Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

URL Blocked for https #153

Open
jiyath opened this Issue Nov 25, 2013 · 13 comments

Comments

Projects
None yet
5 participants
Contributor

yankidank commented Dec 6, 2013

I noticed this when using a https Youtube URL.

@ghost ghost assigned dbezborodov Dec 6, 2013

@yankidank yankidank closed this Dec 18, 2013

Contributor

yankidank commented Dec 18, 2013

I didn't get this bug fix in to 2.0.1, so it won't apply until 2.0.2 unless you apply the change manually.

A user on Pligg brought up this issue. Another user pointed out that it is the exclamation mark that is causing it.

after carefully dissecting the url validation pattern in /submit.php, I made some additions to validate the hash-bang #! and changes to the order of characters that were wrong.

the new validation pattern validates well. I tested it on a lot of urls and it validated properly, for http(s) and ftp:

"http://foo.com/photos.php?id=1111111#!/photo.php?pid=719397&id=1111111&fbid=111111110";
"http://foo.com/?ref=home#!/pages/foo-foo/888888888?v=wall&ref=ts";
"http://foo.com/foo.foo#!/profile.php?id=100000&v=wall";
"http://fairuziyet.com/song.php?song=%D8%AD%D8%A8%D9%8A%D8%A8%D9%8A%20%D9%82%D8%A7%D9%84%20%D8%A7%D9%86%D8%B7%D8%B1%D9%8A%D9%86%D9%8A&alpha=%D8%AD&page_no=1";
"http://fairuziyet.com/song.php?song=نحنا والقمر جيران&albnum=1&playnum=1&musinum=1&page_no=2"
"http://1337.net"
"http://a.b-c.de" (only validates in the test but not on pligg)
"http://foo.bar/?q=Test%20URL-encoded%20stuff" (only validates in the test but not on pligg)
"http://j.mp"
"ftp://foo.bar/baz"
"http://code.google.com/events/#&product=browser"
"http://☺.damowmow.com/" (only validates in the test but not on pligg)
"https://secure.wikimedia.org/wikipedia/en/wiki/Alan_Turing#Early_computers_and_the_Turing_test";

1- Open \submit.php
2- Look for function do_submit1() { (it should be line 136 if you have not modified the code)
3- Add $url = utf8_decode($url);
right after $url = html_entity_decode($url);
it will make sure the encoded urls are properly decoded.
4- Comment the pattern on line 197
do not delete any code, just comment it in case you want to revert back to it.
5- Add this new pattern below the commented one
$pattern = '/^(([\w]+:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?(\d\w?)+[\w]{2,4}(:[\d]+)?(([/#!+-~.,\d\w]+|%[a-fA-f\d]{2,2}))(??(&?([-+.,\d\w]|%[a-fA-f\d]{2,2})=?))?(#([-+.,/\d\w]|%[a-fA-f\d]{2,2})_)?$/';

Contributor

yankidank commented Aug 7, 2014

The provided $pattern results in all false positive flags for me ("URL is invalid or blocked"). I'm going to add int he url_decode function to the file, and hopefully we can figure out a better regex to use for the pattern later on.

@yankidank yankidank reopened this Aug 7, 2014

Weird, it works for me. Anyway, excellent idea what you suggested. Thanks for all your efforts!

jiyath commented Aug 10, 2014

This $pattern not working any links.

200 - submit i dev 2_0_2

"yankidank commented 3 days ago
The provided $pattern results in all false positive flags for me ("URL is invalid or blocked"). I'm going to add int he url_decode function to the file, and hopefully we can figure out a better regex to use for the pattern later on."

jiyath commented Aug 11, 2014

Already added on this commit. c4e85b4
But normal links not working on in this commit.

shklyar commented Aug 18, 2014

Replace the pattern on this
$pattern = '/^http?://(([a-z0-9-]+.)+[a-z]{2,6}|\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})(:[0-9]+)?(/?|/\S+)$/iu';
all links, except https

jiyath commented Aug 20, 2014

Use this pattern :
$pattern = '/(http|https)://(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(/|/([\w#!:.?+=&%@!-/]))?/';

@jiyath jiyath referenced this issue Aug 20, 2014

@yankidank yankidank - Fix for URL validation.
- Removed the utf8_decode function being used on $url, which doesn't seem necessary
69b6d57
Contributor

yankidank commented Aug 20, 2014

I'm afraid @jiyath's pattern isn't restrictive enough and allows users to escape things that they shouldn't be let to escape, like submit step 2's URL preview area.

Regular expressions are like my kryptonite, so any other suggestions are welcome, although we do have a mostly working pattern being used now.

jiyath commented Aug 20, 2014

OK. thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment