Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex bug with line endings handling #162

Closed
RusKnyaz opened this issue Sep 28, 2019 · 10 comments
Closed

Regex bug with line endings handling #162

RusKnyaz opened this issue Sep 28, 2019 · 10 comments
Labels

Comments

@RusKnyaz
Copy link

Execute the code:

var rheaders = /^(.*?):[ \t]*([^\r\n]*)$/mg;
var headersString = 'X-AspNetMvc-Version: 4.0\r\nX-Powered-By: ASP.NET\r\n\r\n';
var arr = []
while ( match = rheaders.exec( headersString ) ) { 
arr.push(match[1].toLowerCase());
arr.push(match[ 2 ]);
}

Expected:
arr is array of ["x-aspnetmvc-version", "4.0", "x-powered-by", "ASP.NET"]
Observed:
arr is empty.

@RusKnyaz RusKnyaz changed the title Regex bug Regex bug with line endings handling Sep 28, 2019
@paulbartrum
Copy link
Owner

paulbartrum commented Sep 29, 2019

Looks like the issue is that in .NET the $ character matches \n (in multiline mode):

new Regex("^.*$", RegexOptions.Multiline).Matches("one\r\ntwo\r\n")[0].Value
// returns "one\r"

Whereas in javascript it matches \r or \n:

'one\r\ntwo\r\n'.match(/^.*$/m)[0]
// returns "one"

@paulbartrum
Copy link
Owner

The workaround is to change your regular expression:

var rheaders = /^(.*?):[ \t]*([^\r\n]*)\r?$/mg;

@RusKnyaz
Copy link
Author

Unfortunately I cannot use a workaround because this code is from jquery. And there are probably tons of code on web pages that use regular expressions. Please look at the similar issue in jint

@Taritsyn
Copy link
Contributor

@RusKnyaz For implementation of regular expressions in the Jurassic, Jint and NiL.JS engines are used a System.Text.RegularExpressions.Regex class, which is not fully compatible with ECMAScript (see the “Regular expression parsing error” issue).

@RusKnyaz
Copy link
Author

@Taritsyn I kown it. And It is possible to fix. Please look the link I provided in previous comment.

@paulbartrum
Copy link
Owner

I've checked in a fix, let me know if it works for you :-)

@paulbartrum
Copy link
Owner

@RusKnyaz You'll notice that my fix is more complicated than the Jint one. It seems the fix in Jint is not correct. 'one\r\ntwo'.match(/^.*$/mg).toString() should return "one,,two" but in Jint it returns "one\r,two".

@kpreisser
Copy link
Collaborator

kpreisser commented Oct 1, 2019

@RusKnyaz You'll notice that my fix is more complicated than the Jint one. It seems the fix in Jint is not correct. 'one\r\ntwo'.match(/^.*$/mg).toString() should return "one,,two" but in Jint it returns "one\r,two".

It seems the fix in Jurassic is also not 100% correct 😉
E.g. ('one\\\r'.match(/^.*\\$/mg) || []).toString() should return "one\\" but returns "".

@paulbartrum
Copy link
Owner

Ha, you're right, darn it.

@paulbartrum
Copy link
Owner

I checked in a fix for the escaping issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants