Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harden url fetcher and don't crash on non-ASCII urls #219

Merged
merged 1 commit into from Mar 28, 2016
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
19 changes: 15 additions & 4 deletions src/plugins/irc-events/link.js
Expand Up @@ -16,10 +16,9 @@ module.exports = function(irc, network) {
}

var links = [];
var split = data.message.split(" ");
var split = data.message.replace(/\x02|\x1D|\x1F|\x16|\x0F|\x03(?:[0-9]{1,2}(?:,[0-9]{1,2})?)?/g, "").split(" ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record/archive purpose, this is the same check as this one, to clean up messages from colors, boldness, ...

_.each(split, function(w) {
var match = w.indexOf("http://") === 0 || w.indexOf("https://") === 0;
if (match) {
if (/^https?:\/\//.test(w)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to block this, but in general I'm not in favor of obfuscating a couple checks with a regex. It's shorter, yes, but regexes are arguably harder to understand and, when not backed with tests, tend to let bugs go through more easily.
The previous check was a tiny bit longer but more straightforward to read (granted, storing this in that match variable was unnecessary).

links.push(w);
}
});
Expand All @@ -44,7 +43,7 @@ module.exports = function(irc, network) {
msg: msg
});

var link = links[0];
var link = escapeHeader(links[0]);
fetch(link, function(res) {
parse(msg, link, res, client);
});
Expand Down Expand Up @@ -103,6 +102,8 @@ function fetch(url, cb) {
try {
var req = request.get({
url: url,
maxRedirects: 5,
timeout: 5000,
headers: {
"User-Agent": "Mozilla/5.0 (compatible; The Lounge IRC Client; +https://github.com/thelounge/lounge)"
}
Expand Down Expand Up @@ -150,3 +151,13 @@ function fetch(url, cb) {
cb(data);
}));
}

// https://github.com/request/request/issues/2120
// https://github.com/nodejs/node/issues/1693
// https://github.com/alexeyten/descript/commit/50ee540b30188324198176e445330294922665fc
function escapeHeader(header) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment to explain this, in addition to the links, would have be nice, for future ourselves.
To keep in mind when we start adding tests to these.

return header
.replace(/([\uD800-\uDBFF][\uDC00-\uDFFF])+/g, encodeURI)
.replace(/[\uD800-\uDFFF]/g, "")
.replace(/[\u0000-\u001F\u007F-\uFFFF]+/g, encodeURI);
}