JavaScript Injection using event attributes #148

Qendolin · 2022-10-21T17:09:36Z

Script injection is still very easy to achieve.
As mentioned in #66 (comment) you can inject JavaScript via on* attributes.
[aaa onclick=alert('Hacked')]Click Me[/aaa]

I don't think the solution is to blacklist certain attributes. IMO it shouldn't even be possible to set any attributes unless explicitly enabled in a preset. Even a style attribute can be used maliciously.

Furthermore it is not save to allow the creation of an arbitrary html tag. For example some websites might use custom elements which themselves might be exploitable.

I'ld strongly prefer it this library was safe by default.

The text was updated successfully, but these errors were encountered:

JiLiZART · 2022-10-22T21:31:14Z

This library has a layered structure. Raw @bbob/parser can be used to transform bbcodes to AST, or @bbob/core for plugin support.
I don't include whole html5 attributes validation and tags spec cause the main goal of this library is to be tiny and raw so you can use it to create your own solutions.

Anyway library was safe by default is a good point. I will try to investigate how I can achieve this without bloating this library.

Qendolin · 2022-10-23T01:24:10Z

I think the biggest problem is creating HTML as a string. Using the DOM is a lot safer as it prevents any kind of escape attempt.

For my site I've implemented a custom renderNode like this:

const renderNode = (node, { stripTags = false }) => {
	if (!node) return document.createTextNode('');
	const type = typeof node;

	if (type === 'string' || type === 'number') {
		return document.createTextNode(node);
	}

	if (Array.isArray(node)) {
		return renderNodes(node, { stripTags });
	}

	if (type === 'object') {
		if (stripTags === true) {
			return renderNodes(node.content, { stripTags });
		}

		const element = document.createElement(node.tag);
		for (const name in node.attrs) {
			try {
				element.setAttribute(name, node.attrs[name]);
			} catch (ignored) {
				// Ignore invalid attribute names
			}
		}

		if (node.content === null) {
			return element;
		}

		element.append(...renderNodes(node.content));
		return element;
	}

	return document.createTextNode('');
};

const renderNodes = (nodes, { stripTags = false }) => {
	return nodes.flatMap((node) => renderNode(node, { stripTags }));
};

const toHTML = (source, plugins, options) =>
	core(plugins).process(source, {
		...options,
		render: (nodes, opts) => {
			// Document fragment is just a container element, so that the function only returns a single element
			const frag = document.createDocumentFragment();
			frag.append(...renderNodes(nodes, opts));
			return frag;
		}
	}).html;

To make attributes and tag names whitlist-based I've also created my own processor

/**
 * @param name the property name
 * @param value the property value, will be escaped
 * @returns The css style definition
 */
const renderStyle = (name, value) => {
	value = value.replaceAll(`"`, ``).replaceAll(`'`, '').replaceAll(`;`, '');
	return `${name}: ${value};`;
};

// Contains allowed styles
const styleMap = {
	color: (val) => renderStyle('color', val)
}

/**
 * @param attrs 
 * @returns The constructed style attribute
 */
const getStyleFromAttrs = (attrs) => {
	let style = '';
	for (const name in styleMap) {
		if (name in attrs) {
			style += styleMap[name](attrs[name]);
		}
	}
	return { style };
};

/**
 * @param attrs 
 * @returns A style with the color of the unique attribute
 */
const getColorStyle = (attrs) => {
	return getStyleFromAttrs({
		color: getUniqAttr(attrs) ?? 'inherit'
	});
};

/**
 * @param node 
 * @param render the render function
 * @returns the url from the unique attribute or the node content
 */
const getUrlHref = (node, render) => {
	let url = getUniqAttr(node.attrs);
	url ??= render(node.content, { stripTags: true })?.textContent ?? '';
	return url;
};

/**
 * @param node 
 * @param render the render function
 * @returns the url from the node cpntent
 */
const getImgSrc = (node, render) => {
	if (typeof node.content == 'string') return node.content;
	return render(node.content, { stripTags: true })?.textContent ?? '';
};

/**
 * Creates a trusted node.
 * Tag must be a trusted tag
 * Attrs must only contain trusted attributes
 */
const toNode = (tag, attrs, content) => ({
	tag,
	attrs,
	content,
	trusted: true
});

export const definitions = {
	b: (node) => toNode('b', {}, node.content),
	i: (node) => toNode('i', {}, node.content),
	u: (node) => toNode('span', { style: `text-decoration: underline;` }, node.content),
	s: (node) => toNode('s', {}, node.content),
	br: (node) => toNode('br', {}, node.content),
	url: (node, { render }) =>
		toNode(
			'a',
			{
				href: getUrlHref(node, render),
				// Also an important privacy improvement
				rel: 'noreferrer noopener nofollow',
				target: '_blank'
			},
			node.content
		),
	img: (node, { render }) =>
		toNode(
			'img',
			{
				src: getImgSrc(node, render),
				// For privacy
				referrerpolicy: 'no-referrer',
				// For cross origin images
				crossorigin: 'anonymous',
				width: node.attrs.width,
				height: node.attrs.height
			},
			[]
		),
	quote: (node) => toNode('blockquote', {}, node.content),
	style: (node) => toNode('span', getStyleFromAttrs(node.attrs), node.content),
	color: (node) => toNode('span', getColorStyle(node.attrs), node.content),
	// ... and more (I left some out for brevity)
};

function process(tags, tree, core, options) {
	tree.walk((node) => {
		if (isTagNode(node)) {
			// Create a trusted node from whitelisted tags
			if (tags[node.tag]) {
				return tags[node.tag](node, core, options);
			}
			if (node.trusted === true) {
				// trusted nodes can be any tag or contain attributes that are not whitelisted
				return node;
			}
			// Convert all non-whitelisted tags to spans and remove all attributes
			return {
				tag: 'span',
				attrs: {},
				content: node.content,
				trusted: true
			};
		}
		return node;
	});
}

The most important differences are:

It converts any unknown tag to a <span> element
It removes all attributes that are not used by the tag definition
It escapes user specified values
Even if user specified values aren't escaped properly they can't cause harm because we are using the DOM to create the elements instead of string concatenation
Creating elements using the DOM is faster than setting innerHTML
It improves user privacy by setting rel and referrerpolicy to appropriate values

To incorporate findings also denoted in: JiLiZART/BBob#148 exchanged renderer to DOM based rendering to benefit from automatic proper escaping. Still using the HTML renderer as optional fallback.

Replaced `render` provided by bbob-html by some DOM based approach. This provides much more robustness regarding proper escaping of attributes or textual contents, for example. This incorporates some findings reported for: * JiLiZART/BBob#148 where a similar solution got suggested. The previously introduced `htmlSanitizer` is now obsolete.

JiLiZART self-assigned this Nov 13, 2022

JiLiZART added the enhancement New feature or request label Nov 13, 2022

JiLiZART added the P1 Moderate Issue label Oct 16, 2023

mmichaelis mentioned this issue Oct 25, 2023

Misleading name/behavior for escapeHTML: Should denote "use for attribute values" #203

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JavaScript Injection using event attributes #148

JavaScript Injection using event attributes #148

Qendolin commented Oct 21, 2022

JiLiZART commented Oct 22, 2022

Qendolin commented Oct 23, 2022 •

edited

Loading

JavaScript Injection using event attributes #148

JavaScript Injection using event attributes #148

Comments

Qendolin commented Oct 21, 2022

JiLiZART commented Oct 22, 2022

Qendolin commented Oct 23, 2022 • edited Loading

Qendolin commented Oct 23, 2022 •

edited

Loading