Skip to content

Sanitize HTML pasted into ckeditor #5571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Nov 25, 2022

Conversation

tomasr8
Copy link
Member

@tomasr8 tomasr8 commented Nov 16, 2022

closes #5560

Probably the easiest solution is to use an external sanitizer and sanitize the pasted contents before it gets converted into ckeditor's document model. Otherwise we'd probablty have to use writers or directly manipulate the model.

This library seems nice because it can also filter attributes, tags, etc..
If we don't want to add another library and just care about the styles, we could roll our own sanitizer using the DOMParser to filter the styles (DOMParser is used internally by ckeditor when pasting as well):

export function sanitizeStyles(
  fragment,
  allowedStyles = ['color', 'background-color', 'font-size']
) {
  allowedStyles = new Set(allowedStyles);
  const queue = [...fragment.children];
  while (queue.length) {
    const element = queue.pop();
    const style = element.style;
    const toRemove = [];
    for (let i = 0; i < element.style.length; i++) {
      const name = style.item(i);
      if (!allowedStyles.has(name)) {
        toRemove.push(name);
      }
    }
    for (const name of toRemove) {
      style.removeProperty(name);
    }
    queue.push(...element.children);
  }
  return fragment;
}

@tomasr8
Copy link
Member Author

tomasr8 commented Nov 16, 2022

The CSS whitelist is just an example, we probably want to allow more styles than that

@tomasr8 tomasr8 force-pushed the ckeditor-sanitize-css branch from 73bfef9 to ebd43f1 Compare November 16, 2022 15:04
@tomasr8 tomasr8 force-pushed the ckeditor-sanitize-css branch 2 times, most recently from cca8b8e to 4a4eba6 Compare November 25, 2022 08:41
Filters the pasted HTML and removes all CSS styles
which are not whitelisted.

Uses sanitize-html's defaults to sanitize tags, attributes, etc.
@ThiefMaster ThiefMaster force-pushed the ckeditor-sanitize-css branch from 4a4eba6 to 3677a18 Compare November 25, 2022 11:40
@ThiefMaster ThiefMaster force-pushed the ckeditor-sanitize-css branch from 3677a18 to 876dd2b Compare November 25, 2022 11:46
@ThiefMaster ThiefMaster added this to the v3.2 milestone Nov 25, 2022
@ThiefMaster ThiefMaster merged commit 26b60a3 into indico:master Nov 25, 2022
@ThiefMaster ThiefMaster deleted the ckeditor-sanitize-css branch November 25, 2022 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate how we can reduce the amount of CSS garbage when people paste from word into ckeditor
2 participants