Skip to content
Permalink
Browse files

MDL-49686 atto: Process style and class attributes in sub-functions

To ensure we only clean style and classes, first we select the inside
of those attributes and "replace" them with handler functions. Those
functions scan the actual attribute values for class or styles that
we want to exclude.

The first level regex has 3 groups. group1 selects everything in the
tag leading up to the attribute values, group2 has the attributes,
group3 has the trailing quote mark. We work on group2 then return
the combination of group1, group2, and group3.
  • Loading branch information...
merrill-oakland committed Mar 27, 2015
1 parent 0582523 commit 737df5ca1b7e00202c2cb100bcb5237898d68eaa
@@ -863,12 +863,20 @@ EditorClean.prototype = {

// Run some more rules that care about quotes and whitespace.
rules = [
// Remove MSO-blah, MSO:blah in style attributes. Only removes one or more that appear in succession.
{regex: /(<[^>]*?style\s*?=\s*?"[^>"]*?)(?:[\s]*MSO[-:][^>;"]*;?)+/gi, replace: "$1"},
// Remove MSO classes in class attributes. Only removes one or more that appear in succession.
{regex: /(<[^>]*?class\s*?=\s*?"[^>"]*?)(?:[\s]*MSO[_a-zA-Z0-9\-]*)+/gi, replace: "$1"},
// Remove Apple- classes in class attributes. Only removes one or more that appear in succession.
{regex: /(<[^>]*?class\s*?=\s*?"[^>"]*?)(?:[\s]*Apple-[_a-zA-Z0-9\-]*)+/gi, replace: "$1"},
// Get all style attributes so we can work on them.
{regex: /(<[^>]*?style\s*?=\s*?")([^>"]*)(")/gi, replace: function(match, group1, group2, group3) {
// Remove MSO-blah, MSO:blah style attributes.
group2 = group2.replace(/(?:^|;)[\s]*MSO[-:](?:&[\w]*;|[^;"])*/gi,"");
return group1 + group2 + group3;
}},
// Get all class attributes so we can work on them.
{regex: /(<[^>]*?class\s*?=\s*?")([^>"]*)(")/gi, replace: function(match, group1, group2, group3) {
// Remove MSO classes.
group2 = group2.replace(/(?:^|[\s])[\s]*MSO[_a-zA-Z0-9\-]*/gi,"");
// Remove Apple- classes.
group2 = group2.replace(/(?:^|[\s])[\s]*Apple-[_a-zA-Z0-9\-]*/gi,"");
return group1 + group2 + group3;
}},
// Remove OLE_LINK# anchors that may litter the code.
{regex: /<a [^>]*?name\s*?=\s*?"OLE_LINK\d*?"[^>]*?>\s*?<\/a>/gi, replace: ""},
// Remove empty spans, but not ones from Rangy.

0 comments on commit 737df5c

Please sign in to comment.
You can’t perform that action at this time.