Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New WebIDL syntax with explicit name/namespace #181

Closed
evilpie opened this issue Nov 16, 2022 · 22 comments · Fixed by #208
Closed

New WebIDL syntax with explicit name/namespace #181

evilpie opened this issue Nov 16, 2022 · 22 comments · Fixed by #208
Assignees
Milestone

Comments

@evilpie
Copy link

evilpie commented Nov 16, 2022

We have been working on a new syntax that doesn't involve any micro syntax for namespaces, but instead either supports just a string (with an implicit default namespace) or an object/dictionary with an explicit name and namespace.

dictionary ElementWithNamespace {
  required DOMString name;
  required DOMString _namespace;
}
typedef (DOMString or ElementWithNamespace) SanitizerElement;

enum Star {
  "*"
};
typedef (Star or sequence<SanitizerElement>) ElementsOrStar;

dictionary SanitizerAttribute {
  required DOMString name; // We aren't sure how to specify |any| attribute yet.
// maybe: optional bool defaultAttribute;
  DOMString? _namespace = null; // specifying null/undefined means "use the null namespace" 
  required ElementsOrStar elements;
}

dictionary SanitizerConfig {
  sequence<SanitizerElement> allowElements;
  sequence<SanitizerElement> blockElements;
  sequence<SanitizerElement> dropElements;
  sequence<SanitizerAttribute> allowAttributes;
  sequence<SanitizerAttribute> dropAttributes;
  // ....
}

1. Use case: Elements with no namespace (HTML)

{
  allowElements: [ "div", "em", "span", "i"],
}

2. Use case: Mixing elements from different namespaces

{
  allowElements: [ "div", "span", { name: "text", namespace:  "http://www.w3.org/2000/svg" }],
}

3. Use case: Attribute + elements

{
  allowAttributes: [{name: "title", elements: ["i", "b", "s", "u"]}]
}

4. Use case: Allow some attribute on any element (e.g. global attributes)

{
  allowAttributes: [{name: "title", elements: "*"}]
}

5. Use case: Allowing all attributes on a specific element

For this we aren't sure about the syntax.

{
  allowAttributes: [{name: "*", elements: ["span"]}],
  // or
  allowAttributes: [{defaultAttributes: true, elements: ["blah"]}) 
}

6. Use case: Non-null attribute namespace

{
  allowAttributes: [{name: "href", namespace: "http://www.w3.org/1999/xlink", 
     elements: [{ name: "text", namespace: "http://www.w3.org/2000/svg" }]
  }]
}
@mozfreddyb
Copy link
Collaborator

CC @annevk

@domenic
Copy link

domenic commented Nov 16, 2022

Seems good, but consider using the canonical property names localName and namespaceURI instead of inventing new ones.

@annevk
Copy link
Collaborator

annevk commented Nov 16, 2022

I think this is okay given that for MutationRecord we use attributeName and attributeNamespace. I'd even go so far as saying I'd favor sticking to "name" and "namespace" for new APIs, making it clearer that's what element and attribute identity is.


I don't see ElementName defined.

What's the reason for keeping attributes and elements separate?

@evilpie
Copy link
Author

evilpie commented Nov 16, 2022

I don't see ElementName defined.

ElementName was supposed to be SanitizerElement. Frankly all the names are just something I quickly came up with, we should definitely spend some more time thinking about them before putting them in the spec.

@mozfreddyb
Copy link
Collaborator

What's the reason for keeping attributes and elements separate?

A couple of reasons. We think it's making a bit more ergonomic for a couple of use cases:

  • specifying fewer allowed elements but fall back on the default list of allowed attributes
  • allowing an attribute for all (or multiple) elements at the same time.

If elements+attributes were combined in a joint structure, attributes would always have to be copied or specified twice.

@otherdaniel
Copy link
Collaborator

I like the new proposal. I'll try to spec it out next week, and maybe start with a trial implementation so I can try it out.

@petervanderbeken
Copy link

dictionary ElementWithNamespace {
  required DOMString name;
  required DOMString namespace;

I think this would need to be _namespace (namespace is a terminal symbol), but I also think following Node/Element and using localName and namespaceURI would make more sense.

I would also make the namespace URI nullable. It'd be nice to make it optional, with a default that's the same as what we would use if you're using the DOMString member of SanitizerElement, although I guess that's hard because it would depend on the type of the document?

required DOMString name; // We aren't sure how to specify |any| attribute yet.
// maybe: optional bool defaultAttribute;

I think using Star here would make sense. I find having both name and defaultAttribute (allAttributes?) a bit weird, you'd have to add prose about the interaction between them.

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Nov 24, 2022
Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
@mozfreddyb
Copy link
Collaborator

required DOMString name; // We aren't sure how to specify |any| attribute yet.
// maybe: optional bool defaultAttribute;

I think using Star here would make sense. I find having both name and defaultAttribute (allAttributes?) a bit weird, you'd have to add prose about the interaction between them.

This is indeed a thing where we were a bit unsure. If I remember correctly, @otherdaniel also preferred a variant of name: '*' and we're not totally against that.
I had a personal preference for a named switch, in order to make it more explicit and not raise questions like Does * fall back to elsewhere-allowed stuff? Does it directly fall back to the specified default attributes?

That being said, we're open to feedback from folks with more experience in designing & shipping new WebAPIs. We do not intend to go against the grain if there are existing patterns and are keen to learn more.

@petervanderbeken what's your take on explicit/implicit here?
@annevk any opinions about wildcards / explicit flags?

@otherdaniel
Copy link
Collaborator

I toyed around a bit with this and think this is a workable proposal. One item I stumbled on is that error handling becomes a bit weird. E.g. if the name in the name/namespace dictionary is required you get weird error handling: Putting some rubbish into the list of elements usually passes (by ignoring said rubbish); but if you put in a dictionary without "name" key in it, then WebIDL can't make sense of it and the Sanitizer construction fails with an exception. That struck me as rather unexpected.

If possible, I'd like some feedback on "web-by" error handling, too. So far, we just ignore un-parseable config items, meaning that in the worst case our safe-by-default defaults will apply. The idea is to not block off future extensions to the config. Should we instead throw on any unknown config items? Or does it depend on the type of item?

I would also make the namespace URI nullable. It'd be nice to make it optional, with a default that's the same as what we would use if you're using the DOMString member of SanitizerElement, although I guess that's hard because it would depend on the type of the document?

I'm in favour. It wouldn't depend on the document type IMHO, but on whether it's an element or attribute. HTML namespace should be the default for elements; the empty namespace for attributes.

This is indeed a thing where we were a bit unsure. If I remember correctly, @otherdaniel also preferred a variant of name: '*' and we're not totally against that.

Yes, I think name: '*' (or to just consider an absent name key to match any name) would be a bit more author friendly. But arguably it's not a big difference.

@evilpie
Copy link
Author

evilpie commented Dec 2, 2022

I like Peter's idea of using a DOMString or dictionary with name+namespace for the attribute name. However the duplication of name does look a bit strange to me: {allowAttributes: [{name: {name: "id", namespace: null}, elements: "*"}]}

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Dec 2, 2022
Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4051041
Reviewed-by: Yifan Luo <lyf@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Reviewed-by: Anders Hartvoll Ruud <andruud@chromium.org>
Commit-Queue: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1078534}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Dec 2, 2022
Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4051041
Reviewed-by: Yifan Luo <lyf@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Reviewed-by: Anders Hartvoll Ruud <andruud@chromium.org>
Commit-Queue: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1078534}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Dec 11, 2022
…pacesForTesting flag., a=testonly

Automatic update from web-platform-tests
[Sanitizer API] Remove SanitizerAPINamespacesForTesting flag.

Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4051041
Reviewed-by: Yifan Luo <lyf@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Reviewed-by: Anders Hartvoll Ruud <andruud@chromium.org>
Commit-Queue: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1078534}

--

wpt-commits: ad194932af87b0503d32282520fcd10c8cc22243
wpt-pr: 37139
BruceDai pushed a commit to BruceDai/wpt that referenced this issue Dec 13, 2022
Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4051041
Reviewed-by: Yifan Luo <lyf@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Reviewed-by: Anders Hartvoll Ruud <andruud@chromium.org>
Commit-Queue: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1078534}
@otherdaniel
Copy link
Collaborator

One addition to consider: To support the use case of "grab bag of elements and attributes", it would be nice to specify attributes as a simple list, just like the elements. E.g.:

  allowElements: ["div", "span"],
  allowAttributes: ["class", "style"]

This largely aligns with the existing proposal, but requires defining SanitizerAttribute as (DOMString or ...) , just like SanitizerElement already is. (I wouldn't mind making elements entirely optional so that this would also hold for namespaced attributes. But given how few of those exist that doesn't make much of a difference.)

jamienicol pushed a commit to jamienicol/gecko that referenced this issue Dec 14, 2022
…pacesForTesting flag., a=testonly

Automatic update from web-platform-tests
[Sanitizer API] Remove SanitizerAPINamespacesForTesting flag.

Remove SanitizerAPINamespacesForTesting flag and support code, to
prepare for a new and different namespace-d content in Sanitizer
proposal.

Ref: WICG/sanitizer-api#181
Bug: 1225606
Change-Id: I13ad92898b28d56224c7a227b21390e927e86a17
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4051041
Reviewed-by: Yifan Luo <lyf@chromium.org>
Reviewed-by: Koji Ishii <kojii@chromium.org>
Reviewed-by: Anders Hartvoll Ruud <andruud@chromium.org>
Commit-Queue: Daniel Vogelheim <vogelheim@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1078534}

--

wpt-commits: ad194932af87b0503d32282520fcd10c8cc22243
wpt-pr: 37139
@annevk
Copy link
Collaborator

annevk commented Dec 20, 2022

{allowAttributes: [{name: {name: "id", namespace: null}, elements: "*"}]}

Why not flatten this out as {allowAttributes: [{name: "id", namespace: null, elements: "*"}]}?

@evilpie
Copy link
Author

evilpie commented Jan 6, 2023

Why not flatten this out as {allowAttributes: [{name: "id", namespace: null, elements: "*"}]}?

Good question. That was actually in my original proposal. I thought that Peter was suggesting we remove the flattening and also use either a DOMString or an object with name + namespace like for elements, but maybe I am just imaging that now. I am open to both solutions.

Aside: My personal opinion is that as long as we can use just a string for what I suspect is the most common use case, the explicit syntax with namespaces doesn't matter much to me. At that point I would even accept name: { localName: "id", namespaceURI: null }.

@evilpie
Copy link
Author

evilpie commented Jan 25, 2023

We just had a sanitizer meeting and I think we all comfortable with using syntax proposed in the initial #181 (comment).
That means:

  • We use the keys name and namespace.
  • For (allow|block)Elements it can either be a string or an object with name/namespace.
  • For (allow|block)Attributes we always use an object because we need the elements key, so we just put name and namespace at the same level. (aka the "flattened syntax") namespace is optional and defaults to null, meaning the null namespace.

@annevk
Copy link
Collaborator

annevk commented Jan 25, 2023

(If elements defaults to "*" you could allow allowAttributes: ["id"] to globally allow the id attribute, but maybe that's best after some analysis of what people actually do in the wild.)

@mozfreddyb
Copy link
Collaborator

We discussed this again in the call and agree on this syntax.
As a side note, @annevk mentioned we can future-proof the syntax for future improvements by using element-name validation that already exists in createElement

@benbucksch
Copy link

benbucksch commented Sep 20, 2023

@annevk wrote:

If elements defaults to "*", you could allow allowAttributes: ["id"] to globally allow the id attribute

+1

@evilpie
Copy link
Author

evilpie commented Oct 13, 2023

Updated WebIDL based on what was discussed in #199. This doesn't quite match #196, because that seems to contain the star "*" syntax again, which AFAIK didn't really go into.

dictionary SanitizerElementNamespace {
  required DOMString name;
  DOMString? _namespace = "http://www.w3.org/1999/xhtml";
};

// Used by "elements"
dictionary SanitizerElementNamespaceWithAttributes : SanitizerElementNamespace {
  sequence<SanitizerAttribute> attributes;
  sequence<SanitizerAttribute> removeAttributes;
};

typedef (DOMString or SanitizerElementNamespace) SanitizerElement;
typedef (DOMString or SanitizerElementNamespaceWithAttributes) SanitizerElementWithAttributes;

dictionary SanitizerAttributeNamespace {
  required DOMString name;
  DOMString? _namespace = null;
};
typedef (DOMString or SanitizerAttributeNamespace) SanitizerAttribute;

dictionary SanitizerConfig {
  sequence<SanitizerElementWithAttributes> elements;
  sequence<SanitizerElement> removeElements;
  sequence<SanitizerElement> replaceWithChildrenElements;

  sequence<SanitizerAttribute> attributes;
  sequence<SanitizerAttribute> removeAttributes;

  boolean customElements;
  boolean unknownMarkup; // Name TBD!
  boolean comments;
};

Edit: 16 October. Removed Sanitizer SanitizerAttributeNamespaceWithElements. Removed allow prefixes. DOMString? element namespace.
15 November: #200 resolved, s/flattenElements/replaceWithChildrenElements/.

@annevk
Copy link
Collaborator

annevk commented Oct 14, 2023

Thank you @evilpie for putting that together!

  1. namespace should probably always be DOMString? even though that is silly, for compatibility with namespaceURI.
  2. I thought we had decided in the meeting that SanitizerAttributeNamespaceWithElements wasn't needed for the MVP as it was just syntactic sugar.
  3. I think the remaining allow* members should drop the allow prefix.
  4. Not sure we have precedent for "markup" in APIs. unknownNodes or unknownElementsAndAttributes might be cleaner?

@evilpie
Copy link
Author

evilpie commented Oct 16, 2023

  1. namespace should probably always be DOMString? even though that is silly, for compatibility with namespaceURI.

So what would happen for null? Is it just treated as the HTML namespace, so the same as undefined? AFAIK elements always have a namespaceURI.

  1. I thought we had decided in the meeting that SanitizerAttributeNamespaceWithElements wasn't needed for the MVP as it was just syntactic sugar.

I forget about that... I just looked at our notes and we did decide that! I've updated the IDL.

  1. I think the remaining allow* members should drop the allow prefix.

Makes sense to me. I just kept that.

  1. Not sure we have precedent for "markup" in APIs. unknownNodes or unknownElementsAndAttributes might be cleaner?

unknownElementsAndAttributes is quite the mouthful. Maybe we could quickly discuss that during our next Sanitizer meeting.

@evilpie
Copy link
Author

evilpie commented Oct 16, 2023

Anne pointed out that it's apparently possible to create elements with a null namespace in XML or explicitly via the API document.createElementNS. We should probably allow it for future compatibility.

@annevk
Copy link
Collaborator

annevk commented Oct 16, 2023

unknown* also depends on #188.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Nov 29, 2023
…io,freddyb

The new proposed WebIDL is at WICG/sanitizer-api#181 (comment).
Sadly we don't have a real spec for this yet. Luckily this is mostly
just a renaming. The biggest change is that every <element> in elements: now has a list of allowed and removed attributes.

Differential Revision: https://phabricator.services.mozilla.com/D193642
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Nov 30, 2023
…io,freddyb

The new proposed WebIDL is at WICG/sanitizer-api#181 (comment).
Sadly we don't have a real spec for this yet. Luckily this is mostly
just a renaming. The biggest change is that every <element> in elements: now has a list of allowed and removed attributes.

Differential Revision: https://phabricator.services.mozilla.com/D193642
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Nov 30, 2023
…io,freddyb

The new proposed WebIDL is at WICG/sanitizer-api#181 (comment).
Sadly we don't have a real spec for this yet. Luckily this is mostly
just a renaming. The biggest change is that every <element> in elements: now has a list of allowed and removed attributes.

Differential Revision: https://phabricator.services.mozilla.com/D193642

UltraBlame original commit: 8c7aa601f08a38022d337a34788b7e75e5ca4aed
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Nov 30, 2023
…io,freddyb

The new proposed WebIDL is at WICG/sanitizer-api#181 (comment).
Sadly we don't have a real spec for this yet. Luckily this is mostly
just a renaming. The biggest change is that every <element> in elements: now has a list of allowed and removed attributes.

Differential Revision: https://phabricator.services.mozilla.com/D193642

UltraBlame original commit: 8c7aa601f08a38022d337a34788b7e75e5ca4aed
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Nov 30, 2023
…io,freddyb

The new proposed WebIDL is at WICG/sanitizer-api#181 (comment).
Sadly we don't have a real spec for this yet. Luckily this is mostly
just a renaming. The biggest change is that every <element> in elements: now has a list of allowed and removed attributes.

Differential Revision: https://phabricator.services.mozilla.com/D193642

UltraBlame original commit: 8c7aa601f08a38022d337a34788b7e75e5ca4aed
@mozfreddyb mozfreddyb added this to the v1 milestone Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants