Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text/xml no longer XML? #790

Open
eriksiegel opened this Issue Mar 21, 2019 · 9 comments

Comments

Projects
None yet
5 participants
@eriksiegel
Copy link
Contributor

eriksiegel commented Mar 21, 2019

Didn't we decide somewhere that media type text/xml is no longer considered XML but text? Can't find it and the spec still says text/xml is XML.

Does anybody have a better memory than me?

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Mar 21, 2019

I don’t think that we took this decision. I don’t remember that there was any controversy about treating text/xml as XML.

@eriksiegel

This comment has been minimized.

Copy link
Contributor Author

eriksiegel commented Mar 21, 2019

I remember we discussed this because I was specifying the text steps and it was impossible to specify @content-types in such a way that it really only accepts text. Specifying content-type="text/*" would include text/xml as well...

@gimsieke

This comment has been minimized.

Copy link
Contributor

gimsieke commented Mar 21, 2019

I don’t think that text/* includes text/xml but maybe we should state this more explicitly.

In the definition list after this anchor, there is:

Here are some examples of content types for matching:

text/*, any kind of text document.

And “text document” explicitly excludes text/xml and text/html.

So the * wildcard apparently does not work exactly as a replacement for “any characters”. This is also seen in the other example,

*/*+xml, any XML content type.

So this does not literally (in a conventional wildcard sense) match, for example, application/xml, but it is still supposed to match, in a content-type sense, any XML media type.

I think we should explicitly define the wildcards for common media types, instead of listing them as mere examples.

@xml-project

This comment has been minimized.

Copy link
Contributor

xml-project commented Mar 21, 2019

@eriksiegel I see your point: If you want to say, that a particular step needs a text document on the input port, content-types="text/*" can be used, because it includes text/xml. And there is no way to exclude "text/xml" and "text/html", which are XML documents.
Does it make sense to extend our syntax like this: "text/* ~text/xml ~text/html", where "~" reads as "but not"?

@Conal-Tuohy

This comment has been minimized.

Copy link

Conal-Tuohy commented Mar 25, 2019

Another option would be to define an additional document property to record the document type, and not be forced to rely on the IANA media types. e.g. there could be a type document property with values from text, json, xml, binary (or whatever the types are).

@ndw

This comment has been minimized.

Copy link
Contributor

ndw commented Apr 4, 2019

Two thoughts at the 4 April editorial call:

  1. Add a notion of negating mime types, "text/* -text/xml"
  2. Add a notion of shortcuts, a bare word without a "/" is a shortcut for a longer definition.

So, "text" could be a shortcut for "text/* -text/xml -text/html" for example.

@ndw

This comment has been minimized.

Copy link
Contributor

ndw commented Apr 4, 2019

Norm also proposed that the processor should change text/xml to application/xml on load and text/html to application/html+xml.

@ndw

This comment has been minimized.

Copy link
Contributor

ndw commented Apr 18, 2019

Per 18 Apr editorial call, the two comments above are the accepted proposal.

@Conal-Tuohy

This comment has been minimized.

Copy link

Conal-Tuohy commented Apr 19, 2019

Can I assume the media type you meant in your comment above, @ndw, was application/xhtml+xml?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.