Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL objects #26

Closed
annevk opened this issue Jul 16, 2015 · 106 comments
Closed

URL objects #26

annevk opened this issue Jul 16, 2015 · 106 comments

Comments

@annevk
Copy link
Member

annevk commented Jul 16, 2015

Please don't use URL objects in data structures: https://url.spec.whatwg.org/#url-apis-elsewhere

(I would also suggest not referencing all of these out-of-date forks of WHATWG Standards. That doesn't seem helpful to readers.)

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

The problem we have in the NFC spec is that we expose incoming data types as

typedef (DOMString or URL or Blob or JSON) NFCData;

Now if URL becomes USVString then in JS it will be normal string, indistinguishable from text type. That means if we get rid of URL objects, we need to carry the type of the original information as a property (like type: "url", value: "www.mydomain.com" vs type:"text", value: "www.mydomain.com"), which is not very web'ish either, then this will roll on and change the whole API. <scratching head>

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

That looks like a problematic structure either way. You cannot distinguish JSON from a string.

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

In the current implementation, JSON is exposed as Object (property bag) to clients, URL as a URL object, Blob as Object. When a client gets an Object, it can use instanceof to tell it was Blob; if it was not a blob, then it was JSON. This works now, but I don't know which evil is worse: using URL object, or carrying type along in a property. Any best practice on this nowadays?

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

What's the underlying representation you're trying to model?

@kenchris
Copy link
Contributor

NFC allows you to store different kinds of data. For write, we could just have multiple functions, but I am not sure what would be the best model for read. Push API seems to have http://www.w3.org/TR/push-api/#idl-def-PushMessageData which has different getters.

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

Right, what Push does is based on Fetch. Still, what types of data do we have and how are they exposed underneath?

@kenchris
Copy link
Contributor

It can store values like URL (slightly "compressed" by NFC like the new eddystone URLs from physical web), string, and binary data (mime type). We also use a special record for storing json.

http://w3c.github.io/web-nfc/#web-nfc-payload

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

It seems push does not have metadata to include the type. For Request and Response objects in Fetch that is expected to be delivered through the Content-Type header and necessary for the correct extraction of certain types. If you want something analogous I would expect:

interface NFCData {
  readonly attribute NFCDataType type;
  ArrayBuffer arrayBuffer();
  ...
  USVString url();
}
enum NFCDataType { "url", "string", "opaque" };

@domenic
Copy link
Contributor

domenic commented Aug 11, 2015

+1 to @annevk's design. Alternately, a design like { type: "url", data: ... } { type: "text", data: ... } seems fine. Anne's design is better to communicate that NFC's metadata is not relevant and you want the ability to deserialize in multiple ways as appropriate for the use case. The { type, data } seems a bit better to give a more direct mapping to NFC (although in that case just exposing TNF and MIME type would be more appropriate).

@kenchris
Copy link
Contributor

What @annevk suggests is pretty aligned when what I was suggesting originally :-) so I am on board with that. @zolkis what is your take?

@domenic
Copy link
Contributor

domenic commented Aug 11, 2015

Also I think the proper name for Blob-with-mimetype is File?

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

No, Blob-with-filename is File. Blob always has a MIME type (though it may be unknown).

@domenic
Copy link
Contributor

domenic commented Aug 11, 2015

Right, sorry, I was doing so well for early-morning and then I had to go and blow it :P. I'll go get breakfast and coffee now...

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

@domenic

The { type, data } seems a bit better to give a more direct mapping to NFC (although in that case just exposing TNF and MIME type would be more appropriate).

There is some overlap in NFC regarding TNF and MIME types vs mappable JS types, so for Web NFC we have already done a slight mapping here. Then, we didn't mean exposing the low level NFC details in this spec - for that we still have the older WG spec which follows low level details very well. This spec is more like "rethinking NFC usage for the Web", and the argument would favor @annevk 's design (indeed we have iterated through a similar design with @kenchris earlier). I'd rather start with

enum NFCDataType { "url", "string", "json", "blob" };
interface NFCData {
    readonly attribute NFCDataType type;
    Blob blob();
    JSON json();  // same as 'Object json();'
    USVString url();
    DOMString text();
    // ArrayBuffer arrayBuffer(); // skip this if we don't expose the low level content
}

since in that case we have some indication how the content should be handled by the client. Functions may throw type exceptions. But I might miss the details behind the "opaque" type idea.

Thank you all for helping in this.

@domenic
Copy link
Contributor

domenic commented Aug 11, 2015

That design sounds pretty good. Although I would include arrayBuffer() if you are including blob(). Otherwise you are just telling users that you hate them and want to subject them to the pain that is FileReader :P.

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

Indeed, even the examples used FileReader :). If we skip the idea of exposing low level content, then we can dedicate arrayBuffer() to Blob(). Or, if we ever wanted to expose that, we could choose a special name for that purpose.

@kenchris
Copy link
Contributor

So I guess @annevk was suggesting using "opaque", which could then be read with blob() or arrayBuffer() ?

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

Yeah, that was my idea. I'm not sure I'd expose Blob if you can do ArrayBuffer. Doesn't make much sense to offer the asynchronous variant if you offer the bytes synchronously already. If it's about the type that a Blob can expose you really ought to expose that somehow. Does NFC carry a MIME type?

@kenchris
Copy link
Contributor

Yes, it does for binary data

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

So was your idea to expose that through Blob? It seems exposing it directly on NFCData would be simpler?

@kenchris
Copy link
Contributor

Yeah that is an option. Just 'DOMString mimeType()' ?

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

What is the MIME type if the underlying data is a URL or a string? Just blank? It seems you could just offer type and either it's the blessed "url" or "string", and a full blown MIME type otherwise. Also, is the underlying MIME type restricted to 0x00 to 0x7F? In that case you want ByteString.

@kenchris
Copy link
Contributor

Are there other existing examples of APIs like this in the web platform?

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

DataTransfer has something similar with magic "text" (better than string actually) and "url" values that map to text/plain and text/uri-list.

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

You could either do that or keep them distinct, with a type property getter and a mimeType property getter. If the former returns "opaque", the latter will not be the empty string. Kind of depends on how you expect developers will want to branch their code. Seemed reasonable to me that you'd just have:

if(data.type === "url") {
  ...
} else if(data.type === "text") {
  ...
} else if(data.type === "image/png") {
  ...
} else {
  reportError()
}```

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

Since JSON, text and URL could be mapped to MIME types (application/json, text/plain, and text/url) indeed we could go with

interface NFCData {
    readonly attribute DOMString mimeType;
    ArrayBuffer arrayBuffer();
    JSON json();  // same as 'Object json();'
    USVString url();
    DOMString text();
}

@kenchris
Copy link
Contributor

JSON and text can be mapped to different mime types, I like what @annevk suggested more

@annevk
Copy link
Member Author

annevk commented Aug 11, 2015

Actually, JSON needs to be any and it seems I did that wrong in Fetch. true or 1 is valid JSON.

@zolkis
Copy link
Contributor

zolkis commented Aug 11, 2015

JSON and text can be mapped to different mime types

But it's a bit awkward to have 2 type specifiers, like "type" and "mimeType", and I think we could manage with one - this still permits having different MIME types for text and JSON.

@kenchris
Copy link
Contributor

What do you mean with valid mime type; are we actually going to check the type against a list or just making sure it consists of a '/' pattern?

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

I think you need to write out the mapping algorithms and JavaScript since I see some conflicts around JSON and text and such.

Also, why is it okay to lose precision for TNF 1, 3, and 4?

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

I think precision is needed when the client has to do the conversion. For the use cases of Web NFC the rest seem to be fine - it is one of the purposes of the API to map NFC to types known in the browsers. But I have no problems exposing the exact underlying information, when available. Could you give one example for conflict for JSON and text, respectively?

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

@kenchris

I guess you want application/*+json as well

Yes, indeed.

What do you mean with valid mime type; are we actually going to check the type against a list or just making sure it consists of a '/' pattern?

Implementations will have to do sanity checks on MIME type data: bounds, structure, etc for security reasons. I didn't mean validating against a full list of MIME types - those then would need to be listed in the spec or referenced externally. What is a good compromise there?

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

Well, text is a string and JSON can be too. As for your other point, it's not clear to me why we don't have "smart-poster-url" and "url" for instance. Or why we would hide everything TNF=4.

@domenic
Copy link
Contributor

domenic commented Aug 12, 2015

I am just afraid that regular developers don't really know the mime type of, say URL. Should we use text/uri-list ? or text/x-uri ? or ...

I am not sure your mental model here. For reading, it will be perfectly obvious to developers that { type: "text/uri-list", data: "https://example.com" } is a URL. (Although the crappy "URI" name is unfortunate.) For writing, I doubt developers are going to guess at the type they should use---they'll almost certainly look at docs or the spec anyway. So I don't think this is an issue.

If we want to proceed with including simplified helper types, and don't want to lose type information, we need to use a variant of Anne's earlier proposal, something like this:

This proposal sounds good to me. Although, reading further, I guess not to implementers?

Do we prefer mime instead of mimeType? Any existing examples in the web platform?

I have been investigating this for a different spec, and so far everything I've seen in public APIs has been type.


    USVString type;  // "url", "text", json", "opaque", or a MIME type

If we can't have kind + type, then I really don't see any value in this over just using a MIME type directly and only. The mapping would be:

  • Writing:
    • text: authors supply "text/*" and data "any text". Convert data to a string using usual Web IDL/ECMAScript ToString conversion.
    • URLs: authors supply "text/uri-list" (ugh) and data "http://valid.url". Convert data to a string, then validate that it's a valid absolute URL; throw if it's not.
    • JSON: authors supply "application/json" or "application/*+json". JSON.stringify the supplied value (throw if that fails).
    • anything else: authors supply any other non-special mime type plus a BufferSource (= ArrayBuffer or typed array). Any other type throws.
  • Reading
    • TNF=1 Type Text => "text/plain", deserialize to DOMString
    • TNF=1 Type URI, TNF=1 Type Smart Poster, and TNF=3 Absolute URI => "text/uri-list", deserialize to USVString
    • TNF=2 mime type "application/json" or "application/*+json", JSON.parse it (throw if can't be parsed)
    • everything else => mime type as-is, deserialize to ArrayBuffer.

All this said I am not an expert and appreciate you guys are doing some hard work to try to map something that is not web-friendly to something that is. My gut instinct would be to expose things on the lowest level possible and then leave the nice web-friendly stuff for later specs or for author libraries, but I understand the effort to do something nicer and will trust you all to do so. I think everyone is on the same page about not using actual URL objects, which was our main concern, and at this point we're all just kind of throwing in ideas as to what seems like a nice API.

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

We don't hide TNF=4, but expose it as ArrayBuffer without MIME type. Did you mean why do we hide the NDEF fields from developers (e.g. the TNF value), IOW why don't we present a low level interface? That is done by the earlier WG spec pretty well. Should we expose these NFC-specific terms/details for any web page?

One assumption in Web NFC was that we don't have to be compliant with legacy NFC tags - when we started the CG people wanted to have a web-specific NFC, even so much as total separation between NFC content available for web pages and low level interfaces available through native platforms. This was a security issue from the start. Not having to expose all low level details allowed us to simplify the interfaces, and focus on the us cases by not having to care too much about exact formats.

Well, text is a string and JSON can be too.

It is possible to write a string as "application/json" or "text/plain", and read a string as "application/json" or "text/html". I'm not sure I understand the problem. Here I don't see we lose type information that has been written using Web NFC (or even by "legacy" writers).

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

Thanks @domenic , all this sounds good, I would be fine with all of it.

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

With the exception that there are cases when we don't have MIME type information (for instance External Type records). For that we need default MIME type (e.g. "application/octet-stream") or just a placeholder like "opaque".

@domenic
Copy link
Contributor

domenic commented Aug 12, 2015

"application/octet-stream" seems reasonable.

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

This was a security issue from the start.

I guess @domenic and I are not familiar with those which makes it a bit harder to design something here.

I do think "url" is better than "text/uri-list" since we these NFC fields don't actually accept or return a list of URLs and there's no clear MIME type for a single URL.

Soonish we'll also have <script type=module> which is also not a MIME type but much nicer than something like <script type=text/module+javascript> so I don't think that purity matters much. It's not like these types are parsed anyway, they're just treated as a single entity.

@kenchris
Copy link
Contributor

You work on the URL spec @annevk, don't you think we could suggest a real URL mime type? Do you know the process for that?

@domenic
Copy link
Contributor

domenic commented Aug 12, 2015

I guess I was motivated by hiding everything behind the facade of a mime type so that authors have both a uniform mental model and can use it as a MIME type if necessary (e.g. fetch with content-type header set appropriately). @annevk doesn't think that's as important. I'm not sure either of us feels strongly so maybe you guys could make the decision?

@kenchris
Copy link
Contributor

It seems that Robin at least have experience with creating new mime types:

x-www-form-urlencoded application/x-www-form-urlencoded [W3C][Robin_Berjon]
http://www.iana.org/assignments/media-types/media-types.xhtml

@kenchris
Copy link
Contributor

@darobin ^

@domenic
Copy link
Contributor

domenic commented Aug 12, 2015

A slight variant would be to use the word "kind" as in "a web NFC record's kind can be one of: url, text, json, or a MIME type." That seems slightly better than using "type" but allowing things that are not MIME types.

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

If you really wanted it badly feel free to squat "text/url" and I'll add it to the URL Standard and deal with the IETF noise.

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

(@domenic, but then what about <script type=module>?)

@domenic
Copy link
Contributor

domenic commented Aug 12, 2015

@annevk I guess the analogy you're making is that there's already one place in HTML, or at least there will be, where the word "type" is used but we plan to allow other values than MIME types? shrug. It's a hack, and doesn't seem terribly relevant.

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

What matters for back-compatibility is a consistency of mapping to and from NDEF records. Having additional values for "type", which may be deprecated later, are indeed not terribly relevant. Of course a clean design is preferable :), but with the current constraints I don't have problems addressing URL's as type == 'url' or 'text/url' or 'text/uri-list' with the note that in this spec it actually denotes a single URL.

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

Using kind and making it a couple special values or a MIME type seems the least problematic to all involved. I would still be interested in hearing about the security issues that preclude sharing some of the underlying data. #2 is quite vague on the manner and it seems @sicking and @jyasskin are not really in agreement as to what is dangerous and what is not...

@sicking
Copy link

sicking commented Aug 12, 2015

The best way to expose urls is through normal strings. URL objects are IMHO a great library for parsing a URL in order to extract data from it (like the domain name, or query parameters). If someone wants to do that with the data read from a nfc tag, then can easily do that using application logic.

I'm not sure what relevance the security model has here?

@annevk
Copy link
Member Author

annevk commented Aug 12, 2015

We kind of strayed from the original subject.

@zolkis
Copy link
Contributor

zolkis commented Aug 12, 2015

@sicking

The best way to expose urls is through normal strings.

We agree, for URL's, data is always a url-string. The discussion is whether

  • type = "text/uri-list" (MIME type), or "url" (our invented value for this)
  • or we add a new property kind = "url" (invented value) and type is irrelevant for URL (for types other than URL it contains the MIME type).

Introducing kind (as a base type) + having type for MIME type was criticised by implementors as bringing complexity in validating the possible combinations, and the alternative of diluting type with extra non-MIME values like "url" is kind of ugly. The third, using purely MIME types would be fine but we don't have a MIME type for URL's yet (we could use "text/uri-list" with web nfc-specific constraint that it denotes a single URL).

I'm not sure what relevance the security model has here?

Anne asked why do we hide low level NFC information coming from NDEF records, i.e. why not exposing the lowest level of data to web pages (and leave beautification to libraries). I answered that was an outcome of the security related discussion which was outlined in #2. So in this API we try to focus on web-specific use cases of NFC (using types known in the browser), rely on the browser security mechanisms, and additionally use a data storage format which is specific to Web NFC (by adding a special record to the NDEF messages and storing the writing scope in the message). Nevertheless, we can still choose to expose selected low-level information, but then it will be hard to draw the line between low level and high level API's, with the former being needed to comply with NDEF standards.

@sicking
Copy link

sicking commented Aug 13, 2015

We agree, for URL's, data is always a url-string. The discussion is whether

  • type = "text/uri-list" (MIME type), or "url" (our invented value for this)
  • or we add a new property kind = "url" (invented value) and type is irrelevant for URL (for types other than URL it contains the MIME type).

I don't care strongly. Though mimetypes are generally a pretty crappy way of exposing types. For exactly this reason. Lots of stuff doesn't have a mimetype. And lots of things have several different mimetypes.

From a security point of view I think we can expose the low-level information once we know that a tag has opted in to being "secure to expose to the web". At least for reading.

@zolkis
Copy link
Contributor

zolkis commented Aug 13, 2015

@sicking

Though mimetypes are generally a pretty crappy way of exposing types. For exactly this reason. Lots of stuff doesn't have a mimetype. And lots of things have several different mimetypes.

Fully agree. And it would be pity to complicate the API because of insufficient coverage of MIME types.

From a security point of view I think we can expose the low-level information once we know that a tag has opted in to being "secure to expose to the web". At least for reading.

Yes. And tags express that by being a "web-nfc" tag (contain the special record/ writing scope).
We will reiterate then what low-level information is worth exposing.

@annevk

If you really wanted it badly feel free to squat "text/url" and I'll add it to the URL Standard and deal with the IETF noise.

That would be one good solution, IMO. So the value of type would be a MIME type, plus "text/url" which may become a MIME type later. Good enough for me.

Using kind and making it a couple special values or a MIME type seems the least problematic to all involved.

Did you mean using one property like above, but instead of type let's call it kind and let it have MIME types plus a couple of special values like "url" (or even "text/url" FWIW)?
Or did you mean the additional property kind along with type which is meant to be a standard MIME type?

@darobin
Copy link
Member

darobin commented Aug 13, 2015

My recommendation if you go the text/url route is to just squat it and implement it. Only at the very end of the process, if you have no other choice, then register it with IANA. The registration process is cryptic and poorly designed, it makes many other standard processes look fun. The registration form is buggy. The expert reviewers are, well... just to give an example the reviewer for my registration of application/xhtml+xml in 2014 objected because I was not referring to XHTML 2.0.

Once you've accumulated enough fait accompli that you can't be derailed by that sort of feedback, only then should you bother. And even then, if I had to do it again I would probably suggest starting a MIME type registry at the WHATWG instead.

zolkis added a commit to zolkis/web-nfc that referenced this issue Sep 5, 2015
…ad events. Fixed terminology related issues. Renamed send() to pushMessage(). Add timeout to push options. Issues addressed: w3c#2, w3c#3, w3c#22, w3c#26, w3c#28, w3c#30, w3c#31, w3c#32, w3c#33, w3c#35, w3c#36, w3c#38, w3c#39, w3c#40.
@zolkis zolkis closed this as completed Sep 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants