Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the usage of base64 encodings in DataSchema/ContentType #912

Closed
relu91 opened this issue Jun 12, 2020 · 13 comments
Closed

Clarify the usage of base64 encodings in DataSchema/ContentType #912

relu91 opened this issue Jun 12, 2020 · 13 comments
Labels
Propose closing Problem will be closed shortly if there is no veto.

Comments

@relu91
Copy link
Member

relu91 commented Jun 12, 2020

This issue is coming from #869 (comment). I decided to bring it up here because it was not so much related to the PR itself but it has some possible hints to improve the specs. I'll summarize it in the following.

Basically, JSONSchema prescribes how to describe binary data using a JSON string. However is not clear how to use this feature on a TD. In the comment, I made two possible options:
First:

// image property affordance
"image": { 
	"description" : "image", 
	"type": "string"
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
			"contentType": "image/png",
	                "contentEncoding": "base64"
		}
	] 
}}

Second

// image property affordance
"image": { 
	"description" : "image", 
	"type": "string",
        "contentMediaType": "image/png",
	"contentEncoding": "base64"
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
                        "contentType": "application/json" 
		}
	] 
}}

As I described in the comment I prefer the second one, because it adds the ability to use the same patterns even with other forms contentTypes (as hinted here by @zolkis ). Also, @egekorkan had a positive comment on this on #869 (comment).

Consequently, my proposal is to add an example (maybe inspired by the snippet presented above) to describe this feature. It is quite a common pattern on the web to return images in base64 encoding so it might guide TDs designers in the right direction. We could even specifically formalize the usage of contentMediaType and contentEncoding in StringDataSchema as we are doing for minlenght, maxLength and multipleOf in #896 .

@zolkis
Copy link

zolkis commented Jun 12, 2020

We could even specifically formalize the usage of contentType and contentCoding in StringDataSchema

+1

@sebastiankb
Copy link
Contributor

regarding JSON Schema spec:

  • contentType --> contentMediaType
  • contentCoding --> contentEncoding

@sebastiankb
Copy link
Contributor

both terms will be also introduced in the PR #896

@relu91
Copy link
Member Author

relu91 commented Jul 22, 2020

regarding JSON Schema spec:

  • contentType --> contentMediaType
  • contentCoding --> contentEncoding

Thanks, I updated the first comment for clarity.

@sebastiankb
Copy link
Contributor

I think this approach (option 2) makes sense if the TD reflects a purely defined JSON based API. Otherwise, the question arises why Option 1 would be not sufficient, especially when a single data value (the image) is submitted as a property.

I think we should make this very clear in the specification when should be used contentMediaType and contentEncoding.

@relu91
Copy link
Member Author

relu91 commented Aug 10, 2020

I think we should make this very clear in the specification when should be used contentMediaType and contentEncoding.

I agree and probably we should some practical examples.

think this approach (option 2) makes sense if the TD reflects a purely defined JSON based API. Otherwise, the question arises why Option 1 would be not sufficient, especially when a single data value (the image) is submitted as a property.

The problem with option 1 is that we do not have a clear way of applying DataSchemas on binary data formats like image/png. As in the example above it feels quite odd to say that the content of an image/png file is a string. People can still use it but I think the semantic is far more clear with option two.

Furthermore, option two might still be applied to different form contentTypes. As hinted by @zolkis in this comment if we define a common mapping for other content types we will be able to use contentEncoding and contentMediaType attributes also with text/plain or application/xml . For example:

// image property affordance
"image": { 
	"description" : "image", 
	"type": "string",
        "contentMediaType": "image/png",
	"contentEncoding": "base64"
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
                        "contentType": "text/plain" 
		}
	] 
}}

Notice that the content described by this affordance will be something like the following:

cWHyTD/LwZqs6rBIG5fXQHaUyA1QEQfZoXVO3FypCw
LaRFCLh+EqSbAxXrk5zRx38+SvM7VhtOfn/fmeYf5hleZ
GlUwipiPchWINy1DBGtlNReVY1UOJTD0R2teohMlE/jO3ipep5xqf5Cqrqw==

Whereas if the form contentType is application/json the content will be (notice the double quotes needed to satisfy JSON grammar rules):

"cWHyTD/LwZqs6rBIG5fXQHaUyA1QEQfZoXVO3FypCw
LaRFCLh+EqSbAxXrk5zRx38+SvM7VhtOfn/fmeYf5hleZ
GlUwipiPchWINy1DBGtlNReVY1UOJTD0R2teohMlE/jO3ipep5xqf5Cqrqw=="

@sebastiankb
Copy link
Contributor

The problem with option 1 is that we do not have a clear way of applying DataSchemas on binary data formats like image/png. As in the example above it feels quite odd to say that the content of an image/png file is a string. People can still use it but I think the semantic is far more clear with option two.

Yes, the use of the string type is not reasonable. But why is a type used at all? The type is optional and can be skipped. In this case I would expect the client reads the contentType to get more context .

@egekorkan
Copy link
Contributor

I think that string needs to be specified since the new keywords apply to the string type. The examples provided here also use string. It would be like saying maxLength is for strings so we don't need to say string when we put maxLength

@relu91
Copy link
Member Author

relu91 commented Aug 13, 2020

Yes, the use of the string type is not reasonable. But why is a type used at all? The type is optional and can be skipped. In this case I would expect the client reads the contentType to get more context.

Yes, specifying only form contentType and encoding remain a valid option. Maybe the example above with text/plain was not the best. In fact in that case you could just model the property like the following:

// image property affordance
"image": { 
	"description" : "image", 
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
			"contentType": "image/png",
	                "contentEncoding": "base64"
		}
	] 
}}

However, the issue was only about when a TD designer wants to use DataSchema to model binary data. This way he/she can design more complex scenarios (i.e. a JSON file which has a property that contains an encoded image). So maybe a better example for this feature will be something like the following:

// image property affordance
"image": { 
	"description" : "image", 
	"type": "object",
         "properties": {
                   "content": {
                           "type": "string",
                            "contentMediaType": "image/png",
	                    "contentEncoding": "base64"
                  },
                  // .... other properties with image metadata (i.e. size, timestamp etc.)
         },
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
                        "contentType": "application/json" 
		}
	] 
}}

Nevertheless, it is important to consider that right now without introducing these two new terms we cannot correctly represent these JSON documents (which may exist out there):

"WHyTD/LwZqs6rBIG5fXQHaUyA1QEQfZoXVO3FypCwLaRFCLh+EqSbAxXrk5zRx38+SvM7VhtOfn/fmeYf5hleZGlUwipiPchWINy1DBGtlNReVY1UOJTD0R2teohMlE/jO3ipep5xqf5Cqrqw=="

@takuki
Copy link
Contributor

takuki commented Aug 31, 2020

As is the case with MIME usage, both discrete and multipart usages are legitimate.

In my opinion, both examples are equally important.
As @relu91 shows, we can use an example that uses multipart for example 2.

@sebastiankb
Copy link
Contributor

sebastiankb commented Sep 9, 2020

regarding the first example from @relu91 I would expect the following form:

// image property affordance
"image": { 
	"description" : "image", 
	"forms": [
		{ 
			"op": "readproperty", 
			"href": "coaps://mylamp.example.com/lastPicture", 
			"cov:methodName" : "GET",
			"contentType": "image/png;base64"
		}
	] 
}}

contentEncoding would be not needed

@sebastiankb
Copy link
Contributor

lets also check https://tools.ietf.org/html/rfc6839 for the correct format

@sebastiankb
Copy link
Contributor

PR is merged

@sebastiankb sebastiankb added the Propose closing Problem will be closed shortly if there is no veto. label Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Propose closing Problem will be closed shortly if there is no veto.
Projects
None yet
Development

No branches or pull requests

5 participants