glTF 2.0 syntax changes and JSON encoding restrictions

Here're reasons behind glTF 2.0 syntax (objects-to-arrays) changes and new JSON encoding restrictions. While each of them could have medium or little impact on robustness or performance, WG believes that their combination justifies this breaking change.

## Syntax change
### 1. Incorrect usage of JSON (from [Vulkan loader](https://github.com/KhronosGroup/Vulkan-Samples/blob/master/samples/apps/atw/scenes/scene_gltf.h#L68))
As opposed to using arrays with elements with a name property, glTF uses objects with arbitrarily named members. 

Normally JSON objects are well-defined with well-defined member names. As a result, a JSON parser is needed that cannot just lookup object members by name, but can also iterate over the object members as if they are array elements. Not all JSON parsers support this. 

JSON parsers that do support this behavior are typically not optimized for objects with many members.

### 2. Specifics of WebBrowser's JSON parsers
Modern JS engines try to build hidden class-based representations for already seen objects. 

While that makes perfect sense for actual glTF objects like "accessor" or "node", there's no point in creating an object type for "accessors" dictionary. Moreover, when JS runtime sees too many properties, it could fall back to slower map-based internal representation (applies to v8).

This process consumes more client memory and cycles than creating a dense JS array of elements of the same type.

When an engine loads different glTF assets one after another, JS runtime can utilize previously created classes for objects like "accessor", but top-level dictionaries will have always unique signature, so they may not be optimized.

### 3. Asset size reduction
Arrays-based JSON tend to consume less disk / transfer space, less client's RAM. So JSON parser has fewer bytes to process.

Exporters/converters have to generate unique strings (because of global ID scope), collada2gltf does it by prefixing index with type, like "bufferView_25". Such step won't be needed for arrays. Objects that have some meaningful string name (nodes, meshes, and/or materials) have it usually in two places: as an ID and as a "name" property.

Minifying JSON by renaming all IDs to as-short-as-possible unique strings (like "Zq", "Rw", "w$", etc) surely reduces file size, but eliminates possible benefits of readability of string-based IDs.

E.g., minifying a big asset (actual scene with roughly ~1300 nodes, meshes, and ~4200 accessors) reduces JSON to 77% of original size (collada2gltf output). Combined with removing "name" properties - to 70%. Converting same collada2gltf output to array form and removing names reduces file size to 66% of original size. 

### 4. Parsing performance
Web-browser's parsing performance difference should be noticeable on big assets (1M, at least).

Call to JSON.parse() is synchronous and causes little hangs even with WebWorkers (when data is transferred between threads). So an application that renders and loads assets constantly over time (think mapping applications, Cesium's 3D tiles) may benefit even from small performance gains.

Obviously, we can't easily measure internals of JSON.parse() call and say exactly how much each processing stage cost, but there's a clear win with arrays. Actual processing time varies by browser vendor and by CPU.

Hera're samples of duration (performance.now) of the first call to JSON.parse() and size of parsed object in heap (as reported by browser's tools) afterwards.

First number with objects (names removed, IDs minified, 1011478 bytes), second with arrays (also no names, 957153 bytes).
<table>
 <tr>
 <th rowspan="3">User-Agent</th>
 <th colspan="2" rowspan="2">Memory, MB</th>
 <th colspan="4">Time, ms</th>
 </tr>
 <tr>
 <td colspan="2">Core i5-6600, W10x64</td>
 <td colspan="2">Celeron 847, W10x64</td>
 </tr>
 <tr>
 <td>Objects</td>
 <td>Arrays</td>
 <td>Objects</td>
 <td>Arrays</td>
 <td>Objects</td>
 <td>Arrays</td>
 </tr>
 <tr>
 <td>MS Edge 38</td>
 <td>2.77</td>
 <td>2.16</td>
 <td>12</td>
 <td>7</td>
 <td>37</td>
 <td>28</td>
 </tr>
 <tr>
 <td>Mozilla Firefox 51</td>
 <td>2.52</td>
 <td>1.93</td>
 <td>14</td>
 <td>10</td>
 <td>60</td>
 <td>45</td>
 </tr>
 <tr>
 <td>Google Chrome 56</td>
 <td>4.02</td>
 <td>3.67</td>
 <td>14</td>
 <td>10</td>
 <td>60</td>
 <td>45</td>
 </tr>
</table>

## JSON encoding restrictions
JSON has some vague string-related specifics, that could be avoided by enforcing additional restrictions on glTF encoding:

- JSON allows different representations of the same string. E.g., `"%" == "\u0025"`. JSON parsers must understand that.
- JSON allows full Unicode charset. But even one non-ASCII char can reduce parsing performance in v8 by the factor of two, because v8 has a special case, when all chars are "one-byte".

To reduce possible impact of incorrect implementations of string handling in custom loaders, following restrictions are proposed:
- All glTF-critical strings (i.e., property names and enums) must be defined in the spec.
- They must use plain text encoding (i.e. "buffer" instead of "\u0062\u0075\u0066\u0066\u0065\u0072") and they must be limited to ASCII chars only.
- glTF asset must use ASCII/UTF-8 encoding (i.e. non-ASCII chars are allowed only for app-specific strings).

With these restrictions, we can be sure that minimal glTF loader doesn't need proper Unicode support.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

glTF 2.0 syntax changes and JSON encoding restrictions #831

Syntax change

1. Incorrect usage of JSON (from Vulkan loader)

2. Specifics of WebBrowser's JSON parsers

3. Asset size reduction

4. Parsing performance

JSON encoding restrictions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

User-Agent	Memory, MB		Time, ms
	Memory, MB		Core i5-6600, W10x64		Celeron 847, W10x64
	Objects	Arrays	Objects	Arrays	Objects	Arrays
MS Edge 38	2.77	2.16	12	7	37	28
Mozilla Firefox 51	2.52	1.93	14	10	60	45
Google Chrome 56	4.02	3.67	14	10	60	45

glTF 2.0 syntax changes and JSON encoding restrictions #831

Description

Syntax change

1. Incorrect usage of JSON (from Vulkan loader)

2. Specifics of WebBrowser's JSON parsers

3. Asset size reduction

4. Parsing performance

JSON encoding restrictions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions