Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
glTF 2.0 syntax changes and JSON encoding restrictions #831
Here're reasons behind glTF 2.0 syntax (objects-to-arrays) changes and new JSON encoding restrictions. While each of them could have medium or little impact on robustness or performance, WG believes that their combination justifies this breaking change.
1. Incorrect usage of JSON (from Vulkan loader)
As opposed to using arrays with elements with a name property, glTF uses objects with arbitrarily named members.
Normally JSON objects are well-defined with well-defined member names. As a result, a JSON parser is needed that cannot just lookup object members by name, but can also iterate over the object members as if they are array elements. Not all JSON parsers support this.
JSON parsers that do support this behavior are typically not optimized for objects with many members.
2. Specifics of WebBrowser's JSON parsers
Modern JS engines try to build hidden class-based representations for already seen objects.
While that makes perfect sense for actual glTF objects like "accessor" or "node", there's no point in creating an object type for "accessors" dictionary. Moreover, when JS runtime sees too many properties, it could fall back to slower map-based internal representation (applies to v8).
This process consumes more client memory and cycles than creating a dense JS array of elements of the same type.
When an engine loads different glTF assets one after another, JS runtime can utilize previously created classes for objects like "accessor", but top-level dictionaries will have always unique signature, so they may not be optimized.
3. Asset size reduction
Arrays-based JSON tend to consume less disk / transfer space, less client's RAM. So JSON parser has fewer bytes to process.
Exporters/converters have to generate unique strings (because of global ID scope), collada2gltf does it by prefixing index with type, like "bufferView_25". Such step won't be needed for arrays. Objects that have some meaningful string name (nodes, meshes, and/or materials) have it usually in two places: as an ID and as a "name" property.
Minifying JSON by renaming all IDs to as-short-as-possible unique strings (like "Zq", "Rw", "w$", etc) surely reduces file size, but eliminates possible benefits of readability of string-based IDs.
E.g., minifying a big asset (actual scene with roughly ~1300 nodes, meshes, and ~4200 accessors) reduces JSON to 77% of original size (collada2gltf output). Combined with removing "name" properties - to 70%. Converting same collada2gltf output to array form and removing names reduces file size to 66% of original size.
4. Parsing performance
Web-browser's parsing performance difference should be noticeable on big assets (1M, at least).
Call to JSON.parse() is synchronous and causes little hangs even with WebWorkers (when data is transferred between threads). So an application that renders and loads assets constantly over time (think mapping applications, Cesium's 3D tiles) may benefit even from small performance gains.
Obviously, we can't easily measure internals of JSON.parse() call and say exactly how much each processing stage cost, but there's a clear win with arrays. Actual processing time varies by browser vendor and by CPU.
Hera're samples of duration (performance.now) of the first call to JSON.parse() and size of parsed object in heap (as reported by browser's tools) afterwards.
First number with objects (names removed, IDs minified, 1011478 bytes), second with arrays (also no names, 957153 bytes).
JSON encoding restrictions
JSON has some vague string-related specifics, that could be avoided by enforcing additional restrictions on glTF encoding:
To reduce possible impact of incorrect implementations of string handling in custom loaders, following restrictions are proposed:
With these restrictions, we can be sure that minimal glTF loader doesn't need proper Unicode support.