New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glTF 2.0 Texture Formats #835

Closed
lexaknyazev opened this Issue Feb 11, 2017 · 32 comments

Comments

Projects
None yet
10 participants
@lexaknyazev
Copy link
Member

lexaknyazev commented Feb 11, 2017

glTF 2.0 being runtime API-agnostic format must have a robust and extendable texture handling framework.

Right now, glTF has a very simple image object for referencing images of Web-browser compatible codecs (such as JPG or PNG) and a texture object for specifying desired GPU representation. glTF 1.0 doesn't provide any standard way to use native GPU formats (both compressed and uncompressed). Also it lacks support of more advanced texture usage patterns, such as cubemaps or MIP-levels.

There was an attempt (#739) to refactor image and texture object in a way that would enable aforementioned use cases, however that proposal came way before glTF 2.0 API-neutrality strategy change.

Remember, we don't want to introduce any breaking changes after glTF 2.0 release, so such core functionality must be as future-proof as possible. While we don't need to enable every possible format and feature now, we must be sure that future changes could be done in a non-destructive way.


State of cross-API formats support

GPU formats

Here's an overview of GPU texture formats that could be supported across different APIs (OpenGL ES 2.0/3.0, OpenGL 4.5, Vulkan, Metal, D3D11/12). Keep in mind, that in some cases API support doesn't guarantee hardware support. OpenGL ES 2.0 supports very few of them, so it's present there only for reference.

Uncompressed (byte-aligned) formats

The following table contains a matched list of byte-aligned GPU formats. Formats, exclusive to only one API are not mentioned.

  • Green: All modern APIs support.
  • Yellow: Most modern APIs support natively (OpenGL ES 3.0 extensions could be required for some).
  • Red: Support is non-uniform.

image

Packed formats

The following table contains a matched list of packed GPU formats.

  • Green: All modern APIs support (there's exception with D3D for formats available in WebGL 1.0).
  • Yellow: Most modern APIs support (no OpenGL ES 3.0).
  • Red: Native support is non-uniform.

image

Compressed formats

The following table contains a matched list of compressed GPU formats. Actual support varies both by API and hardware. Note, that ETC1 format isn't mentioned there, however ETC2-enabled systems support it.

  • Yellow: BC1 RGB
  • Green: BCn family
  • Gray: PVRTC
  • Blue: ETC2/EAC
  • Red: ASTC

image

Web formats

These formats are universally supported in web browsers, but supporting them in mobile or embedded environment could be inefficient. They require client-side decompression, hence, client RAM and CPU cycles. GPU will got uncompressed data. Client can do recompression to GPU-friendly format, hence even more processing.

BMP

Many versions of format exist. Could have anything from 1 bit-per-pixel BW to 32-bit RGBA. Browser support for rare combinations varies (e.g., look at comments in the Chromium source).
Why was it allowed in the glTF 1.0?

GIF

Could have 24-bit RGB colors, alpha is limited to 1-bit. LZW compression.

PNG

Could have L, LA, RGB, or RGBA data with up to 16-bit per channel. Deflate compression.

JPEG

Could have 24-bit RGB or 8-bit Luminance. No alpha.

State of glTF texture support

Issues

  • texture.format, texture.internalformat, and texture.type make little sense with Web formats. Other formats aren't supported, though.

  • Extending texture.target beyond GL_TEXTURE_2D (e.g. cubemaps) requires a way to associate multiple images with one texture or use some sort of image container format.
    We need to support some use cases, but mustn't introduce breaking changes within 2.x lifecycle.

  • glTF 1.0 assumes WebGL unpacking rule (upper left corner first). This is an issue for export workflow.

  • GPU-friendly formats require full data specification (width, height, pixel type). There's no standard way in glTF 1.0 to address that.

KTX proposal

To address some of issues above, KTX format support was proposed in #739. Main changes include (keep in mind that they are breaking from 1.0, so such decision could only be done with major version upgrade):

  • Allow image/ktx as a valid image MIME-type and container format. KTX container will handle stuff like cubemaps, MIP levels, etc. Loading KTX in GL (ES) is trivial, other API require mapping values from KTX header to their enums (see tables above).

  • Make image contain an array of its representations in different formats. These could include both Web and GPU formats. Different GPU-compressed formats could be provided for asset compatibility.

  • For web formats to achieve feature-parity with KTX, use JSON as metadata container and refer actual data via array of URIs or bufferViews.

  • Possibly, move Web formats support out of the core or not require them outside of web environment (so mobile clients won't have to implement them).

Such layout will allow adding more container formats (like Crunch or Basis) in post-2.0 minor updates.

Example (adapted from #739)

This is just an example, comments are welcome. undefined used for illustrative purposes.

{
    "textures": [
        {
            "image": 0,
            "sampler": 0
        }
    ],
    "images": [
        {
            "formats": [
                {
                    "format": 33779, // GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
                    "mimeType": "image/ktx",
                    "uri": undefined,
                    "bufferView": 0
                },
                {
                    "format": 32856, // GL_RGBA8
                    "mimeType": "image/png",  // valid only when "KHR_image_web" enabled
                    "uri": undefined,
                    "bufferView": undefined,
                    "extensions": {
                        "KHR_image_web": {
                            "flipY": true,
                            "width": 256,
                            "height": 256,  // specify 0 for a 1D texture
                            "depth": 1, // optional; depth of mip level 0 of a 3D texture; must be one for 2D and cube textures
                            "layers": 1, // optional; used for 2D texture arrays and cube map arrays
                            "faces": 1, // optional; 1 or 6 (cube map)
                            "levels": 2, // optional; 0 means run-time should call generateMipmap()
                            "uri": undefined,
                            "bufferViews": [  // list images in the order: depth-layers-faces-levels
                                1,
                                2
                            ]
                        }
                    }
                }
            ]
        }
    ]
}

Final remarks

For mass assets distribution it's vital to use compressed image data. At some point, we should expect libraries like Basis or Crunch to be universally integrated in exporting workflows.

KTX usage of more than one parameter (glFormat, glInternalFormat, glBaseInternalFormat, glType) to describe image format is sub-optimal. It also locks valid formats to those supported by GL.

KTX data layout isn't streaming-friendly: it's hard to fetch low-res image first.

More modern container (KTX2?) could be proposed later.

References


CC @pjcozzi @javagl @sbtron @bghgary @AurL @cedricpinson @mlimper @lasalvavida

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 12, 2017

Actual integration of GPU formats into the core spec will require diligent checking of all combinations of sampler/filtering modes for each format for each API. This is out of 2.0 scope.

Nevertheless, I think we should evaluate and do needed syntax/binding changes to be able to add those formats in the next minor update.

@pjcozzi

This comment has been minimized.

Copy link
Member

pjcozzi commented Feb 13, 2017

Generally looks OK, comments:


Being pragmatic, I am hesitant about introducing KHR_image_web; either all implementations will be forced to implement it or it will lead to fragmentation as exporters prefer to export well-known image formats, and then only some glTF renders can load the model. Instead, consider just removing support for BMP and GIF. PNG and JPEG decoders are widely available AFAIK and give glTF some compression options out of the box.

Note that the current glTF situation is much better than most 3D formats; for example, COLLADA doesn't define at all what image files are valid.


Supporting KTX has lots of implicit requirements (like different compression formats) so the spec will have to define that KTX is supported with a precise set of limitations.


I'm not sure about the height (3D texture), depth (3D texture mip level 0), and layers (2D texture arrays) properties as they can't be implemented in WebGL 1.

Is the goal to design the schema now so that it will be compatible when these are added later?

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

PNG and JPEG decoders are widely available

Generally, I'm OK with them in core. Maybe this could be a conformance/implementation note for mobile deployment. See these comments from #739 on implications:
#739 (comment)
#739 (comment)

Supporting KTX has lots of implicit requirements

That's why my perspective on 2.0 is to allow only glTF 1.0 targets/formats/types. Maybe additionally allow something for PBR, if needed (like cubemaps and mips).

Is the goal to design the schema now so that it will be compatible when these are added later?

Exactly! Biggest struggle would be GPU formats zoo across different APIs (e.g., look at 5551 vs 1555 layouts). Some formats could be easily (and losslessly) converted, while some introduce big overhead. Many examples in ANGLE source code.

@pjcozzi

This comment has been minimized.

Copy link
Member

pjcozzi commented Feb 13, 2017

PNG and JPEG decoders are widely available

Generally, I'm OK with them in core. Maybe this could be a conformance/implementation note for mobile deployment.

Conformance note is OK with me.

Supporting KTX has lots of implicit requirements

That's why my perspective on 2.0 is to allow only glTF 1.0 targets/formats/types. Maybe additionally allow something for PBR, if needed (like cubemaps and mips).

Sounds perfect. Can the PBR folks chime in? @sbtron @bghgary @cedricpinson @mlimper?

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

How to handle URIs/bufferViews for web codecs?
E.g., make both properties arrays and define that with "mimeType": "image/ktx" they must contain no more than one element.

@pjcozzi

This comment has been minimized.

Copy link
Member

pjcozzi commented Feb 13, 2017

I'm not following. Can you provide an example?

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

In the example above, image.formats[].uri and image.formats[].bufferView are value-properties (not arrays) for KTX images.
In the PNG/JPEG extension, these fields are arrays to support KTX-features (i.e., treat several 2D images as one "texture": cubemaps, mips, etc).

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

Compare uri and uris below:

{
    "images": [
        {
            "formats": [
                {
                    "format": 33779, // GL_COMPRESSED_RGBA_S3TC_DXT5_EXT
                    "mimeType": "image/ktx",
                    "uri": "img.ktx"
                },
                {
                    "format": 32856, // GL_RGBA8
                    "mimeType": "image/png",
                    "levels": 2,
                    "uris": [
                        "mip0.png",
                        "mip1.png"
                    ]
                }
            ]
        }
    ]
}
@pjcozzi

This comment has been minimized.

Copy link
Member

pjcozzi commented Feb 13, 2017

Ah, I see. Yes, I think always using arrays (with length === 1 for KTX) is reasonable.

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

Looks like we should also specify "premultipliedness" of alpha, or demand corresponding WebGL flag to be always on for some maps. See this article from #822 (comment).

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 13, 2017

Many points here are beyond what I can comprehend (I just started reading about KTX and texture compression in general).

But the uri[]/bufferView[] arrays confuse me a bit. Particularly, I wonder about the data model that is implied by such an image.format object.

So each of these objects will have an array of data chunks (e.g. ArrayBuffer objects), right?

For the particular case of MipMaps, wouldn't it be necessary to store the width/height for each of them, or are there some assumptions or standards for how the resolution of the lower levels is related to the highest level?

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

But the uri[]/bufferView[] arrays confuse me a bit. Particularly, I wonder about the data model that is implied by such an image.format object.

In some cases, texture consists of several "images". E.g., MIP-levels, Cubemaps, Array or 3D Textures (ES 3.0). Since PNG and JPEG containers allow only one image per file, we need a way to transmit several "images". With KTX, those arrays must contain only one element.

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 13, 2017

Thanks, I understood this so far (assuming that each image.format object will have multiple data blocks).

But won't it be necessary to store more information for each one?

This mainly refers to the different resolutions for different mipmap levels.

(Assuming that nobody wants to create one mipmap level from a PNG, and another (of the same MipMap) from a JPG...)

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

But won't it be necessary to store more information for each one?

In case of KTX file, only URI/bufferView is required because all other properties are provided in the KTX binary header. No need to duplicate them.

As for JPEG/PNG set of images, these properties have exact well-defined meaning:

  • flipY: texcoord origin adjustment, maybe add flipX or use KTXorientation;
  • faces: 1 or 6 (cube map), not optimal;
  • levels: number of MIP levels;
  • height: used for 1D/2D texture target distinction, should be replaced with something better;
  • depth: depth of mip level 0 of a 3D texture; must be one for 2D and cube textures;
  • layers: number of elements in 2D or Cubemap texture array.

For glTF 2.0, I would consider only faces and levels (if we need such features for PBR), since such functionality is supported with WebGL 1.0.

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 13, 2017

So there may be an image.format object like this:

{
    "format": 32856,
    "width": 256,
    "height": 256,
    "levels": 2,
    "bufferViews": [ 1, 2 ]
}

What is the resolution of the image data referred to by bufferView 2? Is it always 128x128?
(I'm not familiar with many concepts here, so apologies if this is a stupid question)

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

Is it always 128x128?

Yes, for 256x256 level 0. See p. 3.7.7 of OpenGL ES 2.0 Spec.

As for your example: "mimeType" is required, width / height not needed for 2D textures, since these dimensions are available in binary headers of all containers.

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 13, 2017

these dimensions are available in binary headers of all containers.

I thought that in a case like the one above, the bufferView would contain the actual image data, as a sequence of bytes representing the GL_RGBA8 values. But I probably have to read this issue and related documents a few more times. Until then, I'll wait with any attempts of implementing an infrastructure for "reading images". (In glTF 1.0, I had some simple map from imageId to byte[] data. Now, I'm not sure what the final structures will look like)

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 13, 2017

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 15, 2017

I'm still trying to understand the implications of this issue. Particularly regarding the resulting data structures, and how the data is supposed to be read, stored and passed to the graphics API.

Are the following statements true? :

  • uri and bufferView are mutually exclusive
  • when the mimeType is given, then the data has to be decoded according to this MIME type. If it is not given, then the data already is the raw pixel data (e.g. as bytes representing the GL_RGBA8 values)
  • The decision of whether the data is "encoded" data or "raw" data is made solely based on the mimeType. So there are 4 options for storing image data. For example, one could
    1. Fetch JPG data from a uri (that points to a .jpg file)
    2. Fetch raw GL_RGBA8 data from a uri (that basically points to a .bin file) (!?!)
    3. Fetch JPG data from a bufferView (!?!)
    4. Fetch raw GL_RGBA8 data from a bufferView

(Particularly, I'm not sure if case 2 and 3 are supposed to be supported)

  • The width/height are only required for raw data (because in the "encoded" data, like JPG or KTX, they are already stored)
  • The additional properties (depth, layers, levels) may be contained in the container (as in KTX). But there may be cases where they have to be specified. E.g. when the mipmap levels are given as an array of uri to individual PNG files, or as an array of bufferView references containing the raw data of each level
@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 15, 2017

uri and bufferView are mutually exclusive

Yes.

when the mimeType is given, ...

mimeType is required.

There's no such thing as "raw" data, because it would significantly complicate loading anything beyond one-level-one-face-2d-texture.

@javagl

This comment has been minimized.

Copy link
Contributor

javagl commented Feb 15, 2017

There's no such thing as "raw" data, because it would significantly complicate loading anything beyond one-level-one-face-2d-texture.

OK, that wasn't clear to me. Again, I'm not so deeply involved here, but thought that it could be possible to roughly have something like this (pseudocode) :

{
    "format" : GL_SOME_CUBE_MAP_TYPE,
    "faces" : 6,
    "bufferViews" : [
        0,  // the bufferView containing the raw RGBA data for the +x side
        1,  // the bufferView containing the raw RGBA data for the -x side
        ...
        5,  // the bufferView containing the raw RGBA data for the -z side
    ]
}

For me, "encodedData+mimeType" was basically equivalent to "raw data" (even though, of course, it's a trade-off of file size vs. decoding effort). I didn't see why something like this should not be supported. But now it's clear: The bufferView references will always contain encoded data (like JPG data). Sorry, I didn't want to open a can of worms here.

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Feb 15, 2017

KTX has a simple fixed-size header at the beginning. The rest is just a fixed-order concatenation of all "bufferViews" from your example.

@theanohana

This comment has been minimized.

Copy link

theanohana commented Feb 24, 2017

How can i let png become a gltf mesh with a ktx texture.

@robertlong

This comment has been minimized.

Copy link
Contributor

robertlong commented Apr 5, 2017

Would there be any required image format (KTX)? Or would a GLTF 2.0 compatible renderer have to support KTX, png, and jpeg?

@sbtron

This comment has been minimized.

Copy link
Contributor

sbtron commented Apr 6, 2017

glTF 2.0 will stick to png and jpeg. Using KTX is an exploration to ensure we can add it in a future update in a compatible manner.

@robertlong

This comment has been minimized.

Copy link
Contributor

robertlong commented May 3, 2017

It sounds like KTX will not make its way into GLTF 2.0.

Is there any proposal for adding support as an extension?

KTX would bring huge optimizations to UnityGLTFLoader. Unity currently handles runtime png/jpg loading very poorly and having the option to load a compressed texture when available would help a lot.

@pjcozzi

This comment has been minimized.

Copy link
Member

pjcozzi commented May 6, 2017

AFAIK there's no current work on a KTX extension, but you are welcome to get the ball rolling on one!

@sakrist

This comment has been minimized.

Copy link

sakrist commented Mar 17, 2018

Hi there!

I've put together my ideas for extension to support compressed textures in glTF.
Here you can check how it looks like.

Main use case is be online extension, i.e. "client request glTF file with specific compression type from hosted application on remote server."

@TimvanScherpenzeel TimvanScherpenzeel referenced this issue Apr 25, 2018

Closed

2.0 Roadmap #330

5 of 6 tasks complete
@TimvanScherpenzeel

This comment has been minimized.

Copy link

TimvanScherpenzeel commented Apr 25, 2018

The KTX extension would make most sense in my opinion (as opposed to separate extensions for DDS / PVR). In an effort to have a single tool I've created https://github.com/timvanScherpenzeel/texture-compressor which is heavily based on the compressed texture generation tooling in https://github.com/AnalyticalGraphicsInc/gltf-pipeline. (My apologies if this sounds like a promotion for my tool, it is merely ment as a way to show that is it possible).

ASTC, ETC, PVRTC and S3TC are all wrapped in a KTX container and able to be decoded correctly using KTXLoader in Three.js. Apart from some smaller issues (like missing mipmapping support in https://github.com/ARM-software/astc-encoder) this appears to work fine.

@robertlong

This comment has been minimized.

Copy link
Contributor

robertlong commented May 9, 2018

I'd also like to see cross platform support for compressed textures. There have been talks about adding support for a Universal Compressed Texture Format however it is not clear what licenses are needed to encode/transcode/decode these textures and when we can expect the extension to be made available to the public.

Until we have the universal compressed texture extension it would be nice to be able to use existing compressed texture formats. A KTX extension and the ability to specify multiple image formats would fill this space in the interim.

As @sakrist mentioned we also have this "client request glTF file with specific compression type from hosted application on remote server" use case in Mozilla Hubs. Currently png/jpeg image decoding is causing a lot of hitching in our app. WebGL doesn't have a great way to offload the cost of decoding these images to another thread. @takahirox has been doing amazing work on the ThreeJS GLTFLoader within the limitations browsers have right now. Adding support for cross-platform compressed textures would help reduce that cost even more.

I like @lexaknyazev's original proposal for an image formats array with support for png/jpg/ktx files. Would anyone else be in favor of making this an extension to hold us over until a universal format is agreed upon and implemented? If so I will create a proposal and submit a PR.

@dewilkinson

This comment has been minimized.

Copy link

dewilkinson commented May 21, 2018

Hi all,

I've been following the above comments and proposals with interest - yes, we are currently working on a KHR_texture_transmission extension for the purpose of transporting compressed textures - via KTX or similar style container format - that would enable import and export of block-compressed texture assets within glTF2.0 scene data.

The full transmission extension is expected to also feature support for a universal transcodable format, along with proposed standardized RDO modes , LZ and rANS lossless encode stages for variable rate compression of texture data to approach jpeg-level compression ratios.

Myself and @richgel999 will be presenting an update during this upcoming 3DFormats call, Wed 23rd May. We would be happy to gather feedback and consensus as to whether we should pursue an interim extension purely for transmitting existing block formats via KTX without the universal format and extended compression and transmission modes.

Let's continue discussion on this topic in this forum, the 3DFormats group will also review and gather consensus on the appropriate direction to go from here regarding an interim compressed texture extension.

Kind Regards,

Dave Wilkinson
Texture Transmission TSG

@lexaknyazev

This comment has been minimized.

Copy link
Member

lexaknyazev commented Nov 23, 2018

Closing this issue, since the path forward has been set.
KTX2 spec (WIP): https://github.com/KhronosGroup/KTX-Specification/
Texture transmission tools: https://github.com/KhronosGroup/glTF-Texture-Transmission-Tools/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment