Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mesh Data rework #371

Draft
wants to merge 15 commits into
base: master
from

Conversation

@mosra
Copy link
Owner

mosra commented Sep 13, 2019

What will this do:

  • Combine MeshData2D and MeshData3D into a single type, blurring the difference between these two (similarly to how AnimationData got done before), reducing code duplication
    • Need to deprecate the two original types, add implicit conversion for seamless backward compatibility
  • Expose new unified Importer API for those
    • Need to deprecate related Importer APIs (and make use of the implicit conversion for seamless backward compatibility)
  • Create a new AbstractMeshConverter plugin interface for operating on
    the MeshData
    • for meshoptimizer etc
  • Create a new magnum-meshconverter utility that would operate in-place on imported data (needs #240 as well, tho)
    • wait, how exactly would this work? the use is a bit limited
  • Better support for generic vertex attributes (reserved IDs in the enum for those)
    • ability to annotate them with string names? could be per-importer, no need to store strings in the mesh data)
    • add tangent and "object ID" attributes at least, postpone weights and bone IDs to later (otherwise i wouldn't have a way to test API usability for these)
  • Meshlet / mesh shader support? The code has something already but I don't remember what exactly that is anymore ... also I hope it doesn't require yet another Array member, that would make the serialization a bit more complex
  • Might need to implement growable Containers::Array to avoid this being too painful for hte importers
    • and some "pack this together, but preserve alignment" utility

Serialization

Basically making the MeshData (and later other *Data as well) easily serializable, allowing projects to convert arbitrary mesh formats to blobs that could be loaded directly via mmaping:

magnum-meshconverter file.blend --id "chair" --serialize chair.blob
magnum-meshconverter scene.glb --id "tree" --serialize tree.blob
...

cat chair.blob tree.blob car.blob > blobs.blob # because why not
auto blob = Utility::Directory::mapRead("blobs.blob");

// yes really
const Trade::MeshData& chair = *static_cast<const Trade::MeshData*>(blob.data());

Key points:

  • Each *Data would contain the same header with some magic, version info and total size. Magic and version info for sanity checks, total size to be able to simply cat multiple blobs together without having to worry about metadata. Similar to RIFF chunks (can't be 100% compatible because it's limited to 4 GB max, but can look there for inspiration).

    struct DataHeader {
        // some magic and version info (ideally stolen from PNG to detect broken encodings, CRLF conversion, bad endiannes etc.)
        DataType type;
        uint64_t size;
        // ... more stuffs
    
        const DataHeader* nextData() const {
            return reinterpret_cast<const char*>(this) + size;
        }
    };
    
    struct Trade::MeshData: DataHeader { ... };
    
    ...
    Containers::ArrayView<const char> data;
    for(auto i = static_cast<const DataHeader*>(data); i < data.end(); i = i->nextData()) {
        if(i->type == DataType::MeshData) {
            const auto& mesh = static_cast<const Trade::MeshData&>(*i);
            ...
        } else if(i-> type == DataType::AnimationData) {
            ...
        } else continue;
    }

    The concatenated blob can then be iterated like above, ignoring unknown chunks.

  • The MeshData need to support both self-contained operation (data bundled internally in an Array) and the above-shown "reinterpreted" operation. Because in the reinterpreted case no destructors are called, this could be done by putting a tag into the Array deleter field --- ideally with an odd value to minimize the chance an actual function could have the same address (I assume functions are never on odd addresses). Both M and ! satisfy this condition so the following would be an odd address on both BE and LE systems:

    if(std::strncmp(reinterpret_cast<const char*>(&arrayDeleter, "MAGNUM!", 8) == 0) {
        // okay this is a memory-mapped data, array.data() is an offset,
        // not a pointer
    }
  • In order to avoid extra work, the data layout should be ~consistent between 32bit and 64bit systems -- otherwise it won't be possible to serialize data on a (64bit) desktop and use them on a (32bit) Emscripten, e.g.. Array is 12 bytes on 32bit and 24 bytes on 64bit (pointer, size and deleter) so I think it could be enough to add padding around it on 32bit systems; plus the above "deleter tag" logic needs to be extended to check for both a 32bit and a 64bit case.

  • Some assets might have one huge buffer and particular meshes being just views on it. Ideally the buffer should be uploaded as-is in a single piece, with meshes then referring subranges. In this case, the serialized MeshData need to have a way to reference an external "data chunk" somehow -- some flag saying the offset not internal but to a different chunk + a data chunk "ID"? Or having that implicit -- it'll always be the first data chunk after a couple of data-less MeshData chunks?

@mosra mosra added this to the 2019.0c milestone Sep 13, 2019
@mosra mosra added this to TODO in Asset management via automation Sep 13, 2019
Copy link
Contributor

Squareys left a comment

I added a couple of comments, just in case they are helpful. Really excited about this!

src/Magnum/Trade/MeshData.cpp Outdated Show resolved Hide resolved
src/Magnum/Trade/MeshData.cpp Show resolved Hide resolved
src/Magnum/Trade/MeshData.h Outdated Show resolved Hide resolved
@mosra mosra referenced this pull request Sep 13, 2019
60 of 60 tasks complete
@Squareys

This comment has been minimized.

Copy link
Contributor

Squareys commented Sep 13, 2019

How about making DataType a char[4] so that it can be used as "magic"?

E.g. "Mg1m" could be "Magnum, version 1, mesh", or "Mg1a" etc. Would a, make the blobs distinguishable and easily identify magnum files as such to help with detection in Any*Importer.

There could also be a file-header in addition to the blob header which suggests number of chunks making detection of broken chungs easier etc.

@mosra

This comment has been minimized.

Copy link
Owner Author

mosra commented Sep 13, 2019

How about making DataType a char[4] so that it can be used as "magic"?

Ah, forgot to list the "magic" field in there. Updated. Putting all that into just four bytes seems a bit too crowded, would prevent extensibility with 3rd party chunk types for example.

There could also be a file-header in addition to the blob header which suggests number of chunks

That's what I explicitly do not want, as then the cat would be impossible and you'd need some tool to put them together, extract, or traverse. With the way above (and analogously to RIFF), a file is correct if each chunk points to a (non-OOB) position of the next chunk with correct magic or EOF and I think that could be enough :) Error detection could be further improved by having a CRC field for every chunk for example ... but then why not just zip all that anyway :)

@mosra mosra referenced this pull request Oct 2, 2019
18 of 45 tasks complete
@mosra mosra force-pushed the meshdata branch 2 times, most recently from 1dcbd35 to f1caab3 Nov 9, 2019
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Nov 10, 2019

Codecov Report

Merging #371 into master will increase coverage by 0.35%.
The diff coverage is 97.41%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #371      +/-   ##
=========================================
+ Coverage   72.55%   72.9%   +0.35%     
=========================================
  Files         354     359       +5     
  Lines       18854   19130     +276     
=========================================
+ Hits        13679   13947     +268     
- Misses       5175    5183       +8
Impacted Files Coverage Δ
src/Magnum/Magnum.h 100% <ø> (ø) ⬆️
src/Magnum/Trade/Trade.h 100% <ø> (ø) ⬆️
src/Magnum/Trade/AbstractImporter.h 100% <ø> (ø) ⬆️
src/Magnum/Trade/AbstractImporter.cpp 100% <100%> (ø) ⬆️
src/Magnum/Mesh.cpp 100% <100%> (ø) ⬆️
src/Magnum/Trade/Data.cpp 100% <100%> (ø)
src/Magnum/Vk/Enums.cpp 100% <100%> (ø) ⬆️
src/Magnum/Trade/AnimationData.h 100% <100%> (ø) ⬆️
src/Magnum/Trade/MeshData.h 100% <100%> (ø)
src/Magnum/GL/Mesh.cpp 50.54% <100%> (ø) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 98232f3...0281b82. Read the comment docs.

mosra added 7 commits Nov 4, 2019
Not yet sure if I should follow the PixelFormat naming or not. Maybe
will change this later.
Better for checking accidents, as picking a wrong primitive / index type
can lead to *serious* rendering issues. Similarly to a change done to
(Compressed)PixelFormat in 2019.10.
Otherwise they take up too much space.
With API analogous to the (relatively) new AnimationData -- with one
buffer containing all index data and one buffer containing all vertex
data, both meant to be uploaded as-is to the GPU.

This will eventually replace MeshData2D and MeshData3D, backwards
compatibility and wiring up to other APIs will be done in follow-up
commits.
Deprecating of the old ones comes later.
Will be used for MeshData that don't own the memory.
Copypaste error. This would hide graceful assert messages, which is not
good.
@mosra mosra force-pushed the meshdata branch from f1caab3 to ad441a1 Nov 11, 2019
@mosra mosra force-pushed the meshdata branch from ad441a1 to 0281b82 Nov 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
3 participants
You can’t perform that action at this time.