multi-part document (MPD) support #26

ScanMountGoat · 2023-02-25T22:12:47Z

Ldraw has an MPD extension for embedding multiple files in the same file. The extension is documented here: https://www.ldraw.org/article/47.html.

While files using the MPD extension typically use the .mpd extension, it's also common to see .ldr files containing multiple embedded files. This seems to be the standard way to define submodels in a single ldraw file or embed custom parts.

The current parsing implementation assumes all subfile commands come from the resolver like DiskResolver, which won't be the case for embedded files. This causes resolve errors. The spec doesn't make any requirements on how to handle embedded filenames "masking" the filenames of actual parts on disk. This could theoretically be handled by just parsing and adding all the embedded files to the source map before recursing into the subfiles themselves. This would ensure the embedded submodel file gets loaded before we try and resolve the command 1 ... submodel.

This may also require changes to the return type of the parse commands. Returning a single SourceFile may not adequately capture that the file could have embedded files. This depends on whether a SourceFile should represent the list of commands in a single ldraw file or the file "blocks" as represented in the MPD extension. It still makes sense for parse_raw to return the SourceFile. The recursive version parse could return all commands in a single SourceFile, return a collection of SourceFile, or not return a SourceFile at all.

I'm currently working on reworking the file loading to try and support this extension as described above. Let me know if you have any suggestions or concerns.

The text was updated successfully, but these errors were encountered:

djeedai · 2023-02-26T15:40:41Z

Thanks for the details. I've read the specification and my reactions are as follow.

General remarks

I'm all for adding support for MPD.
The terminology seems (maybe surprisingly) to be that "file" is a single coherent block of commands, and "document" refers to an OS file (generally on disk). I think luckily the terminology is already correct in most places.

Library 📦 `weldr`

parse_raw() never resolves, so no issue.
parse() uses the given FileRefResolver and SourceMap to do resolutions after it has read the entire document (so, multiple files if MPD, once implemented), so there should be no issue.
parse() returns a single SourceFileRef which is coherent with the fact the MPD specification says there's a single "main" model (the first one), and other sub-files embedded inside the MPD probably shouldn't be exposed outside of it. We could return all embedded files here but I feel it's fine to return the main one only as long as the other ones are not lost and can still be found inside e.g. the updated SourceMap after the parse() call.
On that last point, this might be problematic depending on how the specification is interpreted:
1. If we consider the sentence "all other files in the MPD will only be rendered if they are referenced by the main model, directly or indirectly." then we may say that embedded sub-files are only for the main model, and shouldn't leak outside of the MPD. If so, because all sub-file refs are inserted into the SourceMap currently, I guess for MPD we need a "temporary" SourceMap that we discard at the end of the document parsing, to avoid exposing the embedded sub-files to other files.
2. If we consider the sentence "So far, there are no clear scoping or namespace rules on MPD files. If you put a file named stud.dat in your MPD file, don't be surprised to see your stud.dat file appear on the top of every single brick in your scene." then we may say that embedded sub-files are just like non-embedded ones, and so inserting into the SourceMap is correct. In that case the current implementation works I believe.
I feel like maybe we should go with the latter, as the second sentence would never have been written in the specification if the intended interpretation was the former. Or maybe have it configurable? But that might be a pain to support both.

Binary tool 📦 `weldr-bin` / ⚙ `weldr`

The current parsing implementation assumes all subfile commands come from the resolver like DiskResolver.

I believe this is incorrect. The current implementation first checks SourceMap, then only if the file is not found does it asks the resolver (by enqueuing into the ResolveQueue). So if the initial document parsing inserted all embedded files into the SourceMap then once we start to resolve we'll find all embedded files and the resolver is never called.

This is just from my first reading of some 2+ year old code so I might be misunderstanding or there might be a bug, like we don't query the SourceMap before calling the resolver in some places. But overall it looks like the design would allow handling MPD without much change.

ScanMountGoat · 2023-02-27T02:54:44Z

I feel like maybe we should go with the latter, as the second sentence would never have been written in the specification if the intended interpretation was the former. Or maybe have it configurable? But that might be a pain to support both.

After giving this some more thought, I noticed that all the official parts use the extension .dat. Embedded files for applications like bricklink studio typically don't have any file extensions on submodels since these are user generated names like "vent" or "roof". The MPD files for official sets in the LDraw model repository use the .ldr extension for submodels. I've also seen .ldr files containing multiple submodels with the .ldr extension. While name collisions with official parts or other submodels are possible in theory, I'm not sure this would happen often in practice. I'll go with the simple approach for now and just add embedded files to the source map.

I believe this is incorrect. The current implementation first checks SourceMap, then only if the file is not found does it asks the resolver (by enqueuing into the ResolveQueue). So if the initial document parsing inserted all embedded files into the SourceMap then once we start to resolve we'll find all embedded files and the resolver is never called.

Yeah I misinterpreted it at first. I should be able to just handle this using the SourceMap like you mentioned.

djeedai · 2024-05-09T06:25:50Z

Support added in #27, closing this.

djeedai added the enhancement New feature or request label Feb 26, 2023

djeedai mentioned this issue Feb 27, 2023

Roadmap 1.0 #2

Open

22 tasks

djeedai closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-part document (MPD) support #26

multi-part document (MPD) support #26

ScanMountGoat commented Feb 25, 2023

djeedai commented Feb 26, 2023

ScanMountGoat commented Feb 27, 2023 •

edited

djeedai commented May 9, 2024

multi-part document (MPD) support #26

multi-part document (MPD) support #26

Comments

ScanMountGoat commented Feb 25, 2023

djeedai commented Feb 26, 2023

General remarks

Library 📦 weldr

Binary tool 📦 weldr-bin / ⚙ weldr

ScanMountGoat commented Feb 27, 2023 • edited

djeedai commented May 9, 2024

Library 📦 `weldr`

Binary tool 📦 `weldr-bin` / ⚙ `weldr`

ScanMountGoat commented Feb 27, 2023 •

edited