Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-part document (MPD) support #26

Closed
ScanMountGoat opened this issue Feb 25, 2023 · 3 comments
Closed

multi-part document (MPD) support #26

ScanMountGoat opened this issue Feb 25, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@ScanMountGoat
Copy link
Contributor

Ldraw has an MPD extension for embedding multiple files in the same file. The extension is documented here: https://www.ldraw.org/article/47.html.

While files using the MPD extension typically use the .mpd extension, it's also common to see .ldr files containing multiple embedded files. This seems to be the standard way to define submodels in a single ldraw file or embed custom parts.

The current parsing implementation assumes all subfile commands come from the resolver like DiskResolver, which won't be the case for embedded files. This causes resolve errors. The spec doesn't make any requirements on how to handle embedded filenames "masking" the filenames of actual parts on disk. This could theoretically be handled by just parsing and adding all the embedded files to the source map before recursing into the subfiles themselves. This would ensure the embedded submodel file gets loaded before we try and resolve the command 1 ... submodel.

This may also require changes to the return type of the parse commands. Returning a single SourceFile may not adequately capture that the file could have embedded files. This depends on whether a SourceFile should represent the list of commands in a single ldraw file or the file "blocks" as represented in the MPD extension. It still makes sense for parse_raw to return the SourceFile. The recursive version parse could return all commands in a single SourceFile, return a collection of SourceFile, or not return a SourceFile at all.

I'm currently working on reworking the file loading to try and support this extension as described above. Let me know if you have any suggestions or concerns.

@djeedai
Copy link
Owner

djeedai commented Feb 26, 2023

Thanks for the details. I've read the specification and my reactions are as follow.

General remarks

  • I'm all for adding support for MPD.
  • The terminology seems (maybe surprisingly) to be that "file" is a single coherent block of commands, and "document" refers to an OS file (generally on disk). I think luckily the terminology is already correct in most places.

Library 📦 weldr

  • parse_raw() never resolves, so no issue.

  • parse() uses the given FileRefResolver and SourceMap to do resolutions after it has read the entire document (so, multiple files if MPD, once implemented), so there should be no issue.

  • parse() returns a single SourceFileRef which is coherent with the fact the MPD specification says there's a single "main" model (the first one), and other sub-files embedded inside the MPD probably shouldn't be exposed outside of it. We could return all embedded files here but I feel it's fine to return the main one only as long as the other ones are not lost and can still be found inside e.g. the updated SourceMap after the parse() call.

  • On that last point, this might be problematic depending on how the specification is interpreted:

    1. If we consider the sentence "all other files in the MPD will only be rendered if they are referenced by the main model, directly or indirectly." then we may say that embedded sub-files are only for the main model, and shouldn't leak outside of the MPD. If so, because all sub-file refs are inserted into the SourceMap currently, I guess for MPD we need a "temporary" SourceMap that we discard at the end of the document parsing, to avoid exposing the embedded sub-files to other files.
    2. If we consider the sentence "So far, there are no clear scoping or namespace rules on MPD files. If you put a file named stud.dat in your MPD file, don't be surprised to see your stud.dat file appear on the top of every single brick in your scene." then we may say that embedded sub-files are just like non-embedded ones, and so inserting into the SourceMap is correct. In that case the current implementation works I believe.

    I feel like maybe we should go with the latter, as the second sentence would never have been written in the specification if the intended interpretation was the former. Or maybe have it configurable? But that might be a pain to support both.

Binary tool 📦 weldr-bin / ⚙ weldr

The current parsing implementation assumes all subfile commands come from the resolver like DiskResolver.

I believe this is incorrect. The current implementation first checks SourceMap, then only if the file is not found does it asks the resolver (by enqueuing into the ResolveQueue). So if the initial document parsing inserted all embedded files into the SourceMap then once we start to resolve we'll find all embedded files and the resolver is never called.

This is just from my first reading of some 2+ year old code so I might be misunderstanding or there might be a bug, like we don't query the SourceMap before calling the resolver in some places. But overall it looks like the design would allow handling MPD without much change.

@djeedai djeedai added the enhancement New feature or request label Feb 26, 2023
@ScanMountGoat
Copy link
Contributor Author

ScanMountGoat commented Feb 27, 2023

I feel like maybe we should go with the latter, as the second sentence would never have been written in the specification if the intended interpretation was the former. Or maybe have it configurable? But that might be a pain to support both.

After giving this some more thought, I noticed that all the official parts use the extension .dat. Embedded files for applications like bricklink studio typically don't have any file extensions on submodels since these are user generated names like "vent" or "roof". The MPD files for official sets in the LDraw model repository use the .ldr extension for submodels. I've also seen .ldr files containing multiple submodels with the .ldr extension. While name collisions with official parts or other submodels are possible in theory, I'm not sure this would happen often in practice. I'll go with the simple approach for now and just add embedded files to the source map.

I believe this is incorrect. The current implementation first checks SourceMap, then only if the file is not found does it asks the resolver (by enqueuing into the ResolveQueue). So if the initial document parsing inserted all embedded files into the SourceMap then once we start to resolve we'll find all embedded files and the resolver is never called.

Yeah I misinterpreted it at first. I should be able to just handle this using the SourceMap like you mentioned.

@djeedai djeedai mentioned this issue Feb 27, 2023
22 tasks
@djeedai
Copy link
Owner

djeedai commented May 9, 2024

Support added in #27, closing this.

@djeedai djeedai closed this as completed May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants