feat(iso): concat Blu-ray main feature across clips and discs#599
Merged
Conversation
Long Blu-ray releases split the main feature two ways: across multiple M2TS clips within a disc (joined by BDMV/PLAYLIST/*.mpls), and across multiple discs in one NZB (e.g. AVATAR_FIRE_AND_ASH_DISC_1 / _DISC_2). The importer previously kept only the single largest M2TS per ISO, which both dropped the rest of the movie within a disc and treated each disc as an unrelated file. Now ExpandISOContents (shared between rar and sevenzip aggregators) parses the main MPLS playlist on each ISO, reads the 9660 PVD volume label, groups ISOs by stripped base name with a DISC|CD|PART suffix regex, and emits a single Content whose NestedSources chain spans every M2TS in disc-then-playlist order. The metadata layer's existing nested multi-reader produces one seamless seekable virtual file. Non-BDMV discs and unparseable playlists fall back to the legacy largest-file behaviour so nothing regresses.
On a 3D-only Blu-ray release (e.g. AVATAR_FIRE_AND_ASH_3D), the main feature playlist references clips that exist only as SSIF files in BDMV/STREAM/SSIF/ — the M2TS directory holds short extras. The previous resolver indexed only M2TS, so the long 3D playlist failed to resolve any clips and a short extras playlist won by default, producing a ~177 MB virtual file for a movie whose NZB carries ~88 GB of source data. Resolve clip names against M2TS first (preserves the smaller, more compatible 2D version on hybrid 3D releases) and fall back to SSIF when only it can satisfy the playlist. Two new test cases cover the 3D-only-with-SSIF and hybrid-prefers-M2TS paths.
A repeated 88GB-NZB run is still producing a 177MB virtual file with clips=2 — byte-identical to the pre-SSIF-fix output. Three hypotheses remain: stale binary, 'no actual SSIF in this BDMV' (release uses M2TS only), or SSIF lives at a non-standard path. Add one summary log per ISO (total files, playlist count, M2TS and SSIF clip counts, 12 sample paths) and one log per evaluated MPLS (resolved clip count, unresolved count, duration ticks, summed stream bytes) plus one 'picked' line. All prefixed with [DEBUG-isobd] for cheap cleanup and to confirm the new binary is live (the prefix won't appear in prior builds).
Real-ISO run shows all 38 playlists with items=1, max duration 80s, max stream bytes 141MB — yet the NZB carries ~88GB across 2 ISOs. Either ListISOFiles is dropping huge files (UDF alloc-type 2/3 not handled) or reading wrong sizes for them. Add to the bdmv-scan log: - sum of every file size (across all entries) - sum of M2TS-only and SSIF-only sizes - the 6 largest files with human-readable sizes One log line will distinguish 'sizes truncated', 'big files missing', and 'release is genuinely tiny'.
Real run shows all_files_sum_bytes=1.13 GiB across 295 files, biggest single file 135 MiB. NZB is 88 GiB across 2 ISOs. Need to know whether src.Size (claimed ISO bytes from the outer RAR archive) matches the sum of what ListISOFiles enumerated, or whether the walker is missing multi-GB files. One [DEBUG-isobd] iso analyse line per ISO now prints filename, iso_size, listed_files, listed_sum, and coverage_pct so the discrepancy is impossible to miss.
Root cause of the 'main feature M2TS files invisible' bug. udfReadDirEntries parsed every File Identifier Descriptor in a directory but only ever read the FIRST 2048-byte sector of each allocation descriptor's extent — even when the extent's ad.length claimed it spanned many sectors. A Blu-ray BDMV/STREAM/ directory with ~2500 FIDs (~30 KiB of FID data) lost every entry past the first sector, including the multi-GB main-feature clips 00016/00017/00022/00023/00028/00029 and the corresponding SSIF files. Local repro against AVATAR_FIRE_AND_ASH_3D_DISC_1.iso (37 GiB): - Before: listed_files=298 sum=1.16 GiB coverage=3.1% (no clip >135 MiB) - After: listed_files=2523 sum=74 GiB (00022.m2ts=17 GiB ✓) Fix factors readMetaExtent / readICBExtent helpers that walk every sector of an extent until ad.length is exhausted. Both fail-soft on EOF so a malformed image returns partial data rather than aborting the import. The pre-existing TestUDFReadDirEntriesShortADClampsExtentLength was pinning the BUGGY behaviour (it asserted the walker would truncate to one sector); renamed to TestUDFReadDirEntriesTruncatedExtent and now asserts the new contract: when an extent claims more sectors than the image contains, the walker returns whatever data it could read without an error. Adds fs_local_test.go: an ALTMOUNT_LOCAL_ISO=<path> gated integration test that catches this class of bug instantly against a real ISO. Skipped in CI. Also strips the [DEBUG-isobd] / [DEBUG-walk] instrumentation added during the investigation and tones the resolver / processor logs down to one production-grade INFO line per ISO and per main-feature pick.
The directory-listing fix exposed a second latent bug downstream: the walker only stored ONE allocation descriptor's LBA per file even though huge Blu-ray clips are split across hundreds of extents (Avatar's 00022.m2ts: 945, 00023.m2ts: 945, 00028.m2ts: 294, 00016.m2ts: 238). For every multi-extent file, downstream reads of bytes past the first extent's length returned wrong sectors (whatever happened to live next to extent 1 on disc) instead of the file's real data — silent corruption ~50× the size of the visible bug. Changes: - isoFileEntry now carries []isoExtent instead of a single lba field. - collectFileExtents() walks every inline AD and chases Allocation Extent Descriptor (UDF tag 258) chains so files with more ADs than fit in the FE sector are fully enumerated. Caps total extent bytes at info_length so a malformed FE can't yield more data than the file claims. - ISOFileContent gains a Sources []ISONestedSource slice (one per extent) and drops the single-Segments / single-NestedSource fields. - buildFileContent emits one ISONestedSource per extent: unencrypted ISOs pre-slice outer segments to cover each extent; encrypted ISOs keep the full outer segments and seek via InnerOffset (AES-CBC IV chain still anchors at byte 0 of the outer ISO). - archive.isoFileContentToNestedSource → isoFileContentToNestedSources fans the slice out into one archive.NestedSource per extent. - buildMainFeatureContent and buildLargestFileContent thread the multi-source path so the final concat Content carries every extent of every clip in disc-then-playlist order. Verified against the real Avatar disc 1 ISO via fs_local_test.go: 00022.m2ts: 945 extents, sum-of-extent-lengths == 17 GiB info_length. TestLocalISO_DiscoverBigFiles asserts >=2 extents and full coverage for the sentinel big-clip set.
A BD3D SSIF often emits a dozen separate UDF allocation descriptors for
what's a single contiguous run of sectors on disc. After the multi-
extent fix, each AD became its own NestedSource — bloating the proto
metadata, the validation-sample surface, and the per-file open-handle
count for what is logically one extent.
coalesceExtents merges adjacent extents whose physical sectors follow
the previous extent's last sector. Measured against the real Avatar
disc 1 ISO:
- BDMV/STREAM/SSIF/00022.ssif (22 GiB): 23 extents -> 2
- BDMV/STREAM/SSIF/00028.ssif (7 GiB): 7 extents -> 1
- BDMV/STREAM/SSIF/00016.ssif (6 GiB): 6 extents -> 1
M2TS files keep their full extent list because BD authoring genuinely
interleaves the M2TS clips with the SSIF dependent-view data on disc.
Note: the recent import failure ("not a valid ISO 9660 or UDF image"
on disc 1, segment "44c89668..." unreachable during validation) is a
Usenet-side issue — disc 2 analysed cleanly in 30 seconds with the
same code path; disc 1 timed out reading its first sectors for 9
minutes before giving up. The coalescing change reduces the surface
where transient flakes can bite but cannot eliminate it.
Extract the Content -> FileMetadata mapping body (previously duplicated in rar.CreateFileMetadataFromRarContent and sevenzip.CreateFileMetadataFromSevenZipContent) into a shared package-level function archive.NewFileMetadataFromContent. Both processor methods now delegate to the shared function so the Processor interfaces and all existing callers (aggregator.go, test mocks) keep working unchanged. Behaviour is byte-for-byte preserved: same Status default, same AES handling, same NestedSegmentSource copy loop. This prepares Task 3 (ISO expansion) to persist FileMetadata for files discovered inside bare ISOs without depending on the RAR or 7z packages.
The UDF walker previously had seven sites where it silently dropped a file from its listing (continue/break with no log), making it impossible to diagnose missing files like BDMV/STREAM/00022.m2ts on Avatar disc 1. Thread context.Context through ListISOFiles -> udfWalkAll -> udfReadDirEntries -> collectFileExtents and emit slog.WarnContext at every silent drop site with the file path and a distinct reason. Behavior is unchanged; only diagnostics are added. A new in-memory test (TestUDFWalk_LogsWhenFileICBHasUnknownTag) drives the "unexpected tag" branch and asserts a WARN line is emitted with the file path and bogus tag id.
…ctory enumeration
Today udfWalkAll has no ctx.Err() check between files, so cancellation
only surfaces when the next sector read times out at the NNTP layer.
On a degraded network this can stretch a normal ~16ms/file walk into
minutes per ISO. Same for the AED-chain loop in collectFileExtents.
Add a ctx.Err() check at the top of each loop:
- udfWalkAll: returns the partial result + the cancellation error
immediately. iso_expansion.go already treats any error from the
walk as 'keep ISO as-is', so no caller change needed.
- collectFileExtents: returns []isoExtent (no error), so emit a
WARN in the existing 'AED chain truncated' style and break out
of the chain cleanly with whatever extents we have.
New test TestUDFWalk_StopsWhenContextCanceled builds a 3-FID synthetic
UDF blob, cancels the ctx before calling the walker, and asserts that
udfWalkAll returns context.Canceled within 100ms with an empty result
(i.e. no file ICB was read past the cancel point).
A degraded NNTP provider could stall iso.AnalyzeISO for 9+ minutes per disc, blocking the whole importer. Wrap each AnalyzeISO call in a hard context.WithTimeout (default 120s, knob: Import.IsoAnalyzeTimeoutSeconds) so the existing fallback at iso_expansion.go takes over within a bounded window instead of waiting indefinitely.
…ix 32GB PROPFIND memory leak
…encrypted-ISO .meta from 8GB to ~6MB
…emux correctly The streaming remux disabled itself on any unaligned start because it probed for the packet sync at byte 0/4 of the read offset. ffprobe seeks to a non-packet-aligned near-EOF offset to estimate duration, so the tail was served raw and the duration stayed wrong. Derive packet framing from the known clip byte grid (BDAV-192) instead of probing; pass leading mid-packet payload bytes through and rewrite full packets from the next boundary. Adds unaligned-start determinism coverage that reproduced the bug.
ISO analysis (filesystem walk + Blu-ray playlist resolution over NNTP) can take tens of seconds, during which the queue item's progress bar sat frozen and—for RAR/7z-wrapped ISOs—mislabeled as "Analyzing archive". Thread a progress.Tracker end-to-end through the ISO analysis chain so the bar advances with an "Analyzing ISO" stage: - progress.Tracker gains a nil-safe Slice(idx,count) helper that carves a child tracker covering one Nth of the parent's range. - ExpandISOContents/AnalyzeISO/ResolveMainFeature accept a tracker; ResolveMainFeature reports per-playlist progress (each .mpls is an NNTP round-trip), ExpandISOContents gives each ISO its slice of the band. - Bare ISOs (processor.go) get a dedicated 10->30 tracker. - RAR/7z aggregators derive an "Analyzing ISO" tracker from the archive tracker via Slice(0,1).WithStage without mutating it; archives with no ISO emit no updates, so the common case is unchanged.
javi11
added a commit
that referenced
this pull request
May 31, 2026
javi11
added a commit
that referenced
this pull request
May 31, 2026
* feat(iso): concat Blu-ray main feature across clips and discs (#599) * perf(iso): coalesce Blu-ray playlist reads into sequential runs ResolveMainFeature read every .mpls in BDMV/PLAYLIST/ one file at a time via readISOFile (Seek + ReadFull per extent). The backing DecryptingFile tears down its NNTP reader on every Seek and rebuilds a fresh UsenetReader + download manager on the next Read, with no segment cache. Since playlist files never end on sector boundaries, each .mpls paid a full reader teardown + fresh NNTP fetch, re-fetching the same clustered segments once per file (each fetch 1-5s on Usenet). Add readPlaylistsCoalesced: flatten all playlist extents, sort by disc offset, group contiguous neighbours into runs (split on gaps > 4 MiB or runs > 64 MiB), read each run with a single Seek + ReadFull, then reconstruct each playlist's bytes from the run buffers — byte-identical to readISOFile's multi-extent concat. A real PLAYLIST directory collapses to one sequential read the UsenetReader can prefetch across. Tests: TestReadPlaylistsCoalesced covers single/multi-run, multi-extent order, overlaps, zero-extent, read-error isolation, a differential test proving equivalence to readISOFile, and a seek-count test pinning the one-Seek-per-run property. * fix(parser): avoid shadowing ctx with WithTimeout in fetchAllFirstSegments Renamed the derived context variable from ctx to c to prevent shadowing the parent context, making the timeout scope explicit and avoiding potential misuse of the already-cancelled context in surrounding code.
yoshitaka420
pushed a commit
to yoshitaka420/altmount
that referenced
this pull request
Jun 1, 2026
yoshitaka420
pushed a commit
to yoshitaka420/altmount
that referenced
this pull request
Jun 1, 2026
…11#630) * feat(iso): concat Blu-ray main feature across clips and discs (javi11#599) * perf(iso): coalesce Blu-ray playlist reads into sequential runs ResolveMainFeature read every .mpls in BDMV/PLAYLIST/ one file at a time via readISOFile (Seek + ReadFull per extent). The backing DecryptingFile tears down its NNTP reader on every Seek and rebuilds a fresh UsenetReader + download manager on the next Read, with no segment cache. Since playlist files never end on sector boundaries, each .mpls paid a full reader teardown + fresh NNTP fetch, re-fetching the same clustered segments once per file (each fetch 1-5s on Usenet). Add readPlaylistsCoalesced: flatten all playlist extents, sort by disc offset, group contiguous neighbours into runs (split on gaps > 4 MiB or runs > 64 MiB), read each run with a single Seek + ReadFull, then reconstruct each playlist's bytes from the run buffers — byte-identical to readISOFile's multi-extent concat. A real PLAYLIST directory collapses to one sequential read the UsenetReader can prefetch across. Tests: TestReadPlaylistsCoalesced covers single/multi-run, multi-extent order, overlaps, zero-extent, read-error isolation, a differential test proving equivalence to readISOFile, and a seek-count test pinning the one-Seek-per-run property. * fix(parser): avoid shadowing ctx with WithTimeout in fetchAllFirstSegments Renamed the derived context variable from ctx to c to prevent shadowing the parent context, making the timeout scope explicit and avoiding potential misuse of the already-cancelled context in surrounding code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DISC|CD|PART_<n>suffixes from the ISO 9660 volume label (with a filename fallback), grouping discs that arrive in the same NZB.Contentper group whoseNestedSourceschain spans every M2TS in disc-then-playlist order, so the player sees a single seekable virtual.m2tsend-to-end via WebDAV / FUSE / Stremio.archive.ExpandISOContentsand removed the duplicatedexpandISOContentsfrom rar and sevenzip.Why
Long Blu-ray releases (the trigger here was
AVATAR_FIRE_AND_ASH_DISC_1/_DISC_2) split the main feature across both axes. The old picker dropped clips 2..N from each disc and treated each disc as an unrelated movie. The metadata layer'sMetadataVirtualFile.createNestedReaderalready concatenatesNestedSourcechains with mixed encrypted/unencrypted members — we only needed to teach the importer to produce that ordered list.Files
New:
internal/importer/archive/iso/mpls.go+_test.go— minimal BDA-spec MPLS parser (clip names, IN/OUT ticks, multi-angle PlayItems skipped via length prefix).internal/importer/archive/iso/volume.go+_test.go— reads the 9660 PVD volume label from sector 16 (hybrid BD ISOs always carry one).internal/importer/archive/iso/bluray.go+_test.go— locatesBDMV/PLAYLIST/*.mpls, picks the longest playlist, resolves clip names to orderedBDMV/STREAM/*.M2TSentries.internal/importer/archive/iso_expansion.go+_test.go— sharedExpandISOContents, disc-group regex, main-feature concat assembly.Modified:
internal/importer/archive/iso/types.go— newAnalyzedISOstruct (VolumeLabel, Files, MainFeature, DurationTicks).internal/importer/archive/iso/processor.go—AnalyzeISOreplacesAnalyzeISOContent; encrypted/unencrypted file build paths factored.internal/importer/archive/rar/aggregator.goandsevenzip/aggregator.go— callarchive.ExpandISOContents; deleted the duplicated local implementations.Behaviour
Tests
go test -race ./...passes across the whole repo.go tool golangci-lint run ./internal/importer/archive/...clean.Test plan
AVATAR_FIRE_AND_ASHtwo-disc NZB; confirm a single virtual.m2tsappears at the library path with size ≈ sum of all main-feature M2TS across both discs.Out of scope