Skip to content

Fix multiple issues in the Docker source plugin#96

Open
doraskayo wants to merge 5 commits intoapache:masterfrom
doraskayo:docker-fixes
Open

Fix multiple issues in the Docker source plugin#96
doraskayo wants to merge 5 commits intoapache:masterfrom
doraskayo:docker-fixes

Conversation

@doraskayo
Copy link

@doraskayo doraskayo commented Nov 12, 2025

This PR fixes multiple issues with the Docker source plugin. Fixing those issues is sufficient to get the image mentioned in #95 working, and likely others.

These changes shouldn't affect the resulting artifact of Docker images that were already fetched successfully by the source plugin. Their goal is to fix issues in the plugin that currently trigger Python exceptions with some (many?) publicly available multi-layered images.

See commit messages for more details.

Fixes #95

This prevents a failure to delete whiteout files that have ".wh."
more than once in their name.
The list of TarInfo received from tar.getmembers() is returned from
_get_extract_and_remove_files() to the caller and used to specify the
members to be extracted from the tar file. If the "tarinfo" parameter
is not set to ReadableTarInfo when the member list is created, its
file mode modifications would not apply to those instances of TarInfo.

As a result, the logic in ReadableTarInfo did not actually apply to
the file extraction process and file modes remained the same. This
results in errors such as the following when staged files cannot be
read by the running user:

  [00:10:04] FAILURE [0e4ec5fd] images/redhat-ubi8-x86_64.bst: Failed to capture tree /home/user/.cache/buildstream/tmp/staging-temppou9rkp9: code: 13
  message: "System error in `merklize()` for path \"/home/user/.cache/buildstream/tmp/staging-temppou9rkp9\": std::system_error exception thrown at [buildboxcommon_merklize.cpp:1125] [system:13], errMsg = \"Failed to open path \"\"etc/gshadow\"\"\", errno : Permission denied"

Setting tarinfo=ReadableTarInfo in _get_extract_and_remove_files()
solves this issue.
…ream

This version respects the umask, allowing access by buildbox-casd
running as a different user.

The implementation was copied from BuildStream version 2.6.0.[0]

[0] https://github.com/apache/buildstream/blob/4c68517662e17cd68f579db0b2d5c5e5d4451335/src/buildstream/plugins/sources/tar.py#L75-L92
The previous approach relied on removing files which were already
extracted in a previous step, which was susceptible to permission
and ownership issues.

For example, the following error is seen when the the permissions
of the parent doesn't allow removing a file:

  [00:00:06] FAILURE [0e4ec5fd] images/redhat-ubi8-x86_64.bst: docker source at images/redhat-ubi8-x86_64.bst [line 4 column 2]: Error staging source: [Errno 2] No such file or directory: '/home/coder/.cache/buildstream/tmp/staging-tempi542ewmq/root/buildinfo/'

Processing layer members before extracting them should avoid such
issues by never having to remove any files. It should also result
in significantly reduced disk IO when many whiteout files are
present.

The downside is slightly higher memory usage to store the member
list of all layers before extracting them, but the difference
should not be significant.
Opaque whiteouts hide the contents of their corresponding directories
in lower layers, but not the directories themselves.[0]

The implementation could be likely be optimized by implementing a
data structure that keeps a nested list of files for each directory,
but the focus of this change was correctness. The current
implementation only adds a few seconds of processing overhead to
images that take more than 10 minutes to be extracted, so it should
be good enough for now.

[0] https://specs.opencontainers.org/image-spec/layer/#opaque-whiteout
@doraskayo
Copy link
Author

@gtristan, @juergbi, @abderrahim, is anyone interested in fixing Python exceptions in the Docker source plugin? If so, I would really appreciate a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker source plugin fails to fetch images

1 participant