Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOCKER_COMMAND_TIMEOUT is not long enough for saving large windows images #161

Closed
1 task done
testworksau opened this issue Jan 18, 2022 · 7 comments
Closed
1 task done
Labels
status:confirmed This issue has been reviewed and confirmed type:bug Something isn't working

Comments

@testworksau
Copy link

GitGuardian Shield Version 1.10.7

  • I can reproduce this bug in the latest version

Command executed

ggshield scan docker

Describe the bug

There is a 6 minute hardcoded timeout present for all Docker commands:

# bailout if docker command takes longer than 6 minutes
DOCKER_COMMAND_TIMEOUT = 360

We are finding that some of our Windows containers that we are scanning with the docker scanner are timing out at the image save command.

Expected behavior

No error occurs

Traceback

Traceback (most recent call last):
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 53, in docker_save_to_tmp
    subprocess.run(
  File "C:\Python310\lib\subprocess.py", line 503, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "C:\Python310\lib\subprocess.py", line 1149, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
  File "C:\Python310\lib\subprocess.py", line 1529, in _communicate
    raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['docker', 'save', 'artifactory.our.domain.name/our-docker/our-dotnet-framework-build:1.0', '-o', 'C:\\Users\\agent\\AppData\\Local\\Temp\\tmp4a2ffxyuggshield\\artifactory.our.domain.name--our-docker--our-dotnet-framework-build:1.0.tar']' timed out after 360 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\Scripts\ggshield.exe\__main__.py", line 7, in <module>
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\cmd.py", line 229, in cli_wrapper
    return_code: int = cli.main(standalone_mode=False)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 132, in docker_name_cmd
    return handle_exception(error, config.verbose)
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\utils.py", line 271, in handle_exception
    raise e
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 117, in docker_name_cmd
    archive = str(docker_save_to_tmp(name, temporary_dir))
  File "C:\agent\builds\build-windows-i-00e5eb68f19754db7-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 67, in docker_save_to_tmp
    raise click.ClickException('Command "{}" timed out'.format(" ".join(command)))
click.exceptions.ClickException: Command "docker save artifactory.our.domain.name/our-docker/our-dotnet-framework-build:1.0 -o C:\Users\agent\AppData\Local\Temp\tmp4a2ffxyuggshield\artifactory.our.domain.name--our-docker--our-dotnet-framework-build:1.0.tar" timed out
@testworksau testworksau added type:bug Something isn't working status:new This issue needs to be reviewed labels Jan 18, 2022
@agateau-gg
Copy link
Collaborator

The timeout is hard-coded right now, which is not ideal. We are considering adding a --docker-timeout command-line option and/or a docker-timeout configuration option. In the mean time, can you increase it manually on your side?

Also, out of curiosity, how big are the images you are trying to scan?

@testworksau
Copy link
Author

The timeout is hard-coded right now, which is not ideal. We are considering adding a --docker-timeout command-line option and/or a docker-timeout configuration option.

That sounds like a good idea and would work for us.

In the mean time, can you increase it manually on your side?

We use the pip package, and call the ggshield command installed as part of that package, so that would involve us overwriting the source in the packages folder if I'm not mistaken?

Also, out of curiosity, how big are the images you are trying to scan?

This one is around 15GB. We'd prefer smaller images, but that's not possible when creating images for Microsoft build tooling.

@agateau-gg
Copy link
Collaborator

Yes, patching the source would work but it's going to be clumsy. I just pushed a new branch called docker-timeout-option which adds a --docker-timeout option to the docker scan docker command. The timeout is in seconds. It also includes the fix for the previous bug you hit (#160).

Let me know if it works for you.

@testworksau
Copy link
Author

testworksau commented Jan 20, 2022

Thanks @agateau-gg. I've tried out the branch, and it allows the build to almost run to completion (so the timeout works).

The build now fails with the following stack trace:

ggshield scan docker artifactory.our.domain.name/ourorg/ggtest:1.0 --docker-timeout=1200
Saving docker image... Traceback (most recent call last):
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 56, in docker_save_to_tmp
    subprocess.run(
  File "C:\Python310\lib\subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['docker', 'save', 'artifactory.our.domain.name/ourorg/ggtest:1.0', '-o', 'C:\\Users\\agent\\AppData\\Local\\Temp\\tmpzbp5krf4ggshield\\artifactory.our.domain.name--ourorg--ggtest:1.0.tar']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\Scripts\ggshield.exe\__main__.py", line 7, in <module>
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\cmd.py", line 229, in cli_wrapper
    return_code: int = cli.main(standalone_mode=False)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 145, in docker_name_cmd
    return handle_exception(error, config.verbose)
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\utils.py", line 271, in handle_exception
    raise e
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 130, in docker_name_cmd
    archive = str(docker_save_to_tmp(name, temporary_dir, docker_timeout))
  File "C:\agent\builds\build-windows-i-0f4d3b8d0deea6f8c-1\ourorg\docker-image-builder\ggshield\lib\site-packages\ggshield\docker.py", line 70, in docker_save_to_tmp
    raise click.ClickException("Unable to save docker archive")
click.exceptions.ClickException: Unable to save docker archive

Narrowing in on the save command itself:

docker save artifactory.our.domain.name/ourorg/ggtest:1.0 -o 
  C:\\Users\\agent\\AppData\\Local\\Temp\\tmpzbp5krf4ggshield\\artifactory.our.domain.name--ourorg--ggtest:1.0.tar

and then running the same command manually, outside of GitGuardian, returned a bit more detail:

rename C:\Users\agent\AppData\Local\Temp\.docker_temp_123887387 
C:\Users\agent\AppData\Local\Temp\artifactory.our.domain.name--ourorg--ggtest:1.0.tar: The parameter is incorrect.

A quick manual test shows that the colon (:) in the output filename is causing the problem.

❯ Move-Item .\ggtest.tar C:\Temp\ggtest:1.0.tar
Move-Item: The parameter is incorrect.

You can follow the docker save path through these links:

https://github.com/docker/cli/blob/a32cd16160f1b41c1c4ae7bee4dac929d1484e59/cli/command/image/save.go#L60
https://github.com/docker/cli/blob/a32cd16160f1b41c1c4ae7bee4dac929d1484e59/cli/command/utils.go#L38
https://github.com/docker/cli/blob/a32cd16160f1b41c1c4ae7bee4dac929d1484e59/vendor/golang.org/x/sys/windows/syscall_windows.go#L608
https://github.com/docker/cli/blob/a32cd16160f1b41c1c4ae7bee4dac929d1484e59/vendor/golang.org/x/sys/windows/zsyscall_windows.go#L2352

Ultimately, it appears that the MoveFileEx Win32 API - and likely the Windows filesystem - does not support colons in filenames.

Other notes:

  1. It would be helpful if the exception contained the err_string and / or other relevant information:
    https://github.com/GitGuardian/ggshield/blob/docker-timeout-option/ggshield/docker.py#L63-L70

  2. It would also be of benefit for us if we could pass in a parameter to set the filepath of the output file; we are running build agents in EC2 which have NVMe storage (usually mounted to Z:\ on Windows) which would be much, much faster than using the EBS drives temp folder (C:\). Of course, we'd also see performance gains on Linux builds.

@agateau-gg agateau-gg added status:confirmed This issue has been reviewed and confirmed and removed status:new This issue needs to be reviewed labels Jan 20, 2022
agateau-gg added a commit that referenced this issue Jan 20, 2022
This makes it possible to scan larger images.
agateau-gg added a commit that referenced this issue Jan 20, 2022
Saving fails if the image name contains a ':' character because we then
try to create a tarball name with a ':' in it, which is not an allowed
filename character on Windows.

The tarball is created in a temporary directory, so it does not need to
look like the image name. Let's use a fixed name instead: "archive.tar".

This commit also uses Path instead of str as a type for the path arguments
of functions:
- docker_scan_archive
- docker_archive_cmd
- docker_save_to_tmp
- get_files_from_docker_archive
agateau-gg added a commit that referenced this issue Jan 20, 2022
Saving fails if the image name contains a ':' character because we then
try to create a tarball name with a ':' in it, which is not an allowed
filename character on Windows.

The tarball is created in a temporary directory, so it does not need to
look like the image name. Let's use a fixed name instead: "archive.tar".

This commit also uses Path instead of str as a type for the path arguments
of functions:
- docker_scan_archive
- docker_archive_cmd
- docker_save_to_tmp
- get_files_from_docker_archive
@agateau-gg
Copy link
Collaborator

Thanks for the detailed report! I added a fix for this : issue to the docker-timeout-option branch.

Regarding where the image is saved: the Python function we use looks at the TMP, TEMP and TMPDIR environment variables to decide where to save the image archive. If you can override one of these before calling ggshield you should be able to use your faster disks.

@testworksau
Copy link
Author

No worries @agateau-gg - thanks for responding and fixing the issues so promptly.

I can confirm that the update in the docker-timeout-option branch has resolved the last of our problems; we will play around with the temp directory changes next week.

Thanks again 👏🏼

agateau-gg added a commit that referenced this issue Jan 21, 2022
This makes it possible to scan larger images.
agateau-gg added a commit that referenced this issue Jan 21, 2022
Saving fails if the image name contains a ':' character because we then
try to create a tarball name with a ':' in it, which is not an allowed
filename character on Windows.

The tarball is created in a temporary directory, so it does not need to
look like the image name. Let's use a fixed name instead: "archive.tar".

This commit also uses Path instead of str as a type for the path arguments
of functions:
- docker_scan_archive
- docker_archive_cmd
- docker_save_to_tmp
- get_files_from_docker_archive
@agateau-gg
Copy link
Collaborator

Great to hear the latest changes helped! This work is now in the main branch, so I am going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:confirmed This issue has been reviewed and confirmed type:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants