Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk space not freed after syft command #416

Closed
rmkanda opened this issue May 20, 2021 · 7 comments · Fixed by #448
Closed

Disk space not freed after syft command #416

rmkanda opened this issue May 20, 2021 · 7 comments · Fixed by #448
Labels
bug Something isn't working needs-reproduction missing steps to reproduce or steps have not been confirmed

Comments

@rmkanda
Copy link

rmkanda commented May 20, 2021

What happened:
Whenever I ran syft command to analyse a image the disk usage goes up and once the syft is completed.
The utilized disk space is not freed.

What is the temporary directory? Can the disk space freed automatically?

Environment:

  • Output of syft version: 0.15.2
  • OS MacOS
@rmkanda rmkanda added the bug Something isn't working label May 20, 2021
@luhring
Copy link
Contributor

luhring commented May 27, 2021

Hi @rmkanda! Thanks for reporting this. Our expectation is that all files we create are cleaned up after the run completes. (The exceptions to this are in cases where the app crashes or when the app is terminated early.) So this is definitely a problem, and we need to look more into it.

What is the temporary directory?

We use your OS's tmp directory (seen by running echo $TMPDIR), and within that directory, we create a few directories as needed for the analysis job. These directory names should include the names of our projects, like syft and stereoscope.

I think I'm able to reproduce what you're describing...

How to reproduce

I have a split screen terminal. On the left, I’m running watch -n 0.2 'ls -l $TMPDIR | grep -i stereoscope | wc -l' . On the right, I’m running syft registry:anchore/anchore-engine:latest repeatedly. And then I’m watching the count of stereoscope temp dirs go up and down.

I’m finding that this count returns to the starting point only sometimes.

I'll be looking more into what might be causing this.

@DatGameh
Copy link

DatGameh commented Mar 13, 2024

Hi @luhring, I have a few question about these temp files:

For context, I intend to use Syft for large-volume scans of about 1000-2000 200-300 images at a time, but due to this bug, these large stereoscope directories of ~10GB each are created in the tmp folder but never deleted. This quickly consumes hundreds of GB of storage in a short period.

image

There are two things I would like to ask regarding this:

  1. Can I safely delete all these stereoscope and syft directories in tmp without breaking Syft?
  2. Is it possible to define where Syft saves these temp files?
    • For instance, instead of the OS's 'tmp' directory, save them in a temp directory in my project that gets deleted after the program finishes.

Thank you!

@kzantow
Copy link
Contributor

kzantow commented Mar 15, 2024

@DatGameh first of all, I should note that Syft should be cleaning this stuff up and if it's not, that's a bug we need to fix (I thought it had been, when this was closed). To answer your specific questions:

  1. After Syft runs, you can definitely delete those directories, they won't be used again
  2. It is possible by using a standard(?) environment variable, such as TMPDIR, e.g.: TMPDIR=$(pwd)/tmp syft python:latest seems to work just fine.

However, in figuring this out, I was able to reproduce some leftovers that should be cleaned up but are not. For example:

$ mdkir tmp
$ TMPDIR=$(pwd)/tmp syft python:latest --from registry
...
$ ls tmp
stereoscope-2691232634

Although in this case, the directory has some subdirectories but ultimately they are all empty, so I'm not sure steps offhand to reproduce large content being left on the filesystem. Regardless, these temp directories should get cleaned up, so I'm reopening this issue for that purpose.

EDIT: after debugging and re-running, I am seeing these temp directories getting properly cleaned up most of the time (really, all of the time now, aside from the first example where it did not properly clean up).

@kzantow kzantow reopened this Mar 15, 2024
@kzantow kzantow assigned kzantow and unassigned wagoodman Mar 15, 2024
@DatGameh
Copy link

DatGameh commented Mar 15, 2024

@kzantow Thank you for looking into the issue for me!

Here are extra details which hopefully helps:

The OS I am using is either RHEL 8.9 or Ubuntu 22.04

The task I am doing requires me to scan hundreds of images in a short period.
I do this through a Python script that runs multiple Syft Scans in parallel.
I run up to 30 scans simultaneously.

I used this as reference to implement limited concurrency: https://death.andgravity.com/limit-concurrency#asyncio-semaphore

Perhaps, could it be running multiple scans simultaneously that's causing this issue?

Here's an example of my script, using Python3.11:

import asyncio

CONCURRENT_SCANS = 30
images = [...]  # (A list of about 200-300 image names/links)


async def syft_scanner(image):
    scan_process = await asyncio.create_subprocess_shell(f'syft {image} -o spdx-json={image}.spdx.json',
                                                         stdout=asyncio.subprocess.PIPE,
                                                         stderr=asyncio.subprocess.PIPE)

    return scan_process.returncode


def scan_task_generator(images):
    for image in images:
        yield asyncio.create_task(
            syft_scanner(image),
            name=f'syft_scan_{image}'
        )


async def main():
    ## START
    scan_gen = scan_task_generator(images)
    scan_gen_end = False
    pending = set()

    successful_scans = 0
    failed_scans = 0

    while pending or not scan_gen_end:
        while len(pending) < CONCURRENT_SCANS and not scan_gen_end:
            try:
                aw = next(scan_gen)
            except StopIteration:
                scan_gen_end = True
            else:
                pending.add(asyncio.ensure_future(aw))

        if not pending:
            return

        done, pending = await asyncio.wait(
            pending, return_when=asyncio.FIRST_COMPLETED
        )
        while done:
            try:
                done_result = (done.pop()).result()
            except Exception as e:
                failed_scans += 1
                continue
            if done_result == 0:
                successful_scans += 1
            else:
                failed_scans += 1

    print('successful scans: ' + str(successful_scans))
    print('failed scans: ' + str(failed_scans))

    ## END


if __name__ == '__main__':
    asyncio.run(main())

Please let me know if you need any additional info. Thanks!

@wagoodman wagoodman added the needs-reproduction missing steps to reproduce or steps have not been confirmed label Mar 19, 2024
@kzantow
Copy link
Contributor

kzantow commented Mar 25, 2024

@DatGameh sorry for the delay getting back to you. I've spent quite a bit of time trying to reproduce this but I'm just really not able to. I'm not sure if I possibly killed a process and left some files not cleaned up when I saw this originally, but I can't come up with a way to reproduce it -- even using concurrent scans with a modified version of your provided script. The only thing I can think that's causing this for you is something is killing the process before it terminates normally (including things like out of memory), but without being able to make some sort of steps to reproduce the issue, it's nearly impossible to debug and fix. Is there anything else you could provide to help with reproduction -- maybe a script including a bunch of specific images that exhibits the problem?

@DatGameh
Copy link

@kzantow

Thank you for looking into the problem!
As of now, I've been able to avoid the issue entirely by creating a temporary Syft directory that gets deleted whenever my script exits - whether if it ends normally or abruptly stops through a signal, exception or anything else.

Something like this:
image

There are times when I need to abruptly end the program, since the scanning process naturally takes a long time to complete. In my attempts to recreating the problem so far, prematurely ending my program (e.g. SIGINT) does reproduce the undeleted temp files rather consistently. I didn't think to mention it initially, because I thought temp files should naturally be deleted even if the process terminates prematurely. I apologize for not mentioning it earlier.

I wonder then: Is it expected that ending the process prematurely means the temp files do not get deleted?

So far, I have not been able to reproduce the problem if I allow the program to finish naturally, but I will continue trying to recreate it and report to you if I am able to reproduce it.

@kzantow
Copy link
Contributor

kzantow commented Apr 8, 2024

Thanks @DatGameh. Given there really hasn't been a way to reproduce this that anyone has found (other than killing the process, which is expected to not execute clean up code), I'm going to close this again. Sorry for the noise here!

If we do find a way to reproduce this, let's open a new issue with the exact steps/scripts/etc. that can be used. Thanks in advance!

@kzantow kzantow closed this as not planned Won't fix, can't repro, duplicate, stale Apr 8, 2024
@kzantow kzantow removed their assignment Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-reproduction missing steps to reproduce or steps have not been confirmed
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants