Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check/warn about filling up _TMPDIR #485

Closed
mitzimorris opened this issue Oct 27, 2021 · 5 comments · Fixed by #489
Closed

check/warn about filling up _TMPDIR #485

mitzimorris opened this issue Oct 27, 2021 · 5 comments · Fixed by #489
Labels
documentation question Further information is requested

Comments

@mitzimorris
Copy link
Member

Summary:

Problems reported via Discourse: https://discourse.mc-stan.org/t/analyzing-the-posterior-prediction-samples/24956/20?u=mitzimorris

Description:

User ran model to fit larget dataset and then run posterior predictive checks in the generated quantities block, using all default sampler settings and input dataset of 10M items.

CmdStan run succeeded, CmdStanPy unable to check the outputs when output files written to _TMPDIR -
fails with error:

  File "C:\Users\JORDAN.HOWELL.GITDIR\AppData\Local\Continuum\anaconda3\envs\cmdstan\lib\site-packages\cmdstanpy\utils.py", line 847, in scan_sampling_iters
    raise ValueError(

ValueError: line 51: bad draw, expecting 10799537 items, found 678606

When output_dir specified, call to sampler via CmdStanPy succeeds.

The result of running the sampler with defaults is 4 Stan CSV files of 1001 rows and 10M columns.
Is this enough data to fill up _TMPDIR?

@mitzimorris mitzimorris added bug Something isn't working question Further information is requested labels Oct 27, 2021
@WardBrian
Copy link
Member

Python doesn't document a specific size as a maximum for temporary directories, as I believe it ends up being system-dependent. No clue what Windows routine it even calls for this (on *nix system all the tempfile methods are named like their underlying utility, so tempfile.mkdtemp() calls mkdtemp)

@mitzimorris mitzimorris added documentation and removed bug Something isn't working labels Oct 27, 2021
@mitzimorris
Copy link
Member Author

Python doesn't document a specific size as a maximum for temporary directories, as I believe it ends up being system-dependent.

that's what I would have guessed. beyond our control.

we need a "troubleshooting" section somewhere for problems like this.

@WardBrian
Copy link
Member

If the user is on a FAT32 system the max file size will be 4gb. Probably not that, because it wouldn't work in any directory not just tmp

There's also this:

The GetTempFileName method will raise an IOException if it is used to create more than 65535 files without deleting previous temporary files.

But that seems like it would happen when we create the folder, right?

@mitzimorris
Copy link
Member Author

user reports that output files took up 400MB disk space.
some systems might have small /tmp partitions.

the call to CmdStan didn't throw any errors - some systems might throw https://docs.python.org/3.6/library/exceptions.html#OSError when the disk is full.

@WardBrian
Copy link
Member

I think that Windows always puts %TEMP% on the C drive. So if it’s full, you’ve got problems even if you have terabytes free on other drives. Sounds like that was this specific users issue, and I’m afraid there’s not much we can do.

We could add this to the error message that got reported for a different length csv size?
Add something like “This can be caused by running out of disk space during sampling” to the message?

WardBrian added a commit that referenced this issue Nov 1, 2021
@WardBrian WardBrian added this to To Do in CmdStanPy release 1.0 via automation Nov 2, 2021
@WardBrian WardBrian moved this from To Do to In Progress in CmdStanPy release 1.0 Nov 2, 2021
@WardBrian WardBrian moved this from In Progress to To Review in CmdStanPy release 1.0 Nov 2, 2021
WardBrian added a commit that referenced this issue Nov 2, 2021
CmdStanPy release 1.0 automation moved this from To Review to Done Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation question Further information is requested
Projects
No open projects
2 participants