Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix tmpdir being set from base command #200

Merged
merged 5 commits into from
May 7, 2024
Merged

fix tmpdir being set from base command #200

merged 5 commits into from
May 7, 2024

Conversation

AroneyS
Copy link
Collaborator

@AroneyS AroneyS commented Apr 18, 2024

Prior to this pull-request, TMPDIR was set using tempfile.gettempdir() at the time and using the current env of the Aviary command being run. However, when jobs are submitted to a HPC cluster, often they have a bespoke TMPDIR set for each job (e.g. to clean up tmp files after job is finished).

So, essentially, the PR is to allow jobs to fall back to their TMPDIR env variable if one was not forced by using --tmpdir.

  • Integration tests

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 18, 2024

Integration test test_short_read_recovery works for me without any weird files popping up.
(actually fails at the end due to the odd version of smafa that I have...)

But test_long_read_recovery fails at the spades_assembly rule due to /bin/bash: line 11: TMPDIR: unbound variable
@rhysnewell can we move spades_assembly into a script?

@rhysnewell
Copy link
Owner

Yeah, that would probably be better than the current shell command. However, spades_assembly_coverage also uses a shell command and looks like it is manually setting the TMPDIR variable. I imagine that this might cause some issues as well, so I think all of the shell commands should be considered for porting to python scripts if they interact with TMPDIR in any way

rhysnewell
rhysnewell previously approved these changes Apr 18, 2024
@rhysnewell rhysnewell self-requested a review April 28, 2024 23:03
@rhysnewell rhysnewell dismissed their stale review April 28, 2024 23:04

Needs re-evaluation after other introduced changes

@rhysnewell
Copy link
Owner

So @YibiChen noted some oddities with how tmpdir was being set, just want to verify that this was intended behaviour before merging into dev

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 28, 2024

@YibiChen #201 (comment) for which rule?

@YibiChen
Copy link

YibiChen commented Apr 28, 2024

Actually all rules submitted jobs to remote nodes, according to the qsub_logs from pbs. In the test case test_short_read_recovery_queue_submission, I noticed all the jobs running on the remote jobs were still using /tmp, not the bespoke TMPDIR on each node. I wonder if this is something to be fixed here in aviary.

@YibiChen
Copy link

Ok, it seems to be a known issue with snakemake. See snakemake/snakemake#522. One way to work around is to use shadow_prefix

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 29, 2024

Or something like this: snakemake/snakemake#522 (comment)?

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 29, 2024

Or get it to set resources.tmpdir within the job. Maybe as addition to mqsub-snakemake?

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 29, 2024

Similar issue: snakemake/snakemake#2423

@YibiChen
Copy link

Adding if workflow.run_local should work for us. This is only removed after snakemake 8.0 and we are using 7

@rhysnewell
Copy link
Owner

Adding if workflow.run_local should work for us. This is only removed after snakemake 8.0 and we are using 7

I'd prefer not to use something that has been deprecated in future versions of snakemake if possible

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 29, 2024

It might be due to default-resources. In another snakemake, the command that is run within the job is a snakemake command with --default-resources 'tmpdir=system_tmpdir', even though I haven't set that anywhere.

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 29, 2024

@YibiChen can you try setting tmpdir in default resources to none?
e.g. https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#default-resources

@YibiChen
Copy link

@AroneyS I have tried to set tmpdir in default resources to none, and got an error directly. I think there some builded-in check to make sure tmpdir is given. It makes sense because the tmpdir in default resources is critical and would be used to overwrite the tmdir in the environment once the job is submmited to clusters. That's why our python scripts did not work. The tmpdir value given by PBS is never accessible as it was overwritten by snakemake immediately. It is designed to do so according to https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#standard-resources .

I believe the snakemake team is aware of this problem and fix may come out. For now, we may just exam whether there is a PBS_JOBID env variable and test whether /data1/pbs.$PBS_JOBID.pbs is writable. If true then just use it as tmpdir.

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 30, 2024

What do you think "In cluster or cloud setups, its evaluation is delayed until the actual execution of the job." means from your link? What is it evaluating? Can we set --default-resources tmpdir="$TMPDIR" to load the TMPDIR var from the job env?

@AroneyS
Copy link
Collaborator Author

AroneyS commented Apr 30, 2024

@YibiChen actually maybe you can try the same setup with updated snakemake? Though there are heaps of breaking changes...

Edit: snakemake/snakemake#1860 hmm...

@YibiChen
Copy link

This is what I understand. The evaluation is referring to "setting the $TMPDIR variable for shell commands" (i.e. parsing resources.tmpdir to $TMPDIR in the env). It is delayed so it can overwrite the $TMPDIR given by PBS. --default-resources tmpdir="$TMPDIR" is just the same as the default (i.e. 'tmpdir=system_tmpdir'). Because this part is executed in the main snakemake cmd. We would still get "/tmp". I will try the undated snakemake and see how it goes.

@YibiChen
Copy link

YibiChen commented May 1, 2024

Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it.

I have tested it again after removed that line. It works good now.

@rhysnewell
Copy link
Owner

Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it.

I have tested it again after removed that line. It works good now.

that's pretty funny, but did you put this in your .bashrc or has the aviary configuration gone awry? It can write export variables to your .bashrc if you don't install it within an environment and run aviary configure, so just want to make sure it hasn't gone crazy

@YibiChen
Copy link

YibiChen commented May 7, 2024

Problem "sloved". It turns out the python scripts worked but I had "export TMPDIR=/tmp" in my bashrc... and I totally forgot about it.
I have tested it again after removed that line. It works good now.

that's pretty funny, but did you put this in your .bashrc or has the aviary configuration gone awry? It can write export variables to your .bashrc if you don't install it within an environment and run aviary configure, so just want to make sure it hasn't gone crazy

No worries. It's nothing to do with aviary configure. It was just me been silly. I got complains from the HPC team about not using the correct tmpdir so I just manually added that that my bashrc and totally forgot about it.

@AroneyS AroneyS merged commit 691941f into dev May 7, 2024
2 checks passed
@AroneyS AroneyS deleted the fix-tmpdir branch May 7, 2024 02:43
@AroneyS
Copy link
Collaborator Author

AroneyS commented May 7, 2024

@rhysnewell new release?

@rhysnewell
Copy link
Owner

yeah, i'll spin it up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants