Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTF estimation creates symlinks with absolute paths #1113

Open
DimitriosBellos opened this issue Apr 22, 2024 · 10 comments
Open

CTF estimation creates symlinks with absolute paths #1113

DimitriosBellos opened this issue Apr 22, 2024 · 10 comments

Comments

@DimitriosBellos
Copy link

DimitriosBellos commented Apr 22, 2024

Dear Relion developers,

Hi, my name is Dimitrios Bellos and I am a member of the AI & I team in the Rosalind Franklin Institute. Our team help with supporting our Franklin RELION users with issues.

Recently we spotted that there are issues arising from the fact that the CTF estimation step creates symlinks to the Motion corrected data using absolute paths.

Example in CTFFind/job003/
'Position_99_035[-61_00]_EER_PS.mrc' -> '/<absolute-path>/MotionCorr/job002/<data-directory>/Position_99_035[-61_00]_EER_PS.mrc'

This can cause issues if the whole Relion Project directory is moved. This is common because a whole Relion Project directory may be moved from our compute infrastructure to Baskerville HPC and vise versa. Is it possible to make changes so relative symlinks are created ?
example
'Position_99_035[-61_00]_EER_PS.mrc' -> '/../../../MotionCorr/job002/<data-directory>/Position_99_035[-61_00]_EER_PS.mrc'
You can even generate the relative path using the realpath command (see here https://stackoverflow.com/questions/2564634/convert-absolute-path-into-relative-path-given-a-current-directory-using-bash )

Altenatively, can it even be done so no symlinks are used?

Kind regards,
Dimitrios Bellos

@biochem-fan
Copy link
Member

This can cause issues if the whole Relion Project directory is moved.

I doubt this. These links are created by a CTFFIND job and used only by the job itself. Thus, unless you move the project directory before the job completes, it should be fine. Am I missing other failure modes?

@biochem-fan
Copy link
Member

[-61_00]

Oh, this is STA, not SPA. I know nothing about the STA workflow. STA related issues need to be dealt with by others.

@DimitriosBellos
Copy link
Author

To help this is the script we run on the HPC

#!/bin/bash

#SBATCH --qos=rfi
#SBATCH --account=<account-name>
#SBATCH --time=0-01:00:00
#SBATCH --ntasks=3
#SBATCH --cpus-per-task=4
#SBATCH --gpus-per-task=1

module purge
module load baskerville
module load RELION

# Import (relion)
mkdir -p Import/job001
time relion_import  --do_movies  --optics_group_name "opticsGroup1" --angpix 1.85 --kV 300 --Cs 2.7 --Q0 0.1 --beamtilt_x 0 --beamtilt_y 0 --i "data/HeLa_argon/Position_*.eer" --odir Import/job001/ --ofile movies.star --pipeline_control Import/job001/

# Motion correction (relion)
time srun `which relion_run_motioncorr_mpi` --i Import/job001/movies.star --o MotionCorr/job002/ --first_frame_sum 1 --last_frame_sum -1 --use_own  --j 4 --float16 --bin_factor 1 --bfactor 150 --dose_per_frame 0.14 --preexposure 0 --patch_x 5 --patch_y 5 --eer_grouping 32 --gain_rot 0 --gain_flip 0 --dose_weighting  --grouping_for_ps 29   --pipeline_control MotionCorr/job002/

# CTF correction (relion)
time srun `which relion_run_ctffind_mpi` --i MotionCorr/job002/corrected_micrographs.star --o CtfFind/job003/ --Box 512 --ResMin 30 --ResMax 5 --dFMin 5000 --dFMax 50000 --FStep 500 --dAst 100 --ctffind_exe ctffind --ctfWin -1 --is_ctffind4  --fast_search  --use_given_ps   --pipeline_control CtfFind/job003/

@DimitriosBellos
Copy link
Author

DimitriosBellos commented Apr 23, 2024

The problem is after the completion of CTF estimation process. A directory is created in the <RELION-project-directory-name>/CTFFind/job003/ location that has the same structure as the data directory that exists on the RELION project directory level (<RELION-project-directory-name>/data/HeLa_argon/). It looks like this <RELION-project-directory-name>/CTFFind/job003/data/HeLa_argon/ and in it many symlinks are created.

If these symlinks are not longer needed after the completion of the CTF estimation process, can you please add a step to delete them after the CTF estimation is completed?

If they are needed even after the CTF estimation is completed, can you change the code so they are created using relative paths and not absolute paths. This way the symlinks will not break even if the whole RELION project directory is moved elsewhere.

It is a minor issue, but if the symlinks need to remain there after the CTF estimation completes, then having them in a form that they cannot break if the entire project directory is moved will be very useful.

@biochem-fan
Copy link
Member

biochem-fan commented Apr 23, 2024

The symlinks are not used after the job completion as far as SPA is concerned. I think (not confirmed) they get deleted when a user "Gentle Clean" the job from the GUI.

can you please add a step to delete them after the CTF estimation is completed?

This is a valid suggestion but because it is harmless (and nobody complained for at least five years), my priority is low. A pull request is welcomed.

@DimitriosBellos
Copy link
Author

DimitriosBellos commented Apr 24, 2024

No problem, we can perform the symlink delete part, if the delete of the links is supposed to be executed by the GUI.

We are currectly writing production scripts so a slurn script submitted to an HPC will perform a sequence of processes one after the other automatically. For this reason, we are running RELION solely using commands.

It might be a good idea to add in the documentation, for those that run RELION only via commands, that any symlinks created by CTFFind it is OK to delete them after CTFFind finishes.

Happy if you close the issue-ticket

@biochem-fan
Copy link
Member

biochem-fan commented Apr 24, 2024

FYI:

  • Gentle clean can be invoked from the command line: relion_pipeliner --gentle_clean
  • Did you check relion_it.py based on RELION Schemes? By using this (or relion_piperliner), the job history is created properly. Thus, a user can open the GUI on the output folder of your automatic processing pipeline and inspect what has been performed and continue data processing.

Unfortunately I cannot help with the latter because I don't use the feature myself.

@scheres
Copy link
Contributor

scheres commented Apr 30, 2024

Just want to confirm that STA behaves in the exact same way as SPA here. The PS.mrc files get generated during motioncorr and are only temporarily symlinked. Yes, deleting them would be cleaner, but this should not cause any issues.

@xinsheng44
Copy link

xinsheng44 commented May 4, 2024

Hello, I also have the same problem, when using shell script in ctf, it will create a full path in ctffind directory, I do not know how to solve it now.

At the same time, when I do ctf, there is an error “ERROR: Failed to make a symlink from A to B”, but the symlink already exists under ctfFind, but the error is still displayed. How do you solve it?

@xinsheng44
Copy link

The symlinks are not used after the job completion as far as SPA is concerned. I think (not confirmed) they get deleted when a user "Gentle Clean" the job from the GUI.

can you please add a step to delete them after the CTF estimation is completed?

This is a valid suggestion but because it is harmless (and nobody complained for at least five years), my priority is low. A pull request is welcomed.

I tested with SPA and STA and both had the same problem,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants