run_structure_prediction.py accepts comma separated list of input folds and optionally dedicated output_directories for each fold #357

maurerv · 2024-06-06T12:27:50Z

No description provided.

…ds and optionally dedicated output_directories for each fold

dingquanyu · 2024-06-06T12:43:46Z

I guess in the case of padding, you may also need to update the --output_directory key so that its value is a list in the argument dictionary by extending it to all the sub-folders that should be created in this if block here? e.g. iterate through all_folds and append individual path.join(FLAGD.output_path, <name of the protein complex>) to a list.

AlphaPulldown/alphapulldown/scripts/run_multimer_jobs.py

Line 125 in 732baec

command_args["--input"] = ",".join(all_folds)

DimaMolod · 2024-06-06T12:42:39Z

alphapulldown/scripts/run_structure_prediction.py

-        object_to_model, flags_dict, postprocess_flags, output_dir = pre_modelling_setup(interactors, FLAGS)
+
+    if len(FLAGS.input) != len(FLAGS.output_directory):
+        FLAGS.output_directory *= len(FLAGS.input)


this covers the case when we have 1 output dir and many inputs, right?

maurerv · 2024-06-06T13:39:12Z

@dingquanyu from your PR at KosinskiLab/AlphaPulldownSnakemake#13 it seemed like you wanted run_multimer_jobs.py to use a single output directory and create subdirectories for each fold according to use_ap_style.

We could extend run_multimer_jobs.py to allow multiple output_paths, but since run_multimer_jobs.py uses the file-based fold specification, where the user might not know the number of folds beforehand, I think having a single output directory makes the most sense

dingquanyu · 2024-06-06T13:48:48Z

@dingquanyu from your PR at KosinskiLab/AlphaPulldownSnakemake#13 it seemed like you wanted run_multimer_jobs.py to use a single output directory and create subdirectories for each fold according to use_ap_style.

We could extend run_multimer_jobs.py to allow multiple output_paths, but since run_multimer_jobs.py uses the file-based fold specification, where the user might not know the number of folds beforehand, I think having a single output directory makes the most sense

I see. This means in the snakemake pipeline, you will bypass run_multimer_jobs.py and launch run_structure_prediction.py directly with a cluster of jobs ?

maurerv · 2024-06-06T13:52:01Z

Exactly. I added a checkpoint that performs the clustering and then extended the current rule using run_structure_prediction.py to run on each cluster separately. This way we don't need additional rules.

I just pushed these changes for reference bfa71c7ac5d013a0c1aea3b78fc347381a3ca06c

dingquanyu · 2024-06-06T14:04:53Z

Exactly. I added a checkpoint that performs the clustering and then extended the current rule using run_structure_prediction.py to run on each cluster separately. This way we don't need additional rules.

I just pushed these changes for reference bfa71c7ac5d013a0c1aea3b78fc347381a3ca06c

I see. Thanks for the commit. Now it makes sense to me.

run_structure_prediction.py accepts comma separated list of input fol…

732baec

…ds and optionally dedicated output_directories for each fold

maurerv requested review from dingquanyu and DimaMolod June 6, 2024 12:32

DimaMolod approved these changes Jun 6, 2024

View reviewed changes

dingquanyu approved these changes Jun 6, 2024

View reviewed changes

Update run_structure_prediction.py

db19afb

maurerv merged commit 29682ab into KosinskiLab:main Jun 6, 2024
4 checks passed

maurerv deleted the prediction_cli branch June 6, 2024 14:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run_structure_prediction.py accepts comma separated list of input folds and optionally dedicated output_directories for each fold #357

run_structure_prediction.py accepts comma separated list of input folds and optionally dedicated output_directories for each fold #357

maurerv commented Jun 6, 2024

dingquanyu commented Jun 6, 2024

DimaMolod Jun 6, 2024

maurerv Jun 6, 2024

maurerv commented Jun 6, 2024 •

edited

Loading

dingquanyu commented Jun 6, 2024

maurerv commented Jun 6, 2024 •

edited

Loading

dingquanyu commented Jun 6, 2024

run_structure_prediction.py accepts comma separated list of input folds and optionally dedicated output_directories for each fold #357

run_structure_prediction.py accepts comma separated list of input folds and optionally dedicated output_directories for each fold #357

Conversation

maurerv commented Jun 6, 2024

dingquanyu commented Jun 6, 2024

DimaMolod Jun 6, 2024

Choose a reason for hiding this comment

maurerv Jun 6, 2024

Choose a reason for hiding this comment

maurerv commented Jun 6, 2024 • edited Loading

dingquanyu commented Jun 6, 2024

maurerv commented Jun 6, 2024 • edited Loading

dingquanyu commented Jun 6, 2024

maurerv commented Jun 6, 2024 •

edited

Loading

maurerv commented Jun 6, 2024 •

edited

Loading