Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very long calculation time on symmetric map #11

Open
ctueting opened this issue Oct 10, 2022 · 9 comments
Open

Very long calculation time on symmetric map #11

ctueting opened this issue Oct 10, 2022 · 9 comments

Comments

@ctueting
Copy link

Hi Model Angelo DevTeam,

I tried to test model angelo on a < 3Å of us, but the calculation takes very, very long.

The Ca tracing was ok, with 3:31:18h, but for GNN model refinement 1/3 it took 105 hours. I am now in GNN 2/3 for 30h.

So far, the intermediate results look very nice.

The map was created with O symmetry, and the fasta file contains 24 times the same protein. I am not sure, if this is the reason for the long calculation time (or the limitation of our machine - but it uses only a single CPU and a single GPU [2080 Ti]).

Is there an option to increase the speed of highly symmetric protein structures?

Best
Christian

@biochem-fan
Copy link
Member

Does ModelAngelo support non-cubic maps? If so, one can extract an asymmetric unit (ASU) before running the program, significantly reducing the target volume. Because a chain might go into the neighboring ASU, you might want to add some "buffer regions" around an ASU.

@ctueting
Copy link
Author

I also though about this.
But using e.g., PHENIX to isolate a unique map, this must not necessarily the ASU. But splitted within the polypeptide density.
I my case, I can do this, as I can validate the unique model. But for an unknown map, it would be better to identify the symmetry during the calculation, e.g., after predicting the Ca backbone.

@jamaliki
Copy link
Collaborator

Hi,

Thank you for using ModelAngelo.
I'm sorry, but currently ModelAngelo treats all inputs as asymmetric and builds everything separately. It might be best to give parts of the map that you think are single asymmetric units to ModelAngelo, if you wish. Also, the results from the first GNN round will be very similar to the final results. You can use the output_fixed_pruned.cif file as the final pruned output and the output_fixed.cif file as the raw unpruned output. If you are using the no sequence model, you will only have the output.cif file.

Best,
Kiarash

@ctueting
Copy link
Author

Hi Kiarash,

thank you for the fast reply. The output_fixed_aa_pruned.cif looks correct (based on the overall expectation), but instead of 24 chains, this model contains 65 chains, and the individual 24 chains are fragmented. This is ok for me, as I can deal with this, but for non-experienced users, this might be an issue.

best
Christian

ps. I can give you access to the map and fasta, and intermediate results privately, if this helps you to improve your code :)

@jamaliki
Copy link
Collaborator

Hi,

Oh, I know why that is the case. Could you share the inputs and outputs and I can take a look.

Basically, this all gets fixed up in the last step, but that takes too long for you in this case :)

Best,
Kiarash.

@shahpnmlab
Copy link

This is my exact use case as well. Please let me know if a suitable solution is available.
Good job though Kiarash! :)

@jamaliki
Copy link
Collaborator

Hi @shahpnmlab ,

Could you try to break up the map manually? That is probably best. You can still use the same FASTA file.

Best,
Kiarash.

@shahpnmlab
Copy link

I will try that approach... In my case the asymmetric unit is composed of trimers, in that case do i need to provide the sequence only once or as 3times?

@jamaliki
Copy link
Collaborator

Sequences should only ever be unique chains. The pipeline tries to remove duplicated chains but it is always best to provide it with clean data :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants