Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secondary Structure Information #33

Closed
emrekuecuek opened this issue Sep 20, 2021 · 7 comments
Closed

Secondary Structure Information #33

emrekuecuek opened this issue Sep 20, 2021 · 7 comments

Comments

@emrekuecuek
Copy link

emrekuecuek commented Sep 20, 2021

Hello. In my project, I am using secondary structures. Without going into detail, I need to sample CA atoms from different secondary structures for my algorithm. This can easily be done by parsing strings, I know, but I think it might be a nice feature to be able to obtain secondary structure information like any other feature in a PDB file.

Expected Behavior

One solution might be creating a function such as helix(struc::ProteinStructure) or sheet(struc::ProteinStructure) and obtain the lines which contains the starting and ending residues of the secondary structure. Then we can get the list of atoms/residues with a combination like this:

collectatoms(struc::ProteinStructure, calphaselector)[helix(struc::ProteinStructure)[1]] # to obtain the atoms belonging the first Alpha Helix

Current Behavior

I could not find something related with this suggestion in the documentation, if there is one, I am genuinely sorry.

Possible Solution / Implementation

Though I have some experience of the source code in the spatial.jl file, I do not have any experience regarding parsing PDB files. My suggestion might be writing a string parser as a function but I am not sure how we can connect it with a ProteinStructure structure.

Context

My project is related with secondary structures. I think it might be nice to be able to obtain regarding information for those who in need.

@jgreener64
Copy link
Member

You are right, it would be useful. On the todo list is wrapping DSSP to calculate secondary structure from the structure itself, rather than reading it from the PDB/mmCIF header. This approach fits the philosophy of BioStructures better than reading the header, since it would work on custom PDB files without a header too.

No promises about that being implemented soon though, sorry. In the meantime you could write the mmCIF header parsing functions you need using the mmCIF dictionary, for example for helices:

using BioStructures
downloadpdb("1AKE", format=MMCIF)
d = MMCIFDict("1AKE.cif")
hs, he = d["_struct_conf.beg_auth_seq_id"], d["_struct_conf.end_auth_seq_id"]
helices = [(parse(Int, s), parse(Int, e)) for (s, e) in zip(hs, he)]

Note chain IDs etc. are neglected for simplicity in this example.

@emrekuecuek
Copy link
Author

emrekuecuek commented Sep 22, 2021

Thank you for your very fast response. I will think about using mmCIF files, maybe I can parse string from PDB files as well, I haven't decided on that yet. Maybe after my graduation, I would like to make some contributions in my free time to this great project. Perhaps even this issue maybe? 😅

@shuuul
Copy link
Contributor

shuuul commented Sep 17, 2023

There is a new package ProteinSecondaryStructures.jl. Is it possible to add the support for secondary structure information based on it?

@shuuul
Copy link
Contributor

shuuul commented Sep 17, 2023

ProteinSecondaryStructures.jl is based on PDBTools.jl. Maybe we can directly using DSSP_jll or STRIDE_jll and add a ss property to Residue.

@jgreener64
Copy link
Member

There is discussion about running ProteinSecondaryStructures.jl on BioStructures.jl types at BioJulia/ProteinSecondaryStructures.jl#4.

It would also be cool to use DSSP_jll or STRIDE_jll from within BioStructures.jl, but I am unlikely to have the time to add this. Feel free to work on a PR.

@shuuul
Copy link
Contributor

shuuul commented Sep 18, 2023

@jgreener64 I made a PR #43. Please check and let me know what can be improved. I am new to Julia development.

@jgreener64
Copy link
Member

Secondary structure support now added, thanks to @shuuul.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants