Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Developing a more modular approach #45

Open
biomadeira opened this issue Sep 21, 2017 · 1 comment
Open

Developing a more modular approach #45

biomadeira opened this issue Sep 21, 2017 · 1 comment
Assignees

Comments

@biomadeira
Copy link
Collaborator

biomadeira commented Sep 21, 2017

Proposed design:

ProteoFAV's main features:
1 - Reading/parsing formatted files to pandas DataFrames (e.g. mmCIF, PDB, SIFTS XML, DSSP files)
2 - Downloading data files on the fly (e.g. mmCIF, PDB, SIFTS XML, DSSP files)
3 - Fetching sequence annotations (features) (e.g. variants from Ensembl and UniProt)
4 - Merging all the previous data onto a main DataFrame

With this in mind, I think would be great to have a structure like this:

proteofav.mmCIF.read() 		
proteofav.mmCIF.write() 
proteofav.mmCIF.download()
proteofav.mmCIF.select()
proteofav.PDB.read()
proteofav.PDB.write()
proteofav.PDB.download()
proteofav.PDB.select()
proteofav.DSSP.read()
proteofav.DSSP.download()
proteofav.DSSP.select()
proteofav.SIFTS.read()
proteofav.SIFTS.download()
proteofav.SIFTS.select()
proteofav.Validation.read()
proteofav.Validation.download()
proteofav.Validation.select()
proteofav.Annotations.read()
proteofav.Annotations.download()
proteofav.Annotations.select()
proteofav.Variants.fetch()
proteofav.Variants.select()
proteofav.Tables.merge()
proteofav.Tables.generate()

Classes generally have the following basic methods

  • read - read/parse from file
  • write - write output to a file
  • download - downloads data to a file (mmCIF, etc.)
  • fetch - downloads data to the handle, but can be cached (JSON, etc.)
  • merge - merge any set of DataFrames, so each DataFrame should be aware of what type of data it contains
  • generate - automated table generation by input (i.e. input PDB ID/CHAIN ID or input UniProt ID)
@biomadeira biomadeira added this to the Modular approach milestone Sep 21, 2017
@biomadeira biomadeira self-assigned this Sep 21, 2017
@biomadeira
Copy link
Collaborator Author

PR #50 adds:

proteofav.MSA.read()
proteofav.MSA.download()
proteofav.MSA.select()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant