PyEED-jupyternotebook

Jupyter notebook datamodel
This data model consists of a pipeline work to be used for biological analysis uses, as the data gathering step is over all of the data will be stored in a database of your creation, later the database can be expended or the data can be used for further analysis.

The pipe line starts with all of the required installations and packages.

1.Perform a protein-to-protein BLAST search using a relevant protein fasta file, then record the results in an XML file. All related sequences and accession numbers will be saved inside the xml file.

2.Each protein's accession number can be used to find the organism name and taxonomy id, later to be inserted into the database.

3.constructing an HMM (Hidden Markov Model) using the protein superfamily that shares the query sequence. Run a multiple sequence alignment on the best five to ten structurally known proteins, then use the HMM Python package's MSA function to produce a consensus HMM.

4.For the purpose of creating a numbering scheme, compare the relevant sequence to the HMM,choose the finest corresponding sequence that suits your demands. The numbering system will be constructed in sequential order and to assist the user in calculating their own amino acid count, which may then be used, for example, to label specific sequence segments from experimental laboratory work.

5.For each PDB id, a database of known annotations will be downloaded. You can use this database to compare your own annotation to one that already exists.

6.In order to extract the DNA nucleotide sequences from the NCBI database, you must first perform a protein to nucleotide tBLASTn search from a relevant protein fasta file and store the tBLASTn results in an XML file.

7.NCBI DNA ids can be derived from the tBLASTn results that are recorded in the XML file; from these accession numbers, related DNA sequences for each amino acid sequence and corelated DNA NCBI database IDs for each PDB ID can then be found.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
DatamodelExample.ipynb		DatamodelExample.ipynb
DatamodelExample0_1.ipynb		DatamodelExample0_1.ipynb
DatamodelExample0_3.ipynb		DatamodelExample0_3.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DatamodelExample.ipynb

DatamodelExample.ipynb

DatamodelExample0_1.ipynb

DatamodelExample0_1.ipynb

DatamodelExample0_3.ipynb

DatamodelExample0_3.ipynb

LICENSE

LICENSE

README.md

README.md

Repository files navigation

PyEED-jupyternotebook

About

Releases

Packages

Languages

License

PyEED/PyEED-jupyternotebook

Folders and files

Latest commit

History

Repository files navigation

PyEED-jupyternotebook

About

Resources

License

Stars

Watchers

Forks

Languages