akasha147 / PDB-Scrapper Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Web Crawler to Extract data from PDB(Protein Data Bank) Files

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
PDB_List.txt		PDB_List.txt
README.md		README.md
extract.pl		extract.pl

Repository files navigation

PDB-Scrapper

Web Crawler to Extract data from PDB(Protein Data Bank) Files.

Software Requirements(in Fedora(i.e Linux) OS):

Perl

Modules Required(Can be installed using CPAN:
1. DBI (for Connecting to Database)]
2. LWP::Simple (for Web Crawling)
3. IO::String (for Web Crawling)

MySQL Server

Features of the crawler

This script is capable of extracting the following information:

Experiment Type(Eg.X-Ray Diffraction,NMR)
Protein Type(Eg.Lectin)
Resolution of the structure
R-factor

For each Individual Chain in a structure,the code:

Determines the type of the chain(Protein/DNA/RNA)
Extracts the Primary Sequence(from the FASTA file)

The code discards the extract data on the following conditions:

The Chain contains any unknown residue
There is no protein chains in the structure(only DNA or/and RNA)

About

Web Crawler to Extract data from PDB(Protein Data Bank) Files

Report repository

Releases

No releases published

Packages

No packages published

Languages

Perl 100.0%