Skip to content

wwood/ApiLoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ApiLoc

The phylum of intracellular parasites Apicomplexa contains many nasty bugs, including those that cause malaria, cryptosporidiosis, toxoplasmosis and texas fever. While these species are single-celled, they are eukaryotes (i.e. not bacterial or archaeal) and possess many strange and intriguing structures within (and outside) their cells.

ApiLoc is a database of where proteins that are encoded in these parasites' genomes are located in and around these parasites. A web interface is provided at http://apiloc.biochem.unimelb.edu.au but this git repository is the source for the data that is behind that interface.

Structure of the data

Firstly, a caution. This is early days for the database, so expect things to change without notice.

That being said, here is some metadata:

The raw_data folder

This folder contains the raw curatorial data. That is, after reading journal articles, the curator types directly into these spreadsheets.

There is currently a single spreadsheet for each species, appropriately named. Obviously enough the Plasmodium_falciparum.csv file contains a list of all the different P. falciparum proteins that have had their localisation curated. While most species that have undergone curation also happen to have had their genomes sequenced, also have had their genomes sequenced. When this is not the case e.g. for any Besnoitia species (they cause Besnoitiosis), then for largely historical reasons they are recorded and the genus level instead of the species level, and are named with only the genus name e.g. Besnoitia_spp.csv. The species is known for each protein's curation however - this information lives inside the file in the first column.

The controlled_vocabulary folder

This folder contains information about the controlled vocabulary used in ApiLoc.

The misc folder

This folder contains things other than the raw, curated data. Please be aware, though, that the location of this data is liable to change at any time.

The localisation_suggestions.csv file contains proteins that haven't been localised directly with methods sufficient to qualify them for inclusion into the regular curated data, but have been suggesting by experts to be localised to a particular place.

About

A curated database of protein sub-cellular localisation in apicomplexan parasites

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages