This repository processes the gene family data from HGNC. In the future, the repository may expand its scope to process other types of HGNC data.
1.download.ipynb
downloads HGNC data. Check this notebook to see the last modified dates of downloaded files.2.families.ipynb
constructs the gene family ontology innetworkx
. Annotates gene families with their corresponding Entrez Gene IDs. Gene membership in a family is propagated, e.g. genes belonging to the "Glutamate metabotropic receptors" family also belong to the "Glutamate receptors" family.
download
contains unmodified downloads from the EBI FTP site.data
contains generated datasets.families.graphml
contains a GraphML-formatted network of the HGNC gene family ontology.gene-families.tsv
contains the mapping between gene families and Entrez genes.
Have a question? Submit all feedback or questions via GitHub issues!