Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



91 Commits

Repository files navigation


Cancer Deep Phenotype Extraction (DeepPhe) Project

This repository is the home of the public releases of the DeepPhe Cancer Phenotype Extraction System.

DeepPhe uses combines natural language processing (based on Apache cTAKES) with an exetensive domain information model to:

  • extract information from plaintext documents
  • summarize information for Cancers and Tumors across mutiple documents
  • write results to a Neo4j database
  • visualize results at patient and cohort levels using our DeepPhe Viz tool.

The system has been tested using documents from three cancer domains:

  • Breast Cancer
  • Ovarian Cancer
  • Malignant Melanoma

There are two versions of DeepPhe:

  • DeepPhe-XN: the full suite of tools, designed to support cohort discovery for cancer clinical research. This is the version that most users will want to start with. Installation instructions are available on the Windows Installation or Mac Installation
  • DeepPhe-CR: a web-service version of DeepPhe designed to support cancer registries. Requires Docker for installation.


  • DeepPhe is provided under an Academic Software Use Agreement. Refer to that agreement for information about requesting the use of the Software for commercial purposes.

  • DeepPhe includes portions of the ontology.

  • DeepPhe includes portions of the NCI Thesaurus (NCIt).

  • DeepPhe uses the Apache cTAKES (click for license) library.

  • DeepPhe writes output to a Neo4j (click for license) database.

Contact / Help

If you'd like to drop us a note, or have technical questions, please post on the issue tracker.

Metrics on downloads and usage could help us with funding future enhancements.