GitHub - ahodroj/i2b2-hadoop: A Hadoop Map/Reduce implementation of processing and executing I2B2 CRC Queries

i2b2-hadoop: a map/reduce query processor engine for i2b2 CRC queries

i2b2 (Informatics for Integrating Biology & the Bedside) is an open-source platform for de-identified cohort discovery, and for managing and delivering clinical data sets for research with appropriate IRB approval. Sponsored by the National Institutes of Health (NIH), i2b2 is a widely accepted tool among CTSA sites and other Academic Medical Centers (AMCs), and has also found increasing use at other organizations for research and clinical performance improvement initiatives.

An i2b2 implementation consists of a data mart of clinical, research, and administrative data, and an interface to construct and manage queries and data sets. This project provides a Hadoop implementation of the data mart query engine, known as the CRC.

Version: 0.2

Execution mode:

Export concept_dimension table as a CSV file
Export query as XML file, from qt_query_master
observation_fact table contents must exist in CSV format on HDFS

To run

hadoop jar i2b2-hadoop.jar net.hodroj.i2b2.I2B2QueryJob query.xml hdfs://<observation_fact> hdfs://<output>

The output will be a unique list of I2B2 patient_num values

Future plans:

Deploy as a CRC-cell in the i2b2 Hive

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src/net/hodroj/i2b2		src/net/hodroj/i2b2
README.md		README.md
build.xml		build.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src/net/hodroj/i2b2

src/net/hodroj/i2b2

README.md

README.md

build.xml

build.xml

Repository files navigation

About

Releases

Packages

Languages

ahodroj/i2b2-hadoop

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages