Skip to content

bionlplab/radtext

Repository files navigation

RadText

Build status Latest version on PyPI Downloads License codecov Documentation Status Pythong version

Purpose

RadText is a high-performance Python Radiology Text Analysis System.

Prerequisites

  • Python >= 3.6, <3.9
  • Linux
  • Java
# Set up environment
$ sudo apt-get install python3-dev build-essential default-java

Quickstart

The latest radtext releases are available over pypi.

Using pip, RadText releases are available as source packages and binary wheels. It is also generally recommended installing packages in a virtual environment to avoid modifying system state:

$ python -m venv venv
$ source venv/bin/activate
$ pip install -U pip setuptools wheel
$ pip install -U radtext
$ python -m spacy download en_core_web_sm
$ radtext-download --all

To see RadText’s pipeline in action, you can launch the Python interactive interpreter, and try the following commands:

import radtext
nlp = radtext.Pipeline()
with open('/PATH/TO/BIOC_FILE.xml') as fp:
    doc = bioc.load(fp)
    
annotations = nlp(doc)
print(annotations)

RadText also supports command-line interfaces for specific NLP tasks (e.g., de-identification, sentence split, or named entity recognition).

$ radtext-deid --repl=X -i /path/to/input.xml -o /path/to/output.xml
$ radtext-ssplit -i /path/to/input.xml -o /path/to/output.xml
$ radext-ner spacy --radlex /path/to/Radlex4.1.xlsx -i /path/to/input.xml -o /path/to/output.xml

Documentation

You will find complete documentation at our Read the Docs site.

Contributing

You can find information about contributing to RadText at our Contribution page.

Acknowledgment

This work is supported by the National Library of Medicine under Award No. 4R00LM013001 and the NIH Intramural Research Program, National Library of Medicine.

You can find Acknowledgment information at our Acknowledgment page.

License

Copyright BioNLP Lab at Weill Cornell Medicine, 2022.

Distributed under the terms of the MIT license, RadText is free and open source software.