Get Started

clinisift is a multitool for processing clinical medical records.

The main goal is to provide easy, off-the-shelf access to common NLP processes when working with medical records:

Sentence Tokenization and Section Identification from unstructured clinical textual data
Named Entity Recognition of medication-related data and clinical entities from records
Intuitive visualization of extracted information

Some motivating examples that can be accomplished in only a few lines of code to illustrate possible use-cases:

Extract clinical problems and procedures mentioned in a record’s CLINICAL HISTORY section.
When exploring a new dataset, visualize records with clinical and medication entities parsed and highlighted on-the-fly.
Check if both a particular medication and particular surgical procedure are mentioned in a patient’s PAST MEDICAL HISTORY.

Quick Features

Parse - Extract clinical and medical entities through Transformers-based Named Entity Recognition, as well as other components like medical record section identification. Also supports any NER model that can be loaded as a HuggingFace pipeline
Analyze - Built-in methods to quickly filter through parsed data with as little code overhead as possible.
Visualize - spaCy-based visualizer that integrates with Transformers NER to visualize medical record parses on-the-fly, programmatically or via command line.

Get Started

Installation

Install via pip:

pip install clinisift

Or, from source:

git clone git@github.com:clinisift/clinisift.git
cd clinisift && pip install -e .

Quickstart

For a comprehensive overview of clinisift’s capabilities, see the “Components” page on the wiki.

Components

clinisift is made up of Parser and Doc components. See the “Components” page on the wiki for an explanation of all the parameters.

class Parser(
    models=None,
    include_ents=[],
    exclude_ents=[],
    iob_resolve=True,
    sent_tokenizer="clinitokenizer",
    sent_per_line=False,
    extract_section_headers=False,
    section_header_expr=None,
    device=None,
)

class Doc(
    filepath_or_str,
    parser,
    is_file=True
)

Examples

Below are some examples for common use-cases.

Extract all clinical entities and medications from a *.txt file

from clinisift.cliniparse import Parser
from clinisift.doc import Doc

parser = Parser() # med ner and clinical ner
doc = Doc(text_file_path, parser)

res = doc.parse()
# { "sentences": [...],
# "entities": [...l, }

Visualize entities extracted on-the-fly from a directory of .txt files

To launch a visualizer using the default Parser() config:

From the command line:

python -m clinisift.visualizer /my/data/dir

A Flask server will be launched:

The visualizer module can be integrated with any `Parser` for more customizability about the NER pipelines used, entities visualized, and so forth. More information is available in the wiki.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets		assets
src/clinisift		src/clinisift
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
documentation.org		documentation.org
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

src/clinisift

src/clinisift

.gitignore

.gitignore

LICENSE

LICENSE

README.org

README.org

documentation.org

documentation.org

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Quick Features

Get Started

Installation

Quickstart

Components

Examples

Extract all clinical entities and medications from a *.txt file

Visualize entities extracted on-the-fly from a directory of .txt files

About

Releases

Packages

Languages

License

cactiML/clinisift

Folders and files

Latest commit

History

Repository files navigation

Quick Features

Get Started

Installation

Quickstart

Components

Examples

Extract all clinical entities and medications from a *.txt file

Visualize entities extracted on-the-fly from a directory of .txt files

About

Resources

License

Stars

Watchers

Forks

Languages