Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize package dependencies #506

Merged
merged 22 commits into from
Nov 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
6d85be4
Update dependencies and imports for MONAI and typing
fcogidi Nov 15, 2023
64c517c
Refactor MedicalImage and MedicalImageFolder
fcogidi Nov 16, 2023
e5ac323
allow init of MedicalImage class; raise error in methods
fcogidi Nov 17, 2023
fa90753
Refactor import_optional_module function to allow importing module at…
fcogidi Nov 17, 2023
0240987
Refactor MedicalImage optional module imports
fcogidi Nov 17, 2023
9085081
Update dependencies in pyproject.toml
fcogidi Nov 17, 2023
113fd6d
Add test for MedicalImage feature without MONAI
fcogidi Nov 20, 2023
3d7ec1d
Merge branch 'main' into make_monai_optional
fcogidi Nov 20, 2023
4a5e772
Prevent use of txrv_transforms method at runtime if MONAI is not inst…
fcogidi Nov 20, 2023
51d80af
Merge branch 'main' into make_monai_optional
fcogidi Nov 20, 2023
2783be7
Move report package deps to core installation
amrit110 Nov 21, 2023
4e1b6d0
Adjust package installation tests
amrit110 Nov 21, 2023
931f541
Formatting fix
amrit110 Nov 21, 2023
06cbc33
Remove report package test action
amrit110 Nov 21, 2023
8acceaa
Formatting fix
amrit110 Nov 21, 2023
d4049e4
remove txrv_transforms, add dictionary wrapper for torchvision transf…
a-kore Nov 21, 2023
a038e0b
fix repr for transform
a-kore Nov 21, 2023
3c4da81
fix Dictd call func
a-kore Nov 21, 2023
08b4659
fix monitor-api notebook
a-kore Nov 21, 2023
cec48e8
Update imports for image transforms
fcogidi Nov 21, 2023
7b179bc
Update metadata for cxr_classification.ipynb
fcogidi Nov 21, 2023
4e2de73
fix transforms in notebooks
a-kore Nov 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 0 additions & 14 deletions .github/workflows/package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,17 +51,3 @@ jobs:
pip install -e ".[models]"
pip install pytest
python3 -m pytest tests/package/extras/models.py
extra-report-package-install-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install pip
run: python3 -m pip install --upgrade pip
- uses: actions/setup-python@v4.7.1
with:
python-version: '3.10'
- name: Install package and test import
run: |
pip install -e ".[report]"
pip install pytest
python3 -m pytest tests/package/extras/report.py
33 changes: 7 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,17 @@

``cyclops`` is a toolkit for facilitating research and deployment of ML models for healthcare. It provides a few high-level APIs namely:

* `data` - Create datasets for training, inference and evaluation. We use the popular 🤗 [datasets](https://github.com/huggingface/datasets) to efficiently load and slice different modalities of data.
* `models` - Use common model implementations using [scikit-learn](https://scikit-learn.org/stable/) and [PyTorch](https://pytorch.org/).
* `tasks` - Use canonical Healthcare ML tasks such as
* Mortality prediction
* Chest X-ray classification
* `data` - Create datasets for training, inference and evaluation. We use the popular 🤗 [datasets](https://github.com/huggingface/datasets) to efficiently load and slice different modalities of data
* `models` - Use common model implementations using [scikit-learn](https://scikit-learn.org/stable/) and [PyTorch](https://pytorch.org/)
* `tasks` - Use common ML task formulations such as binary classification or multi-label classification on tabular, time-series and image data
* `evaluate` - Evaluate models on clinical prediction tasks
* `monitor` - Detect dataset shift relevant for clinical use cases
* `report` - Create [model report cards](https://vectorinstitute.github.io/cyclops/api/tutorials/nihcxr/nihcxr_report_periodic.html) for clinical ML models

``cyclops`` also provides a library of end-to-end use cases on clinical datasets such as
``cyclops`` also provides example end-to-end use case implementations on clinical datasets such as

* [MIMIC-III](https://physionet.org/content/mimiciii/1.4/)
* [NIH chest x-ray](https://www.nih.gov/news-events/news-releases/nih-clinical-center-provides-one-largest-publicly-available-chest-x-ray-datasets-scientific-community)
* [MIMIC-IV](https://physionet.org/content/mimiciv/2.0/)
* [eICU-CRD](https://eicu-crd.mit.edu/about/eicu/)


## 🐣 Getting Started
Expand All @@ -37,31 +34,15 @@
python3 -m pip install pycyclops
```

The base package installation supports the use of the `data` and `process` APIs to load
and transform clinical data, for downstream tasks.
The base cyclops installation doesn't include modelling packages.

To install additional functionality from the other APIs, they can be installed as extras.
To install additional dependencies for using models,


To install with `models`, `tasks`, `evaluate` and `monitor` API support,

```bash
python3 -m pip install 'pycyclops[models]'
```

To install with `report` API support,

```bash
python3 -m pip install 'pycyclops[report]'
```

Multiple extras could also be combined, for example to install with both `report` and
`models` support:

```bash
python3 -m pip install 'pycyclops[report,models]'
```


## 🧑🏿‍💻 Developing

Expand Down
104 changes: 81 additions & 23 deletions cyclops/data/features/medical_image.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
"""Medical image feature."""

import logging
import os
import tempfile
from dataclasses import dataclass, field
from io import BytesIO
from typing import Any, ClassVar, Dict, Optional, Tuple, Union
from typing import TYPE_CHECKING, Any, ClassVar, Dict, Optional, Tuple, Union

import numpy as np
import numpy.typing as npt
Expand All @@ -15,18 +14,56 @@
from datasets.features import Image, features
from datasets.utils.file_utils import is_local_path
from datasets.utils.py_utils import string_to_dict
from monai.data.image_reader import ImageReader
from monai.data.image_writer import ITKWriter
from monai.transforms.compose import Compose
from monai.transforms.io.array import LoadImage
from monai.transforms.utility.array import ToNumpy

from cyclops.utils.log import setup_logging
from cyclops.utils.optional import import_optional_module


# Logging.
LOGGER = logging.getLogger(__name__)
setup_logging(print_level="INFO", logger=LOGGER)
if TYPE_CHECKING:
from monai.data.image_reader import ImageReader
from monai.data.image_writer import ITKWriter
from monai.transforms.compose import Compose
from monai.transforms.io.array import LoadImage
from monai.transforms.utility.array import ToNumpy

Check warning on line 26 in cyclops/data/features/medical_image.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/features/medical_image.py#L22-L26

Added lines #L22 - L26 were not covered by tests
else:
ImageReader = import_optional_module(
"monai.data.image_reader",
attribute="ImageReader",
error="warn",
)
ITKWriter = import_optional_module(
"monai.data.image_writer",
attribute="ITKWriter",
error="warn",
)
Compose = import_optional_module(
"monai.transforms.compose",
attribute="Compose",
error="warn",
)
LoadImage = import_optional_module(
"monai.transforms.io.array",
attribute="LoadImage",
error="warn",
)
ToNumpy = import_optional_module(
"monai.transforms.utility.array",
attribute="ToNumpy",
error="warn",
)
_monai_available = all(
module is not None
for module in (
ImageReader,
ITKWriter,
Compose,
LoadImage,
ToNumpy,
)
)
_monai_unavailable_message = (
"The MONAI library is required to use the `MedicalImage` feature. "
"Please install it with `pip install monai`."
)


@dataclass
Expand All @@ -35,24 +72,35 @@

Parameters
----------
decode : bool, optional, default=True
Whether to decode the image. If False, the image will be returned as a
dictionary in the format `{"path": image_path, "bytes": image_bytes}`.
reader : Union[str, ImageReader], optional, default="ITKReader"
The MONAI image reader to use.
suffix : str, optional, default=".jpg"
The suffix to use when decoding bytes to image.
decode : bool, optional, default=True
Whether to decode the image. If False, the image will be returned as a
dictionary in the format `{"path": image_path, "bytes": image_bytes}`.
id : str, optional, default=None
The id of the feature.

"""

reader: Union[str, ImageReader] = "ITKReader"
suffix: str = ".jpg" # used when decoding/encoding bytes to image
_loader = Compose(
[
LoadImage(reader=reader, simple_keys=True, dtype=None, image_only=True),
ToNumpy(),
],
)

_loader = None
if _monai_available:
_loader = Compose(
[
LoadImage(
reader=reader,
simple_keys=True,
dtype=None,
image_only=False,
),
ToNumpy(),
],
)

# Automatically constructed
dtype: ClassVar[str] = "dict"
pa_type: ClassVar[Any] = pa.struct({"bytes": pa.binary(), "path": pa.string()})
Expand All @@ -76,12 +124,14 @@

"""
if isinstance(value, list):
value = np.array(value)
value = np.asarray(value)

Check warning on line 127 in cyclops/data/features/medical_image.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/features/medical_image.py#L127

Added line #L127 was not covered by tests

if isinstance(value, str):
return {"path": value, "bytes": None}

if isinstance(value, np.ndarray):
return _encode_ndarray(value, image_format=self.suffix)

if "array" in value and "metadata" in value:
output_ext_ = self.suffix
metadata_ = value["metadata"]
Expand Down Expand Up @@ -132,7 +182,7 @@
if not self.decode:
raise RuntimeError(
"Decoding is disabled for this feature. "
"Please use MedicalImage(decode=True) instead.",
"Please use `MedicalImage(decode=True)` instead.",
)

if token_per_repo_id is None:
Expand All @@ -147,6 +197,8 @@
)

if is_local_path(path):
if self._loader is None:
raise RuntimeError(_monai_unavailable_message)

Check warning on line 201 in cyclops/data/features/medical_image.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/features/medical_image.py#L200-L201

Added lines #L200 - L201 were not covered by tests
image, metadata = self._loader(path)
else:
source_url = path.split("::")[-1]
Expand Down Expand Up @@ -188,6 +240,9 @@
Image as numpy array and metadata as dictionary.

"""
if self._loader is None:
raise RuntimeError(_monai_unavailable_message)

Check warning on line 244 in cyclops/data/features/medical_image.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/features/medical_image.py#L243-L244

Added lines #L243 - L244 were not covered by tests

# XXX: Can we avoid writing to disk?
with tempfile.NamedTemporaryFile(mode="wb", suffix=self.suffix) as fp:
fp.write(buffer.getvalue())
Expand Down Expand Up @@ -219,6 +274,9 @@
Dictionary containing the image bytes and path.

"""
if not _monai_available:
raise RuntimeError(_monai_unavailable_message)

Check warning on line 278 in cyclops/data/features/medical_image.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/features/medical_image.py#L277-L278

Added lines #L277 - L278 were not covered by tests

if not image_format.startswith("."):
image_format = "." + image_format

Expand All @@ -240,5 +298,5 @@
return {"path": None, "bytes": temp_file_bytes}


# add the `MedicalImage` feature to the `features` module
# add the `MedicalImage` feature to the `features` module namespace
features.MedicalImage = MedicalImage
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ class MedicalImageFolderConfig(
class MedicalImageFolder(folder_based_builder.FolderBasedBuilder): # type: ignore
"""MedicalImageFolder."""

BASE_FEATURE = MedicalImage()
BASE_FEATURE = MedicalImage
BASE_COLUMN_NAME = "image"
BUILDER_CONFIG_CLASS = MedicalImageFolderConfig
EXTENSIONS: List[str] # definition at the bottom of the script
Expand Down
108 changes: 82 additions & 26 deletions cyclops/data/transforms.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,84 @@
"""Transforms for the datasets."""
from typing import Any, Callable, Tuple

from typing import Tuple

from monai.transforms import Lambdad, Resized, ToDeviced # type: ignore
from torchvision.transforms import Compose


def txrv_transforms(
keys: Tuple[str, ...] = ("features",),
device: str = "cpu",
) -> Compose:
"""Set of transforms for the models in the TXRV library."""
return Compose(
[
Resized(
keys=keys,
spatial_size=(1, 224, 224),
allow_missing_keys=True,
),
Lambdad(
keys=keys,
func=lambda x: ((2 * (x / 255.0)) - 1.0) * 1024,
allow_missing_keys=True,
),
ToDeviced(keys=keys, device=device, allow_missing_keys=True),
],
)
from torchvision.transforms import Lambda, Resize


# generic dictionary-based wrapper for any transform
class Dictd:
"""Generic dictionary-based wrapper for any transform."""

def __init__(
self,
transform: Callable[..., Any],
keys: Tuple[str, ...],
allow_missing_keys: bool = False,
):
self.transform = transform
self.keys = keys
self.allow_missing_keys = allow_missing_keys

Check warning on line 19 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L17-L19

Added lines #L17 - L19 were not covered by tests

def __call__(self, data: Any) -> Any:
"""Apply the transform to the data."""
for key in self.keys:
if self.allow_missing_keys and key not in data:
continue
data[key] = self.transform(data[key])
return data

Check warning on line 27 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L23-L27

Added lines #L23 - L27 were not covered by tests

def __repr__(self) -> str:
"""Return a string representation of the transform."""
return (

Check warning on line 31 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L31

Added line #L31 was not covered by tests
f"{self.__class__.__name__}(transform={self.transform}, "
f"keys={self.keys}, allow_missing_keys={self.allow_missing_keys})"
)


# dictionary-based wrapper of Lambda transform using Dictd
class Lambdad:
"""Dictionary-based wrapper of Lambda transform using Dictd."""

def __init__(
self,
func: Callable[..., Any],
keys: Tuple[str, ...],
allow_missing_keys: bool = False,
):
self.transform = Dictd(

Check warning on line 47 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L47

Added line #L47 was not covered by tests
transform=Lambda(func),
keys=keys,
allow_missing_keys=allow_missing_keys,
)

def __call__(self, data: Any) -> Any:
"""Apply the transform to the data."""
return self.transform(data)

Check warning on line 55 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L55

Added line #L55 was not covered by tests

def __repr__(self) -> str:
"""Return a string representation of the transform."""
return f"{self.__class__.__name__}(keys={self.transform.keys}, allow_missing_keys={self.transform.allow_missing_keys})"

Check warning on line 59 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L59

Added line #L59 was not covered by tests


# dictionary-based wrapper of Resize transform using Dictd
class Resized:
"""Dictionary-based wrapper of Resize transform using Dictd."""

def __init__(
self,
spatial_size: Tuple[int, int],
keys: Tuple[str, ...],
allow_missing_keys: bool = False,
):
self.transform = Dictd(

Check warning on line 72 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L72

Added line #L72 was not covered by tests
transform=Resize(size=spatial_size),
keys=keys,
allow_missing_keys=allow_missing_keys,
)

def __call__(self, data: Any) -> Any:
"""Apply the transform to the data."""
return self.transform(data)

Check warning on line 80 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L80

Added line #L80 was not covered by tests

def __repr__(self) -> str:
"""Return a string representation of the transform."""
return f"{self.__class__.__name__}(keys={self.transform.keys}, allow_missing_keys={self.transform.allow_missing_keys})"

Check warning on line 84 in cyclops/data/transforms.py

View check run for this annotation

Codecov / codecov/patch

cyclops/data/transforms.py#L84

Added line #L84 was not covered by tests
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
if TYPE_CHECKING:
from mpi4py import MPI
else:
MPI = import_optional_module("mpi4py.MPI", error="ignore")
MPI = import_optional_module("mpi4py.MPI", error="warn")
# mypy: disable-error-code="no-any-return"


Expand Down
Loading
Loading