-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for WSI-level classification (#542)
* Add base class for WSIs & `openslide` implementation (#365) * Add WSI dataset classes (#368) * Add baseline `panda` workflow (#373) * add panda config * adjust batch size * addressed comments * addressed comments * addressed comments * Add support for grid sampling to `WsiDataset` (#377) * Replaced cached_property in WsiDataset by LRU cache (#388) * Updated `EmbeddingsWriter` to support multi-embedding file outputs (#384) * Simple baseline sampler for foreground patches (#394) * add foreground kaggle * added random foreground sampler kaggle * refactored sampler * moved get_mask * typo * refactor sampler * refactor sampler * addressed comments * addressed comments * Fixed linting in WSI feature branch (#407) * added openslide-python to all dependencies * Fix input batch class name (#414) * Add lower bound for wsi resolution level during mask generation (#412) * Move sampler logic to `samplers` module and add unit tests (#420) * Add `WsiClassificationDataset` (#429) * Retrieve MPP from WSIs (#432) * add mpp conversion * formatting * addressed comments * addressed comments * addressed comments * formatting * formatting * formatting * updated panda config (#437) * update WSI foreground segmentation algorithm (#456) * Add `PANDA` dataset class (#430) * fixed panda (#463) * Update `EmbeddingsWriter` to store tensors as lists (#465) * add tiffslide backend (#468) * added panda cli unit tests (#470) * move wsi initialization to setup method of `MultiWsiDataset` (#472) * 459 add camelyon16 slide level task (#476) * added panda dataset class * clean up * remove samples with noisy labels * clean up table in dataset readme * added function for stratified splits * added unit tests * cleanup * addressed comments * fixed issue with resource download * validation fix * updated readme * added to mkdocs * added image_dir to exception print * updated root path in yaml config * added panda to datasets overview table in docs * added md5 hash for downloaded resources * update init * added camelyon16 * added camelyon16 * updated camelyon16 class * added tests and config * formatting * formatting * formatting * formatting * added test files * formatting * lint * added target transforms * formatting * fixed dataset * addressed comments * addressed comments * fix test * fix test * fixed test * addressed comments * updated loss * fix annotations * lint --------- Co-authored-by: Nicolas Kaenzig <nkaenzig@gmail.com> * 475 define slide level evaluation protocol (#511) * updated configs * adjust patience * addressed comments * fixed typo * remove prefetch factor * update 360-aggregated-feature before PR to main (#527) * Updated developer guide (#418) * Update `TotalSegmentator2D` dataset to fetch all the slices (#416) * Move metrics to CPU when using single device (#446) * Remove total segmentator classification dataset (#450) * updated eva logo (#454) * updated eva logo * renamed files * Update actions/checkout digest to a5ac7e5 (#458) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Add configuration logger (#466) * Update `README` with paper citation (#474) * update docs (#482) * Update img shields of README (#480) * Fix `torch` and `jsonargparse` versions (#483) * update depedencies * update * bump micro version (#486) * update config links (#487) * Update paper citation format (#489) * Update the vision dataset return types to `tv_tensors` (#478) * Refactor embeddings writer (#461) * fixed phikon configs (#493) * Refactor embeddings datasets (#495) * Add doc tests and minor fixes (#492) * support setting download as env-variable (#514) * updated confis and doc * typo * update datasets * fixed types * src/eva/core/callbacks/writers/embeddings/base.py * formatting * types --------- Co-authored-by: Nicolas Känzig <36882833+nkaenzig@users.noreply.github.com> Co-authored-by: ioangatop <johngatop@gmail.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * Updated documentation with new datasets and leaderboard (#531) * updated layout * updated layout * addressed comment * 532 update leaderboard results with slide level tasks (#538) * updated docs * update leaderboard * update docs and links * updated configs (#539) * updated leaderboard (#543) --------- Co-authored-by: Nicolas Känzig <36882833+nkaenzig@users.noreply.github.com> Co-authored-by: Nicolas Kaenzig <nkaenzig@gmail.com> Co-authored-by: ioangatop <johngatop@gmail.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
- Loading branch information
1 parent
5752a24
commit 20da644
Showing
107 changed files
with
3,627 additions
and
235 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
--- | ||
trainer: | ||
class_path: eva.Trainer | ||
init_args: | ||
n_runs: &N_RUNS ${oc.env:N_RUNS, 5} | ||
default_root_dir: &OUTPUT_ROOT ${oc.env:OUTPUT_ROOT, logs/${oc.env:DINO_BACKBONE, dino_vits16}/offline/camelyon16} | ||
max_epochs: &MAX_EPOCHS ${oc.env:MAX_EPOCHS, 100} | ||
callbacks: | ||
- class_path: lightning.pytorch.callbacks.LearningRateMonitor | ||
init_args: | ||
logging_interval: epoch | ||
- class_path: lightning.pytorch.callbacks.ModelCheckpoint | ||
init_args: | ||
filename: best | ||
save_last: true | ||
save_top_k: 1 | ||
monitor: &MONITOR_METRIC ${oc.env:MONITOR_METRIC, val/BinaryAccuracy} | ||
mode: &MONITOR_METRIC_MODE ${oc.env:MONITOR_METRIC_MODE, max} | ||
- class_path: lightning.pytorch.callbacks.EarlyStopping | ||
init_args: | ||
min_delta: 0 | ||
patience: ${oc.env:PATIENCE, 10} | ||
monitor: *MONITOR_METRIC | ||
mode: *MONITOR_METRIC_MODE | ||
- class_path: eva.callbacks.ClassificationEmbeddingsWriter | ||
init_args: | ||
output_dir: &DATASET_EMBEDDINGS_ROOT ${oc.env:EMBEDDINGS_ROOT, ./data/embeddings/${oc.env:DINO_BACKBONE, dino_vits16}/camelyon16} | ||
save_every_n: 10_000 | ||
dataloader_idx_map: | ||
0: train | ||
1: val | ||
2: test | ||
metadata_keys: ["wsi_id"] | ||
backbone: | ||
class_path: eva.models.ModelFromFunction | ||
init_args: | ||
path: torch.hub.load | ||
arguments: | ||
repo_or_dir: ${oc.env:REPO_OR_DIR, facebookresearch/dino:main} | ||
model: ${oc.env:DINO_BACKBONE, dino_vits16} | ||
pretrained: ${oc.env:PRETRAINED, true} | ||
force_reload: ${oc.env:FORCE_RELOAD, false} | ||
checkpoint_path: ${oc.env:CHECKPOINT_PATH, null} | ||
logger: | ||
- class_path: lightning.pytorch.loggers.TensorBoardLogger | ||
init_args: | ||
save_dir: *OUTPUT_ROOT | ||
name: "" | ||
model: | ||
class_path: eva.HeadModule | ||
init_args: | ||
head: | ||
class_path: eva.vision.models.networks.ABMIL | ||
init_args: | ||
input_size: ${oc.env:IN_FEATURES, 384} | ||
output_size: &NUM_CLASSES 1 | ||
projected_input_size: 128 | ||
criterion: torch.nn.BCEWithLogitsLoss | ||
optimizer: | ||
class_path: torch.optim.AdamW | ||
init_args: | ||
lr: ${oc.env:LR_VALUE, 0.001} | ||
betas: [0.9, 0.999] | ||
lr_scheduler: | ||
class_path: torch.optim.lr_scheduler.CosineAnnealingLR | ||
init_args: | ||
T_max: *MAX_EPOCHS | ||
eta_min: 0.0 | ||
metrics: | ||
common: | ||
- class_path: eva.metrics.AverageLoss | ||
- class_path: eva.metrics.BinaryClassificationMetrics | ||
data: | ||
class_path: eva.DataModule | ||
init_args: | ||
datasets: | ||
train: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: &DATASET_ARGS | ||
root: *DATASET_EMBEDDINGS_ROOT | ||
manifest_file: manifest.csv | ||
split: train | ||
embeddings_transforms: | ||
class_path: eva.core.data.transforms.Pad2DTensor | ||
init_args: | ||
pad_size: 10_000 | ||
target_transforms: | ||
class_path: eva.core.data.transforms.dtype.ArrayToFloatTensor | ||
val: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: val | ||
test: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: test | ||
predict: | ||
- class_path: eva.vision.datasets.Camelyon16 | ||
init_args: &PREDICT_DATASET_ARGS | ||
root: ${oc.env:DATA_ROOT, ./data/camelyon16} | ||
sampler: | ||
class_path: eva.vision.data.wsi.patching.samplers.ForegroundGridSampler | ||
init_args: | ||
max_samples: 10_000 | ||
width: 224 | ||
height: 224 | ||
target_mpp: 0.25 | ||
split: train | ||
image_transforms: | ||
class_path: eva.vision.data.transforms.common.ResizeAndCrop | ||
init_args: | ||
size: ${oc.env:RESIZE_DIM, 224} | ||
mean: ${oc.env:NORMALIZE_MEAN, [0.485, 0.456, 0.406]} | ||
std: ${oc.env:NORMALIZE_STD, [0.229, 0.224, 0.225]} | ||
- class_path: eva.vision.datasets.Camelyon16 | ||
init_args: | ||
<<: *PREDICT_DATASET_ARGS | ||
split: val | ||
- class_path: eva.vision.datasets.Camelyon16 | ||
init_args: | ||
<<: *PREDICT_DATASET_ARGS | ||
split: test | ||
dataloaders: | ||
train: | ||
batch_size: &BATCH_SIZE ${oc.env:BATCH_SIZE, 32} | ||
shuffle: true | ||
val: | ||
batch_size: *BATCH_SIZE | ||
test: | ||
batch_size: *BATCH_SIZE | ||
predict: | ||
batch_size: &PREDICT_BATCH_SIZE ${oc.env:PREDICT_BATCH_SIZE, 64} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
--- | ||
trainer: | ||
class_path: eva.Trainer | ||
init_args: | ||
n_runs: &N_RUNS ${oc.env:N_RUNS, 5} | ||
default_root_dir: &OUTPUT_ROOT ${oc.env:OUTPUT_ROOT, logs/${oc.env:DINO_BACKBONE, dino_vits16}/offline/panda} | ||
max_epochs: &MAX_EPOCHS ${oc.env:MAX_EPOCHS, 49} | ||
callbacks: | ||
- class_path: lightning.pytorch.callbacks.LearningRateMonitor | ||
init_args: | ||
logging_interval: epoch | ||
- class_path: lightning.pytorch.callbacks.ModelCheckpoint | ||
init_args: | ||
filename: best | ||
save_last: true | ||
save_top_k: 1 | ||
monitor: &MONITOR_METRIC ${oc.env:MONITOR_METRIC, val/MulticlassAccuracy} | ||
mode: &MONITOR_METRIC_MODE ${oc.env:MONITOR_METRIC_MODE, max} | ||
- class_path: lightning.pytorch.callbacks.EarlyStopping | ||
init_args: | ||
min_delta: 0 | ||
patience: ${oc.env:PATIENCE, 8} | ||
monitor: *MONITOR_METRIC | ||
mode: *MONITOR_METRIC_MODE | ||
- class_path: eva.callbacks.ClassificationEmbeddingsWriter | ||
init_args: | ||
output_dir: &DATASET_EMBEDDINGS_ROOT ${oc.env:EMBEDDINGS_ROOT, ./data/embeddings/${oc.env:DINO_BACKBONE, dino_vits16}/panda} | ||
dataloader_idx_map: | ||
0: train | ||
1: val | ||
2: test | ||
metadata_keys: ["wsi_id"] | ||
backbone: | ||
class_path: eva.models.ModelFromFunction | ||
init_args: | ||
path: torch.hub.load | ||
arguments: | ||
repo_or_dir: ${oc.env:REPO_OR_DIR, facebookresearch/dino:main} | ||
model: ${oc.env:DINO_BACKBONE, dino_vits16} | ||
pretrained: ${oc.env:PRETRAINED, true} | ||
force_reload: ${oc.env:FORCE_RELOAD, false} | ||
checkpoint_path: ${oc.env:CHECKPOINT_PATH, null} | ||
logger: | ||
- class_path: lightning.pytorch.loggers.TensorBoardLogger | ||
init_args: | ||
save_dir: *OUTPUT_ROOT | ||
name: "" | ||
model: | ||
class_path: eva.HeadModule | ||
init_args: | ||
head: | ||
class_path: eva.vision.models.networks.ABMIL | ||
init_args: | ||
input_size: ${oc.env:IN_FEATURES, 384} | ||
output_size: &NUM_CLASSES 6 | ||
projected_input_size: 128 | ||
criterion: torch.nn.CrossEntropyLoss | ||
optimizer: | ||
class_path: torch.optim.AdamW | ||
init_args: | ||
lr: ${oc.env:LR_VALUE, 0.001} | ||
betas: [0.9, 0.999] | ||
lr_scheduler: | ||
class_path: torch.optim.lr_scheduler.CosineAnnealingLR | ||
init_args: | ||
T_max: *MAX_EPOCHS | ||
eta_min: 0.0 | ||
metrics: | ||
common: | ||
- class_path: eva.metrics.AverageLoss | ||
- class_path: eva.metrics.MulticlassClassificationMetrics | ||
init_args: | ||
num_classes: *NUM_CLASSES | ||
data: | ||
class_path: eva.DataModule | ||
init_args: | ||
datasets: | ||
train: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: &DATASET_ARGS | ||
root: *DATASET_EMBEDDINGS_ROOT | ||
manifest_file: manifest.csv | ||
split: train | ||
embeddings_transforms: | ||
class_path: eva.core.data.transforms.Pad2DTensor | ||
init_args: | ||
pad_size: &N_PATCHES 1000 | ||
val: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: val | ||
test: | ||
class_path: eva.datasets.MultiEmbeddingsClassificationDataset | ||
init_args: | ||
<<: *DATASET_ARGS | ||
split: test | ||
predict: | ||
- class_path: eva.vision.datasets.PANDA | ||
init_args: &PREDICT_DATASET_ARGS | ||
root: ${oc.env:DATA_ROOT, ./data/panda/prostate-cancer-grade-assessment} | ||
sampler: | ||
class_path: eva.vision.data.wsi.patching.samplers.ForegroundGridSampler | ||
init_args: | ||
max_samples: *N_PATCHES | ||
width: 224 | ||
height: 224 | ||
target_mpp: 0.5 | ||
split: train | ||
image_transforms: | ||
class_path: eva.vision.data.transforms.common.ResizeAndCrop | ||
init_args: | ||
size: ${oc.env:RESIZE_DIM, 224} | ||
mean: ${oc.env:NORMALIZE_MEAN, [0.485, 0.456, 0.406]} | ||
std: ${oc.env:NORMALIZE_STD, [0.229, 0.224, 0.225]} | ||
- class_path: eva.vision.datasets.PANDA | ||
init_args: | ||
<<: *PREDICT_DATASET_ARGS | ||
split: val | ||
- class_path: eva.vision.datasets.PANDA | ||
init_args: | ||
<<: *PREDICT_DATASET_ARGS | ||
split: test | ||
dataloaders: | ||
train: | ||
batch_size: &BATCH_SIZE ${oc.env:BATCH_SIZE, 32} | ||
shuffle: true | ||
val: | ||
batch_size: *BATCH_SIZE | ||
test: | ||
batch_size: *BATCH_SIZE | ||
predict: | ||
batch_size: &PREDICT_BATCH_SIZE ${oc.env:PREDICT_BATCH_SIZE, 64} |
Oops, something went wrong.