New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#271 Classification Evaluation #290

Merged
merged 2 commits into from Jan 23, 2018

Conversation

Projects
None yet
5 participants
@WGierke
Copy link
Contributor

WGierke commented Jan 19, 2018

I'm currently adding the possibility to evaluate a classification model based on the LIDC dataset as described in #271. Furthermore, I'm currently benchmarking the model that's implemented at the moment.
Even if I still haven't finished, I wanted to show you what I'm currently working at.

CLA

  • I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
nodule_list = []
for annotation in scan.annotations:
centroid_x, centroid_y, centroid_z = annotation.centroid()
z_index = int((centroid_z - min_z) / scan.slice_thickness)

This comment has been minimized.

@WGierke

WGierke Jan 19, 2018

Contributor

This step should maybe be transferred to an other location where it can also be tested separately. Computing the slice index from the relative image positions still feels a bit hacky.

This comment has been minimized.

@WGierke

WGierke Jan 20, 2018

Contributor

@Serhiy-Shekhovtsov @vessemer Do you have experience with getting the slice index from the ImagePositient information? The DICOM images have ImagePosition information (z_axis) from e.g. -435 to -63. The nodule has one of -252 so we're required to calculate the slice index (here: 61) out of that. Do you know whether that's the right way or whether we are already doing this somewhere?

This comment has been minimized.

@Serhiy-Shekhovtsov

Serhiy-Shekhovtsov Jan 20, 2018

Contributor

Is this what you are looking for?

This comment has been minimized.

@Serhiy-Shekhovtsov

Serhiy-Shekhovtsov Jan 20, 2018

Contributor

The offset then used here to convert the coordinates

    coord = np.rint((coord - np.array(origin)) * np.array(spacing))

This comment has been minimized.

@vessemer

vessemer Jan 25, 2018

Contributor

@WGierke, @Serhiy-Shekhovtsov, sorry, I've missed that, it should be np.rint((coord - np.array(origin)) / np.array(spacing)), also I didn't get for what reason, @Serhiy-Shekhovtsov, you remove this line, I should missing something :/

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Jan 20, 2018

The current implementation takes 8.5h to run, gives an average accuracy/precision of 10% and an average loss of 3.55 :/

return -np.log(1 - p)


def evaluate_classification(model_path=None):

This comment has been minimized.

@vessemer

vessemer Jan 20, 2018

Contributor

Will it be better to make evaluation functionality console executable with appropriate metrics and directories arguments? Mean, that this is a bit out of user usage.
I'd also suggest to put all metrics in a different file.

CONFIDENCE_THRESHOLD = 0.5


def get_accuracy(TP, TN, FP, FN):

This comment has been minimized.

@reubano

reubano Jan 22, 2018

Contributor

would you be able to use more descriptive variable names?

This comment has been minimized.

@WGierke

WGierke Jan 22, 2018

Contributor

Sure. Do you maybe see any major flaws in the implemented logic of the whole file? I'm just asking because an accuracy of 10% seems pretty low to me.

@WGierke WGierke force-pushed the WGierke:271_pipeline_classification branch from 94b5d81 to 1af3e57 Jan 22, 2018

@WGierke WGierke changed the title [WIP] Classification Evaluation #271 Classification Evaluation Jan 22, 2018

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Jan 23, 2018

It looks good, and no I don't see any major flaws. If someone with more experience in applying these metrics to CT scans has suggestions he/she can always improve this with future PRs. It also seems like the sys.path.insert... isn't making Travis happy. I'll see what can be done about that.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Jan 23, 2018

This is what I got...

src/algorithms/evaluation/evaluation.py

import os

import numpy as np
import pylidc as pl

try:
    from ....config import Config
except ValueError:
    from config import Config

...

src/tests/test_evaluate_classification.py

from ..algorithms.evaluation.evaluation import evaluate_classification


def test_evaluate_classification(model_path=None):
    assert evaluate_classification(model_path)

run with docker-compose -f local.yml run prediction pytest src/tests/test_evaluate_classification.py

I believe you'll also have to add tdmq to the requirements and fix whatever remaining style errors you may see via flake8 prediction

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Jan 23, 2018

@reubano Done. Thanks!

@lamby

This comment has been minimized.

Copy link
Contributor

lamby commented Jan 23, 2018

Awesome :)

@lamby lamby merged commit 57452ab into drivendataorg:master Jan 23, 2018

2 checks passed

concept-to-clinic/cla @WGierke has signed the CLA.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@vessemer vessemer referenced this pull request Jan 23, 2018

Open

Nodules augmentation #294

0 of 1 task complete

@WGierke WGierke referenced this pull request Jan 25, 2018

Merged

#271 Add segmentation evaluation #299

1 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment