Skip to content
This repository has been archived by the owner. It is now read-only.

#271 Classification Evaluation #290

Merged
merged 2 commits into from Jan 23, 2018

Conversation

@WGierke
Copy link
Contributor

@WGierke WGierke commented Jan 19, 2018

I'm currently adding the possibility to evaluate a classification model based on the LIDC dataset as described in #271. Furthermore, I'm currently benchmarking the model that's implemented at the moment.
Even if I still haven't finished, I wanted to show you what I'm currently working at.

CLA

  • I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
nodule_list = []
for annotation in scan.annotations:
centroid_x, centroid_y, centroid_z = annotation.centroid()
z_index = int((centroid_z - min_z) / scan.slice_thickness)

This comment has been minimized.

@WGierke

WGierke Jan 19, 2018
Author Contributor

This step should maybe be transferred to an other location where it can also be tested separately. Computing the slice index from the relative image positions still feels a bit hacky.

This comment has been minimized.

@WGierke

WGierke Jan 20, 2018
Author Contributor

@Serhiy-Shekhovtsov @vessemer Do you have experience with getting the slice index from the ImagePositient information? The DICOM images have ImagePosition information (z_axis) from e.g. -435 to -63. The nodule has one of -252 so we're required to calculate the slice index (here: 61) out of that. Do you know whether that's the right way or whether we are already doing this somewhere?

This comment has been minimized.

@Serhiy-Shekhovtsov

Serhiy-Shekhovtsov Jan 20, 2018
Contributor

Is this what you are looking for?

This comment has been minimized.

@Serhiy-Shekhovtsov

Serhiy-Shekhovtsov Jan 20, 2018
Contributor

The offset then used here to convert the coordinates

    coord = np.rint((coord - np.array(origin)) * np.array(spacing))

This comment has been minimized.

@vessemer

vessemer Jan 25, 2018
Contributor

@WGierke, @Serhiy-Shekhovtsov, sorry, I've missed that, it should be np.rint((coord - np.array(origin)) / np.array(spacing)), also I didn't get for what reason, @Serhiy-Shekhovtsov, you remove this line, I should missing something :/

@WGierke
Copy link
Contributor Author

@WGierke WGierke commented Jan 20, 2018

The current implementation takes 8.5h to run, gives an average accuracy/precision of 10% and an average loss of 3.55 :/

return -np.log(1 - p)


def evaluate_classification(model_path=None):

This comment has been minimized.

@vessemer

vessemer Jan 20, 2018
Contributor

Will it be better to make evaluation functionality console executable with appropriate metrics and directories arguments? Mean, that this is a bit out of user usage.
I'd also suggest to put all metrics in a different file.

CONFIDENCE_THRESHOLD = 0.5


def get_accuracy(TP, TN, FP, FN):

This comment has been minimized.

@reubano

reubano Jan 22, 2018
Contributor

would you be able to use more descriptive variable names?

This comment has been minimized.

@WGierke

WGierke Jan 22, 2018
Author Contributor

Sure. Do you maybe see any major flaws in the implemented logic of the whole file? I'm just asking because an accuracy of 10% seems pretty low to me.

@WGierke WGierke force-pushed the WGierke:271_pipeline_classification branch from 94b5d81 to 1af3e57 Jan 22, 2018
@WGierke WGierke changed the title [WIP] Classification Evaluation #271 Classification Evaluation Jan 22, 2018
@reubano
Copy link
Contributor

@reubano reubano commented Jan 23, 2018

It looks good, and no I don't see any major flaws. If someone with more experience in applying these metrics to CT scans has suggestions he/she can always improve this with future PRs. It also seems like the sys.path.insert... isn't making Travis happy. I'll see what can be done about that.

@reubano
Copy link
Contributor

@reubano reubano commented Jan 23, 2018

This is what I got...

src/algorithms/evaluation/evaluation.py

import os

import numpy as np
import pylidc as pl

try:
    from ....config import Config
except ValueError:
    from config import Config

...

src/tests/test_evaluate_classification.py

from ..algorithms.evaluation.evaluation import evaluate_classification


def test_evaluate_classification(model_path=None):
    assert evaluate_classification(model_path)

run with docker-compose -f local.yml run prediction pytest src/tests/test_evaluate_classification.py

I believe you'll also have to add tdmq to the requirements and fix whatever remaining style errors you may see via flake8 prediction

@WGierke
Copy link
Contributor Author

@WGierke WGierke commented Jan 23, 2018

@reubano Done. Thanks!

@lamby
Copy link
Contributor

@lamby lamby commented Jan 23, 2018

Awesome :)

@lamby lamby merged commit 57452ab into drivendataorg:master Jan 23, 2018
2 checks passed
2 checks passed
@drivendata
concept-to-clinic/cla @WGierke has signed the CLA.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@vessemer vessemer mentioned this pull request Jan 23, 2018
0 of 1 task complete
@WGierke WGierke mentioned this pull request Jan 25, 2018
1 of 1 task complete
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants