New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converted Team 'grt123' identification and classification algorithms #122

Merged
merged 1 commit into from Sep 29, 2017

Conversation

Projects
None yet
6 participants
@dchansen
Copy link
Contributor

dchansen commented Sep 18, 2017

Description

This pull request implements the grt123 identification algorithm.
It is based on PyTorch, which currently conflicts with Tensorflow, so the pull request also disables the Keras import.

While the algorithm is quite fast on GPU, it is painfully slow (but working) on the CPU due to a bug in PyTorch. The bug is fixed in the PyTorch development branch, but the next release is about 2 months off.

Reference to official issue

#4
#1

Motivation and Context

This Pull request implements part of the prediction service (https://concept-to-clinic.readthedocs.io/en/latest/design-doc.html#detect-prediction-service)

How Has This Been Tested?

The algorithm has largely been tested by manually confirming about 5 random LUNA2016 datasets - as the loaded model has already been tested extensively elsewhere, this was deemed sufficient.
It has been tested both outside of docker, on a Titan X GPU, and inside docker on an i7 Intel CPU.

Screenshots (if appropriate):

CLA

  • I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well



def test_classify_predict_model_load(dicom_path):

This comment has been minimized.

@WGierke

WGierke Sep 18, 2017

Contributor

As @lamby already stated here it'd be nice to still have some tests that validate the output of the models. I'm currently working on some tags to only execute these long running tests if one explicitly desires to do so. So feel free to write this test a bit more detailed :) Edit: I added this decorator in c3c626f

This comment has been minimized.

@dchansen

dchansen Sep 18, 2017

Contributor

Sure, I'll have a look at that tomorrow.

This comment has been minimized.

@dchansen

dchansen Sep 19, 2017

Contributor

I updated the test cases to be a little more comprehensive.

@isms

This comment has been minimized.

Copy link
Contributor

isms commented Sep 18, 2017

[...] the pull request also disables the Keras import.

@dchansen I'm not sure this is a feasible compromise. Is there a way to reimplement this in Tensorflow?

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 18, 2017

@isms I agree that it's not a feasible compromise at all. I did it mainly to highlight the issue.

I think we should have a single deep learning framework for the project as a whole, as they don't really play well together. I am not sure if Keras is the best choice, but either way, the project leads should make the decisions. Perhaps we should open an issue?

Converting the model itself into Keras would relatively easy, but converting the weights is somewhat more involved. Converting to Caffe2 or CNTK would be easy, as they all support the ONNX format.

@dchansen dchansen changed the title Converted Team 'grt123' identification algorithm Converted Team 'grt123' identification and classification algorithms Sep 19, 2017

@reubano
Copy link
Contributor

reubano left a comment

Nice job with this! Here some other thoughts...

use git-lfs for the dsb2017_detector.ckpt model
adopt google style docs
use new style classes
fix spacing issues (have you run flake8?)
integrate skip_slow_test


reader = sitk.ImageSeriesReader()
filenames = reader.GetGDCMSeriesFileNames(dicom_path)
if len(filenames) == 0:

This comment has been minimized.

@reubano

reubano Sep 19, 2017

Contributor

if not filenames:

if np.average([min_distance[i] for i in range(label.shape[0]) if slice_area[i] > area_th]) < dist_th:
valid_label.add(vol.label)

if len(valid_label) == 0:

This comment has been minimized.

@reubano

reubano Sep 19, 2017

Contributor

if not valid_label:

bw3 = bw1 & bw2
label = measure.label(bw, connectivity=1)
label3 = measure.label(bw3, connectivity=1)
l_list = list(set(np.unique(label)) - {0})

This comment has been minimized.

@reubano

reubano Sep 19, 2017

Contributor

You should be able to iterate fine without needing to cast this to a list

sum = 0
while sum < np.sum(area) * cover:
sum = sum + area[count]
count = count + 1

This comment has been minimized.

@reubano

reubano Sep 19, 2017

Contributor

sum += area[count] and count += 1

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 19, 2017

I am a little unsure why, but the conflict between Tensorflow and Pytorch seems gone. It was related to this issue in Tensorflow tensorflow/models#523

@dchansen dchansen referenced this pull request Sep 19, 2017

Closed

Feature: Adapt the grt123 model #4

0 of 1 task complete
@lamby

This comment has been minimized.

Copy link
Contributor

lamby commented Sep 20, 2017

This should block on #118 due to the slow test stuff.

@dchansen dchansen force-pushed the dchansen:master branch from a63e97c to e6d4808 Sep 21, 2017

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 22, 2017

Looking pretty good! Just a few questions for you. I noticed that in classify/trained_model/predict, you commented out the previous code instead of deleted it. Was just curious, why did you decided to keep it around? Also there's still some spacing errors. Can you run flake8 prediction in your terminal? Finally, I see this travis error:

$ git lfs pull
Git LFS: (540 of 540 files, 2 skipped) 497.08 MB / 497.08 MB, 41.20 MB skipped 
[eb2c037dd55e2f49da95657f7a7851cbcfe2f2b516848ed03f8c5c820f3e16b4] Object does not exist on the server: [404] Object does not exist on the server
[3a46182c1c4883e170d5b576d3a2645309f34817c53c16d4f7f29fbdcf799308] Object does not exist on the server: [404] Object does not exist on the server

I'll try to pinpoint which files exactly are missing. Meanwhile, can you verify that the model you added to git-lfs was properly uploaded? Thanks for your work!

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 22, 2017

The git lfs issue should be resolved now.
I kept the previous code because I was unsure what the long-term plans were regarding the style of the project. Conceivably, this pull request could be changed to fit the style of the previous one.
While I don't agree with all the choices made, it was in many ways nicer than what I contributed here. The main reason I did not follow it more closely was one of time, as I just wanted a working adaption of the gtr123 model as soon as possible.

I have fixed all the flake8 issues, except the complexity ones. Hopefully, I will have them fixed by the end of the day.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 22, 2017

The git lfs issue should be resolved now.

Awesome!

I kept the previous code because I was unsure what the long-term plans were regarding the style of the project. Conceivably, this pull request could be changed to fit the style of the previous one. While I don't agree with all the choices made, it was in many ways nicer than what I contributed here. The main reason I did not follow it more closely was one of time, as I just wanted a working adaption of the gtr123 model as soon as possible.

Ok, I understand. I'm taking a look to see how to integrate both versions.

I have fixed all the flake8 issues, except the complexity ones. Hopefully, I will have them fixed by the end of the day.

Great!

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 22, 2017

Ok... here's my attempt. What do you think?

import SimpleITK as sitk
from src.algorithms.classify.src import gtr123_model
from src.preprocess.load_ct import load_ct, MetaData


def predict(dicom_path, centroids, model_path=None,
            preprocess_ct=None, preprocess_model_input=None):
    reader = sitk.ImageSeriesReader()
    filenames = reader.GetGDCMSeriesFileNames(dicom_path)

    if not filenames:
        raise FileNotFoundError()

    reader.SetFileNames(reader.GetGDCMSeriesFileNames(dicom_path))
    image = reader.Execute()

    if preprocess_ct:
        meta = load_ct(dicom_path)[1]
        voxel_data = preprocess_ct(image, MetaData(meta))
    else:
        voxel_data = image

    if preprocess_model_input:
        preprocessed = preprocess_model_input(voxel_data, centroids)
    else:
        preprocessed = voxel_data

    return gtr123_model.predict(preprocessed, centroids)

Since the keras model is now gone, do you see any reason to continue to keeping model_path? If not, I can help you refactor it out.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 22, 2017

This would work for now.
I think the long term solution should be to move the gtr123 preprocessing into preprocess_ct. I will probably not have time to do that until next week though.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 22, 2017

I think the long term solution should be to move the gtr123 preprocessing into preprocess_ct.

Agreed. However, I think your PR can work without that at this stage (MVP). I looked around some more and saw that gtr123_model.predict actually takes a model_path param. So the last line could be return gtr123_model.predict(preprocessed, centroids, model_path). I think then, the only change would have to be #L14-L16 here in test_classify_trained_model_predict.py.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

Looking good! What are your thoughts about incorporating something like I mentioned here? Main reason is just so that we don't disable current functionality with this PR. I think with that, we can get this merged.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 26, 2017

I'm trying to do a reasonable merge right now, and algorith.identify.trained_model is giving me a headache.
I think we really need a standardized way of specifying a model, which is independent of frameworks (Keras, Torch, Tensorflow, etc).

I would suggest that each model independently implements a predict method taking only a dicom path (for identification) or a dicom path and a list of candidates. trained_model.predict would then take the same argument, plus a string identifying the algorithm to be used . How does this sound?

@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Sep 26, 2017

@dchansen I'm right now working on that, also I've tried to convert the grt123 classification model into Keras (TF), but I've faced troubles with the BatchNorm layer, despite it transfer weights correctly it feeds me with different results (if I just throw away each BatchNorm from the graph then the output becomes just identical, thus I need to dig deeper).
What is related to the way of standardization without transferring the model I've almost done.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

I'm trying to do a reasonable merge right now, and algorith.identify.trained_model is giving me a headache.

Do you mind elaborating? What specifically is causing issues?

I think we really need a standardized way of specifying a model, which is independent of frameworks (Keras, Torch, Tensorflow, etc).

I would suggest that each model independently implements a predict method taking only a dicom path (for identification) or a dicom path and a list of candidates. trained_model.predict would then take the same argument, plus a string identifying the algorithm to be used . How does this sound?

So if I understand you correctly, algorithms.identify.src.gtr123_model would look like this:

def predict(dicom_path):
    model_path = 'src/algorithms/identify/assets/dsb2017_detector.ckpt'
    ...

and algorithms.identify.trained_model would look like this:

from src.algorithms.identify import src 

def predict(dicom_path, algorithm='gtr123_model'):
    model = func(src, algorithm)
    return model.predict(dicom_path)

Is this correct?

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 26, 2017

@vessemer I am a little unsure about the benefit of moving the model to Keras now that Tensorflow and Torch seem to work without disturbing one another. Different approaches will require different enough post- and pre-processing that they can't be represented as a Keras model anyway.
Long term, I do see the advantage of having all models on a single deep-learning framework, but is Keras the right choice? Using ONNX as the way to save the model would allow for easy use of many frameworks, including Caffe2 and CNTK.
Either way, as soon as there is a consensus, I would be happy to help converting the models to that particular format.

One difference between the Torch and Keras batchnorm is that they use a different epsilon value (1e-5 vs 1e-3), so maybe that's the issue?
Could you make your WIP available on the model standardization?

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 26, 2017

@reubano Yes, that was my thought exactly. The problem right now is that all the models take ownership of trained_model.predict, and there is no good way of switching between particular models. Ideally, we might also like to do meta-model, combining many different ones, and a uniform interface would help greatly.

@vessemer

This comment has been minimized.

Copy link
Contributor

vessemer commented Sep 26, 2017

@dchansen, thank you for your input, but I've taken into account the difference of the default values, I've used the nn-transfer for the model standardization.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 26, 2017

@vessemer Well, if you can share the code, I can have a look at it.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

The problem right now is that all the models take ownership of trained_model.predict, and there is no good way of switching between particular models. Ideally, we might also like to do meta-model, combining many different ones, and a uniform interface would help greatly.

For now, I think this is the preferred course of action. We want to focus on one main model for each of (classify, identify, and segment). Later on, we may consider making it easier to switch out the models. The current area of flexibility is in the serialized trained data, e.g., assets/dsb2017_detector.ckpt. So while it is still tied to the underlying python ML model, you can tweak the training parameters as needed. See my comment on #2 for more info.

Also, let me know what exactly is giving you a headache so I can try and help you out.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 26, 2017

My issue was that I had to disable a lot of code in identify.trained_model
But I think this is what you want?

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

I was literally suggesting you just copy/paste my code from here.

else:
preprocessed = voxel_data

return gtr123_model.predict(preprocessed, centroids)

This comment has been minimized.

@reubano

reubano Sep 26, 2017

Contributor
...
model_path = model_path or 'src/algorithms/classify/assets/gtr123_model.ckpt'
...
return gtr123_model.predict(image, centroids, model_path)

This comment has been minimized.

@dchansen

dchansen Sep 26, 2017

Contributor

Not sure what you're asking here?

This comment has been minimized.

@reubano

reubano Sep 26, 2017

Contributor

Just reintroducing the model_path part from here. Also just noticed a typo which I've fixed. Maybe that tripped you up.. sorry!

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

Just about there! Does this solve your headache with algorithms.identify.trained_model?

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 26, 2017

From the travis log

$ flake8 prediction
prediction/src/algorithms/identify/trained_model.py:13:1: F401 'glob' imported but unused
prediction/src/algorithms/identify/trained_model.py:15:1: F401 'dicom' imported but unused
prediction/src/algorithms/identify/trained_model.py:16:1: F401 'src.preprocess.errors.EmptyDicomSeriesException' imported but unused
prediction/src/algorithms/identify/trained_model.py:17:1: F401 'src.preprocess.lung_segmentation.save_lung_segments' imported but unused
prediction/src/algorithms/classify/trained_model.py:68:56: E231 missing whitespace after ','
prediction/src/preprocess/extract_lungs.py:57:19: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:66:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:67:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:68:18: E127 continuation line over-indented for visual indent
prediction/src/tests/test_endpoints.py:10:1: F811 redefinition of unused 'json' from line 7
The command "flake8 prediction" exited with 1.

$ pycodestyle prediction
prediction/src/algorithms/classify/trained_model.py:68:56: E231 missing whitespace after ','
prediction/src/preprocess/extract_lungs.py:57:19: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:66:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:67:18: E127 continuation line over-indented for visual indent
prediction/src/preprocess/extract_lungs.py:68:18: E127 continuation line over-indented for visual indent
@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 27, 2017

Using a different docker image solved the problem for me.
I cannot for the life of me figure out where the segfault is coming from in the python3:6 image however.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 27, 2017

Yea that is odd... which image works? And what is the one that doesn't?

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 27, 2017

pytorch:36 segfaults.
varunagrawal/pytorch does not, but I had to manually install opencv and tkinter in it.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 27, 2017

I changed the image to be Ubuntu based instead. All tests should pass now.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 27, 2017

Ok.... looks like the tests passed, but there are some new lint errors now

$ flake8 prediction
prediction/src/tests/__init__.py:1:1: F401 'torch' imported but unused
prediction/src/tests/__init__.py:1:13: E261 at least two spaces before inline comment
prediction/src/tests/__init__.py:1:14: E262 inline comment should start with '# '
The command "flake8 prediction" exited with 1.

$ pycodestyle prediction
prediction/src/tests/__init__.py:1:13: E261 at least two spaces before inline comment
prediction/src/tests/__init__.py:1:14: E262 inline comment should start with '# '
The command "pycodestyle prediction" exited with 1.
@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 27, 2017

The torch import is required simply because torch needs to be imported before Keras, always. I am very open to other ways of doing that.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 27, 2017

Try adding this on the offending line: # noqa pylint: disable=unused-import

Edit: if that doesn't work, try this: # noqa: F401

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

Looks like everything is a go now! Do you mind squashing to 1 commit? Or perhaps two, 1 for each issue (if it's possible to separate them).

@dchansen dchansen force-pushed the dchansen:master branch 2 times, most recently from d888503 to 11b8146 Sep 28, 2017

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 28, 2017

I seem to have messed that up rather thoroughly. This will take me a while to sort out.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

No worries :). Perhaps I can help out. What seems to be the issue(s)?

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 28, 2017

Just not very experienced in using git rebase. I found myself in the situation of having to redo every merge made on the project for the last ten days.
I have messed up my repository quite badly - I think my best approach is to try again from home, where I have a pristine version of my repo, before my attempts at rebasing.

@WGierke

This comment has been minimized.

Copy link
Contributor

WGierke commented Sep 28, 2017

Instead of rebasing, I prefer to squash the commits like so. It fulfills the purpose as well, namely having only 1 commit in the end :)

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

You can also take a look at the advice I posted here.

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

Also git config --global rerere.enabled true will save conflict resolutions so in that in the future, you only have to resolve them once.

@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 28, 2017

@WGierke Thank you! Uploading the squashed commit now.

@dchansen dchansen force-pushed the dchansen:master branch 2 times, most recently from 1ac544e to 524f843 Sep 28, 2017

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

$ flake8 interface
interface/backend/api/tests.py:44:1: W293 blank line contains whitespace
interface/backend/api/tests.py:48:57: E502 the backslash is redundant between brackets
interface/backend/api/tests.py:49:81: E502 the backslash is redundant between brackets
interface/backend/api/serializers.py:74:1: E302 expected 2 blank lines, found 1
interface/backend/api/serializers.py:120:17: E127 continuation line over-indented for visual indent
interface/backend/api/views.py:24:1: F811 redefinition of unused 'Response' from line 13
interface/backend/api/views.py:47:37: W291 trailing whitespace
The command "flake8 interface" exited with 1.

$ pycodestyle interface
interface/backend/api/serializers.py:74:1: E302 expected 2 blank lines, found 1
interface/backend/api/serializers.py:120:17: E127 continuation line over-indented for visual indent
interface/backend/api/tests.py:44:1: W293 blank line contains whitespace
interface/backend/api/tests.py:48:57: E502 the backslash is redundant between brackets
interface/backend/api/tests.py:49:81: E502 the backslash is redundant between brackets
interface/backend/api/views.py:47:37: W291 trailing whitespace
The command "pycodestyle interface" exited with 1.
@dchansen

This comment has been minimized.

Copy link
Contributor

dchansen commented Sep 28, 2017

But the pull request doesn't include those files??

@isms

This comment has been minimized.

Copy link
Contributor

isms commented Sep 28, 2017

@reubano @dchansen Those issues were introduced by recent commits unrelated to this PR. I pushed fixes which should get the code style back to passing. Will restart the build on this branch to update the status.

(cc @lamby so we can avoid in the future!)

@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 28, 2017

Ok... master looks good. You mind rebasing @dchansen to ensure everything is good to go?

@dchansen dchansen force-pushed the dchansen:master branch from 524f843 to f284f1f Sep 29, 2017

@reubano reubano merged commit 45db7be into drivendataorg:master Sep 29, 2017

2 checks passed

concept-to-clinic/cla @dchansen has signed the CLA.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@reubano

This comment has been minimized.

Copy link
Contributor

reubano commented Sep 29, 2017

closes #4 :shipit:

@reubano reubano referenced this pull request Oct 20, 2017

Closed

Document the Daniel Hammack algorithm #19

0 of 1 task complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment