-
Notifications
You must be signed in to change notification settings - Fork 146
Fixed coordinates scaling for classification prediction #272
Fixed coordinates scaling for classification prediction #272
Conversation
@@ -34,7 +34,7 @@ class Params: | |||
preprocess.preprocess_dicom.Params | |||
""" | |||
|
|||
def __init__(self, clip_lower=None, clip_upper=None, spacing=None, order=0, # noqa: C901 | |||
def __init__(self, clip_lower=None, clip_upper=None, spacing=False, order=0, # noqa: C901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the [doc string]https://github.com/Serhiy-Shekhovtsov/concept-to-clinic/blob/4887eeb6778013fce10e7e91fd846c84cb7cb248/prediction/src/preprocess/preprocess_ct.py#L19) spacing should be a float or float sequence. Could you update that, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch!
Any comments on suggested approach in general? :) |
0fbb81e
to
436552a
Compare
Thanks for these crucial updates, @Serhiy-Shekhovtsov! We had noticed the scaling issue on our end when testing the Have you tested your update on a However, that test uses the hard-coded
This happens because the contents of the list
We'll need to figure this out before merging so that |
@caseyfitz, yes I saw this problem. Will try to solve it today. And also will improve the approach a bit and fix the failing tests. |
Thanks @Serhiy-Shekhovtsov :) |
@caseyfitz turned out, one little |
More details [here](drivendataorg#272 (comment)).
Bug fixed. When the length of the nodules list is more then 1 – prediction fails as follows: ``` ~/concept-to-clinic/prediction/src/algorithms/classify/src/gtr123_model.py in predict(ct_path, nodule_list, model_path) 275 results = [] 276 --> 277 for nodule, (cropped_image, coords) in zip(nodule_list, patches): 278 cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float()) 279 cropped_image.volatile = True ValueError: too many values to unpack (expected 2) ``` More details [here](drivendataorg#272 (comment)). ## CLA - [x] I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
436552a
to
7671e9a
Compare
Bug fixed. When the length of the nodules list is more then 1 – prediction fails as follows: ``` ~/concept-to-clinic/prediction/src/algorithms/classify/src/gtr123_model.py in predict(ct_path, nodule_list, model_path) 275 results = [] 276 --> 277 for nodule, (cropped_image, coords) in zip(nodule_list, patches): 278 cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float()) 279 cropped_image.volatile = True ValueError: too many values to unpack (expected 2) ``` More details [here](#272 (comment)). ## CLA - [x] I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
"with this one weird trick discovered by a schoolteacher", huh? Looking forward to tests and resolving the conflicts... :) |
741e59a
to
7a53619
Compare
@lamby finally got them all green. |
Was the issue about converting |
@reubano converting to boolean is a fix for an issue of lost |
@Serhiy-Shekhovtsov great! And just to be clear, this fixes #268 right? Is there any other test (besides that one) I should run to make sure this PR works as expected? |
To be precise - the reported issue was caused by wrong coordinates. The coordinates specified in the test were good for full size DICOM but test was running on a small image. But I have found an other issue with classification. It was confirmed by @caseyfitz here. This PR includes tests for real nodules. The problem was - coordinates were not scaled together with an image. So the centroid location was misinterpreted and the cropped patch was wrong, so was the prediction. |
Ran this a few times and while @caseyfitz's use cases looks solved, the sudo docker-compose -f local.yml run prediction pytest -vrsk src/tests/test_classification.py
_____________________________________ test_classify_real_nodule_full_dicom _____________________________________
dicom_paths = ['/images_full/LIDC-IDRI-0002/1.3.6.1.4.1.14519.5.2.1.6279.6001.490157381160200744295382098329/1.3.6.1.4.1.14519.5.2.1...14519.5.2.1.6279.6001.298806137288633453246975630178/1.3.6.1.4.1.14519.5.2.1.6279.6001.179049373636438705059720603192']
model_path = '/app/src/algorithms/classify/assets/gtr123_model.ckpt'
def test_classify_real_nodule_full_dicom(dicom_paths, model_path):
predicted = trained_model.predict(dicom_paths[2], [{'x': 367, 'y': 349, 'z': 75}], model_path)
assert predicted
> assert 0.3 <= predicted[0]['p_concerning'] <= 1
E assert 0.3 <= 0.0021293163299560547
src/tests/test_classification.py:23: AssertionError sudo sh tests/test_docker.sh
+ docker-compose -f local.yml run prediction pytest -rsx
Starting base ...
Starting base ... done
============================================= test session starts ==============================================
platform linux -- Python 3.6.3, pytest-3.1.3, py-1.5.2, pluggy-0.4.0
rootdir: /app, inifile:
collected 59 items
src/tests/test_classification.py ...F.
... |
@reubano, coordinates for this test has been hand crafted to match the real nodule on the third image: LIDC-IDRI-0003. But the |
now machgrid is generated relativaly to symmetrical padding
7a53619
to
245b5dd
Compare
@Serhiy-Shekhovtsov Is this ready to merge from your point of view? :) |
@lamby yes, it is. |
Yes, tests pass for me now... maybe the travis error was a fluke? I just reran the build. |
@reubano it looks so. |
great work!! |
Thank you! |
Bug fixed. When the length of the nodules list is more then 1 – prediction fails as follows: ``` ~/concept-to-clinic/prediction/src/algorithms/classify/src/gtr123_model.py in predict(ct_path, nodule_list, model_path) 275 results = [] 276 --> 277 for nodule, (cropped_image, coords) in zip(nodule_list, patches): 278 cropped_image = Variable(torch.from_numpy(cropped_image[np.newaxis, np.newaxis]).float()) 279 cropped_image.volatile = True ValueError: too many values to unpack (expected 2) ``` More details [here](drivendataorg#272 (comment)). ## CLA - [x] I have signed the CLA; if other committers are in the commit history, they have signed the CLA as well
…g#272) * updated gitignore to ignore jpyter notebook and temporary files * fixed coordinates scaling for classification * fixed mashgrid generation for cropped patch now machgrid is generated relativaly to symmetrical padding * added tests for real modules classification * fixed preprocessing tests
We had a problem - the preprocessing is zooming the image. In other words, the pixel matrix is resized to real proportions. But coordinates we are using for prediction are not scaled accordingly and the actual patch we are making prediction on differs from the expected patch.
For example, the zooming factor for full LIDC-IDRI-0003 image is [2.5, 0.820312, 0.820312]. The prediction result for coordinates of the real nodule {'x': 367, 'y': 350, 'z': 72} is 0.0066. As you can see, it's very low. But if you scale these coordinates by the current zooming factor ({'x': 301, 'y': 286, 'z': 180}) we will get much higher probability of concerning 0.42.
More details here.
Reference to official issue
#268
TODO: Med prediction bugfix
CLA