Add bltest data #11

fedorov · 2016-06-02T00:20:32Z

contributes 4 image series and matching SEGs generated using Brainlab. Baseline NRRDs to be added. Note that only tests 1 and 4 can be handled by the converter, the other two cases need to be debugged ...

related to #9

These files are managed using git-lfs

fedorov · 2016-06-02T13:47:32Z

@che85 this is not ready for merge, I will add baselines. Tests will be added after #6 is merged.

che85 · 2016-06-02T13:51:00Z

Alright.

fedorov · 2016-06-06T15:28:38Z

@che85 can you please study this: https://github.com/github/git-lfs/wiki/Tutorial

The question is to decide if git lfs is a suitable mechanism to handle test data, or we should use an external system.

che85 · 2016-06-13T23:19:55Z

@fedorov To me it seems as if it doesn't work properly. Checking out your branch and fetching the objects seems to download something, but the size of the DICOM images is still a few bytes...

After checking out master and then switching back to add-bltest-data, the DICOM images has appropriate sizes.

I might be wrong or git lfs is unusable in it's current state. Another point which assures that is the large amount of open issues https://github.com/github/git-lfs/issues

One thought how to manage that alternatively would probably be, to have another repository or branch for the test data only so that the user has better control about "if" the test data will be checked out or not.

I found a nice page http://blog.deveo.com/storing-large-binary-files-in-git-repositories/ which lists git lfs like mechanism. I will take a look at it.

fedorov · 2016-06-20T13:56:03Z

@pieper this is the branch with the testing data, to be considered in the context of using IPFS for test data hosting.

fedorov · 2016-06-30T17:21:42Z

Related discussion that resolved the problem I had: git-lfs/git-lfs#1339

@technoweenie

Kudos to @technoweenie for suggesting this solution! See git-lfs/git-lfs#1339 With this modification, git-lfs tracked files are no longer marked as modified by git status after git lfs checkout.

fedorov · 2016-06-30T17:35:07Z

@che85 @pieper @hmeine @nolden you are most welcome to test this and comment.

install git-lfs https://git-lfs.github.com/
check out this branch - this will only check out pointers to the actual files under data/BLTests
do git lfs checkout to pull the actual files

The problem I described to you seems to be resolved, and was due to my limited knowledge (I did not add .gitattributes to the repo) and what appears to be confusing git-lfs interface (the lfs-tracked directory spec).

Let me know what you think about using this approach to maintain test data. We can always switch to something else later, but from what I've seen so far, the next alternative is Midas. Distributed FSs like IPFS and DAT do not solve the hosting problem. As discussed earlier with the RT team and Marco, it is nice to have data in the same repo as the code, so I like this approach with git-lfs, if it works.

pieper · 2016-06-30T19:55:52Z

Yes, I think git-lfs is at least as good as any of the alternatives we
explored so far.

On Thu, Jun 30, 2016 at 1:35 PM, Andrey Fedorov notifications@github.com
wrote:

@che85 https://github.com/che85 @pieper https://github.com/pieper
@hmeine https://github.com/hmeine @nolden https://github.com/nolden
you are most welcome to test this and comment.

install git-lfs https://git-lfs.github.com/

check out this branch - this will only check out pointers to the
actual files under data/BLTests

do git lfs checkout to pull the actual files

The problem I described to you seems to be resolved, and was due to my
limited knowledge (I did not add .gitattributes to the repo) and what
appears to be confusing git-lfs interface (the lfs-tracked directory spec).

Let me know what you think about using this approach to maintain test
data. We can always switch to something else later, but from what I've seen
so far, the next alternative is Midas. Distributed FSs like IPFS and DAT do
not solve the hosting problem. As discussed earlier with the RT team and
Marco, it is nice to have data in the same repo as the code, so I like this
approach with git-lfs, if it works,.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfZG1VwWtQ55XYaYjdOJr_II8BGNWks5qQ_5MgaJpZM4IsGUB
.

hmeine · 2016-07-01T18:41:46Z

As I reported by mail, there are problems with the CT data. I believe that MeVisLab is rather strict, but I think if we publish this as reference datasets, it might be worth looking into. Actually, the problems are even more severe with the BLTests/BLSEGTest2/Image folder than with the Brainlab-deid folder I reported on.

Also, the instructions you gave above were not enough for me – I had to issue git lfs fetch before git lfs checkout (either one was not enough). Maybe that's because I used the github wrapper to checkout the pull request (hub checkout https://github.com/QIICR/dcmqi/pull/11).

fedorov · 2016-07-01T19:11:37Z

@hmeine

As I reported by mail, there are problems with the CT data.

Yes, I did get your report. I haven't had time to look into it, as I've been quite busy with other dcmqi issues (see https://github.com/fedorov?tab=overview&from=2016-06-28&to=2016-06-30).

For test data, I first want to organize BLTest data, and then work on the CT dataset you are concerned about.

the problems are even more severe with the BLTests/BLSEGTest2/Image folder than with the Brainlab-deid folder I reported on

I didn't know about this. Can you please describe the problem?

I've just tried loading the dataset in Slicer, and it looks fine (see screenshot below). However, something is not right, since Slicer says this is a multiframe image, but the fact is it is not a multiframe image .... Also, did you notice ImageType is DERIVED\SECONDARY\OTHER\MEVISLAB ? :)

Also, the instructions you gave above were not enough for me

Interesting. I added an FAQ entry about accessing the data: QIICR/dcmqi-faq#3 (and after doing this, realized this PR is not yet merged!)

pieper · 2016-07-01T20:47:38Z

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate
unusual issues that occur "in the wild"? So I think we have to expect that
some kinds of malformed data needs to be read if possible (I believe David
Clunie's phrase is 'read permissive, write strict' or something like that).

That said, Andrey did you notice that the CT is actually screwed up in the
slicer screen shot you posted? Look in the chin area in the sagittal image
and you can see some data is missing. I would guess that in advanced mode
there's a geometry warning when importing this data.

On Fri, Jul 1, 2016 at 3:11 PM, Andrey Fedorov notifications@github.com
wrote:

@hmeine https://github.com/hmeine

As I reported by mail, there are problems with the CT data.

Yes, I did get your report. I haven't had time to look into it, as I've
been quite busy with other dcmqi issues (see
https://github.com/fedorov?tab=overview&from=2016-06-28&to=2016-06-30).

For test data, I first want to organize BLTest data, and then work on the
CT dataset you are concerned about.

the problems are even more severe with the BLTests/BLSEGTest2/Image folder
than with the Brainlab-deid folder I reported on

I didn't know about this. Can you please describe the problem?

I've just tried loading the dataset in Slicer, and it looks fine (see
screenshot below). However, something is not right, since Slicer says this
is a multiframe image, but the fact is it is not a multiframe image ....
Also, did you notice ImageType is DERIVED\SECONDARY\OTHER\MEVISLAB ? :)

[image: image]
https://cloud.githubusercontent.com/assets/313942/16531959/044a3558-3f9d-11e6-9407-c190a028df8d.png

Also, the instructions you gave above were not enough for me

Interesting. I added an FAQ entry about accessing the data:
QIICR/dcmqi-faq#3 QIICR/dcmqi-faq#3 (and
after doing this, realized this PR is not yet merged!)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfcfNBs7SiZelN0HIwmIBMWInfIO-ks5qRWZpgaJpZM4IsGUB
.

fedorov · 2016-07-01T20:53:56Z

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate unusual issues that occur "in the wild"?

the idea of the dataset is to provide testing for the data collected from the platforms that produce SEG. I agree with Steve - unless the data was corrupted somewhere in transmission, it makes sense to keep it.

Andrey did you notice that the CT is actually screwed up in the slicer screenshot you posted?

I did not, but now I do! I should ask the source of the data! Interesting.

I would guess that in advanced mode there's a geometry warning when importing this data.

There is, but it is somewhat cryptic, and the difference does not show up, unless I resize the table:

pieper · 2016-07-01T21:01:08Z

Maybe slicer should complain more. What does MeVisLab say about this data?
On Jul 1, 2016 4:53 PM, "Andrey Fedorov" notifications@github.com wrote:

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate
unusual issues that occur "in the wild"?

the idea of the dataset is to provide testing for the data collected from
the platforms that produce SEG. I agree with Steve - unless the data was
corrupted somewhere in transmission, it makes sense to keep it.

Andrey did you notice that the CT is actually screwed up in the slicer
screenshot you posted?

I did not, but now I do! I should ask the source of the data! Interesting.

I would guess that in advanced mode there's a geometry warning when
importing this data.

There is, but it is somewhat cryptic, and the difference does not show up,
unless I resize the table:

[image: image]
https://cloud.githubusercontent.com/assets/313942/16534255/64f34458-3fac-11e6-8f11-c4fa84955643.png

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfaXs5O_YAVqGn82e5-WF4lErHrQ6ks5qRX5kgaJpZM4IsGUB
.

hmeine · 2016-07-02T09:31:28Z

MeVisLab splits the volume into 14 parts in its default configuration, due to discontinuities between the slices:

It's easy to configure the import to achieve the same result as in Slicer, but it is unclear what's more "correct". To me, the data looks broken.

fedorov · 2016-07-02T16:58:53Z

I decided to exclude BLTest2 for various reasons, at least in its current state. I can explain later. I will explain later. I will make another PR after revising the data.

fedorov added 2 commits May 31, 2016 18:37

ENH: adding test DICOM data

2827d3d

These files are managed using git-lfs

ENH: add SEG objects generated by Brainlab

8d6eec7

fedorov mentioned this pull request Jun 3, 2016

Cloning the repository takes some time AIM-Harvard/pyradiomics#60

Open

pieper mentioned this pull request Jun 23, 2016

experiment with alternative bulk file resources for testing #19

Closed

BUG: added missing .gitattributes

b8ab07d

BUG: fix directory specification for git-lfs

88611c4

Kudos to @technoweenie for suggesting this solution! See git-lfs/git-lfs#1339 With this modification, git-lfs tracked files are no longer marked as modified by git status after git lfs checkout.

fedorov closed this Jul 2, 2016

fedorov deleted the add-bltest-data branch July 2, 2016 16:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bltest data #11

Add bltest data #11

fedorov commented Jun 2, 2016

fedorov commented Jun 2, 2016

che85 commented Jun 2, 2016

fedorov commented Jun 6, 2016

che85 commented Jun 13, 2016

fedorov commented Jun 20, 2016

fedorov commented Jun 30, 2016

fedorov commented Jun 30, 2016 •

edited

pieper commented Jun 30, 2016

hmeine commented Jul 1, 2016

fedorov commented Jul 1, 2016

pieper commented Jul 1, 2016

fedorov commented Jul 1, 2016

pieper commented Jul 1, 2016

hmeine commented Jul 2, 2016

fedorov commented Jul 2, 2016

Add bltest data #11

Add bltest data #11

Conversation

fedorov commented Jun 2, 2016

fedorov commented Jun 2, 2016

che85 commented Jun 2, 2016

fedorov commented Jun 6, 2016

che85 commented Jun 13, 2016

fedorov commented Jun 20, 2016

fedorov commented Jun 30, 2016

fedorov commented Jun 30, 2016 • edited

pieper commented Jun 30, 2016

hmeine commented Jul 1, 2016

fedorov commented Jul 1, 2016

pieper commented Jul 1, 2016

fedorov commented Jul 1, 2016

pieper commented Jul 1, 2016

hmeine commented Jul 2, 2016

fedorov commented Jul 2, 2016

fedorov commented Jun 30, 2016 •

edited