Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bltest data #11

Closed
wants to merge 4 commits into from
Closed

Add bltest data #11

wants to merge 4 commits into from

Conversation

fedorov
Copy link
Member

@fedorov fedorov commented Jun 2, 2016

contributes 4 image series and matching SEGs generated using Brainlab. Baseline NRRDs to be added. Note that only tests 1 and 4 can be handled by the converter, the other two cases need to be debugged ...

related to #9

@fedorov
Copy link
Member Author

fedorov commented Jun 2, 2016

@che85 this is not ready for merge, I will add baselines. Tests will be added after #6 is merged.

@che85
Copy link
Member

che85 commented Jun 2, 2016

Alright.

@fedorov
Copy link
Member Author

fedorov commented Jun 6, 2016

@che85 can you please study this: https://github.com/github/git-lfs/wiki/Tutorial

The question is to decide if git lfs is a suitable mechanism to handle test data, or we should use an external system.

@che85
Copy link
Member

che85 commented Jun 13, 2016

@fedorov To me it seems as if it doesn't work properly. Checking out your branch and fetching the objects seems to download something, but the size of the DICOM images is still a few bytes...

After checking out master and then switching back to add-bltest-data, the DICOM images has appropriate sizes.

I might be wrong or git lfs is unusable in it's current state. Another point which assures that is the large amount of open issues https://github.com/github/git-lfs/issues

One thought how to manage that alternatively would probably be, to have another repository or branch for the test data only so that the user has better control about "if" the test data will be checked out or not.

I found a nice page http://blog.deveo.com/storing-large-binary-files-in-git-repositories/ which lists git lfs like mechanism. I will take a look at it.

@fedorov
Copy link
Member Author

fedorov commented Jun 20, 2016

@pieper this is the branch with the testing data, to be considered in the context of using IPFS for test data hosting.

@fedorov
Copy link
Member Author

fedorov commented Jun 30, 2016

Related discussion that resolved the problem I had: git-lfs/git-lfs#1339

Kudos to @technoweenie for suggesting this solution!

See git-lfs/git-lfs#1339

With this modification, git-lfs tracked files are no longer marked as modified
by git status after git lfs checkout.
@fedorov
Copy link
Member Author

fedorov commented Jun 30, 2016

@che85 @pieper @hmeine @nolden you are most welcome to test this and comment.

  • install git-lfs https://git-lfs.github.com/
  • check out this branch - this will only check out pointers to the actual files under data/BLTests
  • do git lfs checkout to pull the actual files

The problem I described to you seems to be resolved, and was due to my limited knowledge (I did not add .gitattributes to the repo) and what appears to be confusing git-lfs interface (the lfs-tracked directory spec).

Let me know what you think about using this approach to maintain test data. We can always switch to something else later, but from what I've seen so far, the next alternative is Midas. Distributed FSs like IPFS and DAT do not solve the hosting problem. As discussed earlier with the RT team and Marco, it is nice to have data in the same repo as the code, so I like this approach with git-lfs, if it works.

@pieper
Copy link
Member

pieper commented Jun 30, 2016

Yes, I think git-lfs is at least as good as any of the alternatives we
explored so far.

On Thu, Jun 30, 2016 at 1:35 PM, Andrey Fedorov notifications@github.com
wrote:

@che85 https://github.com/che85 @pieper https://github.com/pieper
@hmeine https://github.com/hmeine @nolden https://github.com/nolden
you are most welcome to test this and comment.

  • install git-lfs https://git-lfs.github.com/
  • check out this branch - this will only check out pointers to the
    actual files under data/BLTests
  • do git lfs checkout to pull the actual files

The problem I described to you seems to be resolved, and was due to my
limited knowledge (I did not add .gitattributes to the repo) and what
appears to be confusing git-lfs interface (the lfs-tracked directory spec).

Let me know what you think about using this approach to maintain test
data. We can always switch to something else later, but from what I've seen
so far, the next alternative is Midas. Distributed FSs like IPFS and DAT do
not solve the hosting problem. As discussed earlier with the RT team and
Marco, it is nice to have data in the same repo as the code, so I like this
approach with git-lfs, if it works,.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfZG1VwWtQ55XYaYjdOJr_II8BGNWks5qQ_5MgaJpZM4IsGUB
.

@hmeine
Copy link

hmeine commented Jul 1, 2016

As I reported by mail, there are problems with the CT data. I believe that MeVisLab is rather strict, but I think if we publish this as reference datasets, it might be worth looking into. Actually, the problems are even more severe with the BLTests/BLSEGTest2/Image folder than with the Brainlab-deid folder I reported on.

Also, the instructions you gave above were not enough for me – I had to issue git lfs fetch before git lfs checkout (either one was not enough). Maybe that's because I used the github wrapper to checkout the pull request (hub checkout https://github.com/QIICR/dcmqi/pull/11).

@fedorov
Copy link
Member Author

fedorov commented Jul 1, 2016

@hmeine

As I reported by mail, there are problems with the CT data.

Yes, I did get your report. I haven't had time to look into it, as I've been quite busy with other dcmqi issues (see https://github.com/fedorov?tab=overview&from=2016-06-28&to=2016-06-30).

For test data, I first want to organize BLTest data, and then work on the CT dataset you are concerned about.

the problems are even more severe with the BLTests/BLSEGTest2/Image folder than with the Brainlab-deid folder I reported on

I didn't know about this. Can you please describe the problem?

I've just tried loading the dataset in Slicer, and it looks fine (see screenshot below). However, something is not right, since Slicer says this is a multiframe image, but the fact is it is not a multiframe image .... Also, did you notice ImageType is DERIVED\SECONDARY\OTHER\MEVISLAB ? :)

image

Also, the instructions you gave above were not enough for me

Interesting. I added an FAQ entry about accessing the data: QIICR/dcmqi-faq#3 (and after doing this, realized this PR is not yet merged!)

@pieper
Copy link
Member

pieper commented Jul 1, 2016

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate
unusual issues that occur "in the wild"? So I think we have to expect that
some kinds of malformed data needs to be read if possible (I believe David
Clunie's phrase is 'read permissive, write strict' or something like that).

That said, Andrey did you notice that the CT is actually screwed up in the
slicer screen shot you posted? Look in the chin area in the sagittal image
and you can see some data is missing. I would guess that in advanced mode
there's a geometry warning when importing this data.

On Fri, Jul 1, 2016 at 3:11 PM, Andrey Fedorov notifications@github.com
wrote:

@hmeine https://github.com/hmeine

As I reported by mail, there are problems with the CT data.

Yes, I did get your report. I haven't had time to look into it, as I've
been quite busy with other dcmqi issues (see
https://github.com/fedorov?tab=overview&from=2016-06-28&to=2016-06-30).

For test data, I first want to organize BLTest data, and then work on the
CT dataset you are concerned about.

the problems are even more severe with the BLTests/BLSEGTest2/Image folder
than with the Brainlab-deid folder I reported on

I didn't know about this. Can you please describe the problem?

I've just tried loading the dataset in Slicer, and it looks fine (see
screenshot below). However, something is not right, since Slicer says this
is a multiframe image, but the fact is it is not a multiframe image ....
Also, did you notice ImageType is DERIVED\SECONDARY\OTHER\MEVISLAB ? :)

[image: image]
https://cloud.githubusercontent.com/assets/313942/16531959/044a3558-3f9d-11e6-9407-c190a028df8d.png

Also, the instructions you gave above were not enough for me

Interesting. I added an FAQ entry about accessing the data:
QIICR/dcmqi-faq#3 QIICR/dcmqi-faq#3 (and
after doing this, realized this PR is not yet merged!)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfcfNBs7SiZelN0HIwmIBMWInfIO-ks5qRWZpgaJpZM4IsGUB
.

@fedorov
Copy link
Member Author

fedorov commented Jul 1, 2016

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate unusual issues that occur "in the wild"?

the idea of the dataset is to provide testing for the data collected from the platforms that produce SEG. I agree with Steve - unless the data was corrupted somewhere in transmission, it makes sense to keep it.

Andrey did you notice that the CT is actually screwed up in the slicer screenshot you posted?

I did not, but now I do! I should ask the source of the data! Interesting.

I would guess that in advanced mode there's a geometry warning when importing this data.

There is, but it is somewhat cryptic, and the difference does not show up, unless I resize the table:

image

@pieper
Copy link
Member

pieper commented Jul 1, 2016

Maybe slicer should complain more. What does MeVisLab say about this data?
On Jul 1, 2016 4:53 PM, "Andrey Fedorov" notifications@github.com wrote:

I may be wrong, but wasn't the idea of the BLTest dataset to illustrate
unusual issues that occur "in the wild"?

the idea of the dataset is to provide testing for the data collected from
the platforms that produce SEG. I agree with Steve - unless the data was
corrupted somewhere in transmission, it makes sense to keep it.

Andrey did you notice that the CT is actually screwed up in the slicer
screenshot you posted?

I did not, but now I do! I should ask the source of the data! Interesting.

I would guess that in advanced mode there's a geometry warning when
importing this data.

There is, but it is somewhat cryptic, and the difference does not show up,
unless I resize the table:

[image: image]
https://cloud.githubusercontent.com/assets/313942/16534255/64f34458-3fac-11e6-8f11-c4fa84955643.png


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAHsfaXs5O_YAVqGn82e5-WF4lErHrQ6ks5qRX5kgaJpZM4IsGUB
.

@hmeine
Copy link

hmeine commented Jul 2, 2016

MeVisLab splits the volume into 14 parts in its default configuration, due to discontinuities between the slices:
DICOM slices in 3D space
It's easy to configure the import to achieve the same result as in Slicer, but it is unclear what's more "correct". To me, the data looks broken.

@fedorov fedorov closed this Jul 2, 2016
@fedorov fedorov deleted the add-bltest-data branch July 2, 2016 16:57
@fedorov
Copy link
Member Author

fedorov commented Jul 2, 2016

I decided to exclude BLTest2 for various reasons, at least in its current state. I can explain later. I will explain later. I will make another PR after revising the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants