Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup datastore #4602

Open
slicerbot opened this issue Mar 13, 2020 · 10 comments
Open

Cleanup datastore #4602

slicerbot opened this issue Mar 13, 2020 · 10 comments
Assignees
Milestone

Comments

@slicerbot
Copy link
Collaborator

@slicerbot slicerbot commented Mar 13, 2020

This issue was created automatically from an original Mantis Issue. Further discussion may take place here.

@slicerbot slicerbot added this to the Slicer 4.11.0 milestone Mar 13, 2020
@lassoan

This comment has been minimized.

Copy link
Contributor

@lassoan lassoan commented Mar 26, 2020

While datastore works, it breaks the extension manager (open DataStore module, close the window; open Extension Manager => instead of "Install" button, "Download" button is displayed and it downloads the extension instead of installing it).

Also, it is clear that there will be no Midas communities (there are many alternative, more open and more popular data repositories) and at some point the Midas server will be retired.

@Slicer/slicer-core What do you think about removing DataStore module, moving data to into a github repository and add links to important data sets (atlas collection, registration case library, maybe a few more nicer data sets) to SampleData module? We could use the existing SlicerTestingData or a new SlicerDataStore github repository.

@lassoan lassoan self-assigned this Mar 26, 2020
@pieper

This comment has been minimized.

Copy link
Member

@pieper pieper commented Mar 26, 2020

I like the idea of migrating from midas to github for this.

I'm not sure what you mean by removing the SampleData module - did you mean DataStore? If so, yes.

@lassoan

This comment has been minimized.

Copy link
Contributor

@lassoan lassoan commented Mar 26, 2020

Yes, sorry I meant removing DataStore module.

@jcfr

This comment has been minimized.

Copy link
Member

@jcfr jcfr commented Mar 26, 2020

removing DataStore

That makes sense.

Also, if there is a strong push to keep similar functionality we could modify the DataStore module and leverage GitHub API to upload asset. That said, I would prefer to avoid the added complexity and associated maintenance cost.

existing SlicerTestingData or a new SlicerDataStore github repository.

That is a tough questions.

Before answering, may be we should define the role of each. Here is an attempt:

  • SlicerTestingData: Collection of datasets used for testing Slicer. These datasets may not have any clinical relevance and may be incomplete.

  • SlicerDataStore: Collection of datasets that are clinically relevant and anatomically correct

Now for practical reason, we could organize all data under the same repository: SlicerDataStore and have different type of release: testing-data-<hashalgo> and clinical-data-<hashalgo>

https://github.com/Slicer/SlicerDataStore/releases/download/testing-data-<hashalgo>/<hash>
https://github.com/Slicer/SlicerDataStore/releases/download/clinical-data-<hashalgo>/<hash>
@lassoan

This comment has been minimized.

Copy link
Contributor

@lassoan lassoan commented Mar 27, 2020

I think it would be easier to have two separate repositories, as we would not need to migrate SlicerTestingData, it would allow using the exact same script for managing data (upload scripts, various download scripts implemented in Python and CMake, etc.), and the download URLs would remain simpler (https://github.com/Slicer/SlicerTestingData/releases/download/MD5/nnnn).

@jcfr

This comment has been minimized.

Copy link
Member

@jcfr jcfr commented Mar 27, 2020

would be easier to have two separate repositories

Agree with you.

Then, should we name it SlicerDataStore or SlicerClinicalData or SlicerSampleData or ... ?

I like SlicerSampleData

@lassoan

This comment has been minimized.

Copy link
Contributor

@lassoan lassoan commented Mar 27, 2020

SlicerDataStore sounds a bit more appropriate to me, as we move DataStore content there, and some data sets, such as atlases and registration case library images, are more than just some "sample" data sets.

@jcfr

This comment has been minimized.

Copy link
Member

@jcfr jcfr commented Mar 27, 2020

The repository for the DataStore module is https://github.com/Slicer/Slicer-DataStore, I will update its README and archive the repository.

In the meantime, I just created a repository called SlicerDataStore (along with README file including a Maintenance and an History section). See https://github.com/Slicer/SlicerDataStore

@jcfr

This comment has been minimized.

Copy link
Member

@jcfr jcfr commented Mar 27, 2020

I will update its README and archive the repository.

README has been updated (addition of History section), and the repository has been archived. see https://github.com/Slicer/Slicer-DataStore

@jcfr

This comment has been minimized.

Copy link
Member

@jcfr jcfr commented Mar 28, 2020

All files from the DataStore collection (see here) are being downloaded. I wrote a small script leveraging pydas to download the file locally along with relevant metadata (contributor, description, number of views and downloads, ...)

Soon I will upload the files as release assets along with the metadata (for historical reference) into https://github.com/Slicer/SlicerDataStore

Once this is completed, I will submit a PR to update the download links in Slicer and remove the DataStore module from Slicer.

Later, we could still revisit and have am updated DataStore module allowing to directly upload MRB as release asset ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.