Skip to content

Latest commit

 

History

History
executable file
·
42 lines (33 loc) · 1.43 KB

HOWTO-4.md

File metadata and controls

executable file
·
42 lines (33 loc) · 1.43 KB

HOWTO 4: Suggested public data sets for machine learning

Intended Audience: ML enthusiasts who desire to share publicly deindentified and curated datasets with the SiiM-MLC

Table of Contents

  1. Background
  2. Guidelines for listing of datasets(a template)
  3. Listing

Chapter 1: Background

In HOWTO-3, DOcker submitters were shown how to document the commits of their work on the SiiM MLC site. Part of those guidelines call out including URLs to datasets that are relevant to using their DOcker. For those who wish to expand beyond that, the collection below is recommended

Chapter 2: Guidelines

  • Name of resource: ( who hosts it and what is it called)
  • submitter: (who are you)
  • Date: (submission date)
  • URL: (root URL of the resource)
  • Known to work with: (list SiiM-MLC DOckers known to work with it)
  • Comments: (what is it's purpose, requirements for access, maturity of curation and tagging (i.e. low, medium, high))

Chapter 3: Listings

  1. Name of Resource: NCI's TCIA

    1. submitter: SG Langer
    2. Date: 8 November 2017
    3. URL: http://www.cancerimagingarchive.net/
    4. Known to work with: ?
    5. Comments: A NCI supported collection of public and semi-private DICOM studies organized by cancer types.
  2. Name of Resource: mdAI

    1. submitter: SG Langer
    2. Date: July 2019
    3. URL: https://google.md.ai/hub
    4. Known to work with: SiiM Jupyter notebook
    5. Comments: Crowd sourced Annotated datasets for chest, abdomen, etc