Public data sets and their properties in the Collective Knowledge Format with JSON API and JSON meta information to be easily pluggable to customizable and reproducible CK experimental workflows (such as collaborative program analysis and optimization):
Switch branches/tags
Nothing to show
Clone or download

README.md

Open data sets with JSON meta for collaborative and reproducible computer systems' research

logo logo

These are various public data sets for public bechmarks and kernels used in our research on universal and multi-objective autotuning/crowd-tuning in the open Collective Knowledge format:

They can be easily plugged in to various CK research workflows (such as collaborative benchmarking and optimization of computer systems).

They also help with our artifact evaluation initiative at various computer systems conferences and journals.

Status

Stable reprository

Dependencies on other repositories

Authors

  • Grigori Fursin, dividiti/cTuning foundation
  • Various authors of shared programs (see individual entries)

Prerequisites

Installation

ck pull repo:ctuning-datasets-min

List of other CK data set repositories shared as zip

It is possible to share CK repositories as zip archives (useful to share artifacts along with publications and add them as supplementary material for ACM Digital Library, for example). Such repositories can be installed via

ck add repo:[repo_name] --zip=[zip archive name or full URL] --quiet

We shared multiple repositories with thousands of data sets for our shared benchmarks via Google Drive:

For example, you can download ckr-ctuning-datasets.zip (or other and much larger datasets ckr-usb-ctuning-dataset-* from our PLDI paper).

Register it with CK simply via:

ck add repo:ctuning-datasets --zip=ckr-ctuning-datasets.zip --quiet

Now, when you run shared cTuning benchmarks (programs), you will automatically have an extended choice of data sets.

If you want to compile and run our benchmarks on Android-based mobile phones, you need to download and register with CK Android NDK as described here:

List of other CK data set repositories shared via BitTorrent

We share some large CK repositories in zip via BitTorrent to optimize sharing (upload and download) of such repositories across multiple users. We use the following file name convention for such repositories: ckr--YYYYMMDD.zip.

Publications

@inproceedings{ck-date16,
    title = {{Collective Knowledge}: towards {R\&D} sustainability},
    author = {Fursin, Grigori and Lokhmotov, Anton and Plowman, Ed},
    booktitle = {Proceedings of the Conference on Design, Automation and Test in Europe (DATE'16)},
    year = {2016},
    month = {March},
    url = {https://www.researchgate.net/publication/304010295_Collective_Knowledge_Towards_RD_Sustainability}
}

@inproceedings{Fur2009,
  author =    {Grigori Fursin},
  title =     {{Collective Tuning Initiative}: automating and accelerating development and optimization of computing systems},
  booktitle = {Proceedings of the GCC Developers' Summit},
  year =      {2009},
  month =     {June},
  location =  {Montreal, Canada},
  keys =      {http://www.gccsummit.org/2009}
  url  =      {https://scholar.google.com/citations?view_op=view_citation&hl=en&user=IwcnpkwAAAAJ&cstart=20&citation_for_view=IwcnpkwAAAAJ:8k81kl-MbHgC}
}

Feedback

If you have problems, questions or suggestions, do not hesitate to get in touch via the following mailing lists: