Skip to content


Subversion checkout URL

You can clone with
Download ZIP
a wide range of heterogenous testdata, useful during technical application development
Arc Perl R
Branch: master
Failed to load latest commit information.
bio NCBI-Protein RefSeq, nr, No85, format: gffp
databases added markdown readme to datahub DBs::metabolic
json_compressibility Readme for the JSON compressibility shootout
json_testfiles commited current set of files
thesis_misc initial upload of thesis related files
viz rem captions added Data-Hub readme
chemicalsOrganicList.json initial upload; for testing algorithms using organochemical names
chemicalsOrganicList_reversed.json initial upload; for testing algorithms using organochemical names
crime-data_geojson.json inicom, JSON compressibility test
dictionary_mozilla_en-US.dic inicom
dictionary_mozilla_en-US.dic.json inicom
geo_autpol_anon.json inicom: geo, etc test data
json_organisms_chemicals.json complex json data sample
met_aggregator_terms.json met aggregator terms as json
rpc_handler_kegg_organisms.json JSON formatted Data of organisms in the KEGG DB

Test Data Hub for Developers


This folder contains a varied set of files in common-formats often found in web-based and cross-platform frameworks. Covered will be a wide, heterogeneous range of information from disciplines such as social media to chemistry.

Some files were provided with the 'Dojo Framework v1.6'


The files are intended to serve as a standard test-set, which may be used for data input during application development.

Use scenarios may be:

  • sentiment mining
  • data visualization
  • data correlation
  • semantic maps
  • data converters
  • data modeling processes
  • database schemas
  • metadata
  • general purpose sets
  • benchmarks
  • etc.


This collection is still in process and no package has been released. Feel free to suggest/commit your own files.

Something went wrong with that request. Please try again.