Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 

README.md

omr-dataset

Vision

Inspired by the famous example of MNIST public database (60000 labelled images of hand-written digits), we acknowledge the need for a well-known and representative data set to help the development of applications in the specific domain of Optical Music Recognition.

Purpose

  • OMR samples for the training and testing of symbol classifiers
  • Ground-truth material for the evaluation or comparison of OMR engines

Organization

Ultimately, once data structuring and content are sufficiently validated, we think this reference should preferably be hosted by the International Music Score Library Project (IMSLP).

Meanwhile, the purpose of this omr-dataset Github repository is to gather the material used to build preliminary versions of the target reference.

Usage

This project is handled by gradle tool, and can be driven from an IDE or the command line.

[NOTA: Noise addition tools are not yet included in this gradle build]

From command line, for a full rebuild, use:

    gradle clean build

To just display usage rules, use:

    gradle run

this will display:

   Syntax:
      [OPTIONS] -- [INPUT_FILES]
   
   @file:
    Content to be extended in line
   
   Options:
    -clean             : Cleans up output
    -controls          : Generates control images
    -features          : Generates .csv and .dat files
    -help              : Displays general help then stops
    -mistakes          : Saves mistake images
    -model <.zip file> : Defines path to model
    -names             : Prints all possible symbol names
    -nones             : Generates none symbols
    -output <folder>   : Defines output directory
    -subimages         : Generates subimages
    -training          : Trains classifier on features
   
   Input file extensions:
    .xml: annotations file

To clean up output, use:

    gradle run -PcmdLineArgs="-output,data/output,-clean"

To generate features, with all options, using input from data/input-images, use:

    gradle run -PcmdLineArgs="-output,data/output,-features,-nones,-controls,-subimages,--,data/input-images"

To launch training on generated features, while saving mistaken images, and targeting a specific model file, use:

    gradle run -PcmdLineArgs="-output,data/output,-training,-mistakes,-model,data/patch-classifier.zip"

Remark: the training task lasts about 15 minutes when run on the toy example data/input-images folder. To monitor the neural network being trained, simply open a browser on http://localhost:9000 url.

Development

See the related wiki for more details.

About

Reference of OMR data

Topics

Resources

License

Releases

No releases published
You can’t perform that action at this time.