Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Small change to the dataset section.
- Loading branch information
1 parent
18e92b0
commit 9488767
Showing
1 changed file
with
3 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,17 @@ | ||
|
||
An attempt was made to find a dataset with handwritten text, but we failed to find one that suited our requirements. | ||
An attempt was made to find a dataset with handwritten text, but no dataset that fulfilled our requirements was found. | ||
The datasets that were found would require a lot of preprocessing. | ||
Figure~\ref{} shows a sample from that kind of dataset. | ||
To get good result from that kind of dataset it would be necessary to implement baseline slant normalization, skew correction, skeleton and so on. | ||
|
||
Therefore, instead of spending a lot of time preprocessing the datasets, we implemented a Graphic User Interface to create our own dataset. | ||
Therefore, instead of spending a lot of time preprocessing the datasets, we implemented a Graphical User Interface to create our own dataset. | ||
The largest advantages of this solution is that our solution records one pixel wide lines and the characters are already separated. | ||
The large part of the work, image processing, was thus reduced significantly. | ||
Our dataset contains 100 examples for every capital letter in the Latin alphabet | ||
\footnote{The dataset is available together with the source code of the system. See appendix~\ref{app:source_code}.}. | ||
An example image from our character image dataset can be found in Figure~\ref{fig:image_feature_extraction}. | ||
|
||
To get a dataset for training the word classifier a generator was created\footnote{Please see HandReco\/src\/api\/word\_examples\_generator.py in the source code for documentation of the word example generator. See appendix~\ref{app:source_code}.}. | ||
To get a dataset for training the word classifier a generator was created\footnote{Please, see HandReco\/src\/api\/word\_examples\_generator.py in the source code for documentation of the word example generator. See appendix~\ref{app:source_code}.}. | ||
The generator creates random errors in the words given as input. | ||
To generate the dataset is obviously not optimal for practical applications, but it is good enough to test the implementation. | ||
|