From 948876770f033e08bd8f68f88f1f7bb0debc7a02 Mon Sep 17 00:00:00 2001 From: Kjell Winblad Date: Mon, 1 Aug 2011 14:16:26 +0200 Subject: [PATCH] Small change to the dataset section. --- report/dataset.tex | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/report/dataset.tex b/report/dataset.tex index 3142433..086f52c 100644 --- a/report/dataset.tex +++ b/report/dataset.tex @@ -1,17 +1,17 @@ -An attempt was made to find a dataset with handwritten text, but we failed to find one that suited our requirements. +An attempt was made to find a dataset with handwritten text, but no dataset that fulfilled our requirements was found. The datasets that were found would require a lot of preprocessing. Figure~\ref{} shows a sample from that kind of dataset. To get good result from that kind of dataset it would be necessary to implement baseline slant normalization, skew correction, skeleton and so on. -Therefore, instead of spending a lot of time preprocessing the datasets, we implemented a Graphic User Interface to create our own dataset. +Therefore, instead of spending a lot of time preprocessing the datasets, we implemented a Graphical User Interface to create our own dataset. The largest advantages of this solution is that our solution records one pixel wide lines and the characters are already separated. The large part of the work, image processing, was thus reduced significantly. Our dataset contains 100 examples for every capital letter in the Latin alphabet \footnote{The dataset is available together with the source code of the system. See appendix~\ref{app:source_code}.}. An example image from our character image dataset can be found in Figure~\ref{fig:image_feature_extraction}. -To get a dataset for training the word classifier a generator was created\footnote{Please see HandReco\/src\/api\/word\_examples\_generator.py in the source code for documentation of the word example generator. See appendix~\ref{app:source_code}.}. +To get a dataset for training the word classifier a generator was created\footnote{Please, see HandReco\/src\/api\/word\_examples\_generator.py in the source code for documentation of the word example generator. See appendix~\ref{app:source_code}.}. The generator creates random errors in the words given as input. To generate the dataset is obviously not optimal for practical applications, but it is good enough to test the implementation.