This project is meant as a helper for our main work for detection of handwritten whiteboard content in lecture videos. It is a fork of TextBoxes.
- Clone this repository. We will call the clone directory
$CAFFE_ROOT
git clone https://github.com/bhargavaurala/accessmath-textboxes.git
cd accessmath-textboxes
- Edit the makefile configuration file
Makefile.config
according to your system needs. Refer Caffe installation instructions for details about dependencies. Make sure that the python wrapper dependencies are installed since we need that for this project. This code has been tested on Ubuntu 14.04.
cp Makefile.config.example Makefile.config
mkdir build
- Build caffe, caffe-python and test if build went correctly.
make -j8
make py
export PYTHONPATH=$PYTHONPATH:$CAFFE_ROOT/python
python -c "import caffe"
- Models trained on ICDAR 2013: Dropbox link BaiduYun link
- Fully convolutional reduced (atrous) VGGNet: Dropbox link BaiduYun link
- Compiled mex file for evaluation(for multi-scale test evaluation: evaluation_nms.m): Dropbox link BaiduYun link
- Frame version of the AccessMath dataset from here. Download the 3-part zip archive and extract into a folder called AccessMathVOC and place in AccessMath-ICFHR18 project root.
export AM_DATA_DIR=/path/to/AccessMathVOC
cd $CAFFE_ROOT/data/AccessMath
./create_data.sh
- This will create train, validation and test LMDBs in
$AM_DATA_DIR/AccessMath/lmdb
- In
models/VGGNet/text/longer_conv_300x300/
Modifydata_param
in the first layer (data
) intrain.prototxt
andtest.prototxt
as shown below
data_param {
source: "/path/to/AccessMathVOC/AccessMath/lmdb/AccessMath_train_lmdb"
batch_size: 32
backend: LMDB
}
- Use
cd $CAFFE_ROOT/build/tools ./caffe train_net -iterations 10000 -solver models/VGGNet/text/longer_conv_300x300/solver.prototxt -weights /path/to/model_trained_on_icdar2013
- You should see around 77.5% as the final validation performance.
- Transfer the model to
models/text_detection
in the AccessMath root folder