by Yingxin LIN
- Different from the common practice of MNIST image recognition using CNN algorithm, I apply NumPy and OpenCV to extract relevant features from each MNIST figure, and then train a Xgboost recognition model. After gradually adjusting parameters, the accuracy of the optimal model on the test set can reach 88%.
- In addition, since I've made extensive use of the broadcasting mechanism of NumPy instead of loops when coding, the code can run at an excellent speed.
- I also define the handwritten numeral edge scanning function totally based on NumPy, which can scan the number of on pixels within image edge with excellent speed and precision in a short time. Some scanning results are shown below:
- Train set: train-labels.gz (label) + train-images-idx3-ubyte.gz (featrues)
- Test set: test-labels.gz (label) + t10k-images-idx3-ubyte.gz (featrues)
- It's necessary to unzip files suffixed with '.gz' before running the code.
- You can learn more details from the PDF file Data ming report & Userguide (in Simplified Chinese).pdf.
- AUTHOR: Yingxin LIN
- Company: School of Finance, Central University of Finance and Economics (CUFE)
- Contact: lyxurthebest@163.com or lyxurthebest@outlook.com
- The copyright belongs to Yingxin LIN , 2021/08/11.