Visual information perception remains a central task of machine learning models. Image recognition has had significant progress in the past decade, and yet actual human perception covers a wider range of visual information encoded in more abstract forms of presentation. Learning sketches could help machine improve its progress and capability on abstract image perception. We investigated feature extraction methods on Google’s Quick, Draw! dataset, using Principal Component Analysis, Self-Supervised Learning, and Variational Autoencoder models, and analyzed the performance of each method on the sketches classification task. In our experiments, we found that self-supervised learning, the state-of-art technique for real image classification, has the best performance in sketches recognition. Furthermore, under our models, the pixel format of sketches gives better classification accuracy than the corresponding stroke/temporal format. Based on our results, we suggest that future work can explore self-supervised learning techniques on sketches recognition tasks and investigate better ways to use the temporal stroke data of sketches.
Index Terms: sketch recognition, image recognition, selfsupervised learning, pretext task, deep learning