Abstract In recent years, more and more businesses are starting to deploy AI-based technologies to take advantage of the increasingly high amount of information that can be collected or is already available. Our system aims to help commercial activities such as retail shops to better understand the needs and preferences of their customers based on what actually captures customers' attention, thus improving commercial strategy development.
The problem of commercial strategy optimization has become central in recent years. We developed a solution to help shops better understand their customers' trends and preferences in order to better create commercial strategies. To do this we are proposing a Computer Vision based gaze estimation system paired with a demographic estimator, which aims to provide meaningful information about the products that capture the customer's attention the most. Our system is capable of understanding where the customer is looking and creating a heatmap, divided by category of customer, using which the shop would be able to better design the exposition strategy. In the scenario we’ve thought about, there will be a fixed camera in shelves or shop windows, in order to have a fixed frontal image of a subject looking at the products. It is important that the camera has a fixed position: this allows the shop manager to correlate the heatmap of gazes created by the system with the position of the items in the shelf/window. It’s also important to mention that in our case an extremely accurate gaze tracking is not necessarily a heavy constraint. This fact allows us to reduce the camera frame rate in order to simplify the problem and lighten the computational effort required.
The system is made up mainly by three components:
- A CNN based network for gaze detection.
- A CNN based network for demographic classification (thus gener and age estiamtion).
- A retrieval component, which helps the second network to improve its accuracy.
Following, a scheme that shows how hte three components interact together.
[1] Appearance-Based Gaze Estimation in the Wild - Zhang et al, Max Planck Institute for Informatics, Saarbrucken, Germany
[2] It’s Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation - Zhang et al, Max Planck Institute for Informatics, Saarbrucken, Germany
[3] Few-Shot Adaptive Gaze Estimation - Seonwook Park et al, ETH Zurich, NVIDIA
[4] facenet-pytorch: https://github.com/timesler/facenet-pytorch
[5] GRA_Net: A Deep Learning Model for Classification of Age and Gender From Facial Images - Avishek Grain, Biswarup Ray, Pawan Kumar Singh et al - Jadavpur University, The National University of Malaysia, IEEE
[6] Deep Residual Learning for Image Recognition - Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun - IEEE
[7] Automatic age and gender classification using supervised appearance model - A. M. Bukar, H. Ugail, and D. Connah - J. Electron. Imag., vol. 25, no. 6, Aug. 2016, Art. no. 061605
[8] Age and gender estimation by using hybrid facial features - V. Karimi and A. Tashk - Proc. 20th Telecommun. Forum (TELFOR), Nov. 2012, pp. 17251728.

