Skip to content

Auditory image understanding for the visually impaired

Notifications You must be signed in to change notification settings

mbanf/PictureSensation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Auditory image understanding for the visually impaired

About

In "Critique of Pure Reason" Immanuel Kant stated that our knowledge of the outside world depends on our modes of perception: "What is first given to us is appearance. When combined with consciousness, it is called perception."

Leonardo da Vinci proclaimed: "The eye encompasses the beauty of the whole world."

Making this visual beauty of the world more accessible to visually impaired people has inspired researchers in Computer Vision for a long time. Perhaps the most ambitious software solution for the vision problem would be an algorithm that produces a semantic description of the image content which is then output of a speech synthesis device in natural language. This automated image analysis system would mimic a partner with normal vision who describes the image to the user. However, despite the fact that automated image understanding will remain a challenge to researchers for many years, it would continue to deprive the visually impaired of a direct perceptual experience, an active exploration, and an impression of where things are in the image and what visual appearance they have. The approaches, described in this thesis, therefore, are to augment the sensory capabilities of visually impaired persons by translating image content into sounds. The task of analyzing and understanding images is still up to the user, which is why we call our approach “auditory image understanding”. Very much like a blind person who explores a Braille text or a bas-relief image haptically with the tip of her finger, our users touch the image (via touch pad or touch screen) and experience the local properties of the image as auditory feedback. Due to the simplicity and directness of the sensory mapping from visual to auditory, we harness the human ability to learn, so we consider the brain of the user as a fundamental part of the system. Visually impaired persons can use the system to analyze images that they find on the internet, but also for personal photos that their friends or loved ones want to share with them. It is this application scenario that makes the direct perceptual access most valuable. The user feedback that we received for our system indicates that visually impaired persons appreciate the fact that they obtain more than an abstract verbal description and that images cease to be meaningless entities to them. Expressed in the words of one adult participant:

"What amazes me is that I start to develop some sort of a spatial imagination of the scene within my mind which really corresponds with what is shown in the image."

Video

PictureSensation

Software

PictureSensation

PictureSensation, a mobile application for the hapto-acoustic exploration of images. It is designed to allow for the visually impaired to gain direct perceptual access to images via an acoustic signal. PictureSensation introduces a swipe-gesture based, speech-guided, barrier free user interface to guarantee autonomous usage by a blind user. It implements a recently proposed exploration and audification principle, which harnesses exploration methods that the visually impaired are used to from everyday life. In brief, a user explores an image actively on a touch screen and receives auditory feedback about its content at his current finger position. PictureSensation provides an extensive tutorial and training mode, to allow for a blind user to become familiar with the use of the application itself as well as the principles of image content to sound transformations, without any assistance from a normal-sighted person. We show our application’s potential to help visually impaired individuals explore, interpret and understand entire scenes, even on small smartphone screens. Providing more than just verbal scene descriptions, PictureSensation presents a valuable mobile tool to grant the blind access to the visual world through exploration, anywhere.

Contact in case you've found a bug.

Design

Alt text An illustration of our barrier free user interface design principle. Two-finger swipe gestures are used to navigate through the different modes of the application

Alt text (a) The PictureSensation application in exploration and (b) colour training mode in comparison with (c) the original desktop based research protoype.6

Alt text (a) The audible colour space representation as first proposed by Banf and Blanz.5 MIDI instruments represent opponent colours. (b) The novel sonification model6 used in PictureSensation. Opponent colours are represented by complementary sound characteristics.

Experiments

References

Banf, M., Mikalay, M., Watzke, B. and Blanz, V.,“PictureSensation – a mobile application to help the blind explore the visual world through touch and sound,” Journal of Rehabilitation and Assistive Technologies Engineering, 2016.

Banf, M., and Blanz, V., “Man Made Structure Detection and Verification of Object Recognition in Images for the Visually Impaired,” Proceedings of Mirage 2013, 6th International Conference on Computer Vision/ Computer Graphics Collaboration Techniques and Applications – in Cooperation with Eurographics Association, June 6-7, 2013, Berlin, Germany.

Banf, M., and Blanz, V., “Sonification of Images for the Visually Impaired using a Multi-Level Approach,” Proceedings of Augmented Human 2013 – in Cooperation with ACM SIGCHI, March 7-8, 2013, Stuttgart, Germany, pp. 162-169.

Banf, M., and Blanz, V., “A Modular Computer Vision Sonfication Model for the Visually Impaired,” Proceedings of the 18th International Conference on Auditory Display, June, 18 – 21, 2012, Atlanta, Georgia.

Banf, M., "Auditory Image Understanding for the Visually Impaired Based on a Modular Computer Vision Sonification Model," PhD Thesis, 2013, Siegen

About

Auditory image understanding for the visually impaired

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published