# Week 9 - Human Visual System

The human visual system has three main levels or stages:  
1. FEATURES - Parallel processing to extract low-level properties: colour, texture, lines and movement.  
2. PATTERNS - Rapid serial processing divides the visual field into regions of similar colour or texture and achieves 'proto-object' recognition of surfaces, boundaries and relative depth. This is driven both top-down by visual attention and bottom-up by low-level properties.  
3. MEMORY - Visual working memory: object recognition & attention. This is under conscious control.  

## Stage 1: FEATURES - Low-level properties
### The eye
Light enters the human eye through the pupil and then passes through the lens which focuses the (inverted) image onto the retina at the back of the eye. The retina contains two kinds of light-sensitive cells: rods and cones. There are about 100-120 million rods and 6 million cones. Rods are very sensitive to light but only see in monochrome and are not very acute. Cones are less sensitive and do not work well at night, but they see colour and are more acute. Cones are primarily responsible for daytime vision.  

### Marks and channels
Graphics are made up of marks, the basic graphical elements such as a glyphs, lines and regions. A mark's visual appearance and spatial attributes such as position, shape and size are given by visual variables. Information graphics map data attributes to these visual variables. Low-level visual processing uses different neural pathways to process different visual variables. These pathways are often called visual channels. Different pathways are used to detect motion, orientation, texture, colour and size. This means that these channels are perceptually distinct.  

Where possible different channels should be used to encode different attributes, rather than using the same channel such as colour to encode multiple attributes, it does not hurt to use a redundant encoding.  

### Identifying cues  
Categories of cues:
1. Colour  
2. Form  
3. Spatial Position  
4. Motion (consider using motion or blinking)  

Type of cue:  
- Orientation  
- Curved/Straight  
- Shape  
- Size  
- Hue  
- Intensity  
- Light/Dark  
- Enclosure  
- Convex/Concave  
- Addition/Added Marks  
- Velocity  
- Flicker  
- Blur  
- Velocity  
- Length  
- Width  

### Colour
Colour is composed of three different channels.   

Cones provide colour vision: they come in three varieties, each with a peak response to a different light frequency within the visible light spectrum. Low-level visual processing encodes these in terms of three opponent colour channels: red to green, blue to yellow, and, the most important channel, black to white which encodes luminance.  

In data visualisation, it is common to think about colour in terms of the HSL colour space:   
- H for the hue-the choice of pure colour.  
- S for the saturation-the amount of white mixed with the colour.  
- L for the lightness-the amount of black mixed with the colour.  
HSL is a closer match than RGB.  

### Colour-blindness & accessibility
Avoid red-green colour maps and ensure hue is not the only channel encoding information. Choose colour maps that also vary luminance or saturation.  

### Which visual variable should I use?
Visual variables are not interchangeable: the same data attribute encoded using different visual variables will not be perceived as effective. Different variables vary in terms of:  
- salience — how quickly they are noticed. For instance, movement is more salient than orientation.  
- discriminability — how many distinct values can you encode without confusing the user.  
- accuracy — how easily can you compare different values.  

Scale of effectiveness, most effective first:  
![image.png](attachment:image.png)  
<style type="text/css">
    img {
        width: 600px;
    }
</style>
- Use bar charts rather than pie charts or doughnut charts.   
- If occlusion is not a problem, then a prism map is more effective than a choropleth map for ungrouped data.   
- Never, never use a 3D pie chart!  




## Stage 2: PATTERNS - Proto-object Recognition
### Grouping: Gestalt Laws
Colour, line orientation and frequency, stereoscopic depth and motion are identified in the first stage of visual processing. The next stage identifies contours, regions, foregrounds, and backgrounds. This stage is where pattern perception is used to extract objects from low-level visual features.

Pattern perception laws are named the Gestalt laws (German for pattern).  
They describe the ways in which we automatically organise elements:  
- Proximity: Elements that are close together form groups.  
- Similarity: Elements that are similar in some way, such as colour or shape, form a group.  
- Connectedness: Connection by lines is a powerful way of grouping elements.  
- Continuity: We tend to group regions and lines to form smooth and continuous shapes.  
- Symmetry: We are good at recognising bilateral symmetry, especially around a horizontal or vertical axis and group the symmetric lines together to form an object.  
- Closure and common region: we like to see closed contours and will mentally extend lines to close them. Being 'inside' a closed contour is a very powerful grouping principle.  
- Shared fate | Common fate | Synchrony: Elements that move together are grouped.  
- Figure ground: Elements either stand out prominently in the front (the figure) or recede into the back (the ground).  

### Perceiving 3D
Monocular static cues  
- Occlusion: this is the most important depth cue–objects in front obscure those behind.  
- Linear perspective: foreshortening, parallel lines converging to a point. We see the sides of the road converge, and people get smaller in the distance.  
- Shape from shading (2022) Monash University  
- Shape-from-shading: We see this in the adjacent image – here, the shading suggests light from above. Concave or convex dimple shapes are suggested by shading. The preceding Angkor image above shows the shape of the elephant’s head.  
- Shape-from-texture distortion: Wireframes make use of this to show the shape.  
- Cast shadows: Cast shadows give a clue about the height above the object on which the shadow is cast.  
- Familiar size: Familiar objects allow us to judge distance because we know how big they actually are.  
- Depth of focus: our eyes change focus to bring the image of the object we are looking at into sharp focus on the fovea. Objects that are closer or further away are blurred, giving an ambiguous clue as to their depth.  

Monocular dynamic (moving picture)  
- Structure from motion: rotation and movement of an object relative to the observer, allowing them to see it from different points of view, is an extremely important depth cue.  

Binocular  
- Vergence angle: When the eyes look at an object at a certain depth, the visual system can use the difference in angle between the line-of-sight vectors of the two eyes to measure the depth of objects that are close by (roughly within arm’s length).  
- Stereoscopic depth: The visual system can make use of small differences between the images on each eye to see depth. 3D TVs and displays provide stereoscopic vision. While stereoscopic vision (in combination with the other cues) can provide a sense of truly immersive 3D, it is only one of many depth cues and is actually not that important. Something like 20% of people does not have stereoscopic vision, and many never notice its absence.  

Not all of these depth cues are needed to create realistic 3D graphics and may not be needed in some tasks. Occlusion is the most important depth cue. I think structure-from-motion is the next most important and can also mitigate the disadvantage of occlusion hiding information. Ware recommends that if structure-from-motion is used, so should occlusion, linear perspective and texture distortion or else it looks strange.  

Data visualisations also make use of more artificial depth cues to show depth. These include showing gridded ground and side planes to show perspective distortion. An extra cue is to project the 3D data onto these planes. In the case of 3D scatter plots, it is common to drop lines to the ground plane so that the points look like pins.  

An important question is when to 2D or not 2D. The general rule is that you should use as few dimensions as required. Thus if you are simply comparing the magnitude of a single attribute, use only a single dimension and plot the values on a uniform scale. There is no need for 2D in this case.  

In the case of 3D, it should be used when visualising inherently three-dimensional structures such as buildings and other physical objects and flows. This is why immersive 3D is so important in scientific visualisation.  

The use of 3D for abstract data visualisation is less easy to justify, and by default, you should use 2D. The disadvantage of 3D is that occlusion hides information, and the perspective distorts size, making it difficult to compare magnitudes. Interaction is also more difficult. For this reason, 3D bar charts are a very bad idea. However, I believe that 3D will be used more frequently in abstract data visualisation when low-cost 3D visualisation technologies, such as the HTC Vive, Oculus Rift or zSpace, allow the user to vary their viewpoint and become available naturally. By allowing the observer to move relative to the graphic, the problems of occlusion and perspective distortion are mitigated. They will be useful when looking at actual 3D visualisations like 3D scatter plots, prism maps, space-time cubes and congruent 2D surfaces drawn in 3D (sometimes called 2.5 D).  



## Stage 3: MEMORY - Object Recognition and Processing
Visual attention and working memory. This is under conscious control.  

In the third and highest level of visual processing, visual objects are held in working memory while the viewer performs some tasks, such as finding the shortest route between two cities. At this level, the processing is conscious and sequential. Only a few objects are held in memory at one time.

We use several types of memory in visual processing: Iconic memory (aka visual cache), which is essentially a very short-term snapshot of the image on the retina; visual short-term memory (STM), which holds the visual features of objects of immediate attention; spatial STM which holds the position/location of the objects; and long-term memory (LTM) which holds memories retained from previous experiences. There are similar kinds of memory for other modalities, such as echoic and verbal working memory for sound. (e.g. Baddeley, 2007).

Visual working memory holds visual objects from long-term memory as well as those on the screen. Visual working memory is probably not distinct from long-term memory, it is simply the current activated long-term memories. We can also think of the visualisation on the screen as a different kind of memory: external visual memory.

One of the most surprising finds of psychologists has been how few objects can be held in our working memory, somewhere between 3 and 5, depending upon the task. And we only remember 3-5 objects if we are concentrating, usually, only 1 or 2 objects are remembered.

This limited capacity seems extraordinary to most people, as we feel we have a rich internal representation of the world we see. This is, however, not true. Inattentional change blindness is a powerful demonstration of our lack of memory capacity. Because we remember so little, we do not notice large changes between what we see in one view and the next.


The limited capacity of working memory has strong implications for visualisation. In particular, it means that we can better encode multiple attributes into one visual object rather than using separate visual objects for each attribute, since, if multiple data are integrated into a single object, more information can be held in visual working memory. For instance, if we are examining wind direction, temperature and wind strength, we are better off encoding this in a single glyph such as an arrow whose orientation gives the direction, colour the temperature and length or width the strength rather using three different glyphs.

Visual attention is the key to understanding how information flows between the different visual processing stages. As the viewer performs the task, their attention turns to different parts of the image. When they move their eye to focus on a new region, subconscious parallel processing of stage 1 extracts low-level properties. Visual attention guides stage 2 processing to extract the surfaces and features of the objects that the viewer is now looking at, and stage 3 processing recognises these objects and places those being attended to in working memory. However, visual attention may also be driven by stage 1: if a light blinks in peripheral vision, this will be noticed subconsciously, ad the viewers' attention will be drawn to it.

Controlling attention is a key part of effective visualisations. You need to direct attention to the salient parts of the display.

A recent theory suggests that for tasks involving visual reasoning, the spatial aspects of a display are the most important – too much visual detail reduces performance (visual-impedance hypothesis, Knauff & Johnson-Laird, 2002). This is consistent with calls from some information visualisation designers (e.g. Tufte) to produce clean, minimalist displays from which irrelevant detail (‘chartjunk’) is eliminated.

## Story-telling
- to guide the viewer through your visualization, to be seen in a specific way.  
- to give your account of events (aka your story)  
    - show vis in predefined order, e.g. natural order from left to right, top to bottom  
    - force to scroll through (scrollytelling)  
- Gestalt Principles - law of human perception

DVP marks is on design, not on data, so don't waste time with data wrangling where possible.
Try to add more to your chart e.g. interactivity if you reuse the same charts from DEP
You are presenting it for an audience