## Recap

In the last video, we talked about analog and digital cameras and discovered modern camera systems in phones.

Today we will try to establish the intersection between computer vision and the history behind research on how vision is formed in the brain.

## Hubel and Wiesel 🐱

The [link to the Paper (from 1959).](https://physoc.onlinelibrary.wiley.com/doi/epdf/10.1113/jphysiol.1959.sp006308) 

Here is a short video called, [The Cat Experiment.](https://www.youtube.com/watch?v=IOHayh06LJ4)

In 1959, David Hubel and Torsten Wiesel provided a quantum step in our understanding of the visual system.

Their work was about the complex mechanisms involved in transforming simple information from the eyes into our rich and complete visual perception of the world.

They won Nobel Prize in 1981.

![Hubel and Wiesel](../img/hubel_wiesel/Hubel-and-Wiesel.jpg)

[Source](https://www.brains-explained.com/how-hubel-and-wiesel-revolutionized-neuroscience/)

At the time, Hubel and Wiesel were unaware of photoreceptors, but earlier research had revealed that the output neurons of the retina, responsible for transmitting visual information from the eyes to the brain, exhibit relatively simple responses to light.

But simple responses to light was obviously not enough to explain our visual system.

They were trying to understand how vision really happened in brain. They were trying to understand the secrets of visual cortex.

![Visual Cortex](../img/hubel_wiesel/visualCortex.png)

[Source](https://hive.blog/hive-196387/@nattybongo/brain-areas-for-sight)

The **visual cortex**, located in the occipital lobe of the brain, is primarily responsible for interpreting and processing visual information received from the eyes, specialized for various aspects of visual processing, including object recognition, spatial localization, and motion detection.

They actually made the **beginning for the knowledge of a deep learning model.**

## How did it all happen? 🤔

They used electrodes to monitor the activity on a brain of a cat, and they were trying to understand what makes neurons fire in vision.

_"When we started working in the late 50s we set up her first experiments and they didn't go well._

They tried all kinds of objects from magazines, fish, mouse and flower pictures. Nothing seem to work.

_Because at the beginning we couldn't make the cells fire at all we'd shine lights all over the screen and nothing seemed to work. Rather by accident one day we were shining small spots either white spots or black spots onto the screen._

_We found that the black dot seemed to be working in a way that at first we couldn't understand until we found that it was the process of slipping the piece of glass into the projector which swept a lie a very faint precise narrow line across the retina._

_And every time we did that we'd get a response"_

![experiment](../img/hubel_wiesel/hubel_wiesel_the_hierarchy.png)

[Source](https://youtu.be/NfnWJUyUJYU?t=1600)

So sliding a piece of glass and making an edge (without even meaning it) cause the neurons to fire.


![Findings](../img/hubel_wiesel/hubel_wiesel_findings.png)

### Here are the findings: 🔎

- In **particular orientation**, neurons get excited about edges.

- Nearby cells in the visual cortex are processing nearby areas in your visual field. **Locality** is preserved in processing.

- Visual cortex has a hierarchical organization. Simple cells to complex cells through layers.

![Featural Hierarchy](../img/hubel_wiesel/hubel_wiesel_hierarchy.png)


Recommended Reads:

[How Hubel and Wiesel Revolutionized Neuroscience and Made Me a Neuroscientist](https://www.brains-explained.com/how-hubel-and-wiesel-revolutionized-neuroscience/)

[Recounting the impact of Hubel and Wiesel](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718241/)

[THE PRIMARY VISUAL CORTEX BY MATTHEW SCHMOLESKY](https://webvision.med.utah.edu/book/part-ix-brain-visual-areas/the-primary-visual-cortex/)

### To learn more about Perception 👀

Here is a wonderful link about [human perception.](https://www.cns.nyu.edu/~david/courses/perception/lecturenotes/V1/lgn-V1.html)

_David Hubel and Torsten Wiesel won the Nobel prize for discovering the functional organization and basic physiology of neurons in V1 (V1 = Primary Visual Cortex)._ 

_They discovered three different types of neurons that can be distinguished based on how they respond to visual stimuli that they called: **simple cells**, **complex cells**, and **hypercomplex cells.**_ 

**Orientation selectivity**: Most V1 neurons are orientation selective meaning that they respond strongly to lines, bars, or edges of a particular orientation (e.g., vertical) but not to the orthogonal orientation (e.g., horizontal).

**Direction selectivity**: Some V1 cells are also direction selective meaning that they respond strongly to oriented lines/bars/edges moving in a preferred direction (e.g., vertical lines moving to the right) but not at all in the opposite direction (e.g., vertical lines moving to the left).

![Direction Selectivity](../img/hubel_wiesel/hubel_wiesel_direction_selectivity.jpg)

**V1 functional architecture**: Hubel and Wiesel also discovered that the neurons in V1 are arranged in an orderly fashion. 

Neurons with similar response properties (e.g., the same orientation preference) lie nearby one another.

**Columnar architecture**: As one moves an electrode vertically through the thickness of cortex, one finds that most neurons have the same selectivity (e.g., the same orientation preference and eye dominance). 

**Ocular dominance columns**: As one moves an electrode tangentially through the cortex, one first finds cells that respond to left eye inputs, then binocular (responsive to both/either eye), then right eye, then binocular, then left again, etc. 

**Orientation columns**: As one moves the electrode tangentially in the orthogonal direction, one first find cells selective for vertical, then diagonal, then horizontal, etc.

A hypercolumn is a chunk of cortex about 1 mm square by 3 mm thick that contains neurons, all with approximately the same receptive field location, but with all different orientation selectivities, direction selectivities, both (left- and right-) eye dominances represented.

![Columnar architecture](../img/hubel_wiesel/hubel_wiesel_columnar_architecture.jpg)

Also in this [MIT Course - 9.11: The Human Brain ](https://www.youtube.com/watch?v=ePP0G7FJGPI) you can learn about more for these findings.

[Here is the book from Hubel and Wiesel](https://books.google.co.in/books?id=8YrxWojxUA4C&lpg=PA106&pg=PA104#v=onepage&q&f=false)

## MIT Summer Vision Project ☀️

This event is considered the birthday of computer vision - Summer of 1966.

_"Vision is so easy!"_

However, Computer Vision is not solved in that Summer.

![Summer Vision](../img/hubel_wiesel/MIT_summer_vision.png)

In 1966, Seymour Papert, an MIT professor working at the AI lab, set out on an ambitious venture known as the Summer Vision Project. He aimed to address the challenge of machine vision and achieve a solution within a short span of a few months.

He encouraged the students to develop a significant component of a visual system in one summer. The students tried to create a platform capable of automatically differentiating between the foreground and background in images, as well as extracting distinct objects without any overlapping, all from real-world scenes.

Although the Summer Vision Project did not yield the desired success, it holds immense significance in the history of computer vision. Many regard it as the official birth of computer vision as a scientific field.

## David Marr

David Marr, who joined MIT's AI Lab in 1973 and became a tenured Psychology professor by 1980, was initially focused on the general theory of the brain but shifted to the study of computer vision. 

He was probably the first to advocate for a computational approach to vision. [Source](https://www.turingpost.com/p/cvhistory2)

### Marr's Findings:

**Vision is Hierachical.**

Marr proposed a representational framework for vision. He concentrated on the vision task of deriving shape information from images. [Source](https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/GOMES1/marr.html)

![Marr's Framework](../img/hubel_wiesel/Marr_Framework.png)

[Source - Lecture 1](https://cs231n.stanford.edu/slides/2016/)

Recommended Read:

[The Dawn of Computer Vision: From Concept to Early Models (1950-70s)](https://www.turingpost.com/p/cvhistory2)

[Marr's Theory](https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/GOMES1/marr.html)

## How do we make computers understand Vision? 🤨

First, why? Why bother?

If we can bridge the semantic gap, we as humans can solve more complex problems.

Images are just a result of digital cameras, so what we have is just bunch of numbers.

![Images For Computers](../img/hubel_wiesel/images_for_computers.png)

[Source](https://www.researchgate.net/publication/327436958_Automatic_Virus_Identification_using_TEM_-_Image_Segmentation_and_Texture_Analysis)

We are trying to make computers understand/interpret the world, similar to how we do.

So that's the semantic gap we are trying to bridge.

- All the physical products we use are a result of an production line.

- Every production line can be improved and people can get goods for cheaper.

- Quality checking is done with help of Computer Vision.

Here are some of the problems that are solved with help of Computer Vision:

![DHL Report](../img/hubel_wiesel/dhlReport.png)

[Source](https://www.dhl.com/us-en/microsites/csi/computer-vision/understanding-computer-vision.html)

### Recommended Read:

[Computer Vision on Manufacturing](https://www.itransition.com/computer-vision/manufacturing)

[Vehicle Inspection Systems](https://www.assemblymag.com/articles/96075-ai-based-vision-technology-aids-vehicle-inspection)


## Understanding and Recognition 🧠

**Recognition**: localization of all the items present in the image.

To make primitive computer to understand the visual structure, it has to be reduced to **simple structures**.

Here is a wonderful paper about different approaches:

![The Paper](../img/hubel_wiesel/recognition_history_paper.png)

Here is a list of categories in all of the different approaches:

![Approaches](../img/hubel_wiesel/approaches_categorized.png)

We will go over 2 of them.

- Brooks & Binford, 1979, Generalized Cylinder Model
- Fischler and Elschlager, 1973, Pictorial Structure Model

## Brooks - 

Rodney Brooks was a director in MIT AI Lab - he is also the founder of iRobot - the company that makes Roombas.

![Roomba](../img/hubel_wiesel/Roomba_805.jpg)

Brooks biggest idea was: **"World is combined of simple shapes"**.

The distinguishing characteristic of Brooks’ work is that it is one of the first systems having used parts-based recognition and generalized cylinders to provide reliable results. [Source](https://www.researchgate.net/publication/257484936_50_Years_of_object_recognition_Directions_forward)

A generalized cylinder is a solid object that is formed by moving a 2D shape along a 3D curve. 

It's like a cylinder, but the shape that's being moved can be any 2D shape, not just a circle, and the curve it's moved along can be any 3D curve, not just a straight line. 

This allows for a wide variety of shapes to be described using this concept, from simple cylinders to more complex biological and manufactured objects.

![Brooks Generalized Cyclinders](../img/hubel_wiesel/brooks_generalized_cyclinders.png)

There is a video about three research projects completed in 1972 at the Stanford Artificial Intelligence Lab.

This video is a great example how generalized-cylinder representation is used.

[Video - Motion & Vision (1972)](https://www.youtube.com/watch?v=laWnTCg5I9w)

Recommended Read:

- [History of Computer Vision](https://letsdatascience.com/learn/history/history-of-computer-vision/)
- [50 Years of object recognition: Directions forward](https://www.researchgate.net/publication/257484936_50_Years_of_object_recognition_Directions_forward)
- [Generalized Cylinder Representation](https://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT13/node4.html)

## Fischler - 

Brooks approach was about modeling the world with simple shapes.

There is also a **parts and structure** approach.

![Parts and Structure](../img/hubel_wiesel/fischler_et_al.png)

In simple terms, it's a method to break down an object into its basic components, like:

- Axes: The main lines or directions that make up the object's shape.
- Sides: The flat surfaces that connect the axes.
- Vertices: The points where the axes and sides meet.

Think of it like building with blocks: the axes are the main beams, the sides are the flat surfaces, and the vertices are the corners where they all come together.

![The Paper](../img/hubel_wiesel/fischler.png)

Fischler and Elschlager's Pictorial Structure Model has had a significant impact on our understanding of visual perception and has influenced various fields.

In Computer vision, the model has inspired the development of algorithms for image segmentation, object recognition, and scene understanding.

### Recommended Reads:

- [Fischler Paper](https://web.archive.org/web/20180413104022id_/http://people.csail.mit.edu:80/torralba/courses/6.870/papers/fischler_1973.pdf)
- [MIT 6.870 Lecture 1](https://www.slideshare.net/zukun/mit6870-grounding-object-recognition-and-scene-understanding-lecture-1)
- [MIT 6.870 All Lectures](https://people.csail.mit.edu/torralba/courses/6.870_2011f/6.870.grounding.html)

# Summary 😌

- In this video, we learned about the early research on the visual cortex.

- We gained insight into how vision occurs in the brain.

- We understood the beginnings of Computer Vision and the motivation for trying to fill the semantic gap.

- We realized that in order to recognize an image and in this sense understand the visual structure on the computer, it needs to be reduced to **simple structures**, and we examined two solutions to this problem.

- In the following video, we will examine the legendary articles of computer vision that have stood the test of time and develop sample applications.

- See you in the next video!