# Python Refresher + Intro to Computer Vision for Robotics (Day 2, First Session)

Welcome to Day 2 (Session 1) of the Stanford AI4ALL Robotics workshop! Today, we will refresh some Python basics and then dive into an introduction to Computer Vision (CV) and why it’s important for robotics. By the end of this session, you’ll understand core CV concepts like color spaces and image thresholding, which we'll use in our project.

## 1. Python Refresher

Let's start with a quick Python refresher covering variables, data types, lists/dictionaries, loops, and functions. Feel free to run each code cell and examine the output. You can also modify the code to experiment – that's one of the best ways to learn!

### Variables and Data Types

In Python, you don't need to declare variable types. A variable can hold numbers, text, etc., and Python figures out the type. Run the code below to see some variables and their types:


In [None]:
# Example: Using variables of different types
x = 7               # x is an integer
pi = 3.14           # pi is a float (decimal number)
text = "Hello AI4ALL"  # text is a string (sequence of characters)
is_robotics_fun = True  # is_robotics_fun is a boolean (True/False)

print("x =", x, "| Type:", type(x))
print("pi =", pi, "| Type:", type(pi))
print("text =", text, "| Type:", type(text))
print("is_robotics_fun =", is_robotics_fun, "| Type:", type(is_robotics_fun))


x = 7 | Type: <class 'int'>
pi = 3.14 | Type: <class 'float'>
text = Hello AI4ALL | Type: <class 'str'>
is_robotics_fun = True | Type: <class 'bool'>


Variables can change type when you assign new values to them (though that's usually not needed in practice). You can also do basic arithmetic with numbers and concatenate strings.

Example: Let's do some operations:

In [None]:
# Basic arithmetic operations
a = 10
b = 3
print("a + b =", a + b)       # addition
print("a - b =", a - b)       # subtraction
print("a * b =", a * b)       # multiplication
print("a / b =", a / b)       # division (always float in Python3)
print("a // b =", a // b)     # integer (floor) division
print("a % b =", a % b)       # modulo (remainder of division)


a + b = 13
a - b = 7
a * b = 30
a / b = 3.3333333333333335
a // b = 3
a % b = 1


Notice the difference between / and // (one gives a decimal, the other an integer result), and % which gives the remainder of division (e.g., 10 % 3 = 1). You can use these to perform various calculations. Also, note that Python follows order of operations (PEMDAS) for arithmetic.

### Quick Jupyter Notebook Tips

In Jupyter notebooks (like Colab), you can intermix Code cells (to run Python) and Markdown cells (for text). To run a code cell, click it and press Shift+Enter (or click the Play button). You can edit and re-run cells multiple times.

### Working with Text (Strings)

Strings in Python are sequences of characters. You can use single or double quotes to define them. You can combine (concatenate) strings with + or repeat them with *. You can also access individual characters by index (starting at 0). Run the example:

In [None]:
name = "Alice"
greeting = "Hello, " + name + "!"
print("Greeting:", greeting)

# String indexing example
print("First letter of name:", name[0])
print("Last letter of name:", name[-1])  # -1 index gives the last character
print("Uppercase name:", name.upper())   # example of a string method


Greeting: Hello, Alice!
First letter of name: A
Last letter of name: e
Uppercase name: ALICE


We used name.upper() to show a string method that returns an uppercase version of the string. There are many useful string methods (like .lower(), .replace(), etc.), but we'll keep things simple for now.

### Lists (Arrays) and Dictionaries

Lists are ordered collections of items (like an array). You can store a sequence of values (even of different types) in a list. Let's see how to create and use a list:

In [None]:
# Creating and using a list
fruits = ["apple", "banana", "cherry"]
print("Our fruit list:", fruits)
print("Number of fruits:", len(fruits))      # len() gives the length of the list
print("First fruit:", fruits[0])             # accessing by index
print("Last fruit:", fruits[-1])             # negative index for last item

# Let's add a new fruit to the list
fruits.append("durian")
print("After adding a fruit, list is now:", fruits)
print("Now number of fruits:", len(fruits))


Our fruit list: ['apple', 'banana', 'cherry']
Number of fruits: 3
First fruit: apple
Last fruit: cherry
After adding a fruit, list is now: ['apple', 'banana', 'cherry', 'durian']
Now number of fruits: 4


We used list.append(item) to add a new element to the list. There are other operations like removing elements, slicing, etc., but append is one of the most common for now.

Dictionaries are another useful data structure: they store key-value pairs (like a mini database or a vocabulary). Keys and values can be various types. A dictionary is defined using curly braces {} with key: value pairs. Let's make a small dictionary and access it:

In [None]:
# Dictionary of some robots and their creators
robots = {
    "Mars Rover": "NASA",
    "Sophia": "Hanson Robotics",
    "ASIMO": "Honda"
}
print("Robots dictionary:", robots)
print("ASIMO is created by:", robots["ASIMO"])  # accessing value by key

# Adding a new key-value pair to the dictionary:
robots["Spot"] = "Boston Dynamics"
print("After adding Spot:", robots)


Robots dictionary: {'Mars Rover': 'NASA', 'Sophia': 'Hanson Robotics', 'ASIMO': 'Honda'}
ASIMO is created by: Honda
After adding Spot: {'Mars Rover': 'NASA', 'Sophia': 'Hanson Robotics', 'ASIMO': 'Honda', 'Spot': 'Boston Dynamics'}


We can loop through a dictionary as well (more on loops below). For example, to print all robot names and their creators, we can do:


In [None]:
# Iterating over dictionary items
for name, maker in robots.items():
    print(name, "->", maker)


Mars Rover -> NASA
Sophia -> Hanson Robotics
ASIMO -> Honda
Spot -> Boston Dynamics


When you run the above, notice how we used dict.items() to get pairs of (key, value). Now we have a basic idea of variables, lists, and dictionaries.

### Control Flow: Conditional Statements

`if` statements allow your program to make decisions. We can execute certain code only if a condition is true. Optionally, use else (and elif for "else if") for alternate cases. Example:

In [None]:
temperature = 30  # in degrees Celsius
if temperature > 25:
    print("It's warm today.")
else:
    print("It's not very warm today.")


It's warm today.


Try changing temperature in the code and re-run to see the other branch. You can have multiple conditions using elif (e.g., check ranges of values).

### Control Flow: Loops

Loops let us repeat actions without writing them multiple times. There are two main loop types in Python: for loops and while loops.

`For` Loops: These iterate over a sequence (like every element in a list, or a range of numbers). For example, let's print each fruit in our fruits list from before:

In [None]:
for fruit in fruits:
    print("I like", fruit)


I like apple
I like banana
I like cherry
I like durian


You can also loop a specific number of times using range(n), which generates numbers from 0 up to n-1. For instance:


In [None]:
# Using range() to loop a fixed number of times
for i in range(5):
    print("Loop index i =", i)


Loop index i = 0
Loop index i = 1
Loop index i = 2
Loop index i = 3
Loop index i = 4


The above will print i from 0 to 4 (5 iterations total). range(5) generated [0,1,2,3,4]. You can use range(start, end) for different start or even add a step as a third argument.

Let's use a loop for a practical task: summing numbers. We can sum all elements in a list:


In [None]:
numbers = [3, 7, 2, 8, 4]
total = 0
for num in numbers:
    total += num   # shorthand for total = total + num
print("Numbers:", numbers)
print("Sum of numbers in list:", total)


Numbers: [3, 7, 2, 8, 4]
Sum of numbers in list: 24


We initialized `total` to 0 and then added each number in the list to it. After the loop, total holds the sum. (We could also use Python’s built-in sum(numbers), but using a loop is more instructive here.)

While Loops: A `while` loop repeats as long as a condition remains true. Be careful with while loops – if the condition never becomes false, you get an infinite loop! Here’s a simple while loop that counts down:

In [None]:
count = 5
while count > 0:
    print("Counting down:", count)
    count -= 1  # decrease count by 1 each time
print("Blast off!")


Counting down: 5
Counting down: 4
Counting down: 3
Counting down: 2
Counting down: 1
Blast off!


Make sure you understand why this loop stops (the condition becomes false when count reaches 0). If you accidentally create a while loop that never stops, you can interrupt the kernel (in Colab: Runtime > Interrupt Execution).

### Functions

Functions are reusable blocks of code that perform a specific task. We define a function using def, give it a name and parameters, and then return a result (or just perform actions). Functions help organize code and avoid repetition.

Let's define a simple function and see it in action:

In [None]:
# Function to square a number
def square(x):
    return x * x

# Using the function
num = 5
result = square(num)
print("The square of", num, "is", result)


The square of 5 is 25


Here square takes an input `x` and returns `x*x`. We then called square(5) and printed the result.

Functions can also just perform an action without returning anything (they return `None` by default if no return given). For example, a greet function that prints a message:


In [None]:
def greet(name):
    print("Hello,", name + "!")

greet("Alice")
greet("Bob")


Hello, Alice!
Hello, Bob!


Run the above cell to see the greetings. Notice we concatenated the name with "!" inside the print. We could also use f-strings (formatted strings) like print(f"Hello, {name}!") which is another convenient way to format output in Python.

Functions can encapsulate logic, and we can call them multiple times with different inputs. This will be useful when structuring our robot code (e.g., we might write a function to decide a move based on sensor input).

**Mentor's Note: Ensure students understand the difference between defining a function and calling it. One way is to ask: "What happens if you call the function before its definition cell is run?" (It will error, since the function isn't defined yet). Also clarify return vs print inside functions if needed.**

📝 Practice: Try it Yourself!

Now it’s your turn to practice a bit of Python. Below are some prompts. Try to write code in the provided cells to accomplish each task.

**Exercise 1**: Create a list of three of your friends' names. Then use a for loop to print a greeting to each friend by name. For example, if your list is ["Alice", "Bob"], it might print "Hello, Alice!" and "Hello, Bob!" (each on a new line).

In [4]:
# 🚀 Exercise 1: Create a list of names and greet each person.
friends = ["Henry", "John", "Jane"]  # TODO: fill this list with 3 names (as strings)

# TODO: Use a for loop to print a greeting for each name in the list
for i in friends:
        print("Hello", i) #print("Hello " + i)


Hello Henry
Hello John
Hello Jane


**Exercise 2**: Use a loop to calculate the sum of all numbers from 1 to 10 (inclusive). In other words, 1+2+3+...+10. Print the result. (You can verify the answer should be 55.)

In [6]:
# 🚀 Exercise 2: Sum numbers from 1 to 10
total = 0
# TODO: use a for loop with range or while loop to accumulate the sum
# Hint: range(1, 11) will generate numbers 1 through 10
for i in range(1,11):
        total += i
print("Sum from 1 to 10 =", total)


Sum from 1 to 10 = 55


**Exercise 3**: Write a function greet_person(name) that returns a greeting string (instead of printing it). For example, greet_person("Charlie") could return "Hello, Charlie!". Then call your function a couple of times with different names and print the returned value each time to test it.


In [9]:
# 🚀 Exercise 3: Define and use a greeting function that returns a string
def greet_person(name):
    # TODO: return a greeting message for the given name
    return "Hello " + name # (replace this with the greeting string)

# Testing the function:
print(greet_person("Charlie"))
print(greet_person("Dana"))


Hello Charlie
Hello Dana


Take a moment to ensure you understand the exercises and your results. If something isn’t working, ask a partner or mentor for help, or try printing intermediate values to debug.

**Mentor's Note: Walk around and assist with the exercises. For Exercise 1, some might try to print inside the loop but forget string concatenation or commas; guide them on using print("Hello, " + name). For Exercise 2, watch for off-by-one errors (using range(10) instead of range(1,11)). For Exercise 3, clarify using return vs print inside the function.**

## 2. Introduction to Computer Vision (CV)

Now that our Python skills are warmed up, let's talk about Computer Vision. What is it, and why is it important in robotics?

Computer Vision (CV) is a field of AI that enables computers (or robots) to interpret and understand visual information from the world, such as images or videos, similar to how humans use their eyes and brain. In simpler terms, it's like giving eyes to machines and teaching them to make sense of what they "see". This is crucial for robots that need to navigate, recognize objects, or make decisions based on their environment.

Some core tasks in computer vision include:
- Image Classification: Determining what an image contains (e.g., identifying if an image is of a cat or a dog). The output is a label or category for the image.
- Object Detection: Finding and locating objects in an image (e.g., drawing a bounding box around a person in an image). The output is one or more bounding boxes and labels for each detected object.
- Image Segmentation: Partitioning an image into segments by class, essentially classifying each pixel (e.g., coloring each pixel belonging to a cat vs background differently). This gives a precise shape outline for objects.

There are other tasks too (face recognition, optical flow for motion, etc.), but the above are fundamental. In robotics, these tasks allow a robot to know what is in its environment and where it is, so it can act accordingly.

Why is CV important for robotics? Imagine a self-driving car (a robot car) – it uses CV to detect lanes, traffic signs, pedestrians, and other cars. In our case, we are building a sorting robot. For a robot to sort objects (by color or shape), it must first see the objects and recognize their properties (like what color an object is, or where it is located). CV provides that capability. Without vision, our robot would be "blind" and unable to differentiate objects.

### Relevance to Our Project: Sorting System

Our final project is a robotic sorting system. In this project, the robot needs to identify objects by color and sort them into groups. We'll use computer vision techniques so that a camera can detect the color of each object and tell the robot where those objects are.

For example, if we have an image feed from a camera looking at a conveyor belt or a table with objects, we want the computer to pick out, say, all the blue objects and maybe tell the robot arm to pick them up. Achieving this will involve some of the CV concepts we introduce today and practice in the next session:
- Converting the camera image into a useful color representation.
- Filtering the image to isolate a particular color (like creating a mask of just the blue areas).
- Finding the location/shape of those areas (so the robot knows where to pick).

Now, let's get into a few key concepts that will help us do this:
color **spaces**, **image filtering**, and **thresholding**.


## 3. Core Computer Vision Concepts

### Color Spaces: BGR and HSV
When dealing with color images on a computer, each pixel is typically represented by three values (for the three primary color channels). The common color space for digital images is RGB (Red, Green, Blue). OpenCV, the library we'll use, by default uses BGR (Blue, Green, Red) – it's just the channel order that's swapped (blue first instead of red).

Why different color spaces? The RGB/BGR is intuitive for display, but not always the best for processing. Another very useful color space is HSV: Hue, Saturation, Value. This corresponds more closely to how humans describe colors:
- Hue: The actual color type (angle on the color wheel), e.g., red, green, blue, etc.
- Saturation: How intense or pure the color is (high saturation means vivid color, low means more grayish).
- Value: The brightness of the color (0 is black, higher values are brighter).

In HSV, the hue is typically given as an angle from 0–360° (OpenCV uses 0–180, essentially half-scale), saturation and value are given as percentages 0–100% (in OpenCV these are 0–255). The benefit is that HSV separates the color component (hue) from illumination (value). This makes it easier to filter by color alone. For example, if we want to detect all green objects, we can specify a range of hue values corresponding to green and not worry as much about how bright or dark the object is.

Below is a diagram of the HSV color model (often visualized as a cone or cylinder). It shows how Hue, Saturation, and Value relate to color perception:
https://docs.wpilib.org/en/stable/docs/software/vision-processing/wpilibpi/image-thresholding.html


HSV color space represented as a cone: Hue corresponds to the angle (different colors around the circle), Saturation is the radius (distance from center – pale to vivid), and Value is the vertical axis (height – from dark to bright).

In OpenCV, we can convert an image from BGR to HSV using a function cv2.cvtColor. We will do that in the next notebook. The key takeaway for now is: for color-based object detection, HSV is often more effective than RGB/BGR. In HSV, a range of hue values cleanly captures "what color" something is, while in BGR, a color is a combination of three values which can be affected by lighting. For instance, a red object in shadow (darker) still has a red hue, even if its BGR values are all lower due to less light.

### Image Filtering
Before we isolate colors, sometimes we might need to filter images. Filtering is modifying images by changing pixel values based on some criteria or neighborhood. Why filter? To enhance certain features or reduce noise.

For example:
- A blurring (smoothing) filter replaces a pixel by the average of its neighbors, which can remove small noise and smooth out the image. This is useful to reduce tiny stray dots or fluctuations.
- A sharpening filter does the opposite – it emphasizes differences (edges) to make features more distinct.
- Edge detection filters (like the Sobel filter or Canny edge detector) find boundaries of objects by looking for rapid intensity changes.

In image processing terms, filtering changes the pixel values to either highlight or suppress certain details. We won't delve deep into filter algorithms today, but remember: we can preprocess images with filters to make our main task (like detecting color blobs) easier. For instance, if an image is a bit noisy, applying a blur filter before thresholding might help produce a cleaner mask.

(If you're curious, blur is an example of a low-pass filter (removes high-frequency detail), and edge detection is high-pass (removes smooth background, keeps high-frequency edges).)

### Thresholding
Thresholding is a simple but powerful operation in vision. It converts an image to a binary image (just two values, commonly 0 and 255) based on a threshold condition. Essentially: if a pixel value meets a criterion, mark it as "foreground" (white); if not, mark it as "background" (black).

For example, suppose we have a grayscale image (each pixel intensity from 0=black to 255=white). If we choose a threshold of 128, we can make a rule: any pixel brighter than 128 becomes 255 (white), and anything else becomes 0 (black). This will segment the image into two parts. If the image was dark background with bright objects, thresholding would ideally leave us white blobs on black background corresponding to the objects.

Thresholding can be applied to color images too, but usually we apply it to one channel at a time (e.g., on a grayscale image or on the hue channel of an HSV image). In our project, we will threshold on the HSV representation to get only the pixels in a certain color range.

Let's illustrate thresholding with a very simple example using Python lists (to avoid needing an image file here). Imagine we have some brightness values and we want to threshold them:

In [11]:
# Simulated grayscale pixel intensities
pixels = [50, 120, 200, 90, 150]  # five pixel values (0-255 scale)
threshold_value = 100

# Apply threshold: values above threshold -> 255, otherwise -> 0
binary_pixels = []
for pixel in pixels:
    if pixel > threshold_value:
        binary_pixels.append(255)
    else:
        binary_pixels.append(0)
print("Original:", pixels)
print("Thresholded:", binary_pixels)

Original: [50, 120, 200, 90, 150]
Thresholded: [0, 255, 255, 0, 255]


When you run the above cell, you'll see the original list and the thresholded result. Pixel values above 100 became 255, the rest became 0. This is exactly what thresholding does on an image, just applied to each pixel. In an actual image, we'd get a black-and-white image as output.

Try changing threshold_value to something else (say 150) in the code and re-run to see how the binary output changes.

Thresholding is often used as a segmentation technique – to separate regions of interest. However, picking the right threshold is key, and in more complex images, a single global threshold might not work perfectly (there are methods like adaptive thresholding or Otsu’s method for that, but those are beyond our scope today).

**In the context of color:**
- We might threshold on the hue channel (to get a certain color range).
- **Actually, we will use a function that threshold in a range: cv2.inRange(), which keeps pixels within a min and max range for each channel. This is like a 3D threshold (for H, S, and V ranges).**

Recap: Thresholding takes us from a full-color or grayscale image to a binary mask. This mask can highlight just the parts of the image we care about (e.g., all pixels that are "green" within some tolerance become white).

One thing to note: after thresholding, the result may have noise (extra white spots or small holes) depending on the image. There are morphological operations (like eroding or dilating the mask) that can clean up such noise if needed. If our results have noise, we might mention how to handle it, but we’ll see if it's necessary once we try on a real image.


## 4. Tying It Together: Vision for the Sorting Robot
Let's connect these concepts to what our robot needs to do:
- We will use a camera image as input.
- Convert the image from BGR to HSV color space.
- Choose a target color (say, we want to sort out all the blue objects first). We will define a range of HSV values that correspond to "blue".
- Apply a threshold (using inRange) to get a binary mask where white = pixels that are in the blue range, black = everything else.
- That mask should ideally look like white blobs on black background, each blob representing a blue object.
- We can then find contours (connected white regions) in that mask. Each contour corresponds to one detected object.
- For each contour, we can compute its area (to filter out tiny specks) and its center. The center (in pixel coordinates) tells us where the object is in the image. This information can later be used to guide the robot to that object.
- We can draw these contours and centers on the image for visualization.

Don't worry if some of these steps sound unfamiliar – we'll walk through them in the next session’s notebook with actual code and a sample image.

Important: In a real setting, factors like lighting can affect color detection. The same blue object might look different under a yellow lamp versus sunlight. HSV helps, but sometimes you need to adjust the ranges for the environment. This is a good thing to keep in mind: computer vision often requires tuning or calibrating to conditions.

### Wrap-Up (Session 1)
You've refreshed Python and learned the basics of computer vision relevant to our project:
- Python basics: variables, loops, functions (you'll use these to write the vision logic and possibly control logic for the robot).
- What computer vision is and common tasks.
- Key concepts: color spaces (HSV), filtering, and thresholding for segmentation.

In the next session (Notebook 2), we'll get hands-on with OpenCV to apply these ideas:
we'll load an image with colorful objects and write code to detect and highlight those objects by color. Exciting!

Before moving on, make sure:
- You are comfortable with the Python exercises above (if not, review or ask questions).
- You understand at a high level what our approach for color-based detection will be (if you can, try to explain the pipeline in your own words).


Great work so far! When you're ready, proceed to Notebook 2: Hands-On Image Processing with OpenCV where we'll implement the vision pipeline step by step.