# Python: Data Types and Structures

When discussing movies, we often categorize them, list their cast, or note their details. Today, we'll learn about Python's fundamental **data types** (numbers, text) and how to group related information using **lists** (for sequences like genres or cast) and **dictionaries** (for a movie's complete profile), just like you'd organize a film database!

## 1. Data Types

Every piece of data in Python has a **type**. Understanding these types helps us know what operations we can perform.

* **Strings (`str`):** A sequence of characters, essentially text. 
* **Booleans (`bool`):** Represents one of two values: `True` or `False`, essential for making decisions in your code.
* **Integers (`int`):** A whole number (positive, negative, or zero) without any decimal point.
* **Floats (`float`):** Short for floating-point number is a number that has a decimal point, used to represent fractional values.


In [None]:
# Example Film Data
release_year = 2010
imdb_rating = 8.8
movie_title = "Inception"
is_sequel = False

We can use the `type()` function to check the type of a variable.

In [None]:
print(type(release_year))

#### ***A note about functions***:

In programming, a **function** is a block of organized, reusable code that performs a specific, single action. Think of it like a mini-program or a recipe for a particular task. Python has many "built in" functions. The `type` function is one example. However, we can also write our own functions.

Functions have a name, like `type` and take one or more **"arguments"**. We **"call"** a function by supplying its name and providing the arguments in parentheses: `function_name(argument 1, argument 2, ...)`

We will dive much deeper into functions later on.

### Strings

Strings are sequences of characters (text). Strings are enclosed in single quotes (`'...'`) or double quotes (`"..."`). We can perform operations on multiple strings like *concaternation*:

In [None]:
director_first = "Christopher"
director_last = "Nolan"
full_director_name = director_first + " " + director_last # Concatenation
print(f"Director: {full_director_name}")

#### String Methods

In Python (and many other object-oriented programming languages), a method is a function that belongs to an object or a class. Think of objects as "things" that have both data (attributes) and actions they can perform (methods). In the following cell, the object is the string associated with the `movie_title` variable.

You call a method with "dot notation":
`object_name.method_name(arguments)`

In [None]:
movie_title = "Interstellar"
movie_title.upper() # The string method "upper" does not take any arguments

All objects have methods associated with them. To get a list of the available methods. Hit `tab` after the dot:

In [None]:
movie_title.

### Booleans

A boolean represents one of two values: `True` or `False`. Booleans are essential for making decisions in your code.

In [None]:
is_oscar_winner = True # movie won an Oscar
is_sequel = False # not a sequel

Booleans can be use to label something as `True` or `False`, but more commonly, we use **comparison operators** which, as the name implies, is a way to compare two values, returning a boolean. This allows you to ask the computer a question.


| Question | Operator |
|-------|-------|
| is it equal to? | `==` |
| is it not equal to? | `!=` |
| is it greater than? | `>` |
| is it greater than or equal to? | `>=` |
| is it less than? | `<` |
| is it less than or equal to? | `<=` |

In [None]:
imdb_rating > 7.0

### Integers vs. Floats

Integers and floats can be a source of confusion because humans comparing them mathematically, but computers see them as two different data types. Specifically, the following should be considered:

1. **Precision and Representation:**

- Integers: Represent whole numbers exactly. There's no ambiguity.
- Floats: Represent real numbers (numbers with decimal points). Due to how computers store floating-point numbers (using a binary approximation), they can sometimes lead to tiny, almost imperceptible precision errors when dealing with complex calculations or comparisons.

In [None]:
1 + 2 == 3    # == is a comparison operator that asks "is it equal?"

In [None]:
0.1 + 0.2 == 0.3   

2. **Memory Usage:** Integers generally require less memory than floats, especially for very large or very small whole numbers, as floats have a fixed precision.

3. **Operations and Behavior:**

- Integer Division: In Python 3 (which we are using), there is "float division" and "floor division"
- Type Coercion: When you perform an operation involving both an integer and a float, Python typically "promotes" the integer to a float before performing the operation, resulting in a float.

In [None]:
5/2

In [None]:
5//2

In summary, choosing between an `int` and a `float` isn't just about whether there's a decimal point; it's about the nature of the data you're representing, the precision required, and how you expect mathematical operations to behave.

- You use integers when you need exact counts or discrete values that can't be fractional (e.g., number of students, years, counts of objects).
- You use floats when dealing with measurements, averages, calculations that inherently involve fractions, or quantities that can exist between whole numbers (e.g., temperature, speed, ratings, financial amounts).



### Why Data Types Matter

Knowing the data type helps Python know what operations are valid. You can add two integers, but you can't meaningfully "add" a string to a boolean. Understanding these basic types is the foundation for working with more complex data structures like lists and dictionaries!

___
## 💪 **Exercise** 💪


1. Look up your favorite movie on [IMDb](www.imdb.com).
2. Create variables for each of the following: `movie_title`, `year`, `runtime_minutes`, `imdb_rating`, `is_sequel`
3. Calculate `runtime_hours` and `remaining_minutes` from the `runtime_munites` variable.
4. Print a friendly summary of your movie in the following format using f-strings:

```
---- Your Movie Night Pick! ----  
Title: The Great Adventure (2023)  
Runtime: 2 hours and 25 minutes  
Critic Score: 8.7/10  
Is it a sequel? False  `    
--------------------------------
```
___

## 2. Data Structures

### Lists: Ordered collections

In Python, a list is one of the most versatile and commonly used "**data structures**". It's a way to store a collection of items in a single variable. Think of it like a shopping list, a roster of students, or a sequence of sensor readings.

#### Creating a list

Lists are assigned to a variable, with the items contained in brackets, separated by commas:

In [None]:
movie_genres = ["Sci-Fi", "Action", "Thriller", "Mystery"]

#### Key Characteristics of Lists:
- **Ordered:** The items in a list have a defined order, and this order will not change. You can refer to items by their **index** (position).

In [None]:
print(movie_genres[0])

- **Mutable:** You can change, add, or remove items from a list after it has been created.

In [None]:
movie_genres[1] = "Horror"
print(movie_genres)

- **Allows duplicates:** Lists can contain items with the same value.

- **Heterogeneous:** A single list can contain items of different data types (e.g., numbers, strings, booleans, or even other lists!).

In [None]:
interstellar = [2014, "Christopher Nolan", 8.7,  "PG-13"]

#### List functions

- `len()`: Returns the number of items (elements) in a list.
- `sum()`: Returns the sum of all numerical items in a list.
- `min()`: Returns the smallest item in a list.
- `max()`: Returns the largest item in a list.

In [None]:
len(interstellar)

#### Homogeneous vs. Heterogeneous Lists: A Key Consideration

The success of some functions (like sum, min, max, sorted) heavily depends on whether your list is homogeneous or heterogeneous.

- **Homogeneous List:** A list where all items are of the same data type:
    `[10, 20, 30]`
- **Heterogeneous List:** A list containing items of different data types: `[2014, "Christopher Nolan", 8.7]`

In [None]:
print(min(movie_genres))

In [None]:
print(min(interstellar))

#### List methods

Some common list methods include:
- `.append()`
- `.insert()`
- `.remove()`
- `.pop()`
- `.count()`

In [None]:
interstellar.append("$758M")
print(interstellar)

### Getting help with functions and methods

The arguments you need to give a function or method are not alway obvious. Fortunately, there are several ways to get help:
- **Question mark:** Type the function or method you want to use, followed by a `?`


In [None]:
interstellar.insert?

- **The `help` function**

In [None]:
help(interstellar.remove)

- **`Shift + Tab` for contextual help**

In [None]:
interstellar.pop(1)

___
## 💪 **Exercise** 💪
1.  Look up your *least* favorite movie on [IMDb](www.imdb.com).
2.  Create a list containing the following: movie title, year, IMDb rating, is it a sequel?
3.  Print the list.
4.  Insert the content rating (e.g., PG) after the year.
5.  Print the updated list.
___

## Dictionaries

Beyond lists, Python offers another incredibly powerful and flexible built-in data structure called a **dictionary**. While lists store items by their ordered position (index), dictionaries store data as **key-value pairs**. Think of a dictionary like a real-world dictionary where each "word" is a unique key, and its "definition" is the value. Dictionaries are:
- Unordered (in Python 3.7+): They are designed for fast lookup by key, not by numerical position.

- Mutable: You can add, remove, or modify key-value pairs after a dictionary has been created.

- Keys Must Be Unique: Each key in a dictionary must be unique. If you try to add a key that already exists, its existing value will be overwritten.

- Keys Must Be Immutable: Dictionary keys must be immutable data types (like strings or numbers). Lists cannot be keys because they are mutable.

- Values Can Be Anything: Values can be of any data type, including other dictionaries, lists, numbers, strings, etc.

In [None]:
movie_revenue = {
    "Avatar": 2923706026,
    "Avengers: Endgame": 2797501328,
    "Avatar: The Way of Water": 2320250281,
    "Titanic": 2257844554,
    "Star Wars: The Force Awakens": 2068223624,
    "Spider-Man: No Way Home": 1921855086,
    "Jurassic World": 1671537445
}

#### Accessing Values
You access values in a dictionary using their associated keys, not numerical indices.

In [None]:
print(movie_revenue["Avatar"])

#### Adding, Modifying, and Removing Key-Value Pairs
Dictionaries are mutable, so you can easily change their contents.

##### Adding New Pairs

In [None]:
movie_revenue["Napoleon Dynamite"] = 1000
print(movie_revenue)

#### Modifying values

In [None]:
movie_revenue['Jurassic World'] = 10
print(movie_revenue)

##### Removing Pairs
You can use the `del` keyword.

In [None]:
del movie_revenue['Avatar']
print(movie_revenue)

##### Common Built-in Functions and Dictionary Methods
- `len(dictionary)`: Returns the number of key-value pairs in the dictionary.
- `.keys()`: Returns a view object that displays a list of all the keys in the dictionary.
- `.values()`: Returns a view object that displays a list of all the values in the dictionary.
- `.items()`: Returns a view object that displays a list of a dictionary's key-value tuple pairs.

In [None]:
movie_revenue.values()

## Tuples (FYI)

In Python, a tuple is a data structure used to store an ordered collection of items. It's similar to a list, but with one critical difference: tuples are immutable. Tuples are:

- Ordered: Items have a defined sequence, accessible by index (starting from 0).

- Immutable: Once created, you cannot change, add, or remove items from a tuple. Its contents are fixed.

- Allows Duplicates: Can contain repeated values.

- Heterogeneous: Can hold items of different data types.

A common use of tuples is as a key in a dictionary to represent coordinates:

In [None]:
sensor_readings = {
    (40.7128, -74.0060): {"temperature": 25.5, "humidity": 60},   # (40.7128, -74.0060) is the tuple
    (34.0522, -118.2437): {"temperature": 30.1, "humidity": 30}   
}

print(sensor_readings[40.7128, -74.0060])   # Direct access by referencing the tuple

In [None]:
# A cleaner way is to assign the tuple to a variable first
coord = (34.0522, -118.2437)
print(sensor_readings[coord])

___
## 💪 **Exercise** 💪

Imagine you're recording a single measurement from a sensor.

Create a new code cell below this one.

1. Create a dictionary named `sensor_reading` with the following data: 
| Key | Value |
|-----|-------|
| timestamp | 2025-06-23 10:30:00 |
| temperature_c | 28.5 |
| humidity_percent | 65 |
| location | Lab A |

2. Print the entire `sensor_reading` dictionary.

3. Access and print only the `temperature_c` value.

4. Add a new key-value pair: `"pressure_kpa"` with a value of `101.25`.

5. Change the `humidity_percent` to `68`.

6. Print the updated `sensor_reading` dictionary.
___

## 📓 Reflection 📓

When teaching science and integrating coding, what is the value of "non-science" examples?