# Introduction to Python (Fall 2024)
## 1. Overview

Credits to Holly Wiberg, Léonard Boussioux, Asterios Tsiourvas, Periklis Petridis, Kimberly Villalobos and Sean Lo who wrote the content you see before you.


## What is Python?

Python is a general-purpose programming language. From the Python website:

> Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.

Python is widely used for writing production code, automating tasks, machine learning, analyzing and visualizing data, and building web applications, among many others.




A few things to know about Python:
1. Python is a *high-level*, *interpreted* language with high-level data types, such as flexible arrays and dictionaries. This allows Python programs to be compact and readable, compared to static, lower-level languages like C, C++, or Java. In fact, Python is the most popular programming language as of Aug 2023 (according to [TIOBE](https://www.tiobe.com/tiobe-index/))!
2. Python is *object-oriented*, which means that software design is organized around objects and special functions (called "methods") for those objects. This is in contrast to functional programming languages in which software design is organized around functions and composing them together in small units.
3. Being a very popular and mature language, Python has libraries for almost anything. Some of the most widely-used examples in data science are:
    - `numpy` for numerical computation at C-like speeds;
    - `pandas` for exploring and manipulating data;
    - `matplotlib` for plotting graphs;
    - `scikit-learn` for statistics and machine learning;
    - `tensorflow` and `pytorch` for deep learning.

Further reading:
- https://docs.python.org/3/tutorial/appetite.html
- https://en.wikipedia.org/wiki/Object-oriented_programming
- https://en.wikipedia.org/wiki/Functional_programming

## Agenda

We're going to give you a whirlwind tour of Python, covering the following topics:
- How we store *information*: Variables, I/O, Data structures
- How we perform *computations*: Functions, Control flow

---

# Module 1: Variables

Python can manipulate data such as numbers and text, by storing them as variables. Here are some examples:

### Numbers

In [None]:
# Python is a calculator!
x = 2 + 5 # Here, we create a variable `x` and bind the value of 2 + 5 to it
x

7

In [None]:
# You can include various mathematical expressions:
(1 + 2) * (3 - x) / 0.5

-24.0

In [None]:
# Exponentiation:
x ** 2

49

In [None]:
#  ^ operator in Python is a bitwise XOR (exclusive OR) operator, not an exponentiation operator.
# Here's how it works:
# The number 7 in binary is 0111.
# The number 2 in binary is 0010
x ^ 2

5

In [None]:
# Division v.s. integer division:
# floor division
print(x // 3)
print(x / 3)
# Computing the remainder:
print(x % 3)

2
2.3333333333333335
1


There are a few different primitive data types for expressing numbers, such as `int` for integers and `float` for floating-point decimals.

In [None]:
# Get the type of a variable
type(x)

int

You can round decimals to the required precision:

In [None]:
# Rounding numbers
round(x / 3, ndigits = 1)

2.3

You can also print numbers in a easy-to-read format (see https://docs.python.org/3/library/stdtypes.html#old-string-formatting)

In [None]:
# Formatting numbers for printing (different from rounding)
format(x / 3, ".2f")

'2.33'

#### **Advanced material**: complex numbers, custom number types

In [None]:
# In-built support for complex numbers
1 + 2.3j

(1+2.3j)

In [None]:
# `Decimal` for fixed-point decimal arithmetic
from decimal import Decimal
print(Decimal(0.1)) # what's happening here?

# When you use Decimal from the decimal module in Python, it creates a decimal
# object that provides more precision and control over floating-point arithmetic
# than the standard float type. When you write Decimal(0.1), you are passing a
# floating-point number (0.1) to the Decimal constructor.

print(Decimal("0.1"))

0.1000000000000000055511151231257827021181583404541015625
0.1


In [None]:
# `Fraction` for storing rational numbers exactly

# When you write Fraction(0.3), you're passing a floating-point number (0.3)
#  to the Fraction constructor.
# Like with the Decimal example earlier, the floating-point representation of
#  0.3 is not exact in binary. The actual binary representation is a number
# close to 0.3, but not exactly 0.3.
# As a result, the Fraction object represents this inexact value as a fraction.
# The output will be a fraction that is equivalent to the binary approximation of 0.3.

from fractions import Fraction
print(Fraction(0.3)) # Again, ???
print(Fraction("0.3"))
print(Fraction(3, 10))

5404319552844595/18014398509481984
3/10
3/10


(Further reading: https://docs.python.org/3/tutorial/floatingpoint.html#tut-fp-issues)

### Text

Text are represented in Python via the string (`str`) data type.

In [None]:
my_string = "Hello world!"
print(my_string)

Hello world!


One can represent the same string in different ways:

In [None]:
# Single quotation marks
print('Hello world!')
# Double quotation marks
print("Hello world!")
# Double quotes become useful if you have to quote single quotes:
print('doesn\'t')
print("doesn't")

Hello world!
Hello world!
doesn't
doesn't


There are special characters you can introduce in strings (called "escape characters"). Some examples:

In [None]:
print("Hello\tworld!") # tab
print("Hello\nworld!") # newline

Hello	world!
Hello
world!


Here's a few ways to represent a string over multiple lines:

In [None]:
# Using \n
print("Machine Learning under a\nModern Optimization Lens")
print()
# Using triple quotes
print("""
Machine Learning under a
Modern Optimization Lens
""")

Machine Learning under a
Modern Optimization Lens


Machine Learning under a
Modern Optimization Lens



You can concatenate (glue together) strings using `+` and repeat them using `*`:

In [None]:
2 * "one plus " + "two equals four"

'one plus one plus two equals four'

Under the hood, a string is a sequence of characters! This means you can index into strings to retrieve specific characters:

In [None]:
my_string

'Hello world!'

In [None]:
print(my_string[6]) # character in position 6
print(my_string[-1]) # character in last position

w
!


Python is a zero-indexed language, so the first index of any sequence is 0. You can even look at substrings of a string by using slicing:

In [None]:
print(my_string[0:5]) # characters from position 0 (incl) to position 5 (excl)
print(my_string[-6:]) # characters from position -6 (incl) to the end
print(my_string[0::1]) # every first character, from position 0 (incl) to the end
print(my_string[0::2]) # every other character, from position 0 (incl) to the end

Hello
world!
Hello world!
Hlowrd


There are some built-in functions you can apply to strings such as `len()` to get its length:

In [None]:
len(my_string)

12

Strings are *immutable*, which means that once created and assigned to a variable, they cannot be changed. If you want to change certain characters in a string, you should create a new string:

In [None]:
my_string[6:] = "NYU!"

TypeError: 'str' object does not support item assignment

In [None]:
my_new_string = my_string[:6] + "NYU!"
print(my_new_string)

Hello NYU!


Further reading:
- On string methods (operations on strings): https://docs.python.org/3/library/stdtypes.html#string-methods
- Formatted string literals (when you want to print strings with values in them): https://docs.python.org/3/reference/lexical_analysis.html#f-strings

### Lists

We've already seen strings as a sequence of characters, which you can index into with numbers (single positions) or slices (substrings). Lists are our first example of composite data types, and are flexible containers in square brackets with items separated by commas:

In [None]:
squares = [1, 4, 9, 16, 25]
squares

[1, 4, 9, 16, 25]

Just like other sequence types (like strings), you can index and slice into them:

In [None]:
print(squares[2]) # indexing returns the item
print(squares[0::2]) # slicing returns a new list
print(len(squares))

9
[1, 9, 25]
5


Lists are *mutable* data structures. What this means is that you can change them (or their elements) after they are created. This is very unlike strings! Let's see an example:

In [None]:
not_squares = squares.copy() # returns a copy of the list

In [None]:
not_squares

[1, 4, 9, 16, 25]

In [None]:
not_squares[1] = 3
not_squares

[1, 3, 9, 16, 25]

In [None]:
squares # not modified, since we modified the copy

[1, 4, 9, 16, 25]

Lists have many built-in convenience functions:
- we can reverse them with `reverse()`;
- we can append (using `append()`) items to the end of the list;
- we can extend a list (using `extend()`) with items from another list (in fact, another sequence);
- we can get the last element and remove it from the list using `pop()`;

In [None]:
# Reverses the list in-place
squares.reverse()
squares

[25, 16, 9, 4, 1]

In [None]:
# Adds an element to the end of the list
squares.append(36)
squares

[25, 16, 9, 4, 1, 36]

In [None]:
# Extends with other lists
squares.extend([49, 64])
squares

[25, 16, 9, 4, 1, 36, 49, 64]

In [None]:
# What's the difference? Here's an example:
squares.append([81, 100])
squares

[25, 16, 9, 4, 1, 36, 49, 64, [81, 100]]

In [None]:
# Pop the last element from the list
last = squares.pop()
print("last: ", last)
print(squares)

last:  [81, 100]
[25, 16, 9, 4, 1, 36, 49, 64]


If all elements of the list are numbers, we can `sum` them; also, we can `sort` them if the elements are comparable.

In [None]:
sum(squares)

204

In [None]:
# Version 1: returning a new list
sorted(squares)

[1, 4, 9, 16, 25, 36, 49, 64]

In [None]:
squares # is this going to be sorted?

[25, 16, 9, 4, 1, 36, 49, 64]

In [None]:
# Version 2: sorting the list inplace
squares.sort()
squares

[1, 4, 9, 16, 25, 36, 49, 64]

In [None]:
# Sorting in reverse order
squares.sort(reverse = True)
squares

[64, 49, 36, 25, 16, 9, 4, 1]

Further reading:
- List methods (and methods for other sequence types): https://docs.python.org/3/library/stdtypes.html#typesseq-common

#### **Advanced material**: shallow vs deep copies
- https://docs.python.org/3/library/copy.html#shallow-vs-deep-copy

# Module 2 - Reading from files

We've now seen how Python stores data in variables and groups logical components together to form functions. One often needs to access data stored in files and write to files, and we'll see how to do that here.

---

The file of interest is `wordle_words.txt`, a list of 5-letter English words used in the game Wordle.

### Reading from files

There are several ways to open a file. We'll see a few of them:

In [None]:
# First method: using f.read()
!wget -O wordle_words.txt https://www.dropbox.com/scl/fi/8wbkx07e7p08f354ejzrm/wordle_words.txt?rlkey=1wwol9rbkzcavi3n0oflmkr6z&dl=0

filepath = "wordle_words.txt"
with open(filepath, "r") as f:
    file = f.read()

--2025-04-08 16:04:34--  https://www.dropbox.com/scl/fi/8wbkx07e7p08f354ejzrm/wordle_words.txt?rlkey=1wwol9rbkzcavi3n0oflmkr6z
Resolving www.dropbox.com (www.dropbox.com)... 162.125.2.18, 2620:100:6017:18::a27d:212
Connecting to www.dropbox.com (www.dropbox.com)|162.125.2.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://uc8fea42af994a1daf900da8c1b6.dl.dropboxusercontent.com/cd/0/inline/CnYWaZXngKcgFoOMDvqv_2XiiHkenT8jnxUJpfAGreHFdmWplOtiL_48VOrVvfgOwGN-AdycFzTKF0nsU9AScqYDmE0kw1QMV-Rbfl7EfDHHijCgLf0VH3PEhfZAbZXRneQMMmxUmuT-BDUFq_CmudnK/file# [following]
--2025-04-08 16:04:35--  https://uc8fea42af994a1daf900da8c1b6.dl.dropboxusercontent.com/cd/0/inline/CnYWaZXngKcgFoOMDvqv_2XiiHkenT8jnxUJpfAGreHFdmWplOtiL_48VOrVvfgOwGN-AdycFzTKF0nsU9AScqYDmE0kw1QMV-Rbfl7EfDHHijCgLf0VH3PEhfZAbZXRneQMMmxUmuT-BDUFq_CmudnK/file
Resolving uc8fea42af994a1daf900da8c1b6.dl.dropboxusercontent.com (uc8fea42af994a1daf900da8c1b6.dl.dropboxusercontent.com)... 162.125.2.15, 26

In the above code, we opened our file using the `open()` function. It takes the filepath of the file, together with a second argument denoting the *mode* in which to open the file (here `"r"` indicates we are opening the file in read-only mode).

Here, we used the `with` keyword (more info [here](https://docs.python.org/3/reference/compound_stmts.html#with)). This keyword defines a *context* in which to open the file, and makes sure that the appropriate setup and cleanup actions are taken (e.g. the file is closed after it is read). It's generally good practice to use `with` statements to read from and write to files.

Here, `f` is now the opened file object, from which `file` contains the contents (what happens after you read it using `f.read()`). Let's examine the file contents:

In [None]:
file

'aback\nabase\nabate\nabbey\nabbot\nabhor\nabide\nabled\nabode\nabort\nabout\nabove\nabuse\nabyss\nacorn\nacrid\nactor\nacute\nadage\nadapt\nadept\nadmin\nadmit\nadobe\nadopt\nadore\nadorn\nadult\naffix\nafire\nafoot\nafoul\nafter\nagain\nagape\nagate\nagent\nagile\naging\naglow\nagony\nagree\nahead\naider\naisle\nalarm\nalbum\nalert\nalgae\nalibi\nalien\nalign\nalike\nalive\nallay\nalley\nallot\nallow\nalloy\naloft\nalone\nalong\naloof\naloud\nalpha\naltar\nalter\namass\namaze\namber\namble\namend\namiss\namity\namong\nample\namply\namuse\nangel\nanger\nangle\nangry\nangst\nanime\nankle\nannex\nannoy\nannul\nanode\nantic\nanvil\naorta\napart\naphid\naping\napnea\napple\napply\napron\naptly\narbor\nardor\narena\nargue\narise\narmor\naroma\narose\narray\narrow\narson\nartsy\nascot\nashen\naside\naskew\nassay\nasset\natoll\natone\nattic\naudio\naudit\naugur\naunty\navail\navert\navian\navoid\nawait\nawake\naward\naware\nawash\nawful\nawoke\naxial\naxiom\naxion\nazure\nbacon\nbadge\nbadly

This is good, but not very helpful! Ideally we would like a list of words. Thankfully, `file` is now just a string, and we can split it with `splitlines()` (or `strip()` and the correct arguments):

In [None]:
lines_1 = file.splitlines()
# or: lines_1 = file.split(sep = "\n")
lines_1[:5]

['aback', 'abase', 'abate', 'abbey', 'abbot']

This pattern of reading text from a file, then extracting the lines from it is a very common pattern. One can accomplish this in one step using `readlines()`:

In [None]:
with open(filepath, "r") as f:
    lines_2 = f.readlines()

In [None]:
lines_2[:5]

['aback\n', 'abase\n', 'abate\n', 'abbey\n', 'abbot\n']

The `\n` character you see at the end of these words is the newline character, and prints a new line. Here's what the effect of printing it is:

In [None]:
print(lines_1[0])
print(lines_2[0])

aback
aback



In [None]:
print(repr(lines_1[0]))
print(repr(lines_2[0]))

'aback'
'aback\n'


To remove these trailing newline characters, you can use the `rstrip()` function (or to remove trailing and starting whitespace, use `strip()`):

In [None]:
# The repr() function in Python returns a string that represents the object in
# a way that is meant to be unambiguous. For strings, repr() often adds quotes
# around the string and includes any special characters (like \n for newlines,
# \t for tabs, etc.) in their escaped forms.
print(repr(lines_2[0].rstrip()))

'aback'


### Writing to files

The next natural thing to want to do is to write to a file. Instead of using the `open()` function in read-only mode, we can use the `"w"` mode to write to a file (or `"a"` to append to a file that already has content).

We write to files using the `write()` function, as demonstrated below:

In [None]:
from google.colab import drive
drive.mount('/content/drive') #mount the drive

Mounted at /content/drive


In [None]:
write_filepath = "/content/drive/MyDrive/sample.txt"
with open(write_filepath, "w") as f:
    f.write("Hello world!\n")


In [None]:
write_filepath = "/content/drive/MyDrive/sample.txt"
with open(write_filepath, "a") as f:
    f.write("Hello NYU!\n")

### (Advanced) Reading and writing raw data to files

So far we've seen how we read and write to text files. However, sometimes we want to store Python data structures directly to a file. We'll see later how to store numeric data and dataframes directly to file, but here's a way using the `pickle` module to *serialize* arbitrary Python data to file.

In [None]:
import pickle

In [None]:
pickle_filepath = "/content/drive/MyDrive/sample_data.p"

In [None]:
with open(pickle_filepath, "wb") as f:
    pickle.dump(lines_1, f)

Here, we open the filepath in `"wb"` mode ("write-bytes"), since we are writing binary data to it. We write to a file using the `pickle.dump()` function.

We can read from a pickled file using `pickle.load()`, as follows:

In [None]:
with open(pickle_filepath, "rb") as f:
    lines_1_p = pickle.load(f)

In [None]:
lines_1_p[:5]

['aback', 'abase', 'abate', 'abbey', 'abbot']

Further reading: https://docs.python.org/3/library/pickle.html#module-pickle

# Module 3 - Advanced Data Structures

So far, we've seen one example of a composite data structure, the `list` (and also our only example of a sequence type). Python actually has a variety of these data structures, which allow us to organize our data in a logical way, and perform computations efficiently.

---

### Tuples

A tuple is like a list, but it is *immutable*, meaning it cannot be modified after creation. We create tuples using round brackets, as follows:

In [None]:
my_tuple = (1, "a", [1, 2])

In [None]:
my_tuple[1:3]

('a', [1, 2])

In [None]:
# Tuples do not support item assignment
my_tuple[0] = 2

TypeError: 'tuple' object does not support item assignment

In [None]:
# But they can contain mutable objects, and those can be mutated:

In [None]:
my_tuple[2].append(3)

In [None]:
my_tuple

(1, 'a', [1, 2, 3])

Tuples can be *packed* and *unpacked*. Tuple packing refers to combining one or more objects into a tuple:

In [None]:
my_tuple = 1, "a", [1, 2] # notice the lack of parentheses
my_tuple

(1, 'a', [1, 2])

While tuple unpacking allows us to create new references to individual elements in the tuple:

In [None]:
# Tuple unpacking:
x, y, z = my_tuple
print(x)
print(y)
print(z)

1
a
[1, 2]


(Actually, tuple unpacking is a special case of sequence unpacking and works for any sequence type:)

In [None]:
x, y, z = [1, 2, 3]
print(x)
print(y)
print(z)

1
2
3


### Sets

A Python set -- like the math definition of a set -- is an *unordered* collection with no duplicate elements. It's useful for membership testing and eliminating duplicate entries. Here's an example:

In [None]:
my_quiz_scores = [7,3,4,8,8,6,7,4]

In [None]:
my_quiz_scores_set = set(my_quiz_scores)
my_quiz_scores_set

{3, 4, 6, 7, 8}

Many of the same things you can do with lists, you can do with sets too. For example:

In [None]:
4 in my_quiz_scores_set

True

In [None]:
# A new student's score arrive, and they scored 10
my_quiz_scores_set.add(10) # instead of append() for lists, since sets are unordered
my_quiz_scores_set

{3, 4, 6, 7, 8, 10}

There are also set-specific operations you can do with two sets; here are some examples:

In [None]:
passing_scores = set(range(5, 11))
passing_scores

{5, 6, 7, 8, 9, 10}

In [None]:
# set union
print(passing_scores | my_quiz_scores_set)
print(passing_scores.union(my_quiz_scores_set))

{3, 4, 5, 6, 7, 8, 9, 10}
{3, 4, 5, 6, 7, 8, 9, 10}


In [None]:
# set intersection
print(passing_scores & my_quiz_scores_set)
print(passing_scores.intersection(my_quiz_scores_set))

{8, 10, 6, 7}
{8, 10, 6, 7}


In [None]:
# set difference
print(my_quiz_scores_set - passing_scores)
print(my_quiz_scores_set.difference(passing_scores))

{3, 4}
{3, 4}


A list of the various set operations available can be found [here](https://docs.python.org/3/library/stdtypes.html#set). Alternatively, you can use the `help()` function to expose the methods of a Python object:


In [None]:
help(my_quiz_scores_set)

Help on set object:

class set(object)
 |  set() -> new empty set object
 |  set(iterable) -> new set object
 |  
 |  Build an unordered collection of unique elements.
 |  
 |  Methods defined here:
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __contains__(...)
 |      x.__contains__(y) <==> y in x.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iand__(self, value, /)
 |      Return self&=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __ior__(self, value, /)
 |      Return self|=value.
 |  
 |  __isub__(self, value, /)
 |      Return self-=value.
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __ixor__(self, value, /)
 |      Return self^=value.


## Dictionaries

So far, we've seen a few different examples of composite data structures: the `list`, the `tuple`, and the `set`. All of these are collections; the former two are also sequences (ordered collections), while the set is an unordered collection.

Here we introduce a different data structure (and perhaps the most Python-ic): the dictionary. It's a *mapping* type, and maps keys to values. Keys have to be unique within a dictionary, and allows us to look up items by key, instead of by an integer index in the case of lists. Here's an example:

In [None]:
# Create an empty dictionary of products and their prices
product_prices = {}

# Adds products (keys) and prices (values) to the dictionary
product_prices["apple"] = 2
product_prices["banana"] = 2
product_prices["carrot"] = 3

print(product_prices)


{'apple': 2, 'banana': 2, 'carrot': 3}


In the above code, we:
- initialized an empty dictionary using `{}` (one can also use `dict()`)
- added key-value pairs using the syntax `dict[key] = value`

You can use dictionaries to test for membership, like for other collection types. You can also access values by their key:

In [None]:
# membership tests
print("apple" in product_prices)
print("apple" in product_prices.keys())
print("donut" in product_prices)


True
True
False


In [None]:
# accessing values by key
print("apple", product_prices["apple"])

apple 2


In [None]:
# use the get() function if you don't know if the collection has the key!
print("donut", product_prices.get("donut"))

donut None


Since a dictionary is also a collection, you can iterate over it in a `for` loop. Here are a few ways to do it:

In [None]:
# iterate over dictionary keys
print(product_prices.keys())
for item in product_prices.keys():
    print(item, product_prices[item])

dict_keys(['apple', 'banana', 'carrot'])
apple 2
banana 2
carrot 3


In [None]:
# iterate over dictionary key-value pairs (very useful!)
print(product_prices.items())
for item, price in product_prices.items():
    print(item, price)

dict_items([('apple', 2), ('banana', 2), ('carrot', 3)])
apple 2
banana 2
carrot 3


You can also:
- remove keys and their values
- update with key-value pairs of other dictionaries

In [None]:
del product_prices["carrot"]

In [None]:
more_prices = {"donut": 4, "eggplant": 3}
product_prices.update(more_prices)
product_prices

{'apple': 2, 'banana': 2, 'donut': 4, 'eggplant': 3}

In [None]:
product_prices[[1,2,3]] = "No"

TypeError: unhashable type: 'list'

In [None]:
product_prices[(1, 2, 3)] = "Yes"

For everything you can do with dictionaries, see here: https://docs.python.org/3/library/stdtypes.html#mapping-types-dict


### Comprehensions

So far, we have seen ways to create lists and add items to lists via `append()` (or `add()` for sets, and `update()` for dictionaries). One very common pattern is to iterative add items to a collection using a `for` loop, and there is a very Pythonic for this: comprehensions.

Below is an example of a list comprehension:

In [None]:
# Recall our earlier example?
squares = [1, 4, 9, 16, 25]

In [None]:
# Instead of this:
squares = []
for i in range(1, 6):
    squares.append(i ** 2)
squares

[1, 4, 9, 16, 25]

In [None]:
squares = [i ** 2 for i in range(1, 6)]

A list comprehension consists of brackets containing an expression followed by a `for` clause, then zero or more `for` or `if` clauses. This allows for nested `for` clauses and `if` conditions:

In [None]:
triangle = [
    (i, j)
    for i in range(4)
    for j in range(i)
    if j % 2 == 0
]
triangle

[(1, 0), (2, 0), (3, 0), (3, 2)]

In [None]:
list(range(4))

[0, 1, 2, 3]

In [None]:
# this is so similar to the following code:
triangle = []
for i in range(4): # 0, 1, 2, 3
    for j in range(i):
        # (1,0),
        # (2,0), (2,1)
        # (3,0), (3,1), (3,2)
        if j % 2 == 0:
            triangle.append((i, j))
triangle

[(1, 0), (2, 0), (3, 0), (3, 2)]

List comprehensions are very useful for:
- making new lists where each element is the result of operations on each member of another sequence,
- or creating a new subsequence of those elements that satisfy a condition.


We can use the same syntax to construct sets:

In [None]:
squares_set = {i**2 for i in range(1, 6)}
squares_set

{1, 4, 9, 16, 25}

And dictionaries:

In [None]:
squares_dict = {i: i ** 2 for i in range(1, 6)}
squares_dict

{1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

If we use parentheses instead of square brackets, we get something unexpected:

In [None]:
(i**2 for i in range(1, 6))

<generator object <genexpr> at 0x7a7c327b0e10>

What are these? these are *generator expressions*, which returns an iterator. An iterator is very similar to a list, (both are ordered collections), just that the iterator does not create and store all the objects in memory. This makes it useful for iterating over the collection under memory constraints. We illustrate this below:

In [None]:
squares_large = [i**2 for i in range(10000000)]
for i in squares_large:
    print(i)
    break

0


In [None]:
squares_large_genexp = (i**2 for i in range(10000000))
for i in squares_large_genexp:
    print(i)
    break

0


We can even dig in to recover the size (in bytes) that the list comprehension and the generator expression hold in memory:

In [None]:
import sys

print(sys.getsizeof(squares_large))
print(sys.getsizeof(squares_large_genexp))

89095160
208
