# Collection

Python provides some built-in collection types that can contain multiple items.

Following are commonly used collection types:

- sequence
  - `str`: an immutable sequence of characters.
  - `list`: a mutable sequence of any objects.
  - `tuple`: an immutable sequence of objects.
- mapping
  - `dict`: a mutable mapping of key and value pairs.
- set
  - `set`: a mutable unordered collection of unique objects.

# String

A string is a sequence of characters.

A string is an immutable object. 

A programmer creates a string literal by surrounding text with single or double quotes, such as `'MARY'`, `"MARY"`, `'41'`, or `"41"`.

An empty string is a sequence type with 0 elements, created with two quotes. Ex: `my_str = ''`.

Python use backslash `\` to escape special characters. For example: `"\n"` represents a newline character. It is also used to escape a slash or quotation symbols. For example: `"a slash \\ and an escaped \" double quotation mark.`

## Built-in String Operations

A programmer can access a character at a specific index using `alphabet[index]`. `index` starts from `0` and ends with `len(alphabet) - 1`.

The `len()` built-in function can be used to find the length of a string (and any other sequence type).

Use `+` to concatenate two strings.

Use `in` to determine if a character or a substring exists in a string. It returns `True` or `False`

Use `for` loop to iterate every character.

In [None]:
text = "Hello World"

print(len(text))  # 11

print (text + " to every one.") # Hello World to every one.

print("H" in text, "hi" in text) # True False

for char in text:
    print(char, end=" ")    
# H e l l o   W o r l d 

## Slicing

Python has a special slicing syntax `sequence[start:stop:step]` to get a subset of a sequence.

- `start`: optional, starting index of the slice.
- `stop`: the last index (exclusive) of the slide or the number of items to get. It is optional with a default to `len(sequence)` if the `start` argument is specified.
- `step`: optional, the step value with a default of `1`.

You can slice a string to get another string. 

You can also use slicing syntax in other sequence types that works similarly.

In [None]:
text = "hello world"

print(text[2]) # regular index: "l"
print(text[2:]) # from index 2 to end: "llo world"
print(text[0:3]) # first three: "hel"
print(text[::4]) # every fourth character: "hor"
print(text[3::2]) # from index 3, every other characters: "l ol"

## Common String Methods

Because string is an immutable object, any method that changes the string content will create a new string.

In [18]:
hi = "Hi"
text = "hello world"

# find the index of the first occurrence of a substring, -1 if not found
print(text.find('o')) # 4
print(text.find("alice")) # -1

# Lower case, upper case and title case
print(hi.lower()) # hi
print(text.upper()) # HELLO WORLD
print(text.title()) # Hello World

# split a string into a list by the specified separator string
# default is white space
print(text.split()) # ["hello", "world"]
print(text.split("ll"))  # ["he", "o world"]

# replace a substr with another substr
print(hi.replace('i', 'a')) # Ha
print(text.replace("world", "alice")) # hello alice

# join a list of strings together with the desired separator string
print(", ".join(["alice", "bob", "cindy"])) # alice, bob, cindy

4
-1
hi
HELLO WORLD
Hello World
['hello', 'world']
['he', 'o world']
Ha
hello alice
alice, bob, cindy





## 3.2 List Basics

A **list** is a sequential container (similar to a string) that composites values.

- Use `my_list = ['foo', 'bar' ]` to create a list of two elements.
- Use `my_list[index]` to access a single element at the `index`, starting from `0`.
- A list is **mutable**:
  - `my_list[1] = 'b'`.
  - `append`, `pop`, `remove`
- Functions working with a list: `len`, `+`, `min` etc.
- List methods: `index(val)`, `count(val)`

## 3.3 Tuple Basics

- Tuple is an unnamed immutable composite value: `(1, 3, 5)`
- Import `namedtuple` to use a named tuple allows the programmer to define a new simple data type that consists of named attributes.

There are functions and methods working with a tuple.

## 3.4 Set Basics

A set is an unordered collection of unique elements. Sets have the following properties:

- Elements are **unordered**: Elements in the set do not have a position or index.
- Elements are **unique**: No elements in the set share the same value.

There are functions and methods working with a set. Using `in` operator to check memebership is probably the most used operation in set.

## 3.5 Dictionary Basics

A dictionary is a Python container used to describe associative relationships. A dictionary associates (or "maps") **keys** with **values**.

Two important observations:

- List is a special type of dictionary, can you see the link?
- It can be the base of OO programming.

There are functions (**CRUD**) and methods working with a tuple.

# List

This section introduces the list data type and basic list operations.

## 1 Introduction to Lists

A `list` is an object that contains multiple data items, one after another. Therefore a list is also a sequence. The item in an list is called an `element`. 

There are two common ways to create a list. First, you can create a list literally by listing elements in brackets and separating by commas. For example:

In [None]:
# a list of numbers
some_numbers = [3, 5 ,7]

# a list of strings
names = ['Alice', 'Bob', 'Cindy']

# elements can be of different types
some_data = [3, 'Alice', 12.5] 

Second, Python has a built-in `list()` function that can convert certain types of objects to lists. For example, you can convert a range object to a list as the following:

In [None]:
generated_numbers = list(range(3, 8, 2))
letters = list('abc')

# print can print a list directly
print(generated_numbers)
print(letters)

## 2 Accesing List Elements

A list has a sequence of elements. A basic requirement is to access one element, all elements or some elements.

### 2.1 Indexing

To access a single element of a list, use an `index`. Each element in a list has an index associated with it, starting from `0`. The first element has an index of `0`, the second element has an index of `1`, and so on and so forth. The last element has an index of the list length minus `1`.

The synatx is to put an index in a pair of brackets, right after the list variable name.

In [None]:
numbers = [3, 5 ,7]

print(numbers[0], numbers[1], numbers[2])

Python supports negative index. `-1` identifies the last elment in a list, `-2` identifies the next to last element, and so on an so forth. The following statement print the elements in the reverse order.

In [None]:
numbers = [3, 5 ,7]

print(numbers[-1], numbers[-2], numbers[-3])

A common error in Python is the `IndexError`: it happens when the index is out of the boundary of a list. The valid index of `[3, 5 7]` is the range from `-3` to `2`. Outside this range, the code crashes with an error: `IndexError: list index out of range`.

In [None]:
numbers = [3, 5 ,7]

print(numbers[5])

### 2.2 Accessing All Element

You can use index to access all elements in a loop, one at a time. Python has a built-in function `len` to get the lenght of a lsit. Therefore, you can iterate over a list as the following:

In [None]:
numbers = [3, 5, 7]
length = len(numbers)
for index in range(length):
    print(f'Index: {index}, Value: {numbers[index]}')

The above method is often used when you need to use the index. For example, you want to show the order of a sorted list. If you don't need the index, Python provides an easier method to iterate over a list using a `for` loop:

In [None]:
numbers = list(range(3, 8, 2))
for number in numbers:
    doubled = number * 2
    print(f'Double elment {number} is {doubled}')

The `for` loop is preferred when you don't need the index because it is simpler than looping with index. It is also less error-prone because you don't need to check the index boundaries.

### 2.3 Slicing a List

A `slice` is a span of items taken from a list. It is used to select some elements from a list. To slice a list, you use the `list_name[start : end]` to specify the start index and end index of a list. Like the `range` syntax, it doesn't include the `end` index. Following are some examples:

In [None]:
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']

# index from 0 to 5, excluding 5
weekday = days[0:5] 
print(f'Weekdays are: {weekday}')

# default start is 0
weekday2 = days[:5] 
print(f'Weekdays version 2 are: {weekday}')

weekends = days[5:7]
print(f'Weekends are {weekends}')

# default end is the length
weekends2 = days[5:] 
print(f'Weekends versioin 2 are {weekends}')

## 3 List Operations

There are three types of operations to manipulate a list.

- operators: Python language provides operators such as `in`, `not in`, `+`, `*`, and `del` that work with a list.
- built-in functions: Python provides some built-in functions that take list(s) as argument(s). For example: `len`, `min`, `max` etc.
- list methods: you use `list_name.method_name()` to perform operations. 

### 3.1 Python Operators

- `in`: check if an item is a list element.
- `not in`: check if an item is not a list element.
- `+`: combine two lists
- `*`: repete a list for a number of times
- `del`: delete an element from a list


In [None]:
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
weekday = days[:5] 

today = 'Thu'
is_weekday = today in weekday
print(f'Today is {today}. Is weekday: {is_weekday}')

is_weekend = today not in weekday
print(f'Today is {today}. Is weekend: {is_weekend}')



To combine two or more lists, using `+` to create a concatenated list. To repeat a list for a number of time to create a new list, use `*`.

In [None]:
tens = [10, 20, 30]
handreds = [500, 600, 700]
tens_and_handreds = tens + handreds
print(tens_and_handreds)

repeated_tens = tens * 3
print(repeated_tens)

The `del` operator delete a list elment at the specified index. 


In [None]:
numbers = [3, 5, 7]
del numbers[1]
print(f'Deleting elemnt at index 1, numbers are: {numbers}')

### 3.2 Built-in Functions

Soem commonly used built-in functions are

- 'len': get the length of a list
- 'min': get the minimum element of a list
- 'max': get the maximum element of a list
- 'sum': get the sum of number elements of a list

In [None]:
numbers = [3, 5, 7]
length = len(numbers)
smallest = min(numbers)
biggest = max(numbers)
total = sum(numbers)
print(f'Length: {length}, Min: {smallest}, Max: {biggest}, Sum: {total}')

You can find more functions such as `mean`, `median` etc in the Python [`statistics` module](https://docs.python.org/3/library/statistics.html).

### 3.3 List Methods

Lists have numerous methods that you can use to manipulate a list. You use `list_name.method_name()` to call a method that work on a list. Some commonly used methods are:

- `append(element)`: add an  to the end of the list.
- `index(element)`: find the first index of an element, raise a `ValueError` if the item is not found. To avoid exception, use `elment in list_name` to check the existence first.
- `insert(index, element)`: insert an item at the specified index.
- `sort()`: sort the items in the list.

You can find more list methods in [Python List Document](https://docs.python.org/3/tutorial/datastructures.html). Following are some code samples of the above methods. Most methods make **in-place modification** of a list. It is important to differentiate an in-place modification and an operation that returns a new list. Both `+` and slicing return a new list.

In [None]:
numbers = [3, 5, 7]
n2 = numbers
n3 = numbers[:2]
numbers.append(42) 
print(numbers) # [3, 5, 7, 42]

if (5 in numbers):
    print(numbers.index(5)) # 1


numbers.insert(1, 50)
print(numbers) # [3, 50, 5, 7, 42]

numbers.sort()
print(numbers) # [3, 5, 7, 42, 50]
print(n2)
print(n3)

# 4 Loop Through a List and List Comprehension

You can loop through a list items by using `for` loop. `for x in [1, 2, 3]`. 

You can also loop through the index numbers: `for x in range(len(my_list))`.

If you need both the index and the item of a loop, use `for index, item in enumerate(my_list)`. 

List comprehension is an expression that generates a new list from an input list or a collection. A list comprehensioin can have three parts:

- an iteration expression: this is a standard for loop like `for item in input_list`.
- an expression for the return data. For example: `item * 2` that returns doubled value.
- an optional `if` clasues that filter items based on a boolean expression. For example: `if item < 10`.

Together, the syntax is `[result_expression itertion_expression if_clause]`.  For example, the following code generates a list that has double values of all even nubmers.


In [None]:
input_list = [1, 2, 3, 4, 5, 6, 7]
for n in input_list:
    print(n)

for index in range(len(input_list)):
    print(index)


for index, item in enumerate(input_list):
    print(index, item)

new_list = [item * 2 for item in input_list if item % 2 == 0]
print(new_list)


Exercise: please use `while` loop to iterate the above list.

The iteration expression can use a range or tuple data.

In [None]:
new_list2 = [item * 2 for item in range(10, 20) if item % 2 == 0]
print(new_list2)

# 5 Nested List

A list can have other lists as its elements. There is nothing special for nested lists, you just use the index to access each element in a list. For example

In [None]:
numbers = [1, [2, 3], [4, 5, 6]]
numbers[1].append(42)
del numbers[2][2]
print(numbers)

# 6 Using List as a Stack

A stack is a data structure that stores elements in an last in, first out (LIFO) manner. For example, Python runtime uses stack to manage calls -- named as a **call stack**. A stack supports two basic methods: 

- `push/append` that adds an element to the top of a stack.
- `pop` that pops an element from the top of a stack.

When using a list to implement a stack, the **top** is the end of a list. You can `append` (also called push in stack) an element and `pop` an element. Both operate at the end of a list.

In [None]:
numbers = [1, 2, 3]
numbers.append(37)
numbers.append(42)
top = numbers.pop()
print(top)