<a href="https://colab.research.google.com/github/stevenkhwun/P4DS/blob/main/Chp02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Built-In Data Structures, Functions, and Files

This notebook is based on [Chapter 3](https://wesmckinney.com/book/python-builtin) of Python for Data Analysis (3rd ed.) by Wes Mckinney.

## Data Structures and Sequences

### List

Lists are **variable length** and their contents can be modified in place. Lists are **mutable**. You can define them using square brackets **`[]`** or using the **`list`** type function:

**Creating a list**

In [1]:
# Create a list
a_list = [2, 3, 7, None]

In [2]:
a_list

[2, 3, 7, None]

In [3]:
# Create a list using list
tup = ("foo", "bar", "baz")    # This is a tuple
b_list = list(tup)

In [4]:
b_list

['foo', 'bar', 'baz']

In [5]:
# Assign a new value to a list
b_list[1] = "peekaboo"

In [6]:
b_list

['foo', 'peekaboo', 'baz']

*Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.*

*The **`list`** built-in function is frequently used in data processing as a way to **materialze an iterator or generator** expression:*

In [11]:
# Create a range
gen = range(10)

In [12]:
gen

range(0, 10)

In [13]:
# Materialize an iterator using list
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

**Adding and removing elements**

*Elements can be appended to the end of the list with the `append` method.*

In [14]:
# Append an element to the end of the list
b_list.append("dwarf")

In [15]:
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

*Using `insert` you can insert an element at a specific location in the list.*

In [16]:
# Insert an element at a specific location in the list
b_list.insert(1, "red")

In [17]:
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

*The inverse operation to `insert` is `pop`, which removes and returns an element at a particular index.*

In [18]:
# Removes and returns an element at a particular index
b_list.pop(2)

'peekaboo'

In [19]:
b_list

['foo', 'red', 'baz', 'dwarf']

*Elements can be removed by value with `remove`, which locates the first such value and removes it from the list.*

In [20]:
# Setting up: Append 'foo' to the list
b_list.append("foo")

In [21]:
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [22]:
# Remove the first 'foo' in the list by method 'remove'
b_list.remove("foo")

In [23]:
b_list

['red', 'baz', 'dwarf', 'foo']

**Keywords `in` and `not`**

*Check if a list contains a value using the `in` keyword.*

In [24]:
# Check if a list contains a value using the in keyword
"dwarf" in b_list

True

*The keyword `not` can be used to negate `in`*

In [25]:
# The keyword not can be used to negate in
"dwarf" not in b_list

False

**Concatenating and combining lists**

*Adding two lists together with `+` concatenates them.*

In [26]:
# Adding two lists together with +
[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

*If you have a list already defined, you can append multiple elements to it using the `extend` method.*

In [40]:
# Append multiple elements using extend method
x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])
x

[4, None, 'foo', 7, 8, (2, 3)]

#### Sorting

You can sort a list in place (without creating a new object) by calling its **`sort`** function:

In [41]:
# Sorting
a = [7, 2, 5, 1, 3]
a.sort()
a

[1, 2, 3, 5, 7]

In [45]:
# Sort a collection of strings by their lengths
b = ["saw", "small", "He", "foxes", "six"]
b.sort(key=len)
b

['He', 'saw', 'six', 'small', 'foxes']

In [43]:
b.sort(key=len)
b

['He', 'saw', 'six', 'foxes', 'small']