<a href="https://colab.research.google.com/github/stevenkhwun/P4DS/blob/main/Chp02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Built-In Data Structures, Functions, and Files

This notebook is based on [Chapter 3](https://wesmckinney.com/book/python-builtin) of Python for Data Analysis (3rd ed.) by Wes Mckinney.

## Data Structures and Sequences

### Dictionary

The dictionary or __`dict`__ may be the most important built-in Python data structure. A dictionary stores a collection of __*key-value*__ pairs, where __*key*__ and __*value*__ are Python objects. Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key. One approach for creating a dictionary is to use curly braces __`{}`__ and colons to separate keys and values.

**Creating a dictionary**

In [2]:
# Create an empty dictionary
empty_dict = {}

In [3]:
empty_dict

{}

In [4]:
# Create a dictionary
d1 = {"a": "some value", "b": [1, 2, 3, 4]}

In [5]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [3]:
# Create a list using list
tup = ("foo", "bar", "baz")    # This is a tuple
b_list = list(tup)

In [4]:
b_list

['foo', 'bar', 'baz']

In [5]:
# Assign a new value to a list
b_list[1] = "peekaboo"

In [6]:
b_list

['foo', 'peekaboo', 'baz']

*Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.*

*The __`list`__ built-in function is frequently used in data processing as a way to __materialze an iterator or generator__ expression:*

In [11]:
# Create a range
gen = range(10)

In [12]:
gen

range(0, 10)

In [13]:
# Materialize an iterator using list
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

__Adding and removing elements__

*Elements can be appended to the end of the list with the __`append`__ method.*

In [14]:
# Append an element to the end of the list
b_list.append("dwarf")

In [15]:
b_list

['foo', 'peekaboo', 'baz', 'dwarf']

*Using __`insert`__ you can insert an element at a specific location in the list.*

In [16]:
# Insert an element at a specific location in the list
b_list.insert(1, "red")

In [17]:
b_list

['foo', 'red', 'peekaboo', 'baz', 'dwarf']

*The inverse operation to __`insert`__ is __`pop`__, which removes and returns an element at a particular index.*

In [18]:
# Removes and returns an element at a particular index
b_list.pop(2)

'peekaboo'

In [19]:
b_list

['foo', 'red', 'baz', 'dwarf']

*Elements can be removed by value with __`remove`__, which locates the first such value and removes it from the list.*

In [20]:
# Setting up: Append 'foo' to the list
b_list.append("foo")

In [21]:
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [22]:
# Remove the first 'foo' in the list by method 'remove'
b_list.remove("foo")

In [23]:
b_list

['red', 'baz', 'dwarf', 'foo']

**Keywords `in` and `not`**

*Check if a list contains a value using the __`in`__ keyword.*

In [24]:
# Check if a list contains a value using the in keyword
"dwarf" in b_list

True

*The keyword __`not`__ can be used to negate __`in`__*

In [25]:
# The keyword not can be used to negate in
"dwarf" not in b_list

False

**Concatenating and combining lists**

*Adding two lists together with __`+`__ concatenates them.*

In [26]:
# Adding two lists together with +
[4, None, "foo"] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

*If you have a list already defined, you can append multiple elements to it using the **`extend`** method.*

In [1]:
# Append multiple elements using extend method
x = [4, None, "foo"]
x.extend([7, 8, (2, 3)])

In [2]:
x

[4, None, 'foo', 7, 8, (2, 3)]

**Sorting**

*You can sort a list in place (without creating a new object) by calling its __`sort`__ function:*

In [3]:
# Sorting
a = [7, 2, 5, 1, 3]
a.sort()

In [4]:
a

[1, 2, 3, 5, 7]

*__`sort`__ has a few options that will occasionally come in handy. For example, we could sort a collection of strings by their lengths.*

In [5]:
# Sort a collection of strings by their lengths
b = ["saw", "small", "He", "foxes", "six"]
b.sort(key=len)

In [6]:
b

['He', 'saw', 'six', 'small', 'foxes']

__Slicing__

*You can select sections of most sequene types by using slice notation, which in its basic form consists of __`start:stop`__ passed to the indexing operator __`[]`__.*

In [2]:
# Set up
seq = [7, 2, 3, 7, 5, 6, 0, 1]

In [3]:
# Slicing
seq[1:5]

[2, 3, 7, 5]

*Slices can also be assigned with a sequence.*

In [4]:
# Slice assigned with a sequence
seq[3:5] = [6, 3]

In [5]:
seq

[7, 2, 3, 6, 3, 6, 0, 1]

*While the element at the __`start`__ index is included, the __`stop`__ index is not included, so that the number of elements in the result is __`stop - start`__.*

*Either the __`start`__ or __`stop`__ can be omitted, in which case they default to the start of sequence and the end of the sequence, respectively.*

In [6]:
# Start of sequence
seq[:5]         # seq[0:5]

[7, 2, 3, 6, 3]

In [8]:
# End of sequence
seq[3:]         # seq[3:8]  (The no. of elements in the list is 8)

[6, 3, 6, 0, 1]

*__Negative indices__ slice the sequence relative to the end.*

In [21]:
# The last element
seq[-1:]

[1]

In [17]:
# The second last element
seq[-2:-1]

[0]

In [18]:
# The third last element to the end
seq[-3:]

[6, 0, 1]

In [19]:
# The second and the third last element
seq[-3:-1]

[6, 0]

*A __`step`__ can also be used after a second colon to, say, take every other element.*

In [23]:
# Step
seq[::2]

[7, 3, 3, 0]

*Use __`step`__ __`-1`__ to reverse a list or tuple.*

In [24]:
# Reverse a list
seq[::-1]

[1, 0, 6, 3, 6, 3, 2, 7]