# Built-In Data Structures, Functions, and Files

This notebook is based on [Chapter 3](https://wesmckinney.com/book/python-builtin) of *Python for Data Analysis (3rd ed.)* by *Wes Mckinney*.

## Built-In Data Structures

### Built-In Sequence Functions

__`enumerate()` function__

*The __`enumerate()`__ function takes a collection (e.g. a tuple) and retruns it as an enumerate object. It adds a counter as the key of the enumerate object.*

In [11]:
# Use the enumerate() function
x = ('apple', 'banana', 'cherry')
y = enumerate(x)

In [12]:
y

<enumerate at 0x1bc9e62e9d0>

In [13]:
print(list(y))

[(0, 'apple'), (1, 'banana'), (2, 'cherry')]


*It's common when iterating over a sequence to want to keep track of the index of the current item. A do-it-yourself approach would look like:*

```Python
# do-it-yourself approach
index = 0
for value in collection:
    # do something with value
    index += 1
```

*Python has a built-in function, __`enumerate`__, which returns a sequence of __`(i, value)`__ tuples and can simply the above code.*

```Python
# Use enumerate() function
for index, value in enumerate(collection):
    # do something with value
```

In [16]:
# Example
for family in enumerate(['Lok Lok', 'Ka Ka', 'Bailey', 'Mui Mui', 'Moji']):
    print(family)

(0, 'Lok Lok')
(1, 'Ka Ka')
(2, 'Bailey')
(3, 'Mui Mui')
(4, 'Moji')


__`sorted()` function__

*The __`sorted()`__ function returns a new sorted list from the elements in any sequence.*

In [17]:
# sorted() function
sorted([7, 1, 2, 6, 0, 3, 2])

[0, 1, 2, 2, 3, 6, 7]

In [18]:
# sorted() function
sorted("horse race")

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

__`zip()` function__

*__`zip()`__ "pairs" up the elements of a number of lists, tuples, or other sequences to create a list of tuples.*

In [3]:
# Using zip() to pair up list
seq1 = ["foo", "bar", "baz"]
seq2 = ["one", "two", "three"]
zipped = zip(seq1, seq2)

In [5]:
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

*__`zip()`__ can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence.*

In [6]:
# Pairs up more than two series
seq3 = [False, True]
list(zip(seq1, seq2, seq3))

[('foo', 'one', False), ('bar', 'two', True)]

*A common use of __`zip()`__ is simultaneously iterating over multiple sequences, possibly also combined with __`enumerate`__.*

In [7]:
# Use zip() and enumerate() together
for index, (a, b) in enumerate(zip(seq1, seq2)):
    print(f"{index}: {a}, {b}")

0: foo, one
1: bar, two
2: baz, three


__`reversed()` function__

*__`reversed()`__ iterates over the elements of a sequence in reverse order.*

In [8]:
# Reverse a sequence
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

*Keep in mind that __`reversed()`__ is a generator, so it does not create the reversed sequence until materialized (e.g., with `list` or a `for` loop).*

### List, Set, and Dictionary Comprehensions

*__List comprehensions__ are a convenient and widely used Python language feature. They allow you to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter into one concise expression. They take the basic form:*

```Python
# List comprehensions basic form
[expr for value in collection if condition]
```

*This is equivalent to the following `for` loop:

```Python
# List comprehensions using for loop
result = []
for value in collection:
    if condition:
        result.append(expr)
```

*For example, given a list of strings, we could filter out strings with length `2` or less and convert them to uppercase.*

In [9]:
# List comprehensions example
strings = ["a", "as", "bat", "car", "dove", "python"]
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']