# Data Manipulation in Python

In [None]:
# No imports today



# Objectives

- Write functions to transform data
- Construct list and dictionary comprehensions
- Extract data from nested data structures

# Functions

This aspect of Python is _incredibly_ useful! Writing your own functions can save you a TON of work - by _automating_ it.

## Creating Functions

The first line will read:

```python

'def' function_name() ':'

```

Any arguments to the function will go in the parentheses.

Let's try building a function that will automate the task of finding how many times a given number can be evenly divided by 2.

In [None]:
# Let's code it!



## Calling Functions

To _call_ a function, simply type its name, along with any necessary arguments in parentheses.

In [None]:
# Let's call it!



## Default Argument Values

Sometimes we'll want the argument(s) of our function to have default values.

In [None]:
def cheers(person='aaron', job='data scientist', age=30):
    return f'Hooray for {person}. You\'re a {job} and you\'re {age}!'

In [None]:
cheers('greg', 'scientist', 130)

In [None]:
cheers(job='scientist', age=130, person='greg')

In [None]:
cheers('cristian', 'git enthusiast')

In [None]:
cheers()

# Lists

## List Methods

Make sure you're comfortable with the following list methods:

- `.append()`: adds the input element to the end of a list
- `.pop()`: removes and returns the element with input index from the list
- `.extend()`: adds the elements in the input iterable to the end of a list
- `.index()`: returns the first place in a list where the argument is found
- `.remove()`: removes element by value
- `.count()`: returns the number of occurrences of the input element in a list

Question: What's the difference between `.remove()` and `del`?

<details>
    <summary>
        Answer here
    </summary>
    .remove() removes an element by value;<br/>
    del removes an element by position

## List Comprehension

List comprehension is a handy way of generating a new list from existing iterables.

Suppose I start with a simple list.

In [None]:
primes = [2, 3, 5, 7, 11, 13, 17, 19]

What I want now to do is to build a new list that comprises doubles of primes. I can do this with list comprehension!

The syntax is: `[ f(x) for x in <iterable> if <condition>]`

In [None]:
prime_doubles = [x*2 for x in primes]
prime_triples = [x*3 for x in primes]

In [None]:
prime_doubles

##### Aside: List Comprehensions Vs. `for`-Loops

Yes, I could do the same work with `for`-loops:

In [None]:
prime_doubles2 = []
for prime in primes:
    prime_doubles2.append(prime*2)
prime_doubles2

In [None]:
prime_doubles == prime_doubles2

But list comprehensions are more efficient: The syntax is simpler, and they're also faster. Also, you'll see them in other people's code, so you'll have to know how to work with them!

### Another List Comprehension Example

I can use list comprehension to build a list from objects other than lists:

In [None]:
names = ('Alan Turing', 'Charles Babbage', 'Ada Lovelace',
        'Anita Borg', 'Steve Wozniak', 'Andrew Ng')

splits = [name.split() for name in names]
splits

In [None]:
[name1[0]+'. '+name2[0]+'.' for (name1, name2) in splits]

### Exercises

1. Use a list comprehension to extract the odd numbers from this set:

In [None]:
nums = set(range(1000))

<details>
    <summary>Answer
    </summary>
    <code>[num for num in nums if num % 2 == 1]</code>
    </details>

2. Use a list comprehension to take the first character of each string from the following list of words.

In [None]:
words = ['carbon', 'osmium', 'mercury', 'potassium', 'rhenium', 'einsteinium',
        'hydrogen', 'erbium', 'nitrogen', 'sulfur', 'iodine', 'oxygen', 'niobium']

<details>
    <summary>Answer
    </summary>
    <code>[word[0] for word in words]</code>
    </details>

3. Use a list comprehension to build a list of all the names that start with 'R' from the following list. Add a '?' to the end of each name.

In [None]:
names = ['Randy', 'Robert', 'Alex', 'Ranjit', 'Charlie', 'Richard', 'Ravdeep',
        'Vimal', 'Wu', 'Nelson']

<details>
<summary>Answer
    </summary>
    <code>[name+'?' for name in names if name[0] == 'R']</code>
    </details>

# Dictionaries

## Dictionary Methods

Make sure you're comfortable with the following dictionary methods:

- `.keys()`: returns an array of the dictionary's keys
- `.values()`: returns an array of the dictionary's values
- `.items()`: returns an array of key-value tuples

## Dictionary Comprehension

Much like list comprehension, I can use dictionary comprehension to build dictionaries from existing iterables.

In [None]:
my_dict = {'who': 'flatiron school', 'what': 'data science',
           'when': 'now', 'where': 'here', 'why': '$',
           'how': 'python'}

Remember that the `.items()` method will return a collection of doubles:

In [None]:
my_dict.items()

So I can use a pair of variables to range over it:

In [None]:
{k: v + '!' for k, v in my_dict.items() if k.startswith('w')}

The same thing works for any collections of doubles:

In [None]:
{k**2: v**2 for k, v in [(0, 1), (2, 3), (4, 5)]}

### `zip`

Remember that `zip` is a handy way of pairing up two or more iterables:

In [None]:
dict(zip(range(5), ['apple', 'orange', 'banana', 'lime', 'blueberry']))

In [None]:
# Zipping multiple iterables together
tuple(zip(range(1, 5), 'a'*4, 'b'*4, 'c'*4, 'd'*4, 'e'*4))

#### Dictionary Comprehension Using `zip`

In [None]:
{k: v for k, v in zip(range(5), range(0, 10, 2))}

In [None]:
scores = [.858, .873, .868]
{'model' + str(j+1): scores[j] for j in range(3)}

### Exercises

1. Use a dictionary comprehension to pair up the countries in the first list with their corresponding capitals in the second list:

In [None]:
list1 = ['USA', 'France', 'Canada', 'Thailand']
list2 = ['Washington', 'Paris', 'Ottawa', 'Bangkok']

<details>
<summary>Answer
    </summary>
    <code>{country: capital for (country, capital) in zip(list1, list2)}</code> <br/> OR <br/>
    <code>dict(zip(list1, list2))</code>
    </details>

2. Use a dictionary comprehension to make each of the characters in the following list a key with the value 'fictional character'.

In [None]:
chars = ['Pinocchio', 'Gilgamesh', 'Kumar Patel', 'Toby Flenderson']

<details>
    <summary>Answer</summary>
    <code>{char: 'fictional character' for char in chars}</code>
    </details>

# Nesting

Just as we can put lists and dictionaries inside of other lists and dictionaries, we can also put comprehensions inside of other comprehensions.

In [None]:
lists = [['morning', 'afternoon', 'night'], ['read', 'code', 'sleep']]

In [None]:
[[item[0] for item in small_list] for small_list in lists]

## Nested Structures

It will be well worth your while to practice accessing data in complex structures. Consider the following:

In [None]:
customers = {
    'bill': {'purchases': {'movies': ['Terminator', 'Elf'],
                     'books': []}, 'id': 1},
            'dolph': {'purchases': {'movies': ['It Happened One Night'],
                     'books': ['The Far Side Gallery']}, 'id': 2},
            'pat': {'purchases': {'movies': [],
                   'books': ['Seinfeld and Philosophy', 'I Am a Bunny']},
                   'id': 3}
}

**Q**: How would we access 'I Am a Bunny'?
<br/>
**A**: The outermost "layer" has a name: 'customers', and that object is a dictionary:
<br/>
`customers`
<br/>
The key we are interested in is 'pat', since that's where 'I Am a Bunny' is located:
<br/>
`customers['pat']`
<br/>
The value corresponding to the key 'pat' is also a dictionary, and in this "lower-down" dictionary, the key we are interested in is 'purchases':
<br/>
`customers['pat']['purchases']`
<br/>
The value corresponding to the key 'purchases' is yet another dictionary, and here the key of interest is `books`:
<br/>
`customers['pat']['purchases']['books']`
<br/>
The value corresponding to the key 'books' is a list, and 'I Am a Bunny' is the second element in that list:
<br/>
`customers['pat']['purchases']['books'][1]`

In [None]:
customers['pat']['purchases']['books'][1]

## Exercises

1. From the list below, make a list of dictionaries where the key is the person's name and the value is the person's home phone number.

In [None]:
phone_nos = [{'name': 'greg', 'nums': {'home': 1234567, 'work': 7654321}},
          {'name': 'max', 'nums': {'home': 9876543, 'work': 1010001}},
            {'name': 'erin', 'nums': {'home': 3333333, 'work': 4444444}},
            {'name': 'joél', 'nums': {'home': 2222222, 'work': 5555555}},
            {'name': 'sean', 'nums': {'home': 9999999, 'work': 8888888}}]

<details>
    <summary>Answer</summary>
    <code>[{item['name']: item['nums']['home']} for item in phone_nos]</code>
    </details>

2. From the customers dictionary above, build a dictionary where the customers' names are the keys and the movies they've bought are the values.

<details>
    <summary>Answer</summary>
    <code>{customer: customers[customer]['purchases']['movies'] for customer in customers.keys()}</code> <br/>
    OR <br/>
    <code>{k: v['purchases']['movies'] for k, v in customers.items()}</code>
    </details>

# More Exercises

1. Build a function that will return $2^n$ for an input $n$.

<details>
    <summary>Answer</summary>
    <code>
def expo(n):
    return 2**n</code>
    </details>

2. Build a function that will take in a list of phone numbers as strings and return the same as integers, removing any parentheses ('(' and ')'), hyphens ('-'), and spaces.

<details>
    <summary>Answer</summary>
    <code>
def int_phone(string_list):
    return [int(string.replace('(', '').replace(')', '').replace('-', '').replace(' ', ''))\
    for string in string_list]</code>
    </details>

3. Build a function that returns the mode of a list of numbers.

<details>
    <summary>Answer</summary>
        <code>
def mode(lst):
    counts = {num: lst.count(num) for num in lst}
    return [num for num in counts.keys() if counts[num] == max(counts.values())]</code>
    </details>