# Python data structures

## Tuple

A fixed-length, immutable sequence of Python objects. If an object inside a tuple is mutable, you modify it in-place.

To convert a sequence to a tuple use `tuple()`

In [1]:
tup = 3,4,5
tup

(3, 4, 5)

In [2]:
tmp_var = 'string'
tup = tuple(tmp_var)
tup

('s', 't', 'r', 'i', 'n', 'g')

## Unpacking

A quick way to assign objects of a sequence to variables.

In [3]:
tup = 1,2,3
a, *_ = tup
print(a)
print(*_)

1
2 3


## Variable swap

In [4]:
a = 1
b = 10

b, a = a, b
print(a)
print(b)

10
1


## Lists

A variable length sequence of Python objects. You can modify the objects in-place. To create a list use `[]`.

In [5]:
a_list = [2,3,7, None]
a_list

[2, 3, 7, None]

### List methods

- `.append()` to insert a new value at the end.
- `.insert()` to insert a new values in a specific position.
- `.pop()` to remove an element from a list. To remove a specific object use the index number. The default behavior to remove the last value.
- `.remove()` to remove an object by value. Only removes the first occurrence.
- `.extend()` to add multiple elements to a list.
- `.sort()` will sort a list in-place. Handles a key argument.

###  Checking whether a list contains a value

Use:
- `in`
- `not in`

### Built-in bisect module for sorted lists

`bisect.bisect(a_list, 2)` finds the position in a sorted list to insert a value and maintain the list ordered. `.bisect` only finds the position, it does not insert the value. To insert a value use `bisect.insort()`.

### Slicing
Slicing is the action of a selecting a section from a list.  The syntax is `a_list[start:stop:step]`. The stop position is not included in the returned selection.

To reverse a list you can use `a_list[::-1]`.

## Built-in sequence functions

### enumerate()
The enumerate function keeps track of the index of the current value and the value when iterating over a sequence. 

`for i, value in enumerate(collection):
    # do something with value`

### sorted()

The sorted function returns a new sorted sequence.

In [6]:
a = 'a string'
a = list(a)
print(sorted(a),'\n')
print('notice that a is not modified.\n')
# notice that a is not modified.
print(a)


[' ', 'a', 'g', 'i', 'n', 'r', 's', 't'] 

notice that a is not modified.

['a', ' ', 's', 't', 'r', 'i', 'n', 'g']


### zip()

The zip function "pairs up" elements of a group of sequences. This function returns an iterator.

In [7]:
seq1 = ['a', 'b', 'c']
seq2 = [1, 2, 3]
zipped = zip(seq1, seq2)
list(zipped)

[('a', 1), ('b', 2), ('c', 3)]

### reversed()

Iterates over the elements of a sequence in reversed order. `reversed` is a generator.

In [8]:
a_list = ['one', 'three', 'five', 'four', 'two', 'zero']
print(list(reversed(a_list)),'\n')
print('reversed does not modify the original list.\n')
print(a_list)

['zero', 'two', 'four', 'five', 'three', 'one'] 

reversed does not modify the original list.

['one', 'three', 'five', 'four', 'two', 'zero']


## Dict

A dict is a collection of key-value pairs. Each key and value are Python objects. To create a dict use `{}`.

In [9]:
a_dict = {
    'a' :'a string',
    'b' : [1,2,3,4],
    'c' : 100,
    'd' : {
        'a' : 200
    }
}

To access a value use the key:

In [10]:
a_dict['c']

100

To add another key:value pair:

In [11]:
a_dict['e'] = 'I am a new key:value pair'
a_dict

{'a': 'a string',
 'b': [1, 2, 3, 4],
 'c': 100,
 'd': {'a': 200},
 'e': 'I am a new key:value pair'}

To remove a key:value pair use `del` or `pop`. These two methods return the value and deletes the key.

In [12]:
del a_dict['c']
a_dict

{'a': 'a string',
 'b': [1, 2, 3, 4],
 'd': {'a': 200},
 'e': 'I am a new key:value pair'}

In [13]:
deleted_value = a_dict.pop('e')
print(deleted_value)
a_dict

I am a new key:value pair


{'a': 'a string', 'b': [1, 2, 3, 4], 'd': {'a': 200}}

In [14]:
# using items to key track of key and value
for key, value in a_dict.items():
    print(key)
    print(value)

a
a string
b
[1, 2, 3, 4]
d
{'a': 200}


### Dict methods

- `.keys()` to retrieve the keys.
- `.values()` to retrieve the values.
- `.update()` to merge dictionaries. In-place modification of the dict. 
- `.get()` to retrieve a value using a key or return a default value.
- `.setdefault()` to set a key and default value
- `defaultdict()` import methods from collection module.

In [15]:
a_dict.keys()

dict_keys(['a', 'b', 'd'])

In [16]:
a_dict.values()

dict_values(['a string', [1, 2, 3, 4], {'a': 200}])

In [17]:
a_dict.update({'a' : 'I was a string, now I\' a longer string', 'z' : 1000})
a_dict

{'a': "I was a string, now I' a longer string",
 'b': [1, 2, 3, 4],
 'd': {'a': 200},
 'z': 1000}

### Key in dict?

In [18]:
'a' in a_dict

True

In [19]:
'a' not in a_dict

False

### Creating a dict from sequences using `zip()`

In [20]:
states = ['California', 'New York', "Georgia"]
capitals = ['Sacramento', 'Albany', 'Atlanta']

mapping = {}
for key, value in zip(states, capitals): # think of a dict as a collection of 2-tuples.
    mapping[key] = value
    
mapping

{'California': 'Sacramento', 'New York': 'Albany', 'Georgia': 'Atlanta'}

### Creating a dict using `dict()` and `zip()`

In [21]:
countries = ['Canada', 'US', 'Mexico']
capitals = ['Ottawa', 'Washington D.C.', 'Mexico City']

dict(zip(countries, capitals))

{'Canada': 'Ottawa', 'US': 'Washington D.C.', 'Mexico': 'Mexico City'}

### Default values

In [22]:
countries = ['Canada', 'US', 'Mexico']
capitals = ['Ottawa', 'Washington D.C.', 'Mexico City']

countries = dict(zip(countries, capitals))

search_key = "Brasil"
if search_key in countries:
    value = countries[search_key]
else:
    value = "Country not present"
    
print(value)

Country not present


In [23]:
search_key = "Mexico"
value = countries.get(search_key, "No record")
print(value)

Mexico City


### Valid dict key types

A key can be any Python object. Use immutable objects as keys.

Use `hash()` to check if an object is immutable.

## Set

It is an unordered collection of unique elements. To create a set, use `set()` or `{}`.

In [24]:
set([1,4,5,3,6,2,2,4,5])

{1, 2, 3, 4, 5, 6}

### Set operations

- union
- intersection
- difference
- symmetric difference

## List, Set, and Dict Comprehensions

It is a loop and if statement combined. List comprehensions return a new list, set or dict.

`[expression with i for i in collection if contion is met]`

In [25]:
# list
numbers = [1,2,3,4,5,6,7,8,9]
[num*num for num in numbers if num % 2 == 0]

[4, 16, 36, 64]

In [26]:
# no if statement
[num*num for num in numbers]

[1, 4, 9, 16, 25, 36, 49, 64, 81]

In [27]:
# set
words = ['apple', 'pear', 'two', 'nine', 'all', 'tea', 'almost']
set_letters = {word[0] for word in words if len(word) > 3}
set_letters

{'a', 'n', 'p'}

In [28]:
# no if statement
{word[0] for word in words}

{'a', 'n', 'p', 't'}

In [29]:
unique_lengths = {len(word) for word in words}
unique_lengths

{3, 4, 5, 6}

In [30]:
location_mapping = {val : index for index, val in enumerate(words) if len(val) > 3}
location_mapping

{'apple': 0, 'pear': 1, 'nine': 3, 'almost': 6}

## Nested list comprehensions

In [31]:
 all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
 ['Merie', 'Juan', 'Javier', 'Natalia', 'Pilar']]


In [32]:
result = [name.lower() for names in all_data for name in names if name.count('e') >= 2]
result

['steven', 'merie']

In [33]:
result = [name for names in all_data for name in names if name[0] == 'J']
result

['John', 'Juan', 'Javier']

In [34]:
some_tuples = [(1,2,3), (4,5,6), (7,8,9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

## Functions

`None` is returned automatically if no return statement is present.

Each function can have positional arguments and keyword arguments. Keyword arguments are for default values or optional arguments.

Namespace: describles the variable scope.

Scopes: local and global.

When a function is executed, a local namespace is created. Once the execution is complete, the local namespace is destroyed.

In [35]:
states = ['   Alabama  ', 'Georgia', 'Georgia!', 'georgia', 'FlOrida', 'south  carolina##', 'West virginia?']

In [36]:
import re

In [37]:
def remove_punctuation(value):
    return re.sub(pattern='[!#?]', repl='', string=value)

def remove_doublespaces(value):
    return re.sub(pattern=r'\s+', repl=' ', string=value)

In [38]:
clean_ops = [str.strip, remove_punctuation, remove_doublespaces, str.title]

In [39]:
def clean_strings(string_list, func_list):
    result = []
    for string in string_list:
        for function in func_list:
            string = function(string)
        result.append(string)
    return result

In [40]:
result = clean_strings(states, clean_ops)

In [41]:
result

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

In [42]:
list(map(remove_punctuation, states))

['   Alabama  ',
 'Georgia',
 'Georgia',
 'georgia',
 'FlOrida',
 'south  carolina',
 'West virginia']

In [43]:
list(map(remove_doublespaces,map(remove_punctuation, states)))

[' Alabama ',
 'Georgia',
 'Georgia',
 'georgia',
 'FlOrida',
 'south carolina',
 'West virginia']

## Lambda functions

In [48]:
strings = ['abadcd','foo', 'card', 'bar', 'aaaa', 'abab']

In [49]:
strings.sort(key=len)

In [50]:
strings

['foo', 'bar', 'card', 'aaaa', 'abab', 'abadcd']

In [52]:
strings.sort(key=lambda x: len(set(list(x))))

In [53]:
strings

['aaaa', 'foo', 'abab', 'bar', 'card', 'abadcd']

In [57]:
len(set(list(strings)))

6

In [62]:
def apply_to_list(numbers, f):
    return [f(x) for x in numbers]

In [68]:
def cube(x):
    return x ** 2

In [69]:
num_list = [2, 6, 3, 7, 9, 10]

In [70]:
apply_to_list(num_list, cube)

[4, 36, 9, 49, 81, 100]

In [72]:
# using lambda function
import math
apply_to_list(num_list, lambda x: math.sqrt(x))

[1.4142135623730951,
 2.449489742783178,
 1.7320508075688772,
 2.6457513110645907,
 3.0,
 3.1622776601683795]

In [73]:
some_tuples = [('John', 20), ('Mike', 10), ('Sara',15), ('Jane',1)]

In [75]:
sorted(some_tuples, key=lambda name : name[1])

[('Jane', 1), ('Mike', 10), ('Sara', 15), ('John', 20)]

## Currying

Deriving new functions from existing ones by partial argument application.

In [85]:
def add_numbers(x, y):
    return x + y

In [86]:
add_five = lambda y : add_numbers(5, y)
add_five(10)

15

Using partial function

In [88]:
from functools import partial
def add_three_num(x, y, z):
    return x + y + z
one_number = partial(add_three_num, 10, 5)
one_number(1)

16