# Data Structures

## Tuples

A *tuple* is a sequence of objects. It can have any number of objects inside. In Python tuples are written with round brackets **()**. 

In [None]:
latitude = 37.7739
longitude = -121.5687
coordinates = (latitude, longitude)
print(coordinates)

You can access each item by its position, i.e. *index*. In programming, the counting starts from 0. So the first item has an index of 0, the second item an index of 1 and so now. The index has to be put inside square brackets **[]**.

In [None]:
y = coordinates[0]
x = coordinates[1]
print(x, y)

## Lists

A **list** is similar to a tuple - but with a key difference. With tuples, once created, they cannot be changed, i.e. they are immutable. But lists are mutable. You can add, delete or change elements within a list.  In Python, lists are written with square brackets **[]**

In [None]:
cities = ['San Francisco', 'Los Angeles', 'New York', 'Atlanta']
print(cities)

You can access the elements from a list using index the same way as tuples.

In [None]:
print(cities[0])

You can call `len()` function with any Python object and it will calculates the size of the object.

In [None]:
print(len(cities))

We can add items to the list using the `append()` method

In [None]:
cities.append('Boston')
print(cities)

As lists are *mutable*, you will see that the size of the list has now changed

In [None]:
print(len(cities))

Another useful method for lists is `sort()` - which can sort the elements in a list.

In [None]:
cities.sort()
print(cities)

The default sorting is in *ascending* order. If we wanted to sort the list in a *decending* order, we can call the function with `reverse=True`

In [None]:
cities.sort(reverse=True)
print(cities)

## Sets

Sets are like lists, but with some interesting properties. Mainly that they contain only unique values. It also allows for *set operations* - such as *intersection*, *union* and *difference*. In practice, the sets are typically created from lists.

In [None]:
capitals = ['Sacramento', 'Boston', 'Austin', 'Atlanta']
capitals_set = set(capitals)
cities_set = set(cities)

capital_cities = capitals_set.intersection(cities_set)
print(capital_cities)

Sets are also useful in finding unique elements in a list. Let's merge the two lists using the `extend()` method. The resulting list will have duplicate elements. Creating a set from the list removes the duplicate elements.

In [None]:
cities.extend(capitals)
print(cities)
print(set(cities))

## Dictionaries

In Python dictionaries are written with curly brackets **{}**. Dictionaries have *keys* and *values*. With lists, we can access each element by its index. But a dictionary makes it easy to access the element by name. Keys and values are separated by a colon **:**. 

In [None]:
data = {'city': 'San Francisco', 'population': 881549, 'coordinates': (-122.4194, 37.7749) }
print(data)

You can access an item of a dictionary by referring to its key name, inside square brackets.

In [None]:
print(data['city'])

## Exercise

From the dictionary below, how do you access the latitude and longitude values? print the latitude and longitude of new york city by extracting it from the dictionary below.

The expected output should look like below.

```
40.661
-73.944
```

In [None]:
nyc_data = {'city': 'New York', 'population': 8175133, 'coordinates': (40.661, -73.944) }

### What is Numpy? 
![](images/python_foundation/numpy.png)


NumPy is the fundamental package for scientific computing with Python:
- A powerful N-dimensional array object
- Tools for integrating C/C++ and Fortran code
- Useful linear algebra, Fourier transform, and random number capabilities
- <a href='http://cs231n.github.io/python-numpy-tutorial/'>Detailed tutorials</a>

By convention, `numpy` is commonly imported as `np`

In [1]:
import numpy as np

## Arrays

The array object in NumPy is called `ndarray`. It provides a lot of supporting functions that make working with arrays fast and easy. Arrays may seem like Python Lists, but `ndarray` is upto 50x faster in mathematical operations. You can create an array using the `array()` method. As you can see, the rsulting object is of type `numpy.ndarray`

In [2]:
a = np.array([1, 2, 3, 4])
print(type(a))

<class 'numpy.ndarray'>


Arrays can have any *dimensions*. We can create a 2D array like below. `ndarray` objects have the property `ndim` that stores the number of array dimensions. You can also check the array size using the `shape` property.

In [3]:
b = np.array([[1, 2, 4], [3, 4, 5]])
print(b)
print(b.ndim)
print(b.shape)

[[1 2 4]
 [3 4 5]]
2
(2, 3)


You can access elements of arrays like Python lists using `[]` notation.

In [4]:
print(b[0])

[1 2 4]


In [5]:
print(b[0][2])

4


## Array Operations

Mathematical operations on numpy arrays are easy and fast. NumPy as many built-in functions for common operations.

In [6]:
print(np.sum(b))

19


You can also use the functions operations on arrays. 

In [7]:
c = np.array([[2, 2, 2], [2, 2, 2]])
print(np.divide(b, c))

[[0.5 1.  2. ]
 [1.5 2.  2.5]]


If the objects are numpy objects, you can use the Python operators as well

In [8]:
print(b/c)

[[0.5 1.  2. ]
 [1.5 2.  2.5]]


You can also combine array and scalar objects. The scalar operation is applied to each item in the array.

In [9]:
print(b)
print(b*2)
print(b/2)

[[1 2 4]
 [3 4 5]]
[[ 2  4  8]
 [ 6  8 10]]
[[0.5 1.  2. ]
 [1.5 2.  2.5]]


An important concept in NumPy is the *Array Axes*. Similar to the `pandas` library, In a 2D array, Axis 0 is the direction of rows and Axis 1 is the direction of columns. The diagram below show the directions.

![](images/python_foundation/pandas_axis.png)

Let's see how we can apply a function on a specific axis. Here when we apply `sum` function on axis-0 of a 2D array, it gives us a 1D-array with values summed across rows.

In [10]:
print(b)
row_sum = b.sum(axis=0)
print(row_sum)

[[1 2 4]
 [3 4 5]]
[4 6 9]


## Exercise

Sum the array `b` along Axis-1. What do you think will be the result?

In [13]:
import numpy as np

b = np.array([[1, 2, 4], [3, 4, 5]])
print(b.sum(axis=1))

[ 7 12]


----