# Creating and Reshaping Arrays

The exercises in this notebook will teach you to use a variety of common functions for creating and reshaping arrays.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

## `np.array`

The most common way to create an array is to construct one from a python list.

**Example:** Passing a list of scalars produces a 1-dimensional array.

In [None]:
np.array([1, 2, 3])

**Example:** Passing a list of lists produces a 2-dimensional array.

In [None]:
np.array([[1, 2], 
          [2, 3]])

**Exercise:** Create a 1-dimensional array using `np.array` containing `[1, 2, 3, 4]`.

In [None]:
np.array?

**Exercise:** Create a 2-dimensional array using `np.array` with two rows containing  `[1, 2, 3]` and `[4, 5, 6]`.

## `np.arange`

**Exercise:** Construct an array containing values from 0 to 9, inclusive, in ascending order.

In [None]:
np.arange?

**Exercise:** Construct an array containing all the integers between 1 and 3 inclusive, in ascending order.

**Exercise:** Construct an array containing all the integers between 5 and 10 inclusive, in ascending order.

**Exercise:** Construct an array containing all the integers between 1 and 10 inclusive, in descending order.

**Exercise:** Construct an array containing all the **even** integers between 2 and 10 inclusive, in ascending order.

## `linspace`

**Exercise:** Construct an array containing 50 evenly-spaced values between -1 and 1.

In [None]:
np.linspace?

## Exercise: `zeros`, `ones`, and `full`

**Exercise:** Construct arrays with the following shapes and values:
- 1-dimensional array with 10 entries, all containing the value 0.
- 2-dimensional array with 3 rows and 5 columns, all containing the value 1.
- 2-dimensional array with 5 rows and 3 columns, all containing the value 2.
- 3-dimensional array of shape (2, 5, 10), all containing the value 0.
- 5-dimensional array of shape (1, 2, 3, 4, 5), all containing the value 42.

In [None]:
np.full?

## `identity`

**Exercise:** Construct a 5 x 5 array with 1s along the diagonal and zeros everywhere else.

In [None]:
np.identity?

## Exercise: `random`

In [None]:
rng = np.random.RandomState(seed=42)

Construct an array containing 10 values drawn uniformly at random from the interval `[-1, 1]`.

In [None]:
rng.uniform?

Construct a 3 x 3 array with values drawn from a normal distribution centered at 0 with a standard deviation of 2.5.

In [None]:
rng.normal?

## `pandas.read_csv`

Many people use the `pandas` module to read numerical data from external sources. The `.csv` (comma-separated value) format is often used for small and medium-sized datasets.

In [None]:
import pandas as pd

We can read a CSV into a DataFrame useing `pandas.read_csv`.

In [None]:
prices = pd.read_csv('prices.csv', index_col='dt', parse_dates=['dt'])
prices.head()

DataFrames are composed of three parts:

- `index`, an array of row-labels
- `columns`, an array of column-labels
- `values`, an array of table values.

We can get a numpy array for each of these attributes by using the `.values` attribute:

In [None]:
prices.index.values

In [None]:
prices.columns.values

In [None]:
prices.values

**Exercise:** Use `pd.read_csv` to load the file "volumes.csv".

**Exercise:** Get a numpy array of datetimes representing the row-labels of the DataFrame.

**Exercise:** Get a numpy array of strings representing the column-labels of the DataFrame.

**Exercise:** Get a numpy array of floats representing the table values of the DataFrame.

## Reshaping Arrays

Once we've created or loaded an array, a common next step is to reshape the array.

The most general way to reshape an array is to use the `.reshape` method of `ndarray`. `.reshape` accepts a tuple of new dimensions and 

In [None]:
data = np.arange(12)

**Exercise:** Reshape `data` into an array with three rows and four columns.

In [None]:
data.reshape?

**Exercise:** Reshape `data` into an array with four rows and three columns:

**Exercise:** Reshape `data` into an array of shape `(2, 2, 3)`.

## Transpose

A common pattern, especially when doing linear algebra with 2D arrays, is to need to "rotate" an array by 90 degrees. This operation is commonly known as "transposing" the array.

**Exercise:** Use the `.transpose()` method to convert data from a `2 x 4` array into a `4 x 2` array.

In [None]:
data = np.arange(8).reshape(2, 4)
data

In [None]:
data.transpose?

Transposing arrays is so common in linear algebra that numpy provides a shorthand for it. The `.T` property provides a transposed view of an array.

**Exercise:** Transpose `data` using the `.T` property.

## Measuring Performance of Numpy vs. Pure Python

We've seen that numpy allows us to run simple numerical computations much faster than pure Python. To show that, we used a few different functions and tools:

- We used a `dot_product` method implemented in pure Python:

```python
def python_dot_product(xs, ys):
    return sum(x * y for x, y in zip(xs, ys))
```
- We used two numpy implementations of `dot_product`:

```python
def manual_numpy_dot(xs, ys):
    return (xs * ys).sum()

def native_numpy_dot(xs, ys):
    return xs.dot(ys)
```

- We used IPython's `%%timeit` magic as a simple way to measure how long a cell takes to run on average.

Unfortunately, nothing in programming comes for free. Numpy allows us to speed up computations on large arrays by performing one complex dispatch **per array** instead of a cheap dispatch **per array element**. **This only gives us a speedup if we have many array elements.**

### Exercise:
Using the ``%%timeit`` builtin, figure out how many data points you need to have for a numpy dot product to be faster than a pure-python implementation.

You can use the `make_list` function below to create Python lists of a given size. Use any of the functions from the exercises above to make numpy lists. Be sure not to include the list/array creation in your timings (that probably means you want to use separate cells for constructing arrays and testing timings).

In [None]:
def make_list(size):
    return list(range(size))

def python_dot_product(xs, ys):
    return sum(x * y for x, y in zip(xs, ys))

def manual_numpy_dot(xs, ys):
    return (xs * ys).sum()

def native_numpy_dot(xs, ys):
    return xs.dot(ys)