# NumPy basics-- arrays and vectorized computation

This chapter focuses on NumPy (numerical Python) and its n-dimensional arrays.  Like R, many NumPy functions are vectorized, so there's no need to write an explicit for loop. Also like base R libraries and many others written more recently (e.g., purrr), most of the heavy lifting is done by code written in C with NumPy wrappers.  In fact, the NumPy arrays are stored in contiguous blocks of memory and like the apply and map functions in R, the type checking is performed by the wrapper before passing off to the C code. 

In [3]:
# Setup
import numpy as np

## The NumPy ndarray

In [30]:
# Generate a small array of random data
# Note-- np.random.randn returns a sample from the standard normal distribution
# Cols, then rows, are defined in the 2D case
data = np.random.randn(2, 3)
data

array([[ 0.39479564,  1.43750826,  0.16596883],
       [ 0.7328657 ,  0.55998296, -0.92943363]])

In [31]:
# Performing different operations without loops
data * 10

array([[ 3.94795636, 14.3750826 ,  1.65968828],
       [ 7.32865703,  5.59982956, -9.29433631]])

In [34]:
# NumPy arrays are very similar to R matrices in that they are containers for homogenous data
# Look at the different attributes for ndarrays
print(f'Dimensions of the array:\n{data.shape}\n')
print(f'Data type in the array:\n{data.dtype}\n')

Dimensions of the array:
(2, 3)

Data type in the array:
float64



### Creating ndarrays

In [35]:
# Create a simple array from a list of numbers
nums = [6, 7.5, 8, 0, 1]
arr1 = np.array(nums)
arr1

array([6. , 7.5, 8. , 0. , 1. ])

In [40]:
# Look at the different shape and dimension attributes
print(f'Array shape:\n{arr1.shape}\n')
print(f'# of array dimensions:\n{arr1.ndim}\n')

Array shape:
(5,)

# of array dimensions:
1



In [46]:
# Create an array of 0's
zero_arr = np.zeros((10, 10))
zero_arr

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

### Data types for ndarrays

The data type for most NumPy operations is a float, but there are many other options for fine-tune control of how data are interpreted.

In [51]:
# Create a new array of integers
arr = np.array([1, 2, 3, 4], dtype = np.int32)

In [52]:
arr.dtype

dtype('int32')

In [54]:
# Create with float 64
arr = np.array([1, 2, 3, 4], dtype = np.float64)
arr.dtype

dtype('float64')

In [55]:
# Case back to integer 32
arr.astype(np.int32)

array([1, 2, 3, 4], dtype=int32)

### Arithmetic with NumPy arrays

Like R, NumPy operations are vectorized-- multipling an array by a constant (e.g., 2) will multiply each element of the array by 2, etc.

### Basic indexing and slicing