# Python for Data : Numpy Arrays

Python's build in data structures are great for general - purpose programming, but they lack some specialized features we'd like for data analysis. For example adding rows or columns of data in an element-wise fashion and performing math operations on two dimensional tables are common tasks that aren't readily available with Python's base data types.

# Numpy and Array Basics
Numpy implements a data structure called the N-dimensional array or ndarray. ndarrays are similar to lists in that they contain a collection of items that can be accessed via indexes. On the other hand, ndarrays are homogeneous, meaning they can only contain objects of the same type and they can be multi-dimensional, making it easy to store 2-dimensional tables or matrices.

To work with ndarrays, we need to load the numpy library. It is standard practice to load numpy with the alias "np" like so:

In [1]:
import numpy as np

In [2]:
#Create an ndarray by passing a list to np.array() function:
my_list = [1,2,3,4]            #Define a list
my_array = np.array(my_list)   #Pass the list to np.array()
type(my_array)                 #Check the object's type


numpy.ndarray

In [3]:
#To create an array with more than one dimension, pass a nested list to np.array():
second_list = [5,6,7,8]
two_d_array = np.array([my_list, second_list])
print(two_d_array)

[[1 2 3 4]
 [5 6 7 8]]


An ndarray is defined by the number of dimensions it has, the size of each dimension and the type of data it holds. Check the number and size of dimensions of an ndarray with the shape attribute.

In [4]:
two_d_array.shape

(2, 4)

The output above shows that this ndarray is 2-dimensional, since there are two values listed, and the dimensions have length 2 and 4. 

In [5]:
two_d_array.size

8

check the type of the data in an ndarray with the dtype attribute:

In [6]:
two_d_array.dtype

dtype('int64')

Numpy has a variety of special array creation functions. Some handy array creation functions include:

In [7]:
#np.identity() to create a square 2d array with 1's across the diagonal
np.identity(n=10)    #Size of the array

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])