Robert Corbett  -  
rwc27@pitt.edu  -  
9/6/2017  -  
Todo2 numpy notes

Must import numpy library before using it.
May use the 'as' operator to give the library an easier to use name such as np.

In [1]:
import numpy as np

When a numpy array is created, all elements must be of the same data type.
If elements are not the same data type, numpy with 'upcast' all elements so no information is lost.
Following example shows how if one element is float while rest are ints, all the elements will be upcast to float.

In [2]:
upcast_ex = np.array([1.22,4,2,5,3])

In [3]:
print(upcast_ex)

[ 1.22  4.    2.    5.    3.  ]


The dtype arg can be passed to set the data type the array will contain.

Numpy has methods for creating arrays with all elements initialized to zero (np.zeros) or one (np.ones).

np.full(x,y) will create an array of size x filled with value y.

In [4]:
all_zeros = np.zeros(10, dtype=int)

In [5]:
print(all_zeros)

[0 0 0 0 0 0 0 0 0 0]


In [6]:
all_ones = np.ones(5, dtype=float)

In [7]:
print(all_ones)

[ 1.  1.  1.  1.  1.]


In [8]:
pi_array = np.full((4,4), 3.14)

In [9]:
print(pi_array)

[[ 3.14  3.14  3.14  3.14]
 [ 3.14  3.14  3.14  3.14]
 [ 3.14  3.14  3.14  3.14]
 [ 3.14  3.14  3.14  3.14]]


The method np.arange(x,y,z) creates an array from x (inclusive) to y (exclusive) stepping by z.

In [10]:
arange_ex = np.arange(0, 12, 4)

In [11]:
print(arange_ex)

[0 4 8]


The method np.linspace(x,y,z) creates an arrary of size z from x (inclusive) to y (inclusive) in  equal increments.

In [12]:
linspace_ex = np.linspace(3, 9, 13)

In [13]:
print(linspace_ex)

[ 3.   3.5  4.   4.5  5.   5.5  6.   6.5  7.   7.5  8.   8.5  9. ]


Numpy has methods for filling an array with random values.
np.random.random fills the array with random values between 0 and 1.

In [14]:
rand_ex = np.random.random(3)

In [15]:
print(rand_ex)

[ 0.55058913  0.34920854  0.50784876]


Use np.empty() if you want to create an uninitialized array, where the elements have the values that just happen to be in the memory location.

In [16]:
empty_ex = np.empty((3,3))

In [17]:
print(empty_ex)

[[  0.00000000e+000   0.00000000e+000   0.00000000e+000]
 [  0.00000000e+000   0.00000000e+000   6.06712613e-321]
 [  3.94691217e-312   0.00000000e+000   2.56021474e+029]]


NumPy array attributes.

In [18]:
sample_array = np.array(([1,2,3],[4,5,6],[7,8,9]))

ndim: number of dimensions

In [19]:
print("ndim: ", sample_array.ndim)

ndim:  2


shape: the size of each dimension

In [20]:
print("shape: ", sample_array.shape)

shape:  (3, 3)


size: the total size of the array

In [21]:
print("size: ", sample_array.size)

size:  9


dtype: The data type of the array

In [22]:
print("dtype: ", sample_array.dtype)

dtype:  int32


itemsize: the size of each element in bytes

In [23]:
print("itemsize: ", sample_array.itemsize)

itemsize:  4


nbytes: the size of the entire array

In [24]:
print("nbytes: ", sample_array.nbytes)

nbytes:  36


Array Indexing

NumPy arrays are indexed like regular arrays.  Start at index 0.  Can also be indexed from end of array starting with -1.

In [25]:
one_d = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

In [26]:
print(one_d[0])

0


In [27]:
print(one_d[9])

9


In [28]:
print(one_d[-1])

10


In [29]:
print(one_d[-10])

1


In [30]:
two_d = np.array(([11, 12, 13], [21, 22, 23]))

In [31]:
print(two_d[0, -1])

13


NumPy arrays can also be sliced like regular arrays.

x[start:stop:step] start is inclusive, stop is exclusive

In [32]:
print(one_d[0:9:2])

[0 2 4 6 8]


In [33]:
print(one_d[-10:-1])

[1 2 3 4 5 6 7 8 9]


In [34]:
print(one_d[:6])

[0 1 2 3 4 5]


In [35]:
print(one_d[5:])

[ 5  6  7  8  9 10]


In [36]:
print(one_d[::2])

[ 0  2  4  6  8 10]


One important thing to know about numpy arrays, the array slices that are returned are 'views' of the data and not copies.

If you want to copy an array, you must use the copy() method.  x_array=y_array does not copy array, just creates another pointer the the original array.

In [37]:
y_array = np.array([1, 2, 3])

In [38]:
x_array = y_array

In [39]:
print(y_array)

[1 2 3]


In [40]:
print(x_array)

[1 2 3]


In [41]:
x_array[0] = 6

In [42]:
print(y_array)

[6 2 3]


In [43]:
print(x_array)

[6 2 3]


In [44]:
x_array = y_array.copy()

In [45]:
y_array[0] = 1

In [46]:
print(y_array)

[1 2 3]


In [47]:
print(x_array)

[6 2 3]


Concatenate arrays using np.concatenate([x,y])

In [48]:
print(np.concatenate([x_array, y_array]))

[6 2 3 1 2 3]


Computation on NumPy Arrays: Universal Functions

Some default python implementaion is very slow. (due to being a dynamic, interpreted language)  One instance of Python code running slow is when many small operations are repeated (such as a loop). 

UFuncs - NumPy interface to deal with statically typed np.arrays.  Use the same (native) opearators as before (+, -, *, /, %) and NumPy uses UFuncs.

Other UFunc methods include np.abs, Trig functions (np.sin(), np.cos(), ...), exponents and logarithms (np.exp(), np.log()).

Many other special functions in NumPy that optimize the code by acting on arrays of the same data type.

Aggregations:Min, Max and Everything In Between

NumPy has built in aggregation functions for working on arrays.

In [49]:
rand_array = np.random.random(10)

In [50]:
print(rand_array)

[ 0.85746377  0.57999008  0.56109503  0.08595028  0.99823183  0.34590028
  0.23357182  0.98740457  0.30599482  0.67223864]


Sum the elements in an array sum() compared to np.sum()

In [51]:
print(sum(rand_array))

5.62784112411


In [52]:
%timeit(sum(rand_array))

2.38 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [53]:
%timeit(np.sum(rand_array))

2.72 µs ± 42 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


np.sum is 27.1 ns faster per loop

For arrays of the same size, binary operations are performed on them by an element-by-element basis

In [54]:
a = np.array([1, 2, 3, 4])

In [55]:
b = np.array([5, 6, 7, 8])

In [56]:
print(a + b)

[ 6  8 10 12]


Broadcasting allows these operations to be performed on arrays of different sizes.

In [57]:
print(a + 5)

[6 7 8 9]


Similar broadcasting can be done with multi_demensional arrays

In [58]:
M = np.ones((3, 4))

In [59]:
print(M)

[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


In [60]:
print(M + a)

[[ 2.  3.  4.  5.]
 [ 2.  3.  4.  5.]
 [ 2.  3.  4.  5.]]


Rules of Broadcasting

Rule 1: if the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading side.

In [61]:
M = np.zeros((2,3))

In [62]:
print(M)

[[ 0.  0.  0.]
 [ 0.  0.  0.]]


In [63]:
a = np.arange(3)

In [64]:
print(a)

[0 1 2]


In [65]:
print(M.shape)

(2, 3)


In [66]:
print(a.shape)

(3,)


So when M and a are used in a binary operation, the array a is padded with 1's on the leading side. e.i. a.shape is treated as (1, 3)

In [67]:
print(M + a)

[[ 0.  1.  2.]
 [ 0.  1.  2.]]


Rule 2: if the shape of the two arays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

In [68]:
a = np.ones((3, 1))

In [69]:
print(a)

[[ 1.]
 [ 1.]
 [ 1.]]


In [70]:
b = np.arange(3)

In [71]:
print(b)

[0 1 2]


In [72]:
print(a.shape)

(3, 1)


In [73]:
print(b.shape)

(3,)


So, again by rule 1, b is treated as shape (1, 3)

And rule 2 tells us that we upgrade each of these ones to match the corresponding size of the other array;  e.i. both shapes are treated as (3, 3)

In [74]:
print(a + b)

[[ 1.  2.  3.]
 [ 1.  2.  3.]
 [ 1.  2.  3.]]


Rule 3: if in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [75]:
M = np.ones((3, 2))

In [76]:
print(M)

[[ 1.  1.]
 [ 1.  1.]
 [ 1.  1.]]


In [77]:
a = np.arange(3)

In [78]:
print(a)

[0 1 2]


In [79]:
print(M.shape)

(3, 2)


In [80]:
print(a.shape)

(3,)


Again by rules 1 and 2 the shape of a is stretched to (3, 3) but the shape of M remains (3, 2)

So rule 3 says that if the final shapes do not match, an error is raised