## 1) Getting started with Numpy

1) Numpy arrays can be created in a number of ways. One of the simplest ways of creating arrays is using the np.array().

2) We pass a list of lists to the array() function and those lists should be equal in length

3) Each list in the list became a row in the array and the elements of these lists populated the columns of the resulting array

4) As the level of nesting in the below cell is two, the resulting array is two-dimensional, so it can be indexed with a set of two integers

In [1]:
import numpy as np
x = np.array(
    [
        [1,2,3],
        [2,3,4]
    ]
)
print(x)

[[1 2 3]
 [2 3 4]]


In [2]:

# Calculate the dimensionality of array
def get_dimensions(arr):
    return arr.ndim

print(get_dimensions(x))

2


In [3]:

# Get the shape of array
def get_shape(arr):
    return arr.shape

print(get_shape(x))

(2, 3)



## 2) Array indexing and slicing
1) Indexing NumPy arrays is very similary to indexing lists or tuples

Start with creating array that has 100 x 100 dimensions:

In [4]:
x = np.random.random((100, 100))
# Get value in the 42nd row and 87th column
print(x[42, 87])

# Print 5th row of the x matrix
print(x[5, :])

0.8322376255169754
[0.28827779 0.18333284 0.51602473 0.80540336 0.16620221 0.51675642
 0.22875876 0.71806177 0.07373745 0.58014316 0.58102138 0.36944622
 0.25830443 0.50036783 0.54033875 0.6320219  0.03377041 0.93208106
 0.81645902 0.33362042 0.31832955 0.91093155 0.19793072 0.40271133
 0.24947846 0.85744852 0.32409706 0.66470956 0.89553756 0.22849306
 0.09312623 0.66495211 0.32982682 0.27266983 0.01650151 0.42332553
 0.09928792 0.75442452 0.02506674 0.23124316 0.72534917 0.14728873
 0.2351095  0.08517124 0.31662342 0.6574294  0.25648767 0.98764207
 0.83544678 0.74728383 0.51616173 0.89539985 0.03904205 0.75025119
 0.76082245 0.59570489 0.72814062 0.46289996 0.31242527 0.98984584
 0.13295097 0.8399446  0.66001393 0.46608922 0.13469832 0.65816456
 0.03834099 0.10098593 0.17988976 0.68043123 0.19539625 0.25339605
 0.53455816 0.68084495 0.27495861 0.46547353 0.74071541 0.28188631
 0.31367541 0.31474896 0.9670067  0.15858175 0.22502195 0.57030446
 0.61630769 0.87946337 0.14255022 0.1962025

## 3) Memory layout of ndarray
Memory layout of numpy array object can be obtained by accesing the flags property of the object. The following shows the memory layout

In [5]:
print(x.flags)

  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
  UPDATEIFCOPY : False



The C_CONTIGUOUS field in the output indicates whether the array was a C-style array. This means that the indexing of this array is done like a C array. This is also called row-major indexing in the case of 2D arrays.

Similarly, the F_CONTIGUOUS attribute indicates if the array is a Fortran-style array. Such an array is said to have column-major indexing.

In [6]:
# Creating a C Array
c_array = np.random.rand(10000, 10000)

# Creating a Fortran Array
f_array = np.asfortranarray(c_array)

# Create functions to return sum of  elements in first row and first column respectively

def sum_row(x):
    '''
    Given an array `x`, return the sum of its zeroth row
    '''
    return np.sum(x[0, :])

def sum_col(x):
    '''
    Given an array `x`, return the sum of its zeroth column
    '''
    return np.sum(x[:, 0])

In [7]:
# Test the performance of sum functions using both C and Fortran arrays
%timeit sum_row(c_array)

20.7 µs ± 1.49 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [8]:
%timeit sum_row(f_array)

276 µs ± 37.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [9]:
%timeit sum_col(c_array)

299 µs ± 8.6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [10]:
%timeit sum_col(f_array)

21.4 µs ± 827 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## 4) Views and Copies
There are two ways of accessing data by slicing and indexing. They are called copies and views.

View is reference to the original array and so modifying a view modifies the original array. This is not true for copies.

The `may_share_memory` function in NumPy can be used to determine if two arrays are copies or views of each other.

In [11]:

# Create a random array
x = np.random.rand(100, 10)

# Extract first five rows of the array and assign them to variable y
y = x[:5, :]

# Check if x and y share memory
np.may_share_memory(x, y)

True

In [12]:
# Modify the array y and see how it effects x. Set all elements of y to zero
y[:] = 0
print(x[:5, :])

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


In [13]:

# Let us create a copy of x
x = np.random.rand(100, 10)
y = np.empty([5, 10])
y[:] = x[:5, :]
np.may_share_memory(x, y)

False

In [14]:
# Alter y and check x
y[:] = 0
print(x[:5, :])

[[0.92705145 0.0795733  0.10390091 0.86068226 0.5243067  0.6358185
  0.30324646 0.71094334 0.96894352 0.67766897]
 [0.52613503 0.35204925 0.41229782 0.48706471 0.01065627 0.58868265
  0.39864159 0.40269144 0.79038759 0.27420396]
 [0.40929584 0.40030423 0.97979546 0.26077878 0.51330602 0.82109734
  0.33991895 0.35061904 0.25181099 0.28814262]
 [0.28380411 0.97879298 0.98537801 0.56715169 0.81851254 0.98989342
  0.86653276 0.74442839 0.06866418 0.75511973]
 [0.03568511 0.07841995 0.74112896 0.08324871 0.73378934 0.37526977
  0.00815115 0.81623386 0.14949728 0.57838414]]


## 5) Creating arrays from lists
The simplest way to create an array is using the array function. To create a valid array object, we can pass in a list as argument.

In [15]:
# Create an np.array() with list as argument
x = np.array([1, 2, 3])
y = np.array(['hello', 'world'])

In [16]:
# One handy ways to creating lists is to use range() function and use that as input to np.array() function
x = range(5)
y = np.array(x)
print(y)

[0 1 2 3 4]


In [17]:
# Numpy has a convenient function, called arange that combines the functionality of range and array functions
x = np.arange(5)
print(x)

[0 1 2 3 4]


In [18]:
# For multidimensional arrays, the input lists simply have to be nested as follows:
x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
print(x.ndim)
print(x.shape)
print(x)

2
(4, 3)
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


## 6) Array data types
The data type of a NumPy array can be found by simply checking the dtype attribute of the array.

In [19]:

x = np.random.random((10, 10))
print(x.dtype)

float64


In [20]:
x = np.arange(10)
print(x.dtype)

int32


In [21]:

x = np.array(['hello', 'world'])
x.dtype

dtype('<U5')