# NumPy
NumPy is a Python library and an open source project aiming to enable numerical computing with Python. NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more.It is the fundamental package for scientific computing in Python and provides efficient storage and computation for multi-dimensional data arrays. Its efficient storage and manipulation of numerical arrays is absolutely fundamental to the process of doing data science. NumPy is so important in the Python data science. It provides an easy and flexible interface to optimized computation with arrays of data. This notebook outlines techniques for effectively loading, storing, and manipulating in-memory data in Python.

In [3]:
# Importing NumPy and looking its version
import numpy
numpy.__version__

'1.19.2'

In [4]:
# Importing NumPy with an alias - it is a common practice to use alias to shorten the name 
import numpy as np
np.__version__

'1.19.2'

In [5]:
# We can display the built-in documentation to read more about NumPy
np?

In [7]:
#To display contents of numpy namespace
np.<TAB>

## Understanding Data Types

NumPy array and Python list sound similar in their structure, however lists are flexible to have different data types. The flexibility comes with its cost in efficiency since each item in the list must contain its own type info, reference count, other information. Each item is a complete Python object. In the case of NumPy arrays, it it is a fixed type with greater advantage in data storage and manipulation efficiency especially when the data is getting bigger.

In [8]:
# printing range of values---- range() is a built-in function
x=range(6)
for n in x:
  print(n)

0
1
2
3
4
5


In [9]:
# Summing numbers less than 100
x = 0
for i in range(100):
    x += i

In [12]:
# Notice Python assigns data type for the variable, not declared
# Both range of values and the sum are'int' data types
np.dtype?  # display NumPy standard data types
type(x)

int

In [10]:
# List --- the data types of each element depends on each element
L = list(range(10))
L

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [11]:
L2 = [True, "2", 3.0, 4]
[type(item) for item in L2]

[bool, str, float, int]

#### Array
* Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained.
* Array is used to store homogeneous elements at contiguous locations.
* One memory block is allocated for the entire array to hold the elements of the array. The array elements can be accessed in constant time by using the index of the particular element as the subscript.

In [12]:
# Converting a 'list' of values to 'array'
import array
L = list(range(10))
arr= array.array('i', L)  # the 'i' is a type code indicating the contents are integers
arr

array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
# Detail info on array.array
array.array??

In [14]:
# Creating Arrays from Python Lists
np.array([1, 4, 2, 5, 3])

array([1, 4, 2, 5, 3])

In [15]:
# Unlike Python lists, NumPy is constrained to arrays that all contain the same type. 
# so the numbers in the array will be changed to float if there is a single float in the list
np.array([3.14, 4, 2, 3])

array([3.14, 4.  , 2.  , 3.  ])

In [16]:
# Determining the type using 'dtype'
arr = np.array([1, 2, 3, 4], dtype='float32')  #  dtype can be: float64, int
arr

array([1., 2., 3., 4.], dtype=float32)

In [17]:
# Creating a length-10 integer array filled with zeros
arr0 = np.zeros(10, dtype=int)
arr0

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [18]:
# Creating a length-10 integer array filled with ones
arr1 = np.ones(10, dtype=int)
arr1

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

## NumPy Array Attributes

In [13]:
np.random.seed(0)  # seed for reproducibility

arr1 = np.random.randint(10, size=6)  # One-dimensional array
arr2 = np.random.randint(10, size=(3, 4))  # Two-dimensional array
arr3 = np.random.randint(10, size=(3, 4, 5))  # Three-dimensional array

In [14]:
# Dimension, shape, and size of the array
print("arr3 ndim: ", arr3.ndim)
print("arr3 shape:", arr3.shape)
print("arr3 size: ", arr3.size)

arr3 ndim:  3
arr3 shape: (3, 4, 5)
arr3 size:  60


## The NumPy ndarray: A Multidimensional Array Object
N-dimensional array object or ndarray is one of the key features of NumPy; which is a fast, flexible container for large datasets in Python.

In [21]:
# Create a 3x6 floating-point array filled with ones 
X0 = np.zeros((3, 6))
X0

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.]])

In [22]:
# Create a 3x5 floating-point array filled with ones 
X1 = np.ones((3, 5), dtype=float)
X1

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [23]:
# Create a 3x5 array filled with 3.14
Xpi = np.full((3, 5), 3.14)
Xpi

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [24]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
a2 = np.arange(0, 20, 2)
a2

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [25]:
# generate small array of data
data = np.random.randn(2, 3)
data

array([[ 1.25441407,  1.41910204, -0.74385608],
       [-2.5174371 , -1.50709602,  1.14907613]])

In [26]:
# Creating multi-dimentional array from a nested list
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [27]:
# Type of the whole -- ndarray
type(arr2)

numpy.ndarray

In [28]:
# Checking type of elements in the ndarray
arr2.dtype

dtype('int32')

In [29]:
arr3 = np.array([1, 2, 3], dtype=np.float64)
arr3

array([1., 2., 3.])

## Arithmetic with NumPy Arrays

In [30]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

array([[1., 2., 3.],
       [4., 5., 6.]])

In [31]:
arr*arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [32]:
arr-arr

array([[0., 0., 0.],
       [0., 0., 0.]])

In [33]:
1/arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [34]:
arr**0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

## Basic Indexing and Slicing

In [35]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [36]:
arr[5]

5

In [37]:
arr[5:8]

array([5, 6, 7])

In [38]:
# Inserting new elements
arr[5:8] =12
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

In [39]:
# Higher dimesional array ---  3 x 3 array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [40]:
arr2d[2]  # an array

array([7, 8, 9])

In [41]:
arr2d[0][2]

3

In [42]:
# Multi-dimensional array ---  2 x 2 x 3 array
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [43]:
arr3d[0] # will be 2 X 3 array --- Notice the slicing dimension compared to the above slicing

array([[1, 2, 3],
       [4, 5, 6]])

In [44]:
# copy old value if you want to manupulate the array---variables change permanently when manupulated 
old_values = arr3d[0].copy() 

In [45]:
arr3d[0] = 42  # 
arr3d

array([[[42, 42, 42],
        [42, 42, 42]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [46]:
arr3d[0] = old_values
arr3d

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [47]:
# Indexing with slices
arr  # using the above array

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

In [48]:
arr[1:6]

array([ 1,  2,  3,  4, 12])

In [49]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [50]:
arr2d[:2]

array([[1, 2, 3],
       [4, 5, 6]])

In [51]:
arr2d[:2, 1:]

array([[2, 3],
       [5, 6]])

In [52]:
arr2d[-1]

array([7, 8, 9])

In [53]:
arr2d[2, -1]

9

In [54]:
arr2d

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## Computation of NumPy arrays

Computation on NumPy arrays can be very fast, or can be very slow. Using vectorized operations which is, generally implemented through NumPy universal functions (ufuncs) makes computation faster. 