In [1]:
import numpy as np

One of the key features of NumPy is its N-dimensional array object, or ndarray, which is a fast, flexible container for large data sets in Python.
An ndarray is a generic multidimensional container for homogeneous data; that is, all of the elements must be the same type.
Every array has a **shape**, a tuple indicating the size of each dimension, and a **dtype**, an object describing the data type of the array.
## Creating ndarrays

In [3]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

array([6. , 7.5, 8. , 0. , 1. ])

In [5]:
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [6]:
arr2.ndim

2

In [7]:
arr2.shape

(2, 4)

In [8]:
arr1.dtype

dtype('float64')

In [9]:
arr2.dtype

dtype('int32')

## Array creation functions

![image info](./images/1.jpg)
![image info](./images/2.jpg)

In [10]:
arr1 = np.array([1, 2, 3], dtype=np.float64)
arr2 = np.array([1, 2, 3], dtype=np.int32)
arr1.dtype    

dtype('float64')

In [11]:
arr2.dtype

dtype('int32')

![image info](./images/3.jpg)
![image info](./images/4.jpg)

In [12]:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
arr.astype(np.int32)

array([ 3, -1, -2,  0, 12, 10])

In [4]:
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
numeric_strings.astype(float)

array([ 1.25, -9.6 , 42.  ])

NumPy is smart enough to alias the Python types to the equivalent dtypes.
Calling astype always creates a new array (a copy of the data), even if the new dtype is the same as the old dtype.

## Operations between Arrays and Scalars
Arrays are important because they enable you to express batch operations on data without writing any for loops. This is usually called vectorization

In [14]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr * arr

array([[ 1.,  4.,  9.],
       [16., 25., 36.]])

In [15]:
 arr - arr

array([[0., 0., 0.],
       [0., 0., 0.]])

In [16]:
 1 / arr

array([[1.        , 0.5       , 0.33333333],
       [0.25      , 0.2       , 0.16666667]])

In [17]:
arr ** 0.5

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

## Basic Indexing and Slicing

In [19]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [20]:
arr[5:8]

array([5, 6, 7])

In [21]:
arr[5:8] = 12
arr

array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9])

As you can see, if you assign a scalar value to a slice, as in `arr[5:8] = 12`, the value is propagated (or **broadcasted** henceforth) to the entire selection.

An important first distinction from lists is that array slices are views on the original array. This means that the data is not copied, and any modifications to the view will be reflected in the source array.

In [22]:
arr_slice = arr[5:8]
arr_slice[1] = 12345
arr

array([    0,     1,     2,     3,     4,    12, 12345,    12,     8,
           9])

As NumPy has been designed with large data use cases in mind, you could imagine performance and memory problems if NumPy insisted on copying data left and right.

If you want a copy of a slice of an ndarray instead of a view, you will need to explicitly copy the array; for example `arr[5:8].copy()`

In [23]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[0, 2]

3

In [24]:
arr2d[0][2]

3

In [25]:
arr2d[:2]

array([[1, 2, 3],
       [4, 5, 6]])

As you can see, it has sliced along axis 0, the first axis. A slice, therefore, selects a range of elements along an axis. You can pass multiple slices just like you can pass multiple
indexes:

In [26]:
arr2d[:2, 1:]

array([[2, 3],
       [5, 6]])

When slicing like this, you always obtain array views of the same number of dimensions.

By mixing integer indexes and slices, you get lower dimensional slices:

In [27]:
arr2d[1, :2]

array([4, 5])

In [28]:
arr2d[2, :1]

array([7])

In [29]:
arr2d[:, :1]

array([[1],
       [4],
       [7]])

## Boolean Indexing

In [32]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
bool_array = np.array([True, False, True, False, True, False, False])
names[bool_array]

array(['Bob', 'Will', 'Will'], dtype='<U4')

In [36]:
data = np.random.randn(7, 4)

In [39]:
mask = (names == 'Bob') | (names == 'Will')
data[mask]

array([[-0.4318451 , -0.16711368,  0.21634219, -1.16854007],
       [-0.02202475,  0.46491864,  1.03573644,  0.73656771],
       [-0.06005468,  0.27540106, -1.61610564,  1.042174  ],
       [ 0.31334142, -0.25420721,  1.2007739 ,  1.18932517]])

Selecting data from an array by boolean indexing always creates a copy of the data,
even if the returned array is unchanged.

The Python keywords `and` and `or` do not work with boolean arrays.

In [40]:
data[data < 0] = 0
data

array([[0.        , 0.        , 0.21634219, 0.        ],
       [0.        , 0.        , 0.66295433, 0.        ],
       [0.        , 0.46491864, 1.03573644, 0.73656771],
       [0.        , 0.27540106, 0.        , 1.042174  ],
       [0.31334142, 0.        , 1.2007739 , 1.18932517],
       [0.        , 0.        , 0.        , 0.82659946],
       [0.        , 0.22541353, 2.80702975, 0.        ]])

## Fancy Indexing

In [41]:
arr = np.empty((8, 4))
for i in range(8):
    arr[i] = i
arr

array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

In [42]:
arr[[4, 3, 0, 6]]

array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])

In [43]:
arr[[-3, -5, -7]]

array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])

Passing multiple index arrays does something slightly different; it selects a 1D array of
elements corresponding to each tuple of indices:

In [44]:
arr = np.arange(32).reshape((8, 4))
arr

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])

In [45]:
arr[[1, 5, 7, 2], [0, 3, 1, 2]]

array([ 4, 23, 29, 10])

In [46]:
arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

In [47]:
 arr[np.ix_([1, 5, 7, 2], [0, 3, 1, 2])]

array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])

Keep in mind that fancy indexing, unlike slicing, always copies the data into a new array

## Transposing Arrays and Swapping Axes
Transposing is a special form of reshaping which similarly returns a view on the underlying data without copying anything. Arrays have the `transpose` method and also
the special `T` attribute:


In [49]:
arr = np.arange(15).reshape((3, 5))
arr

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [50]:
arr.T

array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

## Universal Functions: Fast Element-wise Array Functions
A universal function, or ufunc, is a function that performs elementwise operations on
data in ndarrays. You can think of them as fast vectorized wrappers for simple functions
that take one or more scalar values and produce one or more scalar results.

In [55]:
arr = np.arange(10)
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [56]:
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

These are referred to as unary ufuncs. Others, such as add or maximum, take 2 arrays
(thus, binary ufuncs) and return a single array as the result:

In [58]:
x = np.random.randn(8)
y = np.random.randn(8)
x

array([ 0.71218362, -0.93490402, -0.11651451, -1.46998147, -0.29780386,
        0.59806224,  0.54313595,  0.39208181])

In [59]:
y

array([ 0.57481134, -0.8453657 , -0.78534631, -0.76517757,  2.66641939,
       -1.01039003,  0.58732215,  0.57911018])

In [60]:
np.maximum(x, y)

array([ 0.71218362, -0.8453657 , -0.11651451, -0.76517757,  2.66641939,
        0.59806224,  0.58732215,  0.57911018])

![image info](./images/5.jpg)
![image info](./images/6.jpg)
![image info](./images/7.jpg)

## Expressing Conditional Logic as Array Operations

In [61]:
xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
result = np.where(cond, xarr, yarr)
result

array([1.1, 2.2, 1.3, 1.4, 2.5])

In [63]:
arr = np.random.randn(4, 4)
arr

array([[-0.66890388, -0.22991968, -1.27782622,  0.78146283],
       [-0.16364032, -0.13659705,  0.59885547,  1.78517257],
       [ 0.8593769 ,  0.14816475, -0.63139059, -0.62357915],
       [ 1.93235685, -0.77730984,  0.37139016, -0.4274369 ]])

In [64]:
np.where(arr > 0, 2, arr)

array([[-0.66890388, -0.22991968, -1.27782622,  2.        ],
       [-0.16364032, -0.13659705,  2.        ,  2.        ],
       [ 2.        ,  2.        , -0.63139059, -0.62357915],
       [ 2.        , -0.77730984,  2.        , -0.4274369 ]])

## Mathematical and Statistical Methods

In [65]:
arr = np.random.randn(5, 4)
arr.mean()

0.1368000544162397

In [66]:
arr.sum()

2.7360010883247945

Functions like `mean` and `sum` take an optional `axis` argument which computes the statistic
over the given axis, resulting in an array with one fewer dimension:

In [67]:
arr.mean(axis=1)

array([-0.67082367,  0.2183705 ,  0.53465402, -0.13814258,  0.739942  ])

In [68]:
arr.sum(0)

array([ 1.17504866,  0.81580412,  1.51367598, -0.76852767])

Other methods like `cumsum` and `cumprod` do not aggregate, instead producing an array
of the intermediate results:

In [70]:
arr = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])
arr.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36], dtype=int32)

In [71]:
arr.cumsum(0)

array([[ 0,  1,  2],
       [ 3,  5,  7],
       [ 9, 12, 15]], dtype=int32)

![image info](./images/8.jpg)

## Sorting

In [74]:
arr = np.random.randn(8)
arr

array([-1.02700457,  1.46392001, -0.57632016,  0.10280574,  0.37983063,
        0.33237144, -1.71195306,  0.36855892])

In [76]:
arr.sort()
arr

array([-1.71195306, -1.02700457, -0.57632016,  0.10280574,  0.33237144,
        0.36855892,  0.37983063,  1.46392001])

Multidimensional arrays can have each 1D section of values sorted in-place along an
axis by passing the axis number to sort:

In [77]:
arr = np.random.randn(5, 3)
arr

array([[ 0.42619181, -2.31030083,  0.286963  ],
       [-0.25968895, -0.63507228,  0.06799456],
       [ 0.2916739 , -0.08932828, -0.09280791],
       [ 2.61235381,  0.7772777 , -0.93076322],
       [ 0.68139778,  1.39522571, -2.19182859]])

In [78]:
arr.sort(1)
arr

array([[-2.31030083,  0.286963  ,  0.42619181],
       [-0.63507228, -0.25968895,  0.06799456],
       [-0.09280791, -0.08932828,  0.2916739 ],
       [-0.93076322,  0.7772777 ,  2.61235381],
       [-2.19182859,  0.68139778,  1.39522571]])

The top level method `np.sort` returns a sorted copy of an array instead of modifying
the array in place.

## Linear Algebra

In [86]:
x = np.array([[1., 2., 3.], [4., 5., 6.]])
y = np.array([[6., 23.], [-1, 7], [8, 9]])
x

array([[1., 2., 3.],
       [4., 5., 6.]])

In [87]:
y

array([[ 6., 23.],
       [-1.,  7.],
       [ 8.,  9.]])

In [88]:
x.dot(y)

array([[ 28.,  64.],
       [ 67., 181.]])

In [89]:
np.dot(x,y)

array([[ 28.,  64.],
       [ 67., 181.]])

`numpy.linalg` has a standard set of matrix decompositions and things like inverse and
determinant.

![image info](./images/10.jpg)