### <mark> Why use Numpy?

    -It provides efficient storage
    -It provides better way of handling data for processing
    -Its fast
    -Its easy to learn
    -It uses relatively less memory to store data
    -NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more.
    -Fast and versatile, the NumPy vectorization, indexing, and broadcasting concepts are the de-facto standards of array computing today.

NumPy is a Python library used for **WORKING WITH ARRAYS**. It also has functions for working in domain of linear algebra, fourier transform, and matrices.

In Python we have lists that serve the purpose of arrays, but they are slow to process.

NumPy aims to provide an array object that is **UPTO 50X FASTER THAN TRADITIONAL PYTHON LISTS**. The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy. Arrays are very frequently used in data science, where speed and resources are very important.

NumPy arrays are **STORED AT ONE CONTINUOUS PLACE IN MEMORY INLIKE LISTS**, so processes can access and manipulate them very efficiently. This behavior is called **locality of reference** in computer science.

NumPy is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++.

#### <font color='blue'> List vs. Array </blue>

--Python does not have **BUILT-IN** support for Arrays, but Python Lists can be used instead.

-We can use LISTS as ARRAYS, however, to work with arrays in Python you will have to import a library, like the NumPy library.

L:can have different DATATYPES    **A: same data type elements only

L:No need import a module    **A:Need to explicitly import 
    
L:can not    **A:Can directly handle arithmetic operations 

L:Can be NESTED varyingly    **A:Must contain either all nested elements of same size 

L:Consume LARGER MEMORY    A: Comparatively more compact in memory size 

L:Preferred for shorter sequence of data items   **A:Preferred for longer sequence of data items 


L:Greater flexibility allows **easy modification** (addition, deletion) of data
    
    A: Less flexibility since addition, deletion has to be done element wise 
    
L:The entire list can be printed **without any explicit LOOPING 
   
    A: Loop has to be formed to print or access the components of array
    
##### Array is thus better in SIZE,PERFORMANC(speed),FUNCTIONALITY(optimized function) over list.

##### There are several important differences between NumPy arrays and the standard Python sequences:

   NumPy arrays have a **FIXED SIZE AT CREATION**, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.

   The elements in a NumPy array are all required to be **OF SAME DATATYPE**, and thus will be the same size in memory. The **exception objects**: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements.

   NumPy arrays facilitate advanced **MATHEMATICAL OPERATIONS** and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.

   A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays. In other words, in order to efficiently use much (perhaps even most) of today’s scientific/mathematical Python-based software, just knowing how to use Python’s built-in sequence types is insufficient - one also needs to know how to use NumPy arrays.

## <mark> Array dimensions/slicing/shaping

In [3]:
import numpy as np

In [4]:
a1=np.array([1,2,3,4,5,6],)
print(a1)
print(a1.ndim)

[1 2 3 4 5 6]
1


In [5]:
a2=np.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print(a2)
print(a2.ndim)

[[ 1  2  3  4  5  6]
 [ 7  8  9 10 11 12]]
2


In [6]:
l=[1,2,3,4,5,6]
a3=np.array([[[1,2,3,4,5],[6,7,8,9,10]],[[11,12,13,14,15],[16,17,18,19,20]]])

print(a3)
print(a3.ndim)

[[[ 1  2  3  4  5]
  [ 6  7  8  9 10]]

 [[11 12 13 14 15]
  [16 17 18 19 20]]]
3


In [7]:
#Accessing the elements

In [8]:
a1[0]

1

In [9]:
a2[0,4]

5

In [10]:
a2[1,-1]

12

In [11]:
a3

array([[[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10]],

       [[11, 12, 13, 14, 15],
        [16, 17, 18, 19, 20]]])

In [12]:
a3[0]

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

In [13]:
a3[1]

array([[11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

In [14]:
a3[1,0,1]

12

In [15]:
a33=np.array([[[99,88,77]]])
a33

array([[[99, 88, 77]]])

In [16]:
print(a33[0])
#THis is 3D array which contains ONE 1x3 matrix, which contains ONE array which contains THREE scalars
print(a33.shape)

[[99 88 77]]
(1, 1, 3)


In [17]:
#a33[1] - error

### Slicing

In [18]:
a1[::]

array([1, 2, 3, 4, 5, 6])

In [19]:
a1[1:4]

array([2, 3, 4])

In [20]:
a2

array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12]])

In [21]:
a2[0,1:2]

array([2])

In [22]:
a2[0:2,3:6]

array([[ 4,  5,  6],
       [10, 11, 12]])

In [23]:
a3

array([[[ 1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10]],

       [[11, 12, 13, 14, 15],
        [16, 17, 18, 19, 20]]])

In [24]:
a3[1,0,3:6]

array([14, 15])

In [25]:
a3[:,:,3:6]

array([[[ 4,  5],
        [ 9, 10]],

       [[14, 15],
        [19, 20]]])

In [26]:
a3[0:2,0:4,3:6]

array([[[ 4,  5],
        [ 9, 10]],

       [[14, 15],
        [19, 20]]])

In [27]:
import numpy as np

ar = np.array([[1, 2, 3],
       [4, 5, 6],
       [7, 1, 0]])
ar.T
#To transpose the matrix; ie, to make rows column and columns as row

array([[1, 4, 7],
       [2, 5, 1],
       [3, 6, 0]])

In [28]:
ar.ravel()
#Return a flattened array.

array([1, 2, 3, 4, 5, 6, 7, 1, 0])

In [29]:
ar.reshape(9,1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [1],
       [0]])

In [30]:
ar.reshape(3,3)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 1, 0]])

In [31]:
import numpy as np
krr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = krr.reshape(-1)
print("Using reshape:",newarr)

print("Using ravel:",krr.ravel())

Using reshape: [1 2 3 4 5 6]
Using ravel: [1 2 3 4 5 6]


In [32]:
#Counts the number of non-zero values in the array.
np.count_nonzero(ar)

8

In [33]:
#Return the indices of the elements that are non-zero.
np.nonzero(ar)

(array([0, 0, 0, 1, 1, 1, 2, 2], dtype=int64),
 array([0, 1, 2, 0, 1, 2, 0, 1], dtype=int64))

In [34]:
x = ar.flat
x

<numpy.flatiter at 0x26102185360>

In [35]:
# numpy.flatiter object needs to be looped over in order to access its elements
for item in ar.flat:
    print(item)

1
2
3
4
5
6
7
1
0


In [36]:
arr_list = ar.tolist()
#To create python list from array

arr_list

[[1, 2, 3], [4, 5, 6], [7, 1, 0]]

# <mark> Creating Arrays
    
## Array creation methods in NumPy

    Conversion from other python structure(list,tuple,etc)
        np.array(list)
        arr.shape, arr.reshape, arr.ndim, arr.size, 
    
    Intrinsic numpy array creation objetcs(arange,ones,zeros,etc)
        np.zeros, np.zeros_like
        np.ones, np.ones_like
        np.arange
        np.linspace
        np.logspace
        np.identity
        np.full
        np.full_like
        np.eye
    
    Use of special library functions(eg.random)
        np.random.rand
        np.random.randint
    
    Reading arrays from disk, either from standard or custom formats
    
    Creating arrays from raw bytes through the use of strings or bufffers

### from list and to list

In [37]:
import numpy as np
arr = np.array([[1, 2], [3, 4]])
arr

array([[1, 2],
       [3, 4]])

In [38]:
aa = np.array([[1, 2, 3, 4, 5],[6,7,8,9,10]])

print("Array to list without ravel = ", aa.tolist())
print("Array to list with ravel= ", aa.ravel().tolist())

Array to list without ravel =  [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]
Array to list with ravel=  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


### Random values

In [39]:
"""Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``.
"""
np.random.rand(3,3)

array([[0.17460261, 0.11132211, 0.30989916],
       [0.46142318, 0.07233133, 0.89925829],
       [0.70556045, 0.84733018, 0.2048719 ]])

In [40]:
"""
randint(low, high=None, size=None, dtype=int)
Return random integers from `low` (inclusive) to `high` (exclusive)."""
np.random.randint(3,100,10)

array([83,  8, 53, 82, 34, 70, 95, 48,  4, 25])

In [41]:
np.random.randint(10)
#randint(low, high=None, size=None, dtype=int)

8

In [42]:
r=np.random.randint(3,100,10)
r

array([88, 84, 62, 71, 77, 86, 13, 34, 22, 18])

In [43]:
r=np.random.randint(3,100,(3,3,10))
r

array([[[86, 45, 11, 67, 58, 97, 78, 37, 52, 32],
        [98, 97, 92, 85, 13, 71, 15, 72,  5, 31],
        [62, 36, 40, 60, 99, 96, 98, 12, 87, 37]],

       [[11, 33, 87, 12, 98, 65, 34, 77, 94, 33],
        [54, 55,  4, 86, 14, 70, 84, 12, 80,  9],
        [78, 26,  7, 91, 56, 74, 58, 83, 71,  9]],

       [[80, 41, 63, 93, 49, 73, 61, 63, 67, 40],
        [80, 21, 91, 89, 50,  7, 24, 79, 26, 34],
        [64, 39, 73, 67, 10, 86, 41, 55, 47, 30]]])

In [44]:
print(a3.shape)
print(a3)
#Output:(2,2,5)-(number of arrays, number of rows, number of columns/elements)

(2, 2, 5)
[[[ 1  2  3  4  5]
  [ 6  7  8  9 10]]

 [[11 12 13 14 15]
  [16 17 18 19 20]]]


In [45]:
a3.shape

(2, 2, 5)

In [46]:
a3.reshape(1,5,4)

array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16],
        [17, 18, 19, 20]]])

## zeros

In [47]:
import numpy as np
import pandas as pd
np.zeros(5)
#Return a new array of given shape and type, filled with zeros.


array([0., 0., 0., 0., 0.])

In [48]:
np.zeros((4,5),dtype=int)

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

In [49]:
np.zeros((2,4,5),dtype=int)

array([[[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]],

       [[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]]])

In [50]:
#Return evenly spaced values within a given interval.
np.arange(0,5)

array([0, 1, 2, 3, 4])

In [51]:
c1=np.arange(0,10).reshape(2,5)
c1

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [52]:
np.zeros_like(c1)
#Return an array of zeros with the same shape and type as a given array.

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

## Ones

In [53]:
o=np.ones((3,2))
o

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [54]:
np.ones_like(o,dtype=int)

array([[1, 1],
       [1, 1],
       [1, 1]])

## Full

In [55]:
np.full((5,5),50)
#Return a new array of given shape and type, filled with `fill_value`.

array([[50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50]])

In [56]:
np.full_like(c1,50)

array([[50, 50, 50, 50, 50],
       [50, 50, 50, 50, 50]])

## eye

In [57]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [58]:
np.eye(5,5)
#Return a 2-D array with ones on the diagonal and zeros elsewhere.

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [59]:
np.eye(3,5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.]])

In [60]:
np.eye(10,k=4)

array([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [61]:
np.eye(5,10,k=4)

array([[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.]])

## Array with numerical range

In [62]:
list(range(0,10,2))
#This can return ONLY int objects

[0, 2, 4, 6, 8]

In [63]:
np.arange(0,10)

#arange is faster than range due to memory allocation

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [64]:
np.arange(1,10,0.5)
#here floats can also be obtained

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. ,
       7.5, 8. , 8.5, 9. , 9.5])

In [65]:
type(range(0,10))

range

In [66]:
type(np.arange(0,10))

numpy.ndarray

## linespace

In [67]:
np.linspace(1,100)

array([  1.        ,   3.02040816,   5.04081633,   7.06122449,
         9.08163265,  11.10204082,  13.12244898,  15.14285714,
        17.16326531,  19.18367347,  21.20408163,  23.2244898 ,
        25.24489796,  27.26530612,  29.28571429,  31.30612245,
        33.32653061,  35.34693878,  37.36734694,  39.3877551 ,
        41.40816327,  43.42857143,  45.44897959,  47.46938776,
        49.48979592,  51.51020408,  53.53061224,  55.55102041,
        57.57142857,  59.59183673,  61.6122449 ,  63.63265306,
        65.65306122,  67.67346939,  69.69387755,  71.71428571,
        73.73469388,  75.75510204,  77.7755102 ,  79.79591837,
        81.81632653,  83.83673469,  85.85714286,  87.87755102,
        89.89795918,  91.91836735,  93.93877551,  95.95918367,
        97.97959184, 100.        ])

In [68]:
np.linspace(1,100,25)
#Return evenly spaced numbers over a specified interval.

array([  1.   ,   5.125,   9.25 ,  13.375,  17.5  ,  21.625,  25.75 ,
        29.875,  34.   ,  38.125,  42.25 ,  46.375,  50.5  ,  54.625,
        58.75 ,  62.875,  67.   ,  71.125,  75.25 ,  79.375,  83.5  ,
        87.625,  91.75 ,  95.875, 100.   ])

In [69]:
np.linspace(1,100,25,endpoint=False)

array([ 1.  ,  4.96,  8.92, 12.88, 16.84, 20.8 , 24.76, 28.72, 32.68,
       36.64, 40.6 , 44.56, 48.52, 52.48, 56.44, 60.4 , 64.36, 68.32,
       72.28, 76.24, 80.2 , 84.16, 88.12, 92.08, 96.04])

In [70]:
np.linspace(1,100,25,retstep=True)
#Returns stepsize at end  here ex.4.125

(array([  1.   ,   5.125,   9.25 ,  13.375,  17.5  ,  21.625,  25.75 ,
         29.875,  34.   ,  38.125,  42.25 ,  46.375,  50.5  ,  54.625,
         58.75 ,  62.875,  67.   ,  71.125,  75.25 ,  79.375,  83.5  ,
         87.625,  91.75 ,  95.875, 100.   ]),
 4.125)

## logspace

In [71]:
np.logspace(1,100,25,base=2,endpoint=False)

array([2.00000000e+00, 3.11249583e+01, 4.84381515e+02, 7.53817723e+03,
       1.17312726e+05, 1.82567685e+06, 2.84120580e+07, 4.42162061e+08,
       6.88113785e+09, 1.07087564e+11, 1.66654799e+12, 2.59356184e+13,
       4.03622520e+14, 6.28136706e+15, 9.77536439e+16, 1.52128905e+18,
       2.36750291e+19, 3.68442146e+20, 5.73387323e+21, 8.92332826e+22,
       1.38869110e+24, 2.16114763e+25, 3.36328150e+26, 5.23409982e+27,
       8.14555693e+28])

## <mark> numpy aggregations

In [72]:
import numpy as np
one=np.array([1,3,4,934,2])

In [73]:
one.argmax()
#gives the index of maximum values

3

In [74]:
one.argsort()
#Gives the sorted indices(not values) of the array

array([0, 4, 1, 2, 3], dtype=int64)

In [75]:
ar=np.array([[1, 2, 3],
       [4, 5, 6],
       [7, 1, 0]])
ar

array([[1, 2, 3],
       [4, 5, 6],
       [7, 1, 0]])

In [76]:
ar.argmin()
#Return indices of the minimum values along the given axis 

8

In [77]:
ar.argmax()
#Return indices of the maximum values along the given axis.

6

In [78]:
ar.argmax(axis=0) #Columnwise

array([2, 1, 1], dtype=int64)

In [79]:
ar.argmax(axis=1)

array([2, 2, 0], dtype=int64)

In [80]:
ar.argsort()
#Returns the indices that would sort this array.

array([[0, 1, 2],
       [0, 1, 2],
       [2, 1, 0]], dtype=int64)

In [81]:
np.sqrt(ar)

array([[1.        , 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974],
       [2.64575131, 1.        , 0.        ]])

In [82]:
ar.sum()

29

In [83]:
ar.min()

0

In [84]:
ar.max()

7

## <mark> Datatypes in array
    
The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.

The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.

The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for integer.

In [85]:
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)

<U6


In [86]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

int32


In [87]:
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1
