In this stage I brushed up my python🐍 skills🛠️, and learned some new stuff to get started with machine learning.

In [2]:
string = "dfjndfjn"
string[::-1]

'njfdnjfd'

In [3]:
len(string)

8

## Working with Numpy library

### NumPy is often significantly faster than standard Python code for numerical computations due to several reasons:

1. **Vectorization**
Avoiding Loops: NumPy allows you to perform operations on entire arrays without the need for explicit loops in Python. This process, called vectorization, leverages efficient, low-level implementations in C and avoids the overhead of Python loops.
Batch Operations: Operations on entire arrays are executed in a single batch, which reduces the overhead of repeatedly interpreting Python code.
2. **Optimized C and Fortran Libraries**
Core Implementation: NumPy's core operations are implemented in C and Fortran, which are compiled languages known for their high performance. These languages can execute operations more efficiently than Python.
BLAS and LAPACK: NumPy can link to highly optimized libraries like BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra Package) for many of its operations, particularly those involving linear algebra. These libraries are optimized for performance on various hardware architectures.
3. **Efficient Memory Management**
Contiguous Memory Layout: NumPy arrays are stored in contiguous blocks of memory, which improves cache performance and memory access patterns. This is in contrast to Python lists, which store elements as pointers to objects scattered in memory.
Data Types: NumPy allows you to specify data types explicitly, enabling more efficient use of memory and operations tailored to specific data types (e.g., 32-bit or 64-bit floats).

In [4]:
import numpy as np


In [5]:
# Comparing the time in below cases tell us how fast numpy can be!!
import time

# Create a large array
size = 10**6
array = np.random.rand(size)

# Sum using a Python loop
start_time = time.time()
total = sum(array)
python_time = time.time() - start_time
print(f"Python loop: {python_time:.10f} seconds")

# Sum using NumPy
start_time = time.time()
total = np.sum(array)
numpy_time = time.time() - start_time
print(f"NumPy: {numpy_time:.10f} seconds")


Python loop: 0.1173245907 seconds
NumPy: 0.0059792995 seconds


In [6]:
np1 = np.array([1, 2])
print(type(np1))
np1

<class 'numpy.ndarray'>


array([1, 2])

In [7]:
np1.shape

(2,)

In [8]:
# 2-D numpy array
Mat1 = np.array([[1, 2, 3], [10, 11, 12]])
Mat1

array([[ 1,  2,  3],
       [10, 11, 12]])

In [9]:
# updating element at row 1 and column 2
Mat1[1, 2] = 100

In [10]:
Mat1

array([[  1,   2,   3],
       [ 10,  11, 100]])

In [11]:
Mat1.shape

(2, 3)

In [12]:
Mat1.ndim

2

In [13]:
# Getting datatype 
Mat1.dtype

dtype('int32')

In [14]:
np2 = np.array([1, 2, 3.0])

In [15]:
np2.dtype

dtype('float64')

In [16]:
np3 = np.array([1, 2.34, "dfjbvdj"])

In [17]:
np3

array(['1', '2.34', 'dfjbvdj'], dtype='<U32')

In [18]:
np3[0] = "jgjrnfjg"

In [19]:
np3

array(['jgjrnfjg', '2.34', 'dfjbvdj'], dtype='<U32')

In [20]:
# arrange in numpy
Mat = np.arange(1, 11, 3)
Mat

array([ 1,  4,  7, 10])

In [21]:
# linspace in numpy, 10 quidistant elements
Mat = np.linspace(1, 11, 10)
Mat

array([ 1.        ,  2.11111111,  3.22222222,  4.33333333,  5.44444444,
        6.55555556,  7.66666667,  8.77777778,  9.88888889, 11.        ])

In [22]:
# 5 X 5 matrix, with each element between 0 and 1, uniform distribution
Mat4 = np.random.rand(5, 5)

In [23]:
Mat4

array([[0.29314591, 0.90784825, 0.51717775, 0.35383567, 0.19082127],
       [0.38663777, 0.50702301, 0.46771796, 0.38294166, 0.49073666],
       [0.4626259 , 0.48518575, 0.76507096, 0.1096412 , 0.70794794],
       [0.30004367, 0.97183243, 0.44112222, 0.10931886, 0.95246676],
       [0.20708138, 0.02016581, 0.28167137, 0.90834675, 0.23938912]])

In [24]:
#  Normally distributed random numbers
Mat5 = np.random.randn(10, 10)
Mat5

array([[-0.67355798, -0.02235885,  0.67567077, -1.21010778,  0.50796134,
        -0.70611037, -0.27221395, -1.64035244, -1.12631559, -0.58081472],
       [ 1.68179259,  0.21798361,  0.59429681, -1.79182626,  0.11886   ,
        -1.26404909, -0.73624553,  0.65013451, -0.72456501,  0.84924981],
       [-1.27827008,  0.55389367,  0.68328092,  0.03645082, -0.7625546 ,
         1.00855569,  0.03018501,  0.90439197,  0.72336735, -0.03946908],
       [ 0.46475266,  0.43164095,  0.5864705 , -1.82281769,  1.760719  ,
         0.91322562, -1.26623298, -1.66835818,  0.85770291, -0.28245957],
       [ 0.80435109, -0.48259848, -0.99091503,  0.1025411 ,  0.41056233,
        -0.17973955, -0.18371402,  0.16425297, -1.51356756,  0.20940054],
       [ 0.83639053,  0.31263172, -1.07750761, -0.10195477,  0.12109395,
        -0.76770903, -0.71847545, -1.46830526, -0.23255862,  1.37160853],
       [ 0.16400113, -1.32548826, -0.04625231, -0.01749812,  0.36072261,
         0.49343805, -1.00870885,  0.04111693

In [25]:
np.diag([10, 10, 11 , 12])

array([[10,  0,  0,  0],
       [ 0, 10,  0,  0],
       [ 0,  0, 11,  0],
       [ 0,  0,  0, 12]])

In [26]:
np.zeros([2, 4])

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [27]:
np.ones([2, 2, 2])

array([[[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]]])

In [28]:
# slicings 
Mat5 = np.random.randn(5, 6)
Mat5

array([[ 1.66805149, -0.14105884, -1.73216101, -0.69621141,  0.92012134,
         0.55316262],
       [ 1.68099655,  0.01329926, -0.18646611, -0.21416044,  1.26906115,
        -0.4252117 ],
       [ 0.43328274,  0.20481825,  1.42794289,  0.02299173, -0.8506473 ,
         1.56696482],
       [ 0.62556233,  0.87983377,  0.6774441 ,  1.5131373 , -1.5698599 ,
        -0.37205481],
       [ 0.01832217,  0.00868064,  0.24105411,  0.29185493, -1.31914451,
        -0.26706116]])

In [29]:
Mat5[0, 0]

1.6680514942721407

In [30]:
# Rows with index 0, 1 and 2 and all columns
Mat5[0:3, :]

array([[ 1.66805149, -0.14105884, -1.73216101, -0.69621141,  0.92012134,
         0.55316262],
       [ 1.68099655,  0.01329926, -0.18646611, -0.21416044,  1.26906115,
        -0.4252117 ],
       [ 0.43328274,  0.20481825,  1.42794289,  0.02299173, -0.8506473 ,
         1.56696482]])

In [31]:
type(Mat5.shape)

tuple