# **Numpy: An Introduction**

In [97]:
# installation
# pip install numpy

In [98]:
# importing
import numpy as np

## What is an array?
An array is a collection of elements of the same data type stored in contiguous memory locations. In Python, the closest built-in data structure to an array is the `list`. Python's `list` type can store elements of different data types, unlike traditional arrays.

If you need a more array-like data structure in Python, you can use the `numpy.array()` function from the NumPy library, which provides a more efficient array implementation for numerical operations.

In [99]:
# making a simple numpy array
np.arange(5)

array([0, 1, 2, 3, 4])

In [100]:
# another way
a = np.arange(6)
print(a)

[0 1 2 3 4 5]


In [101]:
# converting 'a' in 2D array
a2 = a[np.newaxis, :]
print(a2)

[[0 1 2 3 4 5]]


In [102]:
# checkig row vs columns for 1D
a.shape

(6,)

In [103]:
# checking row vs column for 2D
a2.shape

(1, 6)

In [104]:
# reversed order of 'a2'
a2_rev = a[: ,np.newaxis]
print(a2_rev)

# checking row vs column for 2D
a2_rev.shape

[[0]
 [1]
 [2]
 [3]
 [4]
 [5]]


(6, 1)

In [105]:
# similarly 3D array
a3 = a2[np.newaxis, :]
print(a3)

# checking row vs column for 3D
a3.shape

[[[0 1 2 3 4 5]]]


(1, 1, 6)

## Why 1D, 2D, 3D etc. arrays in Data Science?
The specific use of 1D, 2D, and 3D arrays in data science depends on the nature of the problem and the type of data being worked with. For example:

- 1D arrays are commonly used for univariate time series analysis, feature engineering, and as input to many machine learning models.
- 2D arrays are the backbone of many machine learning algorithms, such as linear regression, logistic regression, and neural networks, which operate on tabular data.
- 3D arrays are essential for working with video data, medical imaging, and other types of volumetric data, often used in deep learning models for tasks like video analysis and 3D object recognition.
The choice of array dimensionality depends on the structure and characteristics of the data, and the specific requirements of the data science task at hand.

### 1. Creating arrays in `numpy`

In [106]:
a = np.array([1,2,3,4])
b = np.array([(1,2,3,4), (5,6,7,8)])

Let's see the arrays and their dimensions!

In [107]:
print(a)
a.shape

[1 2 3 4]


(4,)

Hence, `a` is 1D array.

In [108]:
print(b)
b.shape

[[1 2 3 4]
 [5 6 7 8]]


(2, 4)

And `b` is 2D array.

In [109]:
# type of array
# type is n-dimensional array
print(type(a))
print(type(b))

<class 'numpy.ndarray'>
<class 'numpy.ndarray'>


### 2. Initilizing Arrays

In [110]:
zeros = np.zeros((3,5)) # (rows, columns)
zeros

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [111]:
# to initialize array with `1`
ones = np.ones((3,5))
ones

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [112]:
ones.dtype

dtype('float64')

In [113]:
# creating full array with desired value
full = np.full((2,5), 7)    # 7 is desired value here
print(full)
print(full.dtype)

[[7 7 7 7 7]
 [7 7 7 7 7]]
int64


All these arrays created or initialized are either integer (int64) or float (float64) type.

In [114]:
# full can also be done with object datatype
full = np.full((2,5), 'a')
full

array([['a', 'a', 'a', 'a', 'a'],
       ['a', 'a', 'a', 'a', 'a']], dtype='<U1')

In [115]:
# data type of elements
# This is not float or int type
full.dtype

dtype('<U1')

In [116]:
# creating an identity matrix (all diagonal entries are 1 and other are 0)
identity = np.eye(5)
identity

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

### 3. Attributes of Array

In [117]:
# checking dimensionality
a3.ndim

3

In [118]:
# number of elements in array
a3.size

6

In [119]:
# length of array
len(a3)

1

In [120]:
# shape of array like (row, column, depth) "depth" in case of 3D array
a3.shape

(1, 1, 6)

### 4. Basic Operations on Arrays

In [121]:
# using arrays already created
print(f"First array: {a}")
print("Second array:")
print(b)

First array: [1 2 3 4]
Second array:
[[1 2 3 4]
 [5 6 7 8]]


In [124]:
# subtraction on 'a' and 'b'
c = a - b
c

array([[ 0,  0,  0,  0],
       [-4, -4, -4, -4]])

In [126]:
# addition on 'a' and 'b'
d = a + b
d

array([[ 2,  4,  6,  8],
       [ 6,  8, 10, 12]])

In [127]:
# another way to add
e = np.add(a, b)
e

array([[ 2,  4,  6,  8],
       [ 6,  8, 10, 12]])

In [128]:
# multiplication
f = a * b
f

array([[ 1,  4,  9, 16],
       [ 5, 12, 21, 32]])

In [129]:
# division
g = a / b
g

array([[1.        , 1.        , 1.        , 1.        ],
       [0.2       , 0.33333333, 0.42857143, 0.5       ]])

In [130]:
# square of each element
h = a ** 2
h

array([ 1,  4,  9, 16])

#### **Key rules and considerations for performing arithmetic operations on arrays:**
1. Conformability:
    - Arrays must have the same shape (dimensions) to perform binary operations like addition, subtraction, multiplication.
    - Mismatched shapes will result in an error, unless broadcasting can be applied.
2. Broadcasting:
    - Allows you to perform element-wise operations on arrays with different shapes.
    - Smaller arrays are "stretched" or "replicated" to match the shape of larger arrays.
    - Follows well-defined rules based on the specific dimensions of the arrays.
3. Element-wise vs. Matrix Operations:
    - Element-wise operations apply the operation to each corresponding element.
    - Matrix operations (e.g., matrix multiplication) have specific requirements, like matching column-row dimensions.
4. Axis Conventions:
    - Careful alignment of axes (rows, columns, depth) is crucial when working with multidimensional arrays.
    - Misalignment can lead to incorrect results or dimension mismatch errors.
Following these rules helps ensure the correctness and efficiency of your array-based computations in data science tasks.