# Introduction to NumPy

NumPy (Numerical Python) is a powerful, open-source library used for numerical and scientific computing in Python. It provides support for large multidimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. NumPy is the foundational package for many scientific computing libraries in Python, including pandas, SciPy, and scikit-learn.

---

# Installing NumPy

```python
!pip install numpy --upgrade

```
---

# How to import NumPy

```python
import numpy as np
```
---

# Python list VS NumPy array

While Python lists are flexible and can hold different types of data, NumPy arrays are more efficient for numerical operations. Here are some key differences:

1. Performance: NumPy arrays are faster and more memory efficient than Python lists because they are implemented in C and use contiguous memory blocks.
2. Functionality: NumPy provides a wide range of mathematical and statistical functions that operate on arrays, which are not available for Python lists.
3. Type Consistency: NumPy arrays are homogeneous, meaning all elements have the same type, while Python lists can contain elements of different types.

---

# Creating NumPy Arrays

There are several ways to create NumPy arrays. Here are a few common methods:

## 1. Converting Python Sequence to NumPy Arrays

In [None]:
a1D = np.array([1, 2, 3, 4]) # np.array() is a function not a method.
a2D = np.array([[1, 2], [3, 5]])
print(a2D)

In [None]:
a2 = np.array([1, 2, 3], dtype= np.int8)
a2

### Now, it might seem nothing much but if you don't do that, you can get unwanted results.

In [None]:
a3 = np.array([127, 128, 120], dtype=np.int8)
a3

### An 8-bit signed integer represents integers from -128 to 127. Assigning the int8 array to integers outside of this range results in overflow.

In [None]:
a4 = np.array([2, 3, 4], dtype=np.uint32)

b4 = np.array([5, 6, 7], dtype=np.uint32)

c_unsigned32 = a4 - b4

print('unsigned c:', c_unsigned32, c_unsigned32.dtype)
# unsigned c: [4294967293 4294967293 4294967293] uint32

# 2 - 5 = -3 mod 2^32 = 4294967293

c_signed32 = a4 - b4.astype(np.int32)

print('signed c:', c_signed32, c_signed32.dtype)
# signed c: [-3 -3 -3] int64

---
## 2. Intrinsic NumPy array creation functions

NumPy has over 40 built-in functions for creating arrays and these can be divided into roghly 3-categories:

### 1D arrays

#### 1. numpy.arange()
 - numpy.arange creates arrays with regularly incrementing values.
 - It takes 3 inputs in total: start, stop and step.

In [None]:
a5 = np.arange(10)
b5 = np.arange(2, 10, dtype=float)
c5 = np.arange(2, 3, 0.1)
c5

#### 2. numpy.linspace()

- It will create arrays with a specified number of elements, and spaced equally between the specified beginning and end values.
- The advantage of this creation function is that you guarantee the number of elements and the starting and end point.

In [None]:
np.linspace(1., 4., 6)

### 2D arrays

#### 1. numpy.eye()

- In this function, the elements where i=j (row index and column index are equal) are 1 , and the rest are 0.

In [None]:
a6 = np.eye(3) # no of rows and columns, 1st rows and then columns
b6 = np.eye(3, 5)
b6

#### 2. numpy.diag()

- It can define either a square 2D array with given values along the diagonal or if given a 2D array returns a 1D array that is only the diagonal elements.
- The second input in the function, called k, specifies which diagonal to work with.
  - k=0 (default): Main diagonal
  - k>0: Diagonal above the main diagonal
  - k<0: Diagonal below the main diagonal

In [None]:
a7 = np.diag([1, 2, 3])
b7 = np.diag([[1, 2], [3, 4]])
c7 = np.diag([1, 2, 3], 1)
c7

### ndarrays

#### 1. numpy.zeros

- It will create an array filled with 0 values with the specified shape. The default dtype is float64

In [None]:
a8 = np.zeros((2, 3))
b8 = np.zeros([2, 2])
b8

#### 2. numpy.ones

- It will create an array filled with 1 values. It is identical to zeros in all other respects 

In [None]:
a8 = np.ones((2, 3))
b8 = np.ones([2, 2])
b8

---
## 3. Replicating, joining, or mutating existing arrays

Once you have created arrays, you can replicate, join, or mutate those existing arrays to create new arrays.

### Replicating Arrays

It involves creating copies or repeating elements of an array

#### 1. numpy.copy()

- This creates a new array that is a copy of original array.

In [None]:
a10 = np.array([1, 2, 3])
b10 = np.copy(a10)
b10

#### 2. numpy.tile()

- This repeats the elements of the array multiple times.

In [None]:
a11 = np.array([8, 7, 9])
b11 = np.tile(a11, 2)
b11

#### 3. numpy.repeat()

- Does the same job as `tile`, only the order of elements is different.

In [None]:
a12 = np.array([8, 7, 9])
b12 = np.repeat(a11, 2)
b12

### Joining Arrays
You can also create arrays by joining multiple arrays.

#### 1. numpy.concatenate()

- It is a more flexible and can concatenate along any specified axis.
- Axis 0 - row, Axis 1 - column

In [None]:
a13 = np.array([8, 7, 9])
b13 = np.array([1, 2, 3])
c13 = np.concatenate((b13, a13)) # for axis, we need 2D array
c13

#### 2. numpy.hstack()

- It is used to stack arrays horizontally (column-wise).
- It is specifically for horizontal stacking (along the second axis for 2D arrays).

In [None]:
a14 = np.array([1, 2, 3])
b14 = np.array([4, 5, 6])
c14 = np.hstack((a14, b14))
c14

#### 3. numpy.vstack()

- It is used to stack arrays vertically (row-wise).

In [None]:
a15 = np.array([1, 2, 3])
b15 = np.array([4, 5, 6])
c15 = np.vstack((a15, b15))
c15

### Mutating Arrays
It involves modifying the content of existing arrays.

In [None]:
a16 = np.array([1, 2, 3])
a16[0] = 10 #Changes the first element of a16 to 10, resulting in [10, 2, 3].
a16

---
# Indexing

In NumPy, ndarray objects can be indexed using `x[obj]` syntax, where `x` is the array and `obj` is the selection.

## 1. Basic Indexing

- Single Element Indexing:

   - Works like standard Python sequences with 0-based indexing.
   - Supports negative indices for indexing from the end.
   - In multidimensional arrays, indices can be separated by commas.
     <br><br>
- Subdimensional Arrays:
  - Indexing with fewer indices than dimensions returns a subdimensional array.

In [None]:
a18 = np.arange(10)
a18[2]  # Output: 2
a18[-2]  # Output: 8

In [None]:
a19 = np.array([[1, 2, 3],[4, 5, 6]])
a19[0, 1]
a19[0]

## 2. Slicing and Striding

- Slicing is used to access a subset of elements from an array.
- Slices are defined by the `start:stop:step` notation.
- Slicing always returns views, not copies.

In [None]:
a20 = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
a20[1:7:2]  # Output: array([1, 3, 5])
a20[:10:]  # Output: array([8, 9])

a21 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
a21[::2, :2:]
a20[::-1] # select all in reverse direction

## 3. Advanced Indexing

- You can use arrays of integers or boolean values to select specific elements.
- You can use multiple arrays to index into a multidimensional array.
- Returns a copy of the data.


In [None]:
a22 = np.array([10, 20, 30, 40])
idx = np.array([0, 2])
a22[idx] # Output: [10 30]

bool_idx = [True, False, True, False]
bool_idx = a22 > 20
a22[bool_idx]

a23 = np.array([[1, 2], [3, 4], [5, 6]])
row_idx = np.array([0, 2])
col_idx = np.array([1, 0])
a23[row_idx, col_idx]

---
# Copies and Views

A NumPy array consists of two main parts:
- Data buffer: This contains the actual data elements stored in contiguous memory.
- Metadata: This includes information about the data such as the data type, shape, strides, etc., which helps in manipulating the array.

## View
- Changes made to the data in the view will be reflected in the original array and vice versa.
- Views are created to enhance performance by avoiding unnecessary data duplication.


In [None]:
# slicing

a24 = np.arange(10)
b24 = a24[2:5] # b24 is a view of a24
b24[0] = 10
a24

## Copy
- Changes made to the data in the copy do not affect the original array and vice versa.
- Copies are useful when you need to modify the data without altering the original array.
- Creating a copy involves additional memory usage and computational overhead.


In [None]:
# Advanced Indexing

a24 = np.arange(9).reshape((3, 3))
b24 = a24[[0, 2]] # y is a copy of selected rows from x
b24[0, 0] = 10
print(a24,"\n\n", b24, "\n")

# Using np.copy

c24 = np.arange(10)
d24 = np.copy(c24)
d24[0] = 99
print(c24, "\n\n", d24)

## How to Check if an Array is a View or a Copy ?
You can use the .base attribute of a NumPy array to determine if it is a view or a copy:
- If the `.base` attribute is None, the array is a copy.
- If the `.base` attribute points to another array, the array is a view.

In [None]:
a25 = np.arange(6)
b25 = a25.reshape((2, 3))

print("Is b25 a view?", b25.base is a25)  # True because y is a view of x

c25 = np.copy(a25)
print("Is c25 a view?", c25.base is a25)  # False because z is a copy of x


---
# Broadcasting

- It allows you to perform arithmetic operations on arrays of different shapes.
- Broadcasting means expanding the smaller array's dimensions so that it matches the shape of the larger array without actually copying the data.

#### Adding a Scalar to an Array

 - When you add a scalar to a NumPy array, the scalar is broadcasted to match the shape of the array:

In [None]:
l1 = [1, 2, 3]
n1 = 3
a26 = np.array(l1)

c26 = a26 * n1

l3 = [l1[i] * n1 for i in range(3)]
c26

#### Adding Arrays of Different Shapes
- If you have a 2D array and a 1D array. NumPy will try to broadcast the smaller array (1D) to match the shape of the larger array (2D).

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]]) # (2, 3)
b = np.array([1, 2, 3]) #(3)

result = a + b  # Broadcasting 'b' to match the shape of 'a'
result   # Output: [[2, 4, 6], [5, 7, 9]]

## General Broadcasting Rules

 1. Align shapes from the right
 2. Dimensions must be equal, or one of them must be 1

In [None]:
shape1 = (2, 3, 4)
shape2 = (24, 90, 1, 3, 4)

# Create arrays with the specified shapes
array1 = np.ones(shape1)
array2 = np.ones(shape2)

# Perform a broadcasting operation (e.g., addition)
result = array1 + array2

print("Shape of the result:", result.shape)

---
# Structured Arrays

- Structured arrays in NumPy allow you to create arrays where each element can have a custom data type consisting of multiple fields.
- You can create a structured array using `numpy.array()` with a `dtype` argument that specifies the fields and their data types.

In [None]:
# Define the data types for each field
dtype = np.dtype([
    ('name', 'U20'),   # Unicode string with max length 20 for name
    ('age', np.int32), # 32-bit integer for age
    ('weight', np.float64) # 64-bit float for weight
])

# Create a structured array with predefined data
data = np.array([
    ('Alice', 25, 65.2),
    ('Bob', 30, 72.9),
    ('Charlie', 35, 68.5)
], dtype=dtype)

data

### Accessing Structured Array Elements
 - You can access elements and fields of a structured array using indexing and field names.

In [None]:
# Accessing elements and fields
print(data[0])          # Output: ('Alice', 25, 65.2)

# Accessing a specific field of a specific element
print(data[1]['name'])  # Output: 'Bob'
print(data[2]['weight']) # Output: 68.5

### Manipulating Structured Arrays

- You can manipulate structured arrays similar to regular arrays, including slicing, iterating, and performing operations across fields.

In [None]:
# Slicing and iterating over structured arrays
print(data[:2])  # Output: [('Alice', 25, 65.2), ('Bob', 30, 72.9)]

for person in data:
    print(person['name'], person['age'])  # Iterates over each person's name and age

# Performing operations
average_age = np.mean(data['age'])
print(f"Average age: {average_age:.2f}")

# Sorting based on a field
sorted_data = np.sort(data, order='age')
print(sorted_data)


---
# Use cases

### Basic Arithmetic Operations

- You can perform element-wise arithmetic operations on NumPy arrays, just as you would with scalar values.

In [None]:
a29 = np.array([1, 2, 3])
b29 = np.array([4, 5, 6])

In [None]:
a29 + b29 # Element-wise, lessser no of element then broadcasting

In [None]:
a29 - b29

In [None]:
a29 * b29

In [None]:
a29 / b29

### Universal Functions (ufuncs)

- NumPy provides a set of functions that operate element-wise on arrays, called universal functions or ufuncs.

In [None]:
# Element-wise square root
np.sqrt(a29)

In [None]:
# Element-wise exponential
np.exp(a29)

In [None]:
# Element-wise natural logarithm
np.log(a29)

### Aggregate Functions
- NumPy also provides functions to perform various aggregate operations on arrays.

In [None]:
# Sum of all elements
np.sum(a29)

In [None]:
# Mean of all elements
np.mean(a29)

In [None]:
# Standard deviation of all elements
np.std(a29)

### Linear Algebra

In [None]:
m1 = np.arange(9).reshape(3, 3)
m2 = np.arange(10, 19).reshape(3, 3)
m1

In [None]:
# Matrix multiplication
np.matmul(m1, m1)

In [None]:
# Transpose of a matrix
np.transpose(m1)

In [None]:
# Dot Product
np.dot(a29, b29)

### Random Number Generation
Random number generation is fundamental for simulations, probabilistic algorithms, and Monte Carlo methods.

#### np.random.randint()
 - Generates random integers between specified bounds.

In [None]:
a30 = np.random.randint(1, 11, 10)
a30

#### np.random.permutation(a30)
- Returns a new array with a randomly permuted array.

In [None]:
np.random.permutation(a30)