# NumPy Arrays (`ndarray`)

While Python's built-in `list` is versatile, it is highly inefficient for heavy numerical computation. Enter **NumPy** (Numerical Python) and its core data structure: the `ndarray` (N-dimensional array).

NumPy arrays are written in C and store elements in contiguous blocks of memory. Because all elements in a NumPy array must be of the same data type (homogeneous), it avoids the massive overhead of type-checking and pointer-chasing that slows down standard Python lists.

# NumPy Arrays: Creation and Essential Methods

NumPy arrays are the core data structure in the NumPy library, coming primarily in two flavors:

1. **1D Arrays (Vectors)**
2. **2D Arrays (Matrices)**

---

### 1. Creating Arrays from Python Lists

The most direct way to create a NumPy array is by casting an existing Python list (or list of lists) using `np.array()`.

* **1D Array (Vector):**
```python
import numpy as np

my_list = [1, 2, 3]
arr = np.array(my_list)
# Output: array([1, 2, 3])

```


*Visual Indicator:* Single set of brackets `[ ]`.
* **2D Array (Matrix):**
```python
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
matrix_arr = np.array(my_matrix)
# Output:
# array([[1, 2, 3],
#        [4, 5, 6],
#        [7, 8, 9]])

```


*Visual Indicator:* Double sets of brackets `[[ ]]` at the opening and closing.

---

### 2. Built-in NumPy Generation Methods

NumPy provides highly optimized, built-in functions to generate arrays quickly without needing to construct Python lists first.

#### A. `np.arange()`

Similar to Python's built-in `range()` function, but returns a NumPy array.

* **Syntax:** `np.arange(start, stop, step)` (stop is *exclusive*).
```python
np.arange(0, 10)     # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(0, 11, 2)  # array([ 0,  2,  4,  6,  8, 10]) - Even numbers

```



#### B. `np.zeros()` and `np.ones()`

Generates arrays filled entirely with 0s or 1s.

* **1D Array:** Pass a single integer.
```python
np.zeros(3)  # array([0., 0., 0.])

```


* **2D Array:** Pass a **tuple** representing dimensions `(rows, columns)`.
```python
np.ones((2, 3))
# array([[1., 1., 1.],
#        [1., 1., 1.]])

```



#### C. `np.linspace()`

Returns evenly spaced numbers over a specified interval.

* **Syntax:** `np.linspace(start, stop, num_points)`
* *Crucial Difference from `arange`:* The third argument is the **total number of points** you want, not the step size. The `stop` value is *inclusive* by default.


```python
np.linspace(0, 5, 10)  # 10 evenly spaced points between 0 and 5

```



#### D. `np.eye()`

Creates an Identity Matrix (a square 2D matrix with 1s on the main diagonal and 0s elsewhere).

* **Syntax:** `np.eye(N)` (Takes a single integer since it must be square: ).
```python
np.eye(3)
# array([[1., 0., 0.],
#        [0., 1., 0.],
#        [0., 0., 1.]])

```



---



In [3]:
import numpy as np # importing numpy library

In [3]:
my_list = [1,2,3,4,5,6,7,8,9]
np.array(my_list)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [5]:
np.array([[1,2,3],[4,5,6],[7,8,9]])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [7]:
#irregular 2D list not like matries
np.array([[1,2,3],[4,5,6],[7,8]])

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

In [9]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [11]:
np.arange(0, 10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
np.arange(0,10,2)

array([0, 2, 4, 6, 8])

In [15]:
# array of even numbers from 0 to 100
np.arange(0,100,2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,
       34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66,
       68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98])

In [16]:
# array of odd numbers from 0 to 100
np.arange(1,100,2)

array([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33,
       35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67,
       69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99])

In [18]:
#np.zeros
np.zeros(3) # 1d array  of 0's

array([0., 0., 0.])

In [20]:
np.zeros((2,3)) # 2d array  of 0's, tuple should be passed

array([[0., 0., 0.],
       [0., 0., 0.]])

In [22]:
#np.ones
np.ones(3) # 1d array  of 1's

array([1., 1., 1.])

In [24]:
#np.ones
np.ones((3,2)) # 1d array  of 1's

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

In [26]:
np.linspace(1,20,10)

array([ 1.        ,  3.11111111,  5.22222222,  7.33333333,  9.44444444,
       11.55555556, 13.66666667, 15.77777778, 17.88888889, 20.        ])

In [27]:
np.eye(4) #identity matrix

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

### Generating Random Arrays

NumPy has a robust module for generating random data: `np.random`. Note that for 2D random arrays, you typically pass dimensions as separate arguments, *not* as a tuple.

#### A. `np.random.rand()`

Creates an array populated with random samples from a **Uniform Distribution** over .

```python
np.random.rand(5)     # 1D array of 5 random numbers
np.random.rand(5, 5)  # 2D matrix (5x5) of random numbers

```

#### B. `np.random.randn()`

Returns samples from the **Standard Normal Distribution** (Gaussian distribution centered around 0).

```python
np.random.randn(2)     # 1D array of 2 numbers
np.random.randn(4, 4)  # 2D matrix (4x4)

```

#### C. `np.random.randint()`

Returns random integers from a low (inclusive) to a high (exclusive) number.

* **Syntax:** `np.random.randint(low, high, size)`
```python
np.random.randint(1, 100)      # Single random integer between 1 and 99
np.random.randint(1, 100, 10)  # 1D array of 10 random integers

```



In [28]:
# Generating Random Arrays
# NumPy has a robust module for generating random data: np.random.
np.random.rand(5)

array([0.57514425, 0.06621174, 0.43515457, 0.29569734, 0.58707605])

In [29]:
np.random.rand(5, 5)

array([[0.60220288, 0.22053272, 0.22379613, 0.01758028, 0.09477804],
       [0.50616129, 0.38108286, 0.15505825, 0.17648087, 0.16755954],
       [0.95173982, 0.3524542 , 0.8869167 , 0.68775585, 0.22916507],
       [0.04353458, 0.71014113, 0.80062652, 0.94554671, 0.16464396],
       [0.17153112, 0.42952378, 0.56095623, 0.133312  , 0.14656438]])

In [30]:
np.random.randn(2)

array([ 0.34516729, -1.1261108 ])

In [32]:
np.random.randn(4, 4)

array([[ 0.01483091, -0.52163818, -0.71608882, -0.60107403],
       [ 0.95851568,  2.07344594, -0.67238744, -0.33660049],
       [ 0.17691877, -0.53673035, -0.78820552, -0.42564453],
       [-1.27735122,  1.14085424, -0.48919697, -0.69181305]])

In [33]:
np.random.randint(1,100,10)

array([53, 13, 91, 92, 99, 99,  9, 67, 90, 38], dtype=int32)

In [35]:
np.random.randint(1,100,7)

array([ 1, 96, 80, 36, 87, 57, 24], dtype=int32)

###  Essential Array Attributes and Methods

Once an array is created, several methods and attributes help analyze and manipulate it.

#### Array Manipulation

* **`.reshape()`:** Returns an array containing the same data with a new shape.
* *Requirement:* The total number of elements must remain unchanged (e.g., an array of 25 elements can be reshaped to , but not ).


```python
arr = np.arange(25)
arr.reshape(5, 5) # Converts 1D array into 5x5 matrix

```



#### Finding Max/Min Values and Locations

* **`.max()` / `.min()`:** Returns the actual maximum or minimum value in the array.
* **`.argmax()` / `.argmin()`:** Returns the **index location** of the maximum or minimum value.
```python
ranarr = np.random.randint(0, 50, 10)
ranarr.max()    # e.g., 49
ranarr.argmax() # e.g., 3 (Meaning 49 is at index 3)

```



#### Key Attributes (No Parentheses Needed)

* **`.shape`:** Returns a tuple indicating the dimensions of the array.
```python
arr.shape # e.g., (25,) for a 1D vector of length 25
matrix.shape # e.g., (5, 5) for a 2D matrix

```


* **`.dtype`:** Returns the data type of the objects in the array.
```python
arr.dtype # e.g., dtype('int32') or dtype('float64')

```

In [36]:
arr = np.arange(25)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]


In [37]:
arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [38]:
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24]


In [39]:
arr = np.random.randint(1,100,20)
print(arr)

[55 57  5 79 47 60 75 99 75  4 22 47 14  2 48 79 73 19 68 30]


In [41]:
print(arr.max())

99


In [42]:
print(arr.min())

2


In [43]:
print(arr.argmax()) # index of maximum element

7


In [44]:
print(arr.argmin()) # index of min element

13


In [47]:
arr = arr.reshape(5,4)
print(arr.shape)

(5, 4)


In [48]:
print(arr.dtype)

int32


In [50]:
arr = np.arange(7)
for i in arr:
    print(i,end=" ")

0 1 2 3 4 5 6 

# NumPy Arrays: Indexing and Selection

Extracting subsets of data is a fundamental operation in data science and engineering. NumPy provides powerful, optimized mechanisms for selecting elements from arrays. While basic indexing mirrors standard Python lists, NumPy introduces advanced features like **Broadcasting**, **Matrix Slicing**, and **Boolean (Conditional) Selection**.

---

## 1D Array Indexing & Slicing

Basic indexing in 1D NumPy arrays works exactly like Python lists.

```python
import numpy as np

arr = np.arange(0, 11) 
# arr is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# 1. Single Element (Index 8)
print(arr[8]) # Output: 8

# 2. Slice (Start:Stop) - Stop is exclusive
print(arr[1:5]) # Output: [1, 2, 3, 4]

# 3. Slice from Start / To End
print(arr[:6]) # Output: [0, 1, 2, 3, 4, 5]
print(arr[5:]) # Output: [5, 6, 7, 8, 9, 10]

```

---

##  Broadcasting and the "View" Trap

This is a critical engineering distinction between Python lists and NumPy arrays.

### Broadcasting

NumPy allows you to assign a single value to a slice of an array, which "broadcasts" that value across the entire slice.

```python
arr[0:5] = 100
# arr is now [100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10]

```
### The "View" Trap (Memory Management)

When you slice a Python list, it creates a new copy in memory.
When you slice a NumPy array, it creates a **View** (a pointer) to the original array. This is done to prevent memory exhaustion when working with massive datasets.

**Bug Scenario:**

```python
arr = np.arange(0, 11)
slice_of_arr = arr[0:6]

# Broadcast to the slice
slice_of_arr[:] = 99

# The original array is ALSO changed!
print(arr) 
# Output: [99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10]

```

**The Fix (`.copy()`):**
If you need an independent copy, you must explicitly request it.

```python
arr_copy = arr.copy()

```

In [4]:
# basic indexing
arr = np.arange(1,10)

In [5]:
arr

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
print(arr[0])

1


In [9]:
print(arr[-1])

9


In [11]:
print(arr[:])

[1 2 3 4 5 6 7 8 9]


In [12]:
print(arr[2:])

[3 4 5 6 7 8 9]


In [14]:
print(arr[:5])

[1 2 3 4 5]


In [17]:
print("arr : ", arr)
arr[0:5] = 25
print(arr)

arr :  [25 25 25 25 25  6  7  8  9]
[25 25 25 25 25  6  7  8  9]


In [21]:
# here the example fails bcz we need to use slicing operator while asiiging, if not it replaces the reference variable arr_slice with reference to 50
arr = np.arange(1,10)
print('arr :',arr)
arr_slice = arr[0:5]
print("arr_slice", arr_slice)
arr_slice = 50
print(arr_slice)
print("arr :", arr)

arr : [1 2 3 4 5 6 7 8 9]
arr_slice [1 2 3 4 5]
50
arr : [1 2 3 4 5 6 7 8 9]


In [22]:
arr = np.arange(1,10)
print('arr :',arr)
arr_slice = arr[0:5]
print("arr_slice", arr_slice)
arr_slice[:] = 50
print(arr_slice)
print("arr :", arr)

arr : [1 2 3 4 5 6 7 8 9]
arr_slice [1 2 3 4 5]
[50 50 50 50 50]
arr : [50 50 50 50 50  6  7  8  9]


In [23]:
# to create a new copy of slice of array use .copy
arr_cp = arr.copy()
arr_cp[1:5] = 100
print("arr_cp", arr_cp)
print("original arr ", arr)

arr_cp [ 50 100 100 100 100   6   7   8   9]
original arr  [50 50 50 50 50  6  7  8  9]


## 2D Array (Matrix) Indexing

There are two formats for grabbing data from a 2D array. The **single-bracket, comma-separated** notation is highly recommended for clarity and performance.

```python
matrix = np.array([
    [5,  10, 15],
    [20, 25, 30],
    [35, 40, 45]
])

# Format: matrix[row, column]

# 1. Single Element (Row 1, Col 2 -> 30)
print(matrix[1, 2]) 

# 2. Entire Row (Row 0)
print(matrix[0]) # Output: [5, 10, 15]

```

### 2D Slicing (Sub-Matrices)

You can apply slice notation to both rows and columns simultaneously to extract sub-matrices.

```python
# Grab top-right corner: 
# Rows 0 and 1 (:2) 
# Columns 1 and 2 (1:)
sub_matrix = matrix[:2, 1:]

print(sub_matrix)
# Output:
# [[10, 15],
#  [25, 30]]

```


In [24]:
matrix = np.array([
    [5,  10, 15],
    [20, 25, 30],
    [35, 40, 45]
])

In [26]:
# single element
print(matrix[0][0])
print(matrix[1,1])

5
25


In [28]:
#single row
print(matrix[1])

[20 25 30]


## 4. Conditional (Boolean) Selection

This is arguably the most frequently used selection method in data analysis (especially when moving to Pandas). It allows you to filter an array based on a logical condition without writing `for` loops.

### Step 1: Generating a Boolean Array

Applying a comparison operator to an array returns a new array of the same shape, filled with `True` or `False`.

```python
arr = np.arange(1, 11)
# arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

bool_arr = arr > 5
# Output: [False, False, False, False, False, True, True, True, True, True]

```

### Step 2: Applying the Boolean Mask

You can pass this boolean array into the index brackets. NumPy will return only the elements where the index is `True`.

```python
# Passing the boolean array
result = arr[bool_arr]
# Output: [6, 7, 8, 9, 10]

```

### The Idiomatic One-Liner

In professional code, Steps 1 and 2 are combined into a single, highly readable line.

```python
# "Give me all elements in 'arr' where 'arr' is less than 3"
filtered = arr[arr < 3]

print(filtered)
# Output: [1, 2]

```

---

## Summary Table: Selection Patterns

| Operation | Syntax | Result |
| --- | --- | --- |
| **1D Slice** | `arr[start:stop:step]` | Subset of array (View) |
| **Broadcast** | `arr[start:stop] = X` | Modifies original array |
| **2D Element** | `mat[row, col]` | Single integer/float |
| **2D Slice** | `mat[row_start:row_stop, col_start:col_stop]` | Sub-matrix |
| **Condition** | `arr[arr > X]` | Array of elements satisfying condition |

In [31]:
arr = np.arange(1, 11)
print(arr)

[ 1  2  3  4  5  6  7  8  9 10]


In [33]:
bool_arr = arr > 5
print(bool_arr)

[False False False False False  True  True  True  True  True]


In [35]:
print(arr[bool_arr])

[ 6  7  8  9 10]


In [37]:
# in single line
print(arr[arr>5])

[ 6  7  8  9 10]


In [38]:
print(arr)

[ 1  2  3  4  5  6  7  8  9 10]


# NumPy Arrays: Basic Operations and Universal Functions

breakdown of NumPy operations! NumPy makes performing mathematical operations on large datasets incredibly fast and intuitive. Instead of writing manual loops to calculate values one by one, NumPy applies these operations across the entire array instantly.

## 1. Array with Array Operations

When you perform basic arithmetic between two NumPy arrays of the same size, the operation happens on an **element-by-element** basis.

### Engineering Example

```python
import numpy as np

arr = np.arange(0, 11) 
# arr is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Addition
print(arr + arr) 
# Output: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

# Multiplication
print(arr * arr) 
# Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

```

---

## 2. Array with Scalar Operations (Broadcasting)

A "scalar" is simply a single number. When you combine an array with a scalar, NumPy automatically **broadcasts** that single number to every single element in the array.

### Engineering Example

```python
# Add 100 to every element
print(arr + 100)
# Output: [100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110]

# Exponents (Square every element)
print(arr ** 2)
# Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

```

---

## 3. The Zero Division Quirk (Warnings vs. Errors)

In standard Python, dividing by zero crashes your program with a `ZeroDivisionError`. NumPy handles this differently: it allows the code to continue running, issues a **RuntimeWarning**, and fills the problematic spot with a specific placeholder value.

* **0 divided by 0:** Yields `nan` (Not a Number).
* **1 divided by 0:** Yields `inf` (Infinity).

```python
# Assume arr starts with 0

# 0 / 0 scenario
print(arr / arr)
# Warning: invalid value encountered in true_divide
# Output: [nan, 1., 1., 1., ...]

# 1 / 0 scenario
print(1 / arr)
# Warning: divide by zero encountered in true_divide
# Output: [inf, 1., 0.5, 0.333..., ...]

```



In [3]:
import numpy as np
arr = np.arange(0,11)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10]


In [4]:
arr = arr + arr
print(arr)

[ 0  2  4  6  8 10 12 14 16 18 20]


In [5]:
arr = arr/arr
print(arr)

[nan  1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]


  arr = arr/arr


In [10]:
arr = np.arange(0,11)
arr = arr*arr
print(arr)

[  0   1   4   9  16  25  36  49  64  81 100]


In [11]:
arr = arr -20
print(arr)

[-20 -19 -16 -11  -4   5  16  29  44  61  80]


In [12]:
arr = arr**2
print(arr)

[ 400  361  256  121   16   25  256  841 1936 3721 6400]


In [13]:
arr = arr //arr
print(arr)

[1 1 1 1 1 1 1 1 1 1 1]



## 4. Universal Array Functions (ufuncs)

Universal functions are highly optimized mathematical operations built directly into NumPy. They are designed to broadcast across an entire array effortlessly. If you need to perform a common mathematical operation (like trigonometry, exponents, or logarithms), always check if NumPy has a `ufunc` for it firstâ€”it will be much faster than writing your own.

### Common ufuncs

```python
# Square Root
np.sqrt(arr)

# Exponential (e^x)
np.exp(arr)

# Trigonometry
np.sin(arr)
np.cos(arr)

# Logarithmic (Note: log(0) yields -inf and a warning)
np.log(arr)

```

---

## Summary Table

| Operation Type | Syntax Example | Behavior |
| --- | --- | --- |
| **Array-Array** | `arr + arr` | Element-by-element calculation. |
| **Array-Scalar** | `arr * 100` | Broadcasts scalar to all elements. |
| **Zero Division** | `arr / 0` | Issues a warning, returns `nan` or `inf`. |
| **Universal Function** | `np.sqrt(arr)` | Applies optimized math to all elements. |

Would you like to walk through a short practice exercise to test how these operations behave when combining arrays of different shapes?

In [14]:
arr = np.arange(1,11)
print(arr)

[ 1  2  3  4  5  6  7  8  9 10]


In [15]:
arr = arr*arr
print(arr)

[  1   4   9  16  25  36  49  64  81 100]


In [17]:
arr = np.sqrt(arr)
print(arr)

[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
