In [None]:
#  Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

# Purpose and Advantages of NumPy in Scientific Computing and Data Analysis

**NumPy** (Numerical Python) is a fundamental package for scientific computing in Python. It provides:

- **Efficient multi-dimensional array objects (`ndarray`)** for storing large datasets.
- **Fast and vectorized operations** on arrays without the need for explicit loops.
- A rich collection of **mathematical functions** to perform complex numerical computations.
- Tools for **linear algebra, Fourier transforms, random number generation**, and more.
- Integration capabilities with other scientific libraries like **Pandas, SciPy, Matplotlib**, and machine learning frameworks.

---

## How NumPy Enhances Python’s Numerical Capabilities

1. **Performance:**
   NumPy operations are implemented in C, making them much faster than standard Python loops or list comprehensions.

2. **Vectorization:**
   Allows batch operations on data without writing explicit loops, leading to concise and readable code.

3. **Memory Efficiency:**
   NumPy arrays consume less memory compared to Python lists because they store elements of the same type in contiguous blocks.

4. **Broadcasting:**
   Enables operations between arrays of different shapes in a memory-efficient way.

5. **Comprehensive Functionality:**
   Includes built-in functions for statistics, algebra, and random sampling which are optimized and reliable.

In [5]:
# Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

# Comparison of `np.mean()` and `np.average()` in NumPy

| Feature               | `np.mean()`                                      | `np.average()`                                   |
|-----------------------|-------------------------------------------------|-------------------------------------------------|
# Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

# Comparison of `np.mean()` and `np.average()` in NumPy

| Feature               | `np.mean()`                                      | `np.average()`                                   |
|-----------------------|-------------------------------------------------|-------------------------------------------------|
| **Purpose**           | Computes the arithmetic mean (simple average)   | Computes the weighted average by default, or simple average if no weights provided |
| **Parameters**        | Takes an array and optional axis                 | Takes an array, optional axis, and weights      |
| **Weights support**   | **No** — calculates unweighted mean             | **Yes** — can apply weights to elements         |
| **Return value**       | Scalar or array of means along specified axis   | Scalar or weighted mean along specified axis    |
| **Use case**           | When all data points are equally important       | When some data points have more importance (weights) |

---

## When to use which?

- Use **`np.mean()`** when you want a simple average without weighting.

- Use **`np.average()`** when you want to compute an average that accounts for varying importance or frequency of elements, by specifying weights.

import numpy as np

data = np.array([1, 2, 3, 4])
weights = np.array([0.1, 0.2, 0.3, 0.4])

print("Mean:", np.mean(data))                       # Output: 2.5
print("Weighted Average:", np.average(data, weights=weights))  # Output: 3.0

---

## When to use which?

- Use **`np.mean()`** when you want a simple average without weighting.

- Use **`np.average()`** when you want to compute an average that accounts for varying importance or frequency of elements, by specifying weights.

import numpy as np

data = np.array([1, 2, 3, 4])
weights = np.array([0.1, 0.2, 0.3, 0.4])

print("Mean:", np.mean(data))                       # Output: 2.5
print("Weighted Average:", np.average(data, weights=weights))  # Output: 3.0

SyntaxError: invalid character '—' (U+2014) (<ipython-input-5-ebe253f81a84>, line 15)

In [2]:
#  Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

# Methods for Reversing a NumPy Array Along Different Axes

NumPy provides multiple ways to reverse arrays along specific axes:

---

## 1D Array Reversal

- Use slicing with `[::-1]` to reverse the entire 1D array.

```python
import numpy as np

arr_1d = np.array([1, 2, 3, 4, 5])
reversed_1d = arr_1d[::-1]

print("Original 1D array:", arr_1d)
print("Reversed 1D array:", reversed_1d)

SyntaxError: invalid decimal literal (<ipython-input-2-e7c20ddcac5c>, line 11)

In [1]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Reverse rows (vertical flip)
reversed_rows = arr_2d[::-1, :]

# Reverse columns (horizontal flip)
reversed_cols = arr_2d[:, ::-1]

print("Original 2D array:\n", arr_2d)
print("Rows reversed:\n", reversed_rows)
print("Columns reversed:\n", reversed_cols)

NameError: name 'np' is not defined

In [None]:
#  How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.


- The **data type** (or `dtype`) specifies the kind of elements stored in a NumPy array (e.g., integers, floats, booleans).
- NumPy arrays are **homogeneous**, meaning all elements must have the same data type.
- The `dtype` determines:
  - How much memory each element occupies.
  - How operations on the array are performed.

---

## How to Determine Data Type of a NumPy Array

You can check the data type of an array using the `.dtype` attribute:

import numpy as np

arr_int = np.array([1, 2, 3])
arr_float = np.array([1.5, 2.5, 3.5])

print("Integer array dtype:", arr_int.dtype)
print("Float array dtype:", arr_float.dtype)


In [None]:
arr = np.array([1, 2, 3], dtype=np.int32)
print("Original dtype:", arr.dtype)

arr_float = arr.astype(np.float64)
print("Converted dtype:", arr_float.dtype)


In [None]:
# Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?


# 5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?


- `ndarray` stands for **N-dimensional array**.
- It is the core data structure of NumPy designed for storing **homogeneous** numerical data.
- Represents a fixed-size, multidimensional container where all elements share the same data type.



1. **Homogeneous data type:** All elements have the same type (e.g., all integers, all floats).
2. **Multidimensional:** Can represent 1D, 2D, 3D, or higher-dimensional arrays.
3. **Fixed size:** Once created, the size cannot be changed without creating a new array.
4. **Memory efficiency:** Uses contiguous blocks of memory for fast access and better cache utilization.
5. **Vectorized operations:** Supports element-wise arithmetic and other operations without explicit loops.
6. **Broadcasting:** Allows arithmetic operations on arrays of different shapes.
7. **Rich functionality:** Includes many built-in methods for mathematical, logical, statistical, and linear algebra operations.

---

## Differences Between `ndarray` and Python Lists

| Feature               | NumPy `ndarray`                    | Python List                      |
|-----------------------|----------------------------------|---------------------------------|
| Data type             | Homogeneous (single dtype)       | Heterogeneous (mixed types)     |
| Memory efficiency     | Contiguous memory block          | Stores references to objects    |
| Performance           | Optimized for numerical ops      | Slower for large numerical data |
| Functionality         | Vectorized operations supported  | Requires explicit loops         |
| Dimensionality        | Supports multiple dimensions     | Nested lists for multidimensions|
| Size mutability       | Fixed size after creation        | Dynamic size (can grow/shrink)  |

---

## Example: Creating and Printing ndarrays

import numpy as np

# Create a 1D ndarray
arr_1d = np.array([1, 2, 3])
print("1D ndarray:", arr_1d)

# Create a 2D ndarray
arr_2d = np.array([[1, 2], [3, 4]])
print("2D ndarray:\n", arr_2d)

In [None]:
#   Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

# Performance Benefits of NumPy Arrays over Python Lists for Large-Scale Numerical Operations


1. **Homogeneous Data Types**
   - NumPy arrays (`ndarray`) store elements of the same data type in a contiguous block of memory.
   - Python lists store references to objects which can be of different types, causing overhead in memory access.

2. **Contiguous Memory Storage**
   - NumPy uses contiguous memory allocation, which improves cache locality and speeds up access.
   - Python lists are arrays of pointers to objects scattered in memory, causing slower access.

3. **Vectorized Operations**
   - NumPy performs element-wise operations implemented in optimized C code, avoiding explicit Python loops.
   - Python lists require looping through elements explicitly, which is slower due to Python's interpreted nature.

4. **Low-Level Optimizations**
   - NumPy operations are backed by highly optimized libraries like BLAS and LAPACK.
   - Python lists have no such optimizations and rely on generic Python code execution.

5. **Reduced Overhead**
   - NumPy avoids dynamic type checks and boxing/unboxing of data during operations.
   - Python lists incur overhead due to dynamic typing and handling of diverse object types.

---

## Example: Timing NumPy Arrays vs Python Lists for Large-Scale Addition

import numpy as np
import time

# Large size
size = 10**7

# Create Python lists
list1 = list(range(size))
list2 = list(range(size))

# Create NumPy arrays
arr1 = np.arange(size)
arr2 = np.arange(size)

# Timing addition with Python lists (using list comprehension)
start = time.time()
result_list = [x + y for x, y in zip(list1, list2)]
end = time.time()
print(f"Python list addition took {end - start:.4f} seconds")

# Timing addition with NumPy arrays (vectorized)
start = time.time()
result_array = arr1 + arr2
end = time.time()
print(f"NumPy array addition took {end - start:.4f} seconds")

In [None]:
#  Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.


## Overview

- `np.vstack()` and `np.hstack()` are functions used to stack arrays vertically or horizontally, respectively.

| Function    | Description                                      |
|-------------|------------------------------------------------|
| `vstack()`  | Stacks arrays **vertically** (row-wise)         |
| `hstack()`  | Stacks arrays **horizontally** (column-wise)    |

---

## `np.vstack()` — Vertical Stack

- Stacks arrays along **rows** (adds rows).
- Resulting array has more rows, same number of columns.
- Arrays must have the **same number of columns**.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

result_v = np.vstack((a, b))
print("Vertical stack (vstack):\n", result_v)

In [None]:
a = np.array([[1], [2], [3]])
b = np.array([[4], [5], [6]])

result_h = np.hstack((a, b))
print("Horizontal stack (hstack):\n", result_h)

In [None]:
#  Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.


# Differences Between `fliplr()` and `flipud()` Methods in NumPy

- Both `fliplr()` and `flipud()` are NumPy functions used to flip arrays, but they operate along different axes:

| Function   | Operation                  | Axis affected                 |
|------------|----------------------------|-------------------------------|
| `fliplr()` | Flip array **left to right** | Flips columns (horizontal flip) |
| `flipud()` | Flip array **upside down**   | Flips rows (vertical flip)      |

---

## Behavior on Different Array Dimensions

- Both functions primarily operate on **2D or higher-dimensional arrays**.
- For 1D arrays:
  - `fliplr()` raises an error because 1D arrays don't have columns.
  - `flipud()` works by flipping elements vertically (reverses the 1D array).

---

## Example Usage

import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

print("Original array:\n", arr)

# Flip left to right (columns reversed)
flipped_lr = np.fliplr(arr)
print("\nAfter np.fliplr():\n", flipped_lr)

# Flip upside down (rows reversed)
flipped_ud = np.flipud(arr)
print("\nAfter np.flipud():\n", flipped_ud)

In [None]:
#  Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?


## Vectorization

- Vectorization means performing operations on entire arrays **without explicit Python loops**.
- It leverages NumPy’s optimized, low-level implementations (usually in C) to apply operations element-wise efficiently.
- This results in faster execution and concise code.

### Example of Vectorization:

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

# Element-wise addition without loops (vectorized)
c = a + b
print("Vectorized addition:", c)

In [None]:
# 2D array
A = np.array([[1, 2, 3],
              [4, 5, 6]])

# 1D array (shape (3,))
b = np.array([10, 20, 30])

# b is broadcasted to match A's shape (2, 3)
result = A + b
print("Broadcasted addition:\n", result)

In [None]:
##### Practical Questions ######

In [6]:
#  Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns


import numpy as np

# Create 3x3 array with random integers between 1 and 100
np.random.seed(0)  # for reproducibility
arr = np.random.randint(1, 101, size=(3, 3))
print("Original array:\n", arr)

# Interchange rows and columns by transposing the array
arr_transposed = arr.T
print("\nTransposed array (rows and columns interchanged):\n", arr_transposed)


Original array:
 [[45 48 65]
 [68 68 10]
 [84 22 37]]

Transposed array (rows and columns interchanged):
 [[45 68 84]
 [48 68 22]
 [65 10 37]]


In [7]:
# Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array

import numpy as np

# Generate 1D array with 10 elements
arr_1d = np.arange(1, 11)  # Elements from 1 to 10
print("Original 1D array:")
print(arr_1d)

# Reshape into 2x5 array
arr_2x5 = arr_1d.reshape(2, 5)
print("\nReshaped to 2x5 array:")
print(arr_2x5)

# Reshape into 5x2 array
arr_5x2 = arr_1d.reshape(5, 2)
print("\nReshaped to 5x2 array:")
print(arr_5x2)

Original 1D array:
[ 1  2  3  4  5  6  7  8  9 10]

Reshaped to 2x5 array:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

Reshaped to 5x2 array:
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]


In [8]:
# Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array

import numpy as np

# Create a 4x4 array with random float values
np.random.seed(0)  # For reproducibility
arr = np.random.rand(4, 4)
print("Original 4x4 array:\n", arr)

# Add a border of zeros around the array to make it 6x6
arr_padded = np.pad(arr, pad_width=1, mode='constant', constant_values=0)
print("\nArray after adding zero border (6x6):\n", arr_padded)

Original 4x4 array:
 [[0.5488135  0.71518937 0.60276338 0.54488318]
 [0.4236548  0.64589411 0.43758721 0.891773  ]
 [0.96366276 0.38344152 0.79172504 0.52889492]
 [0.56804456 0.92559664 0.07103606 0.0871293 ]]

Array after adding zero border (6x6):
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.5488135  0.71518937 0.60276338 0.54488318 0.        ]
 [0.         0.4236548  0.64589411 0.43758721 0.891773   0.        ]
 [0.         0.96366276 0.38344152 0.79172504 0.52889492 0.        ]
 [0.         0.56804456 0.92559664 0.07103606 0.0871293  0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [9]:
# Using NumPy, create an array of integers from 10 to 60 with a step of 5.

import numpy as np

# Create array of integers from 10 to 60 with step 5
arr = np.arange(10, 61, 5)
print("Array from 10 to 60 with step 5:")
print(arr)

Array from 10 to 60 with step 5:
[10 15 20 25 30 35 40 45 50 55 60]


In [10]:
# Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.

import numpy as np

# Create NumPy array of strings
arr = np.array(['python', 'numpy', 'pandas'])
print("Original array:")
print(arr)

# Uppercase transformation
upper_arr = np.char.upper(arr)
print("\nUppercase:")
print(upper_arr)

# Lowercase transformation
lower_arr = np.char.lower(arr)
print("\nLowercase:")
print(lower_arr)

# Title case transformation
title_arr = np.char.title(arr)
print("\nTitle case:")
print(title_arr)

# Capitalize first character
capitalize_arr = np.char.capitalize(arr)
print("\nCapitalize first character:")
print(capitalize_arr)

Original array:
['python' 'numpy' 'pandas']

Uppercase:
['PYTHON' 'NUMPY' 'PANDAS']

Lowercase:
['python' 'numpy' 'pandas']

Title case:
['Python' 'Numpy' 'Pandas']

Capitalize first character:
['Python' 'Numpy' 'Pandas']


In [11]:
# Generate a NumPy array of words. Insert a space between each character of every word in the array.

import numpy as np

# Create NumPy array of words
words = np.array(['hello', 'world', 'numpy'])

# Function to insert spaces between characters
def insert_spaces(word):
    return ' '.join(list(word))

# Vectorize the function to apply it element-wise on the array
vectorized_func = np.vectorize(insert_spaces)

# Apply to the array
spaced_words = vectorized_func(words)

print("Original words:", words)
print("Words with spaces between characters:", spaced_words)

Original words: ['hello' 'world' 'numpy']
Words with spaces between characters: ['h e l l o' 'w o r l d' 'n u m p y']


In [12]:
# Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

import numpy as np

# Create two 2D arrays
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6]])

arr2 = np.array([[6, 5, 4],
                 [3, 2, 1]])

print("Array 1:\n", arr1)
print("\nArray 2:\n", arr2)

# Element-wise addition
add = arr1 + arr2
print("\nElement-wise Addition:\n", add)

# Element-wise subtraction
sub = arr1 - arr2
print("\nElement-wise Subtraction:\n", sub)

# Element-wise multiplication
mul = arr1 * arr2
print("\nElement-wise Multiplication:\n", mul)

# Element-wise division
div = arr1 / arr2
print("\nElement-wise Division:\n", div)


Array 1:
 [[1 2 3]
 [4 5 6]]

Array 2:
 [[6 5 4]
 [3 2 1]]

Element-wise Addition:
 [[7 7 7]
 [7 7 7]]

Element-wise Subtraction:
 [[-5 -3 -1]
 [ 1  3  5]]

Element-wise Multiplication:
 [[ 6 10 12]
 [12 10  6]]

Element-wise Division:
 [[0.16666667 0.4        0.75      ]
 [1.33333333 2.5        6.        ]]


In [13]:
#  Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements


import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)
print("5x5 Identity Matrix:\n", identity_matrix)

# Extract the diagonal elements
diagonal_elements = np.diag(identity_matrix)
print("\nDiagonal elements of the identity matrix:")
print(diagonal_elements)


5x5 Identity Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal elements of the identity matrix:
[1. 1. 1. 1. 1.]


In [14]:
# Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

import numpy as np

# Function to check if a number is prime
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(np.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

# Generate 100 random integers between 0 and 1000
np.random.seed(0)  # For reproducibility
arr = np.random.randint(0, 1001, size=100)
print("Random integers array:\n", arr)

# Vectorize the prime checking function
vectorized_is_prime = np.vectorize(is_prime)

# Filter primes in the array
primes = arr[vectorized_is_prime(arr)]
print("\nPrime numbers in the array:")
print(primes)


Random integers array:
 [684 559 629 192 835 763 707 359   9 723 277 754 804 599  70 472 600 396
 314 705 486 551  87 174 600 849 677 537 845  72 777 916 115 976 755 709
 847 431 448 850  99 984 177 755 797 659 147 910 423 288 961 265 697 639
 544 543 714 244 151 675 510 459 882 183  28 802 128 128 932  53 901 550
 488 756 273 335 388 617  42 442 543 888 257 321 999 937  57 291 870 119
 779 430  82  91 896 398 611 565 908 633]

Prime numbers in the array:
[359 277 599 677 709 431 797 659 151  53 617 257 937]


In [15]:
#  Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

import numpy as np

# Create an array of daily temperatures for 30 days (random example data)
np.random.seed(0)  # For reproducibility
daily_temps = np.random.uniform(low=15, high=35, size=30)  # temps between 15 and 35 degrees
print("Daily temperatures for 30 days:\n", daily_temps)

# Reshape into weeks (assuming 7 days per week)
# Since 30 is not divisible by 7, handle extra days separately
weeks = daily_temps[:28].reshape(4, 7)  # First 28 days -> 4 full weeks

# Calculate weekly averages
weekly_averages = weeks.mean(axis=1)
print("\nWeekly averages for first 4 weeks:")
for i, avg in enumerate(weekly_averages, 1):
    print(f"Week {i} average temperature: {avg:.2f}°C")

# Handle remaining 2 days
remaining_days = daily_temps[28:]
remaining_avg = remaining_days.mean()
print(f"\nAverage temperature for remaining 2 days: {remaining_avg:.2f}°C")


Daily temperatures for 30 days:
 [25.97627008 29.30378733 27.05526752 25.89766366 23.47309599 27.91788226
 23.75174423 32.83546002 34.27325521 22.66883038 30.83450076 25.5778984
 26.36089122 33.51193277 16.42072116 16.74258599 15.40436795 31.65239691
 30.56313502 32.40024296 34.57236684 30.98317128 24.22958725 30.61058353
 17.36548852 27.79842043 17.86706575 33.89337834 25.43696644 23.2932388 ]

Weekly averages for first 4 weeks:
Week 1 average temperature: 26.20°C
Week 2 average temperature: 29.44°C
Week 3 average temperature: 25.39°C
Week 4 average temperature: 26.11°C

Average temperature for remaining 2 days: 24.37°C
