# **`Data Science Learners Hub`**

**Module : Python**

**email** : [datasciencelearnershub@gmail.com](mailto:datasciencelearnershub@gmail.com)

### **`#2: Operations on NumPy Arrays`**

1. **Element-wise Operations:**
   - Performing basic arithmetic operations on arrays.
   - Universal functions (ufuncs) in NumPy.

2. **Aggregation and Statistics:**
   - Sum, mean, median, variance, and standard deviation.
   - Min, max, and other aggregation functions.

3. **Array Comparison and Boolean Operations:**
   - Comparing arrays element-wise.
   - Boolean indexing and masking.

4. **Array Manipulation:**
   - Reshaping arrays.
   - Concatenation and splitting arrays.

#### **`1. Element-wise Operations on NumPy Arrays:`**

**Concept:**
Element-wise operations in NumPy refer to performing operations on corresponding elements of two arrays. This means that each element in the result is obtained by applying the operation to the corresponding elements in the input arrays.

**Examples:**
Let's demonstrate basic arithmetic element-wise operations:

1. **Addition (`+`):**
   ```python
   import numpy as np

   arr1 = np.array([1, 2, 3])
   arr2 = np.array([4, 5, 6])
   result = arr1 + arr2
   ```
   Result:
   ```
   [5, 7, 9]
   ```

2. **Subtraction (`-`):**
   ```python
   import numpy as np

   arr1 = np.array([4, 5, 6])
   arr2 = np.array([1, 2, 3])
   result = arr1 - arr2
   ```
   Result:
   ```
   [3, 3, 3]
   ```

3. **Multiplication (`*`):**
   ```python
   import numpy as np

   arr1 = np.array([2, 3, 4])
   arr2 = np.array([5, 2, 1])
   result = arr1 * arr2
   ```
   Result:
   ```
   [10, 6, 4]
   ```

4. **Division (`/`):**
   ```python
   import numpy as np

   arr1 = np.array([10, 15, 20])
   arr2 = np.array([2, 3, 4])
   result = arr1 / arr2
   ```
   Result:
   ```
   [5.0, 5.0, 5.0]
   ```

**Real-world Scenarios:**

1. **Financial Analysis:**
   - **Operation:** Calculating the return on investment for a series of investments.
   - **Application:** Element-wise multiplication is used to calculate the returns for each investment, facilitating analysis.

2. **Image Processing:**
   - **Operation:** Adjusting the brightness of an image.
   - **Application:** Element-wise addition or subtraction can be employed to change the brightness of each pixel in an image.

3. **Scientific Experiments:**
   - **Operation:** Analyzing experimental data collected at different time points.
   - **Application:** Element-wise operations help compare and analyze data collected at different time intervals.

4. **Weather Data Analysis:**
   - **Operation:** Comparing temperatures recorded by different weather stations.
   - **Application:** Element-wise subtraction allows for comparing temperature variations recorded by different stations.

5. **Healthcare Data Analysis:**
   - **Operation:** Analyzing patient vital signs over time.
   - **Application:** Element-wise operations help analyze trends and changes in vital signs recorded at different time points.

**Key Takeaway:**
Element-wise operations in NumPy are fundamental for efficiently processing and analyzing numerical data, especially in scenarios where operations need to be performed on corresponding elements across arrays. They find wide application in fields such as finance, image processing, scientific research, meteorology, and healthcare.

#### **`2. Aggregation and Statistics in NumPy:`**

**Aggregation Functions:**

NumPy provides various aggregation functions that summarize data by computing statistical measures. Here are some key aggregation functions:


1. **Sum (`np.sum()`):**
   - Computes the sum of array elements along a specified axis or the entire array.
   - Example:

In [1]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
total_sum = np.sum(arr)
print(total_sum)

15


2. **Mean (`np.mean()`):**
   - Calculates the arithmetic mean (average) of array elements.
   - Example:

In [2]:
import numpy as np

arr = np.array([10, 20, 30, 40, 50])
average = np.mean(arr)
print(average)

30.0


3. **Median (`np.median()`):**
   - Computes the median value of array elements.
   - Example:

In [3]:
import numpy as np

arr = np.array([3, 1, 4, 1, 5, 9, 2, 6, 5])
median_value = np.median(arr)
print(median_value)

4.0


4. **Variance (`np.var()`):**
   - Calculates the variance, a measure of the spread of array values.
   - Example:

In [4]:
import numpy as np

arr = np.array([2, 4, 4, 4, 5, 5, 7, 9])
variance = np.var(arr)
print(variance)

4.0


5. **Standard Deviation (`np.std()`):**
   - Computes the standard deviation, a measure of the amount of variation or dispersion.
   - Example:

In [5]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
std_deviation = np.std(arr)
print(std_deviation)

1.4142135623730951



**Real-world Examples:**

1. **Healthcare Data Analysis:**
    
    - **Metric:** Mean heart rate over a population.
    - **Application:** Calculate the mean heart rate using `np.mean()` to assess overall health trends.
2. **Environmental Monitoring:**
    
    - **Metric:** Variance in temperature readings.
    - **Application:** Evaluate the variance in temperature data using `np.var()` to understand temperature fluctuations.
3. **Manufacturing Quality Control:**
    
    - **Metric:** Standard deviation of product dimensions.
    - **Application:** Assess the consistency in product dimensions using `np.std()` to ensure quality control.
4. **Social Sciences Research:**
    
    - **Metric:** Median income in a demographic study.
    - **Application:** Determine the median income using `np.median()` to understand income distribution.

**Key Takeaway:** Aggregation functions in NumPy play a crucial role in summarizing and analyzing datasets by providing key statistical measures. They are essential for gaining insights into data distribution, central tendency, and variability, contributing significantly to various fields, including finance, healthcare, environmental science, manufacturing, and social sciences.

#### **`3. Array Comparison and Boolean Operations in NumPy:`**

**Importance:**

Array comparison and boolean operations in NumPy are fundamental for creating conditions, filtering data, and making logical decisions. These operations involve comparing elements in arrays and generating boolean arrays as a result.

**Scenarios:**

1. **Filtering Data:**
   - **Scenario:** Identifying temperatures above a certain threshold in weather data.
   - **Application:** Use boolean operations to filter temperatures exceeding a specified threshold for further analysis.

In [6]:
import numpy as np

temperatures = np.array([22, 25, 30, 35, 28, 24, 32])
above_threshold = temperatures > 25
filtered_temperatures = temperatures[above_threshold]
print(filtered_temperatures)

[30 35 28 32]


2. **Scientific Experiments:**
   - **Scenario:** Analyzing experimental results based on specific criteria.
   - **Application:** Use boolean operations to filter and analyze experimental data meeting specific conditions.

In [7]:
import numpy as np

experiment_results = np.array([12, 18, 22, 15, 30, 25, 20])
successful_experiments = experiment_results >= 20
analysis_data = experiment_results[successful_experiments]
print(analysis_data)

[22 30 25 20]


3. **Financial Analysis:**
   - **Scenario:** Identifying profitable stock trades based on certain indicators.
   - **Application:** Use boolean operations to filter stock trades with favorable conditions.

In [8]:
import numpy as np

stock_prices = np.array([150, 160, 140, 170, 180, 130])
profitable_trades = stock_prices > 150
selected_trades = stock_prices[profitable_trades]
print(selected_trades)

[160 170 180]


**Real-world Examples:**

1. **Medical Research:**
   - **Scenario:** Analyzing patient data for specific conditions.
   - **Application:** Identify patients with a certain health parameter above a threshold using boolean operations for targeted medical analysis.

2. **Environmental Monitoring:**
   - **Scenario:** Identifying pollution levels above safety standards.
   - **Application:** Use boolean operations to filter data from environmental sensors, focusing on instances when pollution levels exceed acceptable standards.

3. **Educational Assessment:**
   - **Scenario:** Identifying students who scored above a passing grade.
   - **Application:** Use boolean operations to filter and analyze student scores, focusing on those who met or exceeded passing criteria.

**Key Takeaway:**

Array comparison and boolean operations in NumPy provide a powerful mechanism for filtering and analyzing data based on specific conditions. These operations are crucial in various real-world applications, including weather analysis, scientific experiments, financial decision-making, medical research, environmental monitoring, and educational assessments. They enable users to focus on relevant data subsets, make informed decisions, and extract meaningful insights from datasets.

#### **`4. Array Manipulation in NumPy:`**

**1. Reshaping Arrays:**

- **Scenario:** Image Processing
- **Application:** Reshaping arrays is crucial in image processing, where images are represented as multi-dimensional arrays. Reshaping can be used to change the dimensions of an image array.

In [9]:
import numpy as np

image = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_image = np.reshape(image, (3, 2))
print(reshaped_image)

[[1 2]
 [3 4]
 [5 6]]


**2. Concatenation:**

- **Scenario:** Combining Datasets
- **Application:** Concatenation is useful when combining datasets, especially when datasets are collected separately but need to be analyzed together.

In [12]:
import numpy as np

data1 = np.array([1, 2, 3])
data2 = np.array([3, 5, 6])
concatenated_data = np.concatenate((data1, data2))
print(concatenated_data)

[1 2 3 3 5 6]


In [2]:
import numpy as np

# axis = 0 is also called as first axis and it represents rows
# axis = 1 is also called as second axis and it represents columns

# What si the output of the following ?

a = [1,2,4]
b = [2,4,4]
c = [1,2,4]

print("Concatenate")
print(np.concatenate((a,b), axis=0))
# print("Concatenate along axis =1 ") # NOTE : AxisError: axis 1 is out of bounds for array of dimension 1
# print(np.concatenate((a,b), axis=1))
print("hstack")
print(np.hstack((a,b))) # Why do we have to give (a,b) inside a parthesis in a paranthesis ? Because hstak takes tuple as an input
print("vstack")
print(np.vstack((a,b)))
print("column_stack")
print(np.column_stack((a,b)))

Concatenate
[1 2 4 2 4 4]
hstack
[1 2 4 2 4 4]
vstack
[[1 2 4]
 [2 4 4]]
column_stack
[[1 2]
 [2 4]
 [4 4]]


**3. Splitting:**

- **Scenario:** Separating Datasets
- **Application:** Splitting is handy when you have a single dataset that needs to be divided into multiple parts for separate analysis or processing.

In [13]:
import numpy as np

original_data = np.array([1, 2, 3, 4, 5, 6])
split_data = np.split(original_data, 2)
print(split_data)

[array([1, 2, 3]), array([4, 5, 6])]


In [4]:
# help(np.split)

    # See Also
    # --------
    # array_split : Split an array into multiple sub-arrays of equal or
    #               near-equal size.  Does not raise an exception if
    #               an equal division cannot be made.
    # hsplit : Split array into multiple sub-arrays horizontally (column-wise).
    # vsplit : Split array into multiple sub-arrays vertically (row wise).
    # dsplit : Split array into multiple sub-arrays along the 3rd axis (depth).
    # concatenate : Join a sequence of arrays along an existing axis.
    # stack : Join a sequence of arrays along a new axis.
    # hstack : Stack arrays in sequence horizontally (column wise).
    # vstack : Stack arrays in sequence vertically (row wise).
    # dstack : Stack arrays in sequence depth wise (along third dimension).

**Real-world Examples:**

1. **Image Processing: Reshaping Arrays**

   - **Scenario:** Color channels in an image represented as separate arrays.
   - **Application:** Reshape the color channels into a single array for further image processing.

In [14]:
    import numpy as np

    red_channel = np.array([[255, 0], [0, 255]])
    green_channel = np.array([[0, 255], [255, 0]])
    blue_channel = np.array([[0, 0], [255, 255]])

    # Combine color channels into a single image array
    image = np.stack((red_channel, green_channel, blue_channel), axis=-1)
    reshaped_image = np.reshape(image, (2, 6))
    print(reshaped_image)

[[255   0   0   0 255   0]
 [  0 255 255 255   0 255]]


2. **Combining Financial Datasets: Concatenation**

   - **Scenario:** Quarterly financial reports stored in separate arrays.
   - **Application:** Concatenate quarterly reports into a single dataset for annual financial analysis.

In [15]:
import numpy as np

q1_data = np.array([100, 150, 120])
q2_data = np.array([130, 140, 110])

# Combine quarterly data into a single dataset
annual_data = np.concatenate((q1_data, q2_data))
print(annual_data)

[100 150 120 130 140 110]


3. **Splitting Time Series Data: Splitting**

   - **Scenario:** Time series data collected over a year.
   - **Application:** Split the time series into monthly data for individual analysis.

In [16]:
import numpy as np

yearly_data = np.array([10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65])

# Split yearly data into monthly data
monthly_data = np.split(yearly_data, 12)
print(monthly_data)

[array([10]), array([15]), array([20]), array([25]), array([30]), array([35]), array([40]), array([45]), array([50]), array([55]), array([60]), array([65])]


**Key Takeaway:**

Array manipulation operations in NumPy, such as reshaping, concatenation, and splitting, are essential tools for organizing and preparing data for various applications. These operations find practical use in scenarios like image processing, financial analysis, and time series data management, allowing users to efficiently handle complex data structures.