In [1]:
import numpy as np

# Set Operations
### Finding Common Elements with intersect1d()

In [2]:
n1 = np.array([10,20,30,40,50,60])
n2= np.array([50,60,70,80,90])

In [3]:
# Find common elements (intersection)
np.intersect1d(n1,n2)

array([50, 60])

### Explanation:

* np.intersect1d(n1, n2) finds common elements that exist in both arrays
* "intersect1d" = Intersection of 1-dimensional arrays
* Returns a sorted array containing only values present in both n1 and n2
* Think of it as: "Which values appear in BOTH arrays?"
* The result contains no duplicates (each value appears only once)

### Visual Representation

### Key Points:

* Returns elements that are in BOTH arrays
* Result is always sorted in ascending order
* Result contains unique values (no duplicates)
* Works with arrays of any size
* If no common elements exist, returns an empty array

### When to use intersect1d():

* Finding common items between two datasets
* Matching IDs or identifiers
* Comparing lists to find overlaps
* Data validation - checking which items exist in reference data
* Set theory operations in data analysis

### Example Use Cases:

In [4]:
# Find students in both classes
class_a = np.array([101, 102, 103, 104, 105])
class_b = np.array([103, 104, 105, 106, 107])
both_classes = np.intersect1d(class_a, class_b)
# Output: array([103, 104, 105])  (students enrolled in both)

# Find common products between two stores
store1_inventory = np.array([10, 20, 30, 40])
store2_inventory = np.array([20, 30, 50, 60])
common_products = np.intersect1d(store1_inventory, store2_inventory)
# Output: array([20, 30])  (products available in both stores)

### Related Set Operations:

### Finding Difference with setdiff1d()

In [5]:
np.setdiff1d(n1,n2)

array([10, 20, 30, 40])

### Explanation:

* np.setdiff1d(n1, n2) finds elements that are in n1 but NOT in n2
* "setdiff1d" = Set Difference for 1-dimensional arrays
* Returns a sorted array of values that exist in n1 but don't exist in n2
* Think of it as: "What's in the first array that's NOT in the second array?"
* Order matters: setdiff1d(n1, n2) ≠ setdiff1d(n2, n1)

### Visual Representation:

### Key Points:

* Returns elements unique to the first array
* Result is always sorted in ascending order
* Result contains unique values (no duplicates)
* Direction matters: switching arrays gives different results
* If all elements of n1 are in n2, returns an empty array

In [None]:
# Union: Elements in EITHER array (or both)
a = np.array([1, 2, 3])
b = np.array([3, 4, 5])
np.union1d(a, b)
# Output: array([1, 2, 3, 4, 5])  (all unique elements)

# Set difference: Elements in first array but NOT in second
a = np.array([1, 2, 3, 4])
b = np.array([3, 4, 5])
np.setdiff1d(a, b)
# Output: array([1, 2])  (in 'a' but not in 'b')

# Symmetric difference: Elements in EITHER array but NOT in both
a = np.array([1, 2, 3])
b = np.array([3, 4, 5])
np.setxor1d(a, b)
# Output: array([1, 2, 4, 5])  (excludes 3, which is in both)

### Aggregation Functions

In [7]:
n3 = np.array([10,20])
n4 = np.array([30,40])
np.sum([n3,n4])

np.int64(100)

**Explanation**:
- `np.sum([n3, n4])` calculates the **total sum** of all elements in both arrays
- Takes a **list of arrays** as input: `[n3, n4]`
- Adds up **every single element** from all arrays
- Returns a **single number** (scalar value)

**Key Points:**
- Sums **all elements** across all arrays provided
- Returns a **single scalar** value
- Can take a single array or multiple arrays
- Works with any dimension (1D, 2D, 3D, etc.)

**Different Ways to Use `np.sum()`:**
```python
# Example 1: Sum a single array
arr = np.array([1, 2, 3, 4, 5])
np.sum(arr)
# Output: 15  (1+2+3+4+5)
```
```python
# Example 2: Sum multiple arrays
a = np.array([10, 20, 30])
b = np.array([5, 15, 25])
np.sum([a, b])
# Output: 105  (10+20+30+5+15+25)
```
```python
# Example 3: Sum a 2D array (all elements)
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
np.sum(arr2d)
# Output: 21  (1+2+3+4+5+6)
```

**Using `axis` Parameter:**

The `axis` parameter lets you control **which direction** to sum:
```python
# 2D array example
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

# Sum ALL elements (default, no axis)
np.sum(arr)
# Output: 21

# axis=0: Sum DOWN columns (collapse rows)
np.sum(arr, axis=0)
# Output: array([5, 7, 9])
# Explanation: [1+4, 2+5, 3+6] = [5, 7, 9]

# axis=1: Sum ACROSS rows (collapse columns)
np.sum(arr, axis=1)
# Output: array([6, 15])
# Explanation: [1+2+3, 4+5+6] = [6, 15]
```

**Visual Understanding of `axis`:**

### Basic Addition

In [8]:
n5 = np.array([10,20,30])
n5 = n5+1
n5

array([11, 21, 31])

### Basic Subtraction 

In [10]:
n6 = np.array([10,20,30])
n6 = n6-1
n6

array([ 9, 19, 29])

### Basic Multiplication

In [11]:
n7 = np.array([10,20,30])
n7 = n7*2
n7

array([20, 40, 60])

### Basic Division

In [12]:
n8 = np.array([10,20,30])
n8 = n8/2
n8

array([ 5., 10., 15.])

### Mean (average)
**Explanation**: Sum of all elements divided by count: (1+2+...+9)/9 = 5.0.

In [13]:
n9 = np.array([10,20,30])
np.mean(n9)

np.float64(20.0)

### Standard Deviation (spread of data)
**Explanation**: Measures how spread out the values are from the mean. Higher = more spread.

In [14]:
n10 = np.array([10,20,30])
np.std(n10)

np.float64(8.16496580927726)

### Median (middle value)  
**Explanation**: The middle value when sorted. For odd count, it's the center element. 

In [15]:
n11 = np.array([10,20,30])
np.median(n11)

np.float64(20.0)