# Complete Summary: Internship Day 2 - NumPy and Matplotlib

This comprehensive notebook provides a complete summary of all topics covered during Internship Day 2:

## Table of Contents
1. **NumPy Fundamentals**
2. **NumPy Operations**
3. **Matplotlib Module**
4. **Complete Code Examples**
5. **Key Concepts & Best Practices**

---
# Part 1: NumPy Fundamentals

## What is NumPy?
**NumPy** is the fundamental package for scientific computing with Python.
- Features a powerful **N-dimensional Array Object** called `ndarray`
- More **memory efficient** than Python lists
- Provides **routines to manipulate arrays**
- Foundation for most scientific Python packages

## Key Advantages:
- **Speed**: Operations are implemented in C
- **Memory**: More efficient storage
- **Broadcasting**: Operations on arrays of different shapes
- **Vectorization**: Apply operations to entire arrays

In [None]:
# Import necessary libraries
import numpy as np
import sys

print("NumPy Fundamentals - Array Creation and Properties")
print("=" * 50)

# Create arrays with different data types
x_int16 = np.array([1, 2, 3], np.int16)
x_float = np.array([1.1, 2.2, 3.3], float)
x_empty = np.array([], np.int16)

print("\n1. Array Creation:")
print(f"Int16 array: {x_int16}")
print(f"Float array: {x_float}")
print(f"Empty array: {x_empty}")

# Array properties
print("\n2. Array Properties (Int16 array):")
print(f"Size (number of elements): {x_int16.size}")
print(f"Item size (bytes per element): {x_int16.itemsize}")
print(f"Total bytes consumed: {x_int16.nbytes}")
print(f"System size (total object): {sys.getsizeof(x_int16)}")

print("\n3. Array Properties (Float array):")
print(f"Size: {x_float.size}")
print(f"Item size: {x_float.itemsize} bytes")
print(f"Total bytes: {x_float.nbytes}")
print(f"System size: {sys.getsizeof(x_float)}")

# Indexing examples
print("\n4. Array Indexing:")
print(f"First element x_int16[0]: {x_int16[0]}")
print(f"Last element x_int16[-1]: {x_int16[-1]}")

print("\n5. Memory Comparison:")
python_list = [1, 2, 3]
print(f"Python list size: {sys.getsizeof(python_list)} bytes")
print(f"NumPy array size: {sys.getsizeof(x_int16)} bytes")
print(f"Memory saved: {sys.getsizeof(python_list) - sys.getsizeof(x_int16)} bytes")

---
# Part 2: NumPy Operations

## Advanced Array Operations

### Key Operations Covered:
1. **Finding Maximum/Minimum Elements** - `argmax()`, `argmin()`
2. **Sorting Arrays** - `sort()`, `argsort()`
3. **Statistical Operations** - `mean()`, `sum()`, `std()`
4. **Matrix Operations** - `dot()`, matrix multiplication
5. **Array Manipulation** - `vstack()`, `column_stack()`, `flipud()`

### Axis Parameter:
- `axis=None`: Operations on entire array (flattened)
- `axis=0`: Operations along columns (vertically)
- `axis=1`: Operations along rows (horizontally)

In [None]:
print("NumPy Operations - Maximum and Minimum Elements")
print("=" * 50)

# Create a 5x4 array for demonstration
array_2d = np.arange(20).reshape(5, 4)
print("\n2D Array (5x4):")
print(array_2d)

print("\n1. argmax() and argmin() - Finding indices:")
print(f"Max element index (entire array): {np.argmax(array_2d)}")
print(f"Min element index (entire array): {np.argmin(array_2d)}")

print(f"\nMax indices by row (axis=1): {np.argmax(array_2d, axis=1)}")
print(f"Min indices by row (axis=1): {np.argmin(array_2d, axis=1)}")

print(f"\nMax indices by column (axis=0): {np.argmax(array_2d, axis=0)}")
print(f"Min indices by column (axis=0): {np.argmin(array_2d, axis=0)}")

# Example with custom array
print("\n\n2. Custom Array Example:")
custom_array = np.array([
    [3, 7, 1],
    [10, 3, 2],
    [5, 6, 7]
])
print("Custom array:")
print(custom_array)
print(f"Max indices by row: {np.argmax(custom_array, axis=1)}")
print(f"Min indices by row: {np.argmin(custom_array, axis=1)}")

In [None]:
print("NumPy Operations - Sorting")
print("=" * 30)

# Sorting operations
array_3x3 = np.array([
    [3, 7, 1],
    [10, 3, 2],
    [5, 6, 7]
])

print("\nOriginal 3x3 array:")
print(array_3x3)

print("\n1. np.sort() - Returns sorted copy:")
print(f"Sorted (entire array): {np.sort(array_3x3, axis=None)}")
print("\nSorted along rows (axis=1):")
print(np.sort(array_3x3, axis=1))
print("\nSorted along columns (axis=0):")
print(np.sort(array_3x3, axis=0))

# argsort example
print("\n\n2. np.argsort() - Returns indices for sorting:")
array_1d = np.array([28, 13, 45, 12, 4, 8, 0])
print(f"Original 1D array: {array_1d}")
print(f"Indices for sorting: {np.argsort(array_1d)}")
print(f"Sorted using indices: {array_1d[np.argsort(array_1d)]}")

In [None]:
print("NumPy Operations - Statistical Functions")
print("=" * 40)

# Statistical operations
data = np.array([3, 2, 8, 9])
print(f"\nData array: {data}")
print(f"Mean: {np.mean(data)}")
print(f"Sum: {np.sum(data)}")
print(f"Standard deviation: {np.std(data):.2f}")
print(f"Maximum value: {np.max(data)}")
print(f"Minimum value: {np.min(data)}")

# 2D statistical operations
data_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"\n\n2D Data array:")
print(data_2d)
print(f"Overall mean: {np.mean(data_2d)}")
print(f"Mean by rows: {np.mean(data_2d, axis=1)}")
print(f"Mean by columns: {np.mean(data_2d, axis=0)}")

In [None]:
print("NumPy Operations - Matrix Operations")
print("=" * 35)

# Matrix multiplication using np.dot()
matrix1 = np.array([
    [3, 4, 2],
    [5, 1, 8],
    [3, 1, 9]
])

matrix2 = np.array([
    [3, 7, 5],
    [2, 9, 8],
    [1, 5, 8]
])

print("\nMatrix 1:")
print(matrix1)
print("\nMatrix 2:")
print(matrix2)

result = np.dot(matrix1, matrix2)
print("\nMatrix multiplication result (np.dot):")
print(result)

# Another example
print("\n" + "="*40)
matrix_a = np.array([[3, 4, 2], [10, 3, 2], [5, 6, 7]])
matrix_b = np.array([[3, 7, 1], [2, 3, 2], [1, 6, 7]])

print("\nMatrix A:")
print(matrix_a)
print("\nMatrix B:")
print(matrix_b)

result2 = np.dot(matrix_a, matrix_b)
print("\nA × B result:")
print(result2)

In [None]:
print("NumPy Operations - Array Manipulation")
print("=" * 38)

# Array manipulation operations
base_array = np.array([
    [3, 2, 8],
    [4, 12, 34],
    [23, 12, 67]
])

print("\nOriginal array:")
print(base_array)

# Adding rows using vstack
new_row = np.array([2, 1, 8])
array_with_row = np.vstack((base_array, new_row))
print("\n1. Adding row using np.vstack():")
print(array_with_row)

# Adding columns using column_stack
new_column = np.array([2, 1, 8])
array_with_col = np.column_stack((base_array, new_column))
print("\n2. Adding column using np.column_stack():")
print(array_with_col)

# Reversing arrays
print("\n3. Reversing Arrays:")
array_1d = np.array([3, 6, 7, 2, 5, 1, 8])
print(f"Original 1D array: {array_1d}")
print(f"Reversed (flipud): {np.flipud(array_1d)}")

# Reversing 2D array
array_2d_demo = np.array([[1, 2, 3, 8], [4, 5, 6, 10], [7, 8, 9, 15], [34, 65, 8, 90]])
print("\nOriginal 2D array:")
print(array_2d_demo)
print("\nReversed 2D array (flipud):")
print(np.flipud(array_2d_demo))

---
# Part 3: Matplotlib Module

## What is Matplotlib?
**Matplotlib** is a Python 2D plotting library that generates:
- **Plots** (line, scatter, bar)
- **Histograms**
- **Power spectra**
- **Error charts**
- **Scatter plots**

## Key Features:
- **Few lines of code** for complex visualizations
- **Customizable** colors, markers, and styles
- **Publication-quality** figures
- **Interactive** capabilities

## Magic Function:
`%matplotlib inline` - Renders figures in Jupyter notebook (instead of displaying figure object dump)

In [None]:
# Matplotlib setup and basic plotting
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

print("Matplotlib Basics - Setup and Data Creation")
print("=" * 45)

# Create data
x = np.arange(10)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y = x              # Same as x

print(f"\nX values: {x}")
print(f"Y values: {y}")

print("\nBasic Plot Function:")
print("matplotlib.pyplot.plot(*args, scalex=True, scaley=True, data=None, **kwargs)")
print("\nExamples:")
print("plot(x, y)         # default line style and color")
print("plot(x, y, 'bo')   # blue circle markers")
print("plot(y)            # y using x as index array 0..N-1")
print("plot(y, 'r+')      # red plusses")

In [None]:
# Basic line plot
plt.figure(figsize=(10, 6))
plt.plot(x, y, label='y = x')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.title('Basic Line Plot')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print("Basic line plot created above!")

In [None]:
print("Matplotlib - Colors and Styles")
print("=" * 32)

print("\nColor Codes:")
print("m = magenta, b = blue, g = green, y = yellow, r = red")
print("\nMarker Styles:")
print("'o' = circles, '*' = stars, '+' = plus signs, 's' = squares")
print("\nLine Styles:")
print("'-' = solid, '--' = dashed, ':' = dotted, '-.' = dash-dot")

# Advanced plotting with multiple styles
plt.figure(figsize=(12, 8))

# Multiple plots with different styles (from the original notebooks)
plt.plot(x, y, '--oy', label='Dashed line with yellow circles', linewidth=2)
plt.plot(x, -y, '*b-', label='Solid line with blue stars', linewidth=2)

# Additional style examples
plt.plot(x, y + 2, 'r:', label='Red dotted line', linewidth=2)
plt.plot(x, y - 2, 'g-.s', label='Green dash-dot with squares', linewidth=2)

# Styling
plt.xlabel('X values', fontsize=12)
plt.ylabel('Y values', fontsize=12)
plt.title('Multiple Plot Styles - Matplotlib Demo', fontsize=14, fontweight='bold')
plt.legend(fontsize=10)
plt.grid(True, alpha=0.3)
plt.show()

print("\nMultiple style demonstration completed!")

In [None]:
# More advanced matplotlib examples
print("Advanced Matplotlib Examples")
print("=" * 30)

# Create subplots
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 10))

# Plot 1: Basic line plot
ax1.plot(x, y, 'b-', linewidth=2)
ax1.set_title('Linear Function')
ax1.set_xlabel('X')
ax1.set_ylabel('Y')
ax1.grid(True)

# Plot 2: Quadratic function
y_quad = x**2
ax2.plot(x, y_quad, 'r-', linewidth=2)
ax2.set_title('Quadratic Function')
ax2.set_xlabel('X')
ax2.set_ylabel('Y²')
ax2.grid(True)

# Plot 3: Scatter plot
ax3.scatter(x, y + np.random.randn(10), c='green', alpha=0.7)
ax3.set_title('Scatter Plot with Noise')
ax3.set_xlabel('X')
ax3.set_ylabel('Y + noise')
ax3.grid(True)

# Plot 4: Multiple functions
ax4.plot(x, np.sin(x), 'purple', label='sin(x)', linewidth=2)
ax4.plot(x, np.cos(x), 'orange', label='cos(x)', linewidth=2)
ax4.set_title('Trigonometric Functions')
ax4.set_xlabel('X')
ax4.set_ylabel('Y')
ax4.legend()
ax4.grid(True)

plt.tight_layout()
plt.show()

print("\nAdvanced subplot demonstration completed!")

---
# Part 4: Complete Integration Example

## Combining NumPy and Matplotlib

This section demonstrates how NumPy and Matplotlib work together for data analysis and visualization.

In [None]:
print("Complete Integration: NumPy + Matplotlib")
print("=" * 42)

# Generate sample data using NumPy
np.random.seed(42)  # For reproducible results
data_size = 100

# Create different types of data
x_data = np.linspace(0, 4*np.pi, data_size)
y_sine = np.sin(x_data) + 0.1 * np.random.randn(data_size)
y_cosine = np.cos(x_data) + 0.1 * np.random.randn(data_size)
y_trend = 0.1 * x_data + np.random.randn(data_size) * 0.2

print(f"Generated {data_size} data points")
print(f"X range: {x_data.min():.2f} to {x_data.max():.2f}")
print(f"Sine wave mean: {np.mean(y_sine):.3f}, std: {np.std(y_sine):.3f}")
print(f"Cosine wave mean: {np.mean(y_cosine):.3f}, std: {np.std(y_cosine):.3f}")
print(f"Trend line mean: {np.mean(y_trend):.3f}, std: {np.std(y_trend):.3f}")

# Create comprehensive visualization
plt.figure(figsize=(15, 10))

# Main plot
plt.subplot(2, 2, 1)
plt.plot(x_data, y_sine, 'b-', alpha=0.7, label='Noisy Sine')
plt.plot(x_data, y_cosine, 'r-', alpha=0.7, label='Noisy Cosine')
plt.plot(x_data, y_trend, 'g-', alpha=0.7, label='Linear Trend')
plt.title('Combined Data Visualization')
plt.xlabel('X values')
plt.ylabel('Y values')
plt.legend()
plt.grid(True, alpha=0.3)

# Histogram
plt.subplot(2, 2, 2)
plt.hist(y_sine, bins=20, alpha=0.7, color='blue', label='Sine')
plt.hist(y_cosine, bins=20, alpha=0.7, color='red', label='Cosine')
plt.title('Distribution of Y values')
plt.xlabel('Y values')
plt.ylabel('Frequency')
plt.legend()

# Scatter plot
plt.subplot(2, 2, 3)
plt.scatter(y_sine, y_cosine, alpha=0.6, c=x_data, cmap='viridis')
plt.colorbar(label='X values')
plt.title('Sine vs Cosine Scatter')
plt.xlabel('Sine values')
plt.ylabel('Cosine values')

# Statistical summary
plt.subplot(2, 2, 4)
categories = ['Sine', 'Cosine', 'Trend']
means = [np.mean(y_sine), np.mean(y_cosine), np.mean(y_trend)]
stds = [np.std(y_sine), np.std(y_cosine), np.std(y_trend)]

x_pos = np.arange(len(categories))
plt.bar(x_pos, means, yerr=stds, capsize=5, alpha=0.7, color=['blue', 'red', 'green'])
plt.title('Mean Values with Standard Deviation')
plt.xlabel('Data Series')
plt.ylabel('Mean Value')
plt.xticks(x_pos, categories)

plt.tight_layout()
plt.show()

print("\nIntegration example completed - NumPy data analysis + Matplotlib visualization!")

---
# Part 5: Summary of Key Concepts

## NumPy Benefits & Features

### 1. Memory Efficiency
- **NumPy arrays** are more memory-efficient than Python lists
- **Fixed data types** reduce memory overhead
- **Contiguous memory** layout improves performance

### 2. Performance
- **C implementation** for core operations
- **Vectorized operations** avoid Python loops
- **Broadcasting** enables operations on different shaped arrays

### 3. Functionality
- **Rich mathematical functions**
- **Linear algebra operations**
- **Statistical functions**
- **Array manipulation tools**

In [None]:
print("Memory and Performance Comparison")
print("=" * 35)

import time

# Memory comparison
size = 1000
python_list = list(range(size))
numpy_array = np.arange(size)

print(f"\nMemory Usage Comparison (size={size}):")
print(f"Python list: {sys.getsizeof(python_list):,} bytes")
print(f"NumPy array: {sys.getsizeof(numpy_array):,} bytes")
print(f"Memory saved: {sys.getsizeof(python_list) - sys.getsizeof(numpy_array):,} bytes")
print(f"Efficiency gain: {sys.getsizeof(python_list) / sys.getsizeof(numpy_array):.2f}x")

# Performance comparison
print(f"\nPerformance Comparison:")
large_size = 100000

# Python list operation
python_list_large = list(range(large_size))
start_time = time.time()
python_result = [x * 2 for x in python_list_large]
python_time = time.time() - start_time

# NumPy array operation
numpy_array_large = np.arange(large_size)
start_time = time.time()
numpy_result = numpy_array_large * 2
numpy_time = time.time() - start_time

print(f"Python list multiplication time: {python_time:.6f} seconds")
print(f"NumPy array multiplication time: {numpy_time:.6f} seconds")
print(f"NumPy speedup: {python_time / numpy_time:.2f}x faster")

## Essential Function Reference

### NumPy Core Functions

| Category | Function | Description |
|----------|----------|-------------|
| **Array Creation** | `np.array()` | Create arrays from lists |
| | `np.arange()` | Create range arrays |
| | `np.linspace()` | Create evenly spaced arrays |
| | `np.zeros()`, `np.ones()` | Create arrays filled with 0s or 1s |
| **Finding Elements** | `np.argmax()` | Indices of maximum elements |
| | `np.argmin()` | Indices of minimum elements |
| | `np.where()` | Find elements meeting condition |
| **Sorting** | `np.sort()` | Return sorted copy |
| | `np.argsort()` | Indices that would sort array |
| **Statistics** | `np.mean()` | Calculate mean |
| | `np.std()` | Standard deviation |
| | `np.sum()` | Sum of elements |
| | `np.min()`, `np.max()` | Minimum/maximum values |
| **Linear Algebra** | `np.dot()` | Matrix multiplication |
| | `np.transpose()` | Transpose arrays |
| **Array Manipulation** | `np.vstack()` | Stack arrays vertically |
| | `np.column_stack()` | Stack arrays horizontally |
| | `np.flipud()` | Flip array upside down |
| | `np.reshape()` | Change array shape |

### Matplotlib Essential Functions

| Category | Function | Description |
|----------|----------|-------------|
| **Setup** | `%matplotlib inline` | Enable inline plotting |
| | `import matplotlib.pyplot as plt` | Import plotting module |
| **Basic Plotting** | `plt.plot()` | Create line plots |
| | `plt.scatter()` | Create scatter plots |
| | `plt.bar()` | Create bar charts |
| | `plt.hist()` | Create histograms |
| **Customization** | `plt.xlabel()`, `plt.ylabel()` | Add axis labels |
| | `plt.title()` | Add plot title |
| | `plt.legend()` | Add legend |
| | `plt.grid()` | Add grid lines |
| **Display** | `plt.show()` | Display plots |
| | `plt.figure()` | Create new figure |
| | `plt.subplot()` | Create subplots |

## Data Types and Memory Usage

### NumPy Data Types
- **`np.int16`**: 2 bytes per element (16-bit integers)
- **`np.int32`**: 4 bytes per element (32-bit integers)
- **`np.int64`**: 8 bytes per element (64-bit integers)
- **`float`**: 8 bytes per element (64-bit floating point)
- **`np.float32`**: 4 bytes per element (32-bit floating point)

### Array Properties
- **`.size`**: Number of elements
- **`.itemsize`**: Bytes per element
- **`.nbytes`**: Total bytes consumed by elements
- **`.shape`**: Dimensions of array
- **`.dtype`**: Data type of elements

### Axis Parameter Guide
- **`axis=None`**: Operation on flattened array
- **`axis=0`**: Operation along rows (column-wise)
- **`axis=1`**: Operation along columns (row-wise)

### Matplotlib Style Codes

#### Colors
- **`'b'`**: blue
- **`'g'`**: green  
- **`'r'`**: red
- **`'c'`**: cyan
- **`'m'`**: magenta
- **`'y'`**: yellow
- **`'k'`**: black

#### Markers
- **`'o'`**: circles
- **`'*'`**: stars
- **`'+'`**: plus signs
- **`'s'`**: squares
- **`'^'`**: triangles

#### Line Styles
- **`'-'`**: solid line
- **`'--'`**: dashed line
- **`':'`**: dotted line
- **`'-.'`**: dash-dot line

---
# Practical Applications & Next Steps

## Real-World Applications

### NumPy Applications
1. **Scientific Computing**: Mathematical modeling, simulations
2. **Data Analysis**: Statistical analysis, data preprocessing
3. **Image Processing**: Image manipulation, computer vision
4. **Machine Learning**: Feature engineering, data preparation
5. **Signal Processing**: Audio/video processing, filtering
6. **Financial Analysis**: Risk modeling, portfolio optimization

### Matplotlib Applications
1. **Data Visualization**: Exploratory data analysis
2. **Scientific Plotting**: Research publications, reports
3. **Business Analytics**: Dashboard creation, KPI visualization
4. **Educational Content**: Teaching materials, presentations
5. **Web Development**: Interactive plots for web applications

## Learning Path & Next Steps

### Immediate Next Steps
1. **Practice with Real Data**: Download datasets from Kaggle, UCI ML Repository
2. **Explore Advanced NumPy**: Broadcasting, advanced indexing, structured arrays
3. **Learn Pandas**: Data manipulation and analysis library built on NumPy
4. **Advanced Matplotlib**: 3D plotting, animations, interactive plots

### Intermediate Goals
1. **SciPy**: Scientific computing (optimization, integration, interpolation)
2. **Seaborn**: Statistical data visualization
3. **Plotly**: Interactive web-based visualizations
4. **Machine Learning**: scikit-learn, TensorFlow, PyTorch

### Advanced Applications
1. **Computer Vision**: OpenCV, image processing
2. **Deep Learning**: Neural networks, AI applications
3. **Big Data**: Dask, distributed computing
4. **Web Development**: Flask/Django with data visualization

## Best Practices

### NumPy Best Practices
- **Use vectorized operations** instead of loops
- **Choose appropriate data types** to save memory
- **Leverage broadcasting** for efficient computations
- **Use views vs copies** wisely for memory efficiency

### Matplotlib Best Practices
- **Always add labels and titles** for clarity
- **Use appropriate plot types** for your data
- **Consider colorblind-friendly palettes**
- **Keep plots simple and readable**

## Conclusion

This comprehensive notebook covered:
- ✅ **NumPy fundamentals** - array creation, properties, memory efficiency
- ✅ **NumPy operations** - mathematical, statistical, and manipulation functions
- ✅ **Matplotlib basics** - plotting, styling, and visualization techniques
- ✅ **Integration examples** - combining NumPy and Matplotlib for data analysis
- ✅ **Performance comparisons** - demonstrating NumPy's advantages
- ✅ **Practical applications** - real-world use cases and next steps

**Happy coding and data analyzing!** 🚀📊