# NumPy Examples with Visualizations

## Example 1: Random Data and Statistics

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Example 1: Normal distribution
data = np.random.normal(loc=50, scale=10, size=1000)

mean = np.mean(data)
std = np.std(data)

print("Mean:", mean)
print("Standard Deviation:", std)

# Plotting the histogram
plt.hist(data, bins=30, edgecolor='black')
plt.title("Example 1: Histogram of Normal Distribution")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()


**Explanation**
- Imports NumPy (`np`) for numerical operations and Matplotlib (`plt`) for plotting.
- Generates 1,000 random numbers from a normal distribution with mean 50 and standard deviation 10.
- Calculates and prints the mean and standard deviation of the generated data.
- Plots a histogram to visualize the distribution of the random data.
- The histogram shows how the data is distributed around the mean, with most values clustering near 50.
- Useful for understanding statistical properties and visualizing random data.

The code `np.random.normal(loc=50, scale=10, size=1000)` generates an array of 1,000 random numbers drawn from a normal Gaussian distribution. A normal (Gaussian) distribution is a continuous probability distribution that is symmetric around its mean, showing that data near the mean are more frequent in occurrence than data far from the mean. It is characterized by its bell-shaped curve, known as the "Gaussian curve."

Key properties:

- The mean, median, and mode are all equal and located at the center of the distribution.
- The curve is symmetric about the mean.
- The spread of the distribution is determined by the standard deviation: a larger standard deviation means a wider curve.

The normal function takes three main arguments:

- `loc=50`: This sets the mean (center) of the distribution to 50.
- `scale=10`: This sets the standard deviation (spread) of the distribution to 10.
- `size=1000`: This specifies that 1,000 random samples should be generated.

The result is a NumPy array where most values are clustered around 50, with the spread determined by the standard deviation. This function is commonly used for simulating data, statistical modeling, or visualizing distributions. A common 'gotcha' is forgetting that the standard deviation (scale) controls how tightly the values cluster around the mean—larger values produce a wider spread.

## Example 2: Simulating Dice Rolls

In [None]:
# Example 2: Dice roll simulation
rolls = np.random.randint(1, 7, size=10000)
counts = np.bincount(rolls)[1:]  # Skip index 0
faces = np.arange(1, 7)

print("Dice face counts:", counts)

# Plotting bar chart
plt.bar(faces, counts)
plt.title("Example 2: Dice Roll Frequencies")
plt.xlabel("Dice Face")
plt.ylabel("Count")
plt.show()


**Explanation**
- Simulates 10,000 dice rolls using NumPy's `randint` to generate random integers between 1 and 6.
- Uses `np.bincount` to count occurrences of each dice face (skipping index 0).
- Creates an array `faces` representing dice faces 1 through 6.
- Prints the count of each dice face to show the distribution.
- Plots a bar chart to visualize the frequency of each dice face.
- Demonstrates how random sampling can be used to simulate real-world experiments and visualize results.

The code `np.random.randint(1, 7, size=10000)` generates an array of 10,000 random integers using NumPy's randint function.

The first argument, 1, is the inclusive lower bound, and the second argument, 7, is the exclusive upper bound. This means each integer in the output will be between 1 and 6, inclusive.

The `size=10000` parameter specifies that the function should return an array containing 10,000 such random numbers.

This function is commonly used to simulate random events with discrete outcomes, such as rolling a six-sided die. Each call to randint is independent, so the resulting array represents 10,000 independent rolls of a die. If you were to plot the results, you would expect to see roughly equal counts for each integer from 1 to 6, demonstrating the uniform distribution of outcomes.

A potential 'gotcha' is remembering that the high parameter is exclusive—so 7 is never included in the output. This is a common source of off-by-one errors for those new to NumPy or Python's random functions.

## Example 3: 2D Grid of sin(x) + cos(y)

In [None]:
# Example 3: 2D function plot
x = np.linspace(0, 2 * np.pi, 100)
y = np.linspace(0, 2 * np.pi, 100)
X, Y = np.meshgrid(x, y)
Z = np.sin(X) + np.cos(Y)

# Plotting contour
contour = plt.contourf(X, Y, Z, levels=50, cmap='viridis')
plt.colorbar(contour)
plt.title("Example 3: sin(X) + cos(Y) Surface")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()


The line `plt.contourf(X, Y, Z, levels=50, cmap='viridis')` creates a filled contour plot using Matplotlib.

In this context, X and Y are typically 2D arrays representing grid coordinates,

while Z contains the corresponding values at each grid point.

The function visualizes how Z varies across the grid by filling regions between contour lines with colors.

The `levels=50` argument specifies that the plot should use 50 contour levels, resulting in a smoother gradient and more detailed visualization of the data.

The `cmap='viridis'` parameter sets the color map to 'viridis', which is a perceptually uniform color map that transitions from dark purple to yellow, making it easier to distinguish different regions.

This function is useful for visualizing 2D scalar fields, such as temperature or elevation data, and helps reveal patterns or gradients in the dataset.

A common 'gotcha' is ensuring that X, Y, and Z have compatible shapes; otherwise, the function will raise an error.

Additionally, choosing an appropriate number of levels and a suitable color map can greatly enhance the readability of the plot.

## Example 4: Boolean Masking

In [None]:
# Example 4: Filtering with boolean mask
data = np.random.randint(1, 101, size=50)
filtered = data[data > 70]

print("Original:", data)
print("Filtered (values > 70):", filtered)

# Plotting histogram of filtered values
plt.hist(filtered, bins=10, edgecolor='black', color='orange')
plt.title("Example 4: Values > 70")
plt.xlabel("Value")
plt.ylabel("Count")
plt.show()


This line of code `filtered = data[data > 70]` uses NumPy’s powerful array filtering capabilities to select only the elements from the variable data that are greater than 70. Here, data is expected to be a NumPy array or a similar structure that supports element-wise comparison.

The expression `data > 70` creates a Boolean array of the same shape as data, where each position is True if the corresponding element in data is greater than 70, and False otherwise. When you use this Boolean array to index data (i.e., `data[data > 70]`), NumPy returns a new array containing only the elements where the condition is True.

This technique is called Boolean masking or filtering, and it’s a concise way to extract subsets of data that meet specific criteria. The result, stored in filtered, will be an array of all values from data that are strictly greater than 70. This approach is much more efficient and readable than looping through the array and manually checking each value.

## Example 5: Matrix Multiplication

In [None]:
# Example 5: Matrix multiplication
A = np.random.rand(3, 2)
B = np.random.rand(2, 4)
C = np.dot(A, B)

print("Matrix A:\n", A)
print("Matrix B:\n", B)
print("Product C:\n", C)

# Plotting heatmap
plt.imshow(C, cmap='plasma', aspect='auto')
plt.colorbar()
plt.title("Example 5: Matrix Multiplication Heatmap")
plt.xlabel("Columns in B")
plt.ylabel("Rows in A")
plt.show()


The line `plt.imshow(C, cmap='plasma', aspect='auto')` uses Matplotlib's imshow function to display the contents of the variable C as an image. This is commonly used for visualizing 2D arrays, such as matrices or images, where each value in the array is mapped to a color.

The `cmap='plasma'` argument specifies the colormap to use. The 'plasma' colormap is a perceptually uniform color map that transitions from dark purple to yellow, making it useful for highlighting differences in data values.

The `aspect='auto'` argument tells Matplotlib to automatically adjust the aspect ratio of the image so that the axes fit the plot area. This means the image may be stretched or compressed to fill the available space, rather than preserving the original data aspect ratio.

Overall, this line is a concise way to visualize matrix-like data, with color encoding the values and automatic scaling to fit the plot window.

A common 'gotcha' is that the default origin is at the top-left, which can be changed with the origin parameter if needed.