# üìä Section 9 ‚Äî Real-World Application: Analyzing Stock Market Data with NumPy

---

In this section, we‚Äôll apply NumPy to a **real-world-style data problem** ‚Äî analyzing simulated stock market data.

You‚Äôll learn how NumPy powers workflows in **finance, data analytics, and quantitative research**, enabling fast numerical computations without pandas.

We‚Äôll cover:
- Generating and cleaning synthetic stock data
- Calculating daily returns and moving averages
- Identifying trends and volatility using vectorized logic
- Performing efficient matrix-based portfolio analysis

---

## üß± Step 1: Simulating Stock Data

Let‚Äôs start by creating synthetic closing prices for multiple stocks over 1 year (252 trading days). NumPy‚Äôs random module will help us simulate daily percentage changes (returns).

In [None]:
import numpy as np

np.random.seed(42)
days = 252  # typical trading days per year
stocks = ['AAPL', 'MSFT', 'GOOG', 'AMZN']
n_stocks = len(stocks)

# Simulate daily percentage changes ~ Normal(0.001, 0.02)
daily_returns = np.random.normal(0.001, 0.02, size=(days, n_stocks))

# Starting prices
start_prices = np.array([150, 300, 2800, 3400])

# Compute price series cumulatively
price_series = start_prices * np.cumprod(1 + daily_returns, axis=0)

price_series[:5]  # first 5 days

### üîç Observation
Each column represents one stock‚Äôs daily closing price across 252 days. Vectorized cumulative multiplication makes this efficient ‚Äî no loops needed.

Next, let‚Äôs compute **daily returns** from these prices.

In [None]:
# Compute daily percentage returns from price series
returns = (price_series[1:] - price_series[:-1]) / price_series[:-1]
print("Shape:", returns.shape)
returns[:3]

## üìà Step 2: Moving Averages and Trend Detection

A **moving average** smooths short-term fluctuations to reveal long-term trends. We can compute this efficiently with NumPy‚Äôs convolution function.

In [None]:
def moving_average(data, window):
    """Compute simple moving average using convolution."""
    weights = np.ones(window) / window
    return np.convolve(data, weights, mode='valid')

# Example: 10-day moving average for AAPL
aapl_prices = price_series[:, 0]
ma_10 = moving_average(aapl_prices, 10)

print("Original series length:", len(aapl_prices))
print("MA length:", len(ma_10))

We can also detect **upward trends** when the current price exceeds its moving average.

This is a typical pattern in technical trading strategies ‚Äî all efficiently vectorized.

In [None]:
trend_mask = aapl_prices[9:] > ma_10  # Compare price to its 10-day moving average
trend_days = np.count_nonzero(trend_mask)
print(f"Days with upward trend: {trend_days}/{len(ma_10)}")

## ‚ö° Step 3: Portfolio Analysis

Let‚Äôs simulate a simple **portfolio** ‚Äî equal weights across all 4 stocks ‚Äî and compute its total daily return and risk (volatility).

In [None]:
weights = np.array([0.25, 0.25, 0.25, 0.25])

# Portfolio daily returns (matrix multiplication)
portfolio_returns = returns @ weights

# Annualized statistics
mean_daily = np.mean(portfolio_returns)
std_daily = np.std(portfolio_returns)
annual_return = mean_daily * 252
annual_volatility = std_daily * np.sqrt(252)

print(f"Annualized Return: {annual_return:.2%}")
print(f"Annualized Volatility: {annual_volatility:.2%}")

### üí° Discussion
- Vectorized `@` matrix multiplication combines all stock returns at once.
- `mean` and `std` use efficient C-level loops.
- This pattern mirrors how professional quant libraries calculate risk metrics at scale.

## üîç Step 4: Correlation and Covariance

Analyzing relationships between stocks helps in portfolio diversification. NumPy offers optimized covariance and correlation functions.

In [None]:
# Covariance and correlation matrices
cov_matrix = np.cov(returns.T)
corr_matrix = np.corrcoef(returns.T)

print("Covariance Matrix:\n", cov_matrix)
print("\nCorrelation Matrix:\n", corr_matrix)

### üß† Under the Hood
- Covariance is computed as `(X - mean)^T (X - mean) / (n - 1)`.
- `np.cov` and `np.corrcoef` are built on BLAS/LAPACK routines, giving native-level performance.
- The use of **row-major memory layout (C order)** makes these linear algebra operations cache-efficient.

## ‚öôÔ∏è Step 5: Volatility Clustering

Periods of high volatility often cluster together ‚Äî a real phenomenon in finance.

We can detect these periods using **rolling standard deviation** computed vectorially.

In [None]:
def rolling_std(data, window):
    """Compute rolling standard deviation using stride tricks."""
    shape = (data.size - window + 1, window)
    strides = (data.strides[0], data.strides[0])
    windows = np.lib.stride_tricks.as_strided(data, shape=shape, strides=strides)
    return np.std(windows, axis=1)

volatility_10 = rolling_std(portfolio_returns, 10)
volatility_10[:5]

This approach avoids explicit loops and creates a **view** over the data ‚Äî not a copy ‚Äî using NumPy stride tricks for high efficiency.

## üß† Under the Hood

- NumPy‚Äôs internal `ndarray` stores contiguous data in memory. Stride tricks reinterpret that same memory to create sliding windows.
- `as_strided` is a powerful but advanced function ‚Äî used carefully, it enables operations like rolling metrics without copying large arrays.
- This is one of the techniques used in libraries like `pandas` and `ta-lib` for efficient window functions.

## ‚ö†Ô∏è Best Practices / Pitfalls

- Always **set a random seed** when generating synthetic data for reproducibility.
- Avoid creating unnecessary copies ‚Äî vectorized math and views are faster and memory-efficient.
- Use `np.matmul` or `@` instead of manual loops for portfolio or matrix operations.
- Check array alignment (shapes, dtypes) before complex operations.
- `np.ascontiguousarray()` can ensure proper layout for large computations.

Following these practices yields code that‚Äôs **both readable and high-performance**, just like in production analytics systems.

## üí™ Challenge Exercise

**Challenge 9.1**:

1. Create a 5-stock portfolio with different random weights that sum to 1.
2. Compute the daily and annualized return and volatility.
3. Identify which stock contributes most to the overall volatility (hint: use covariance matrix).

*Try this before viewing the next section or solutions.*

---
‚úÖ **Next Up:** We‚Äôll wrap up with a **comprehensive review and cheat sheet**, consolidating all key NumPy patterns you‚Äôve learned so far.

`# --- End of Section 9 ‚Äî Continue to Final Review ---`