In [1]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Step 1: Load stock data
ticker = 'AAPL'
df = yf.download(ticker, start='2020-01-01', end='2023-01-01')
df['Target'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0)

# Step 2: Feature selection and scaling
features = df[['Open', 'High', 'Low', 'Close', 'Volume']]
scaler = MinMaxScaler()
X = scaler.fit_transform(features)
y = df['Target'].values

# Step 3: Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

# Step 4: Build ANN model
model = Sequential()
model.add(Dense(64, input_dim=5, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=1)

# Step 5: Evaluate model
y_pred = (model.predict(X_test) > 0.5).astype("int32")
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print("Confusion Matrix:")
print(conf_matrix)

  df = yf.download(ticker, start='2020-01-01', end='2023-01-01')
[*********************100%***********************]  1 of 1 completed
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.4967 - loss: 0.6940   
Epoch 2/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5232 - loss: 0.6924 
Epoch 3/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5215 - loss: 0.6915 
Epoch 4/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5248 - loss: 0.6913 
Epoch 5/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5248 - loss: 0.6922 
Epoch 6/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5215 - loss: 0.6914 
Epoch 7/50
[1m19/19[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚î

# Assignment 5 ‚Äî Detailed line-by-line explanation (neutral)

This markdown explains the contents of the code cell above line-by-line, describes the inputs and outputs at each stage, and summarizes the mathematical concepts, advantages, disadvantages, and applications used in this assignment. No recommendations or changes are included ‚Äî only what the assignment contains and how it is implemented.

---

## 1) Imports (what the code brings in)
- `import numpy as np`: imports NumPy for array and numerical operations used throughout the code.
- `import pandas as pd`: imports pandas for DataFrame manipulation when handling downloaded stock data.
- `import yfinance as yf`: imports yfinance to download historical market data from Yahoo Finance.
- `from sklearn.preprocessing import MinMaxScaler`: imports MinMaxScaler to normalize features to a fixed range.
- `from sklearn.model_selection import train_test_split`: imports a utility to split data into training and test sets.
- `from sklearn.metrics import accuracy_score, confusion_matrix`: imports metrics for classification evaluation.
- `from tensorflow.keras.models import Sequential` and `from tensorflow.keras.layers import Dense`: import Keras model and Dense layer to build the ANN architecture used in the assignment.

## 2) Step 1 ‚Äî Load stock data and create target
- `ticker = 'AAPL'`: sets the ticker symbol for the stock to download (Apple Inc.).
- `df = yf.download(ticker, start='2020-01-01', end='2023-01-01')`: downloads daily historical OHLCV (Open, High, Low, Close, Volume) data for the date range and stores it in a pandas DataFrame `df`.
- `df['Target'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0)`: creates a binary target column where each day's label is 1 if the next day's close is greater than the current day's close, otherwise 0. This uses `shift(-1)` to reference the next row's close price. The resulting `df['Target']` is an array of 0/1 values aligned with rows of `df` (the final row may be undefined due to shift).

## 3) Step 2 ‚Äî Feature selection and scaling
- `features = df[['Open', 'High', 'Low', 'Close', 'Volume']]`: selects five columns from the DataFrame to be predictor features. The selected features form a table of shape (n_samples, 5).
- `scaler = MinMaxScaler()`: creates an instance of the MinMaxScaler. The scaler will map each feature individually to the 0‚Äì1 range using the formula x_scaled = (x - min) / (max - min), where min/max are computed per feature.
- `X = scaler.fit_transform(features)`: fits the scaler on the full `features` data and transforms it, returning a NumPy array `X` of scaled feature values (shape (n_samples,5)).
- `y = df['Target'].values`: extracts the target values as a NumPy array `y` of shape (n_samples,) with binary values 0 or 1.

## 4) Step 3 ‚Äî Train-test split (time-aware)
- `X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)`: splits the dataset into training and test sets. `test_size=0.2` reserves 20% of samples for testing. `shuffle=False` preserves chronological order so the model is trained on older data and tested on later data (appropriate for time-series forecasting).
- Outputs: arrays `X_train`, `y_train` for training and `X_test`, `y_test` for evaluation. Shapes depend on total number of rows in `df` after any implicit filtering.

## 5) Step 4 ‚Äî Build ANN model (architecture and forward pass)
- `model = Sequential()`: instantiates a Keras Sequential model object that stacks layers in order.
- `model.add(Dense(64, input_dim=5, activation='relu'))`: first hidden layer ‚Äî a fully connected layer with 64 neurons, expecting input vectors of length 5. Activation function is ReLU (rectified linear unit). Mathematically, this layer computes z = W^T x + b and outputs a = max(0, z) element-wise.
- `model.add(Dense(32, activation='relu'))`: second hidden layer with 32 neurons and ReLU activation; it receives the 64 outputs of the previous layer as its input.
- `model.add(Dense(1, activation='sigmoid'))`: output layer with a single neuron and sigmoid activation that maps the final linear output to a probability p ‚àà (0,1), interpreted as the probability that the next day's close will be higher (class 1).

## 6) Step 4 (continued) ‚Äî Compile (loss and optimizer) and training
- `model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])`: sets the training configuration. The loss function is binary cross-entropy: for a true label y ‚àà {0,1} and predicted probability p the loss per sample is L = ‚àí[y log p + (1 ‚àí y) log(1 ‚àí p)]. The optimizer is Adam, which performs gradient-based updates using adaptive moment estimates. The metric requested is accuracy.
- `model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=1)`: trains the network for 50 epochs over the training set, processing samples in mini-batches of size 32. During training, Keras prints progress including loss and accuracy per epoch (training metrics). The training process updates model weights to minimize the binary cross-entropy loss using backpropagation and the Adam update rule.

## 7) Step 5 ‚Äî Model evaluation (prediction and metrics)
- `y_pred = (model.predict(X_test) > 0.5).astype("int32")`: generates predicted probabilities on the test set with `model.predict(X_test)` and thresholds them at 0.5 to produce predicted class labels 0 or 1. The boolean result is cast to integers. Output `y_pred` aligns with `y_test` for evaluation.
- `accuracy = accuracy_score(y_test, y_pred)`: computes classification accuracy as (number of correct predictions) / (total predictions).
- `conf_matrix = confusion_matrix(y_test, y_pred)`: computes the 2√ó2 confusion matrix [[TN, FP], [FN, TP]] indicating counts of true/false positives/negatives.
- `print(f"Accuracy: {accuracy:.2f}")`, `print("Confusion Matrix:")`, `print(conf_matrix)`: print the numeric accuracy to two decimal places and the confusion matrix array as the final output of the notebook cell.

---

## Mathematical concepts used in the implementation
- Feature scaling (Min‚ÄìMax normalization): x_scaled = (x ‚àí min) / (max ‚àí min). This rescales features to a common range, which aids gradient-based optimization.
- Dense (fully connected) layer: computes a linear combination z = W^T x + b followed by a nonlinear activation a = œÜ(z).
- ReLU activation: œÜ(z) = max(0, z), a piecewise-linear activation function applied element-wise in hidden layers.
- Sigmoid activation (output): œÉ(z) = 1 / (1 + e^{‚àíz}), producing probabilities for binary classification.
- Binary cross-entropy loss: L(y,p) = ‚àí[ y log p + (1 ‚àí y) log(1 ‚àí p) ], averaged over samples; used to quantify mismatch between true labels and predicted probabilities.
- Backpropagation and gradient descent (Adam optimizer): gradients of loss with respect to weights are computed and used to update weights to reduce loss. Adam uses adaptive estimates of first and second moments of gradients in its update rule.

## Inputs and outputs (end-to-end view)
- Inputs: historical OHLCV data fetched by yfinance (Open, High, Low, Close, Volume). The features are scaled to [0,1] and split into training and test sets preserving chronological order. Each sample is a 5-dimensional numeric vector.
- Intermediate outputs: network activations in hidden layers (64-dim then 32-dim), per-sample predicted probability from the sigmoid output. During training, per-epoch loss and accuracy are printed.
- Final outputs: for the test set, predicted binary labels `y_pred`, scalar accuracy, and the 2√ó2 confusion matrix printed to stdout.

## Advantages stated in the assignment content
- ANNs can model complex, nonlinear relationships present in financial time series, enabling the capture of patterns that simpler linear models might miss.
- With appropriate features and training, ANNs can provide probabilistic outputs (via the sigmoid) useful for decision-making or downstream analysis.
- The assignment demonstrates how a feedforward ANN can be applied to a next-day movement (up/down) classification task using tabular financial features.

## Disadvantages / limitations stated in the assignment content
- ANNs require sufficient and relevant historical data to train effectively. Small datasets or poor features limit performance.
- Training neural networks can be computationally expensive, particularly for larger architectures or long training runs.
- Risk of overfitting: the model may learn idiosyncrasies of the training data and fail to generalize to new market conditions. Financial data is non-stationary and noisy, which challenges predictive performance.

## Applications mentioned in the assignment content
- Short-term stock movement prediction (next-day up/down classification).
- Trading decision support where predicted probabilities or labels inform buy/sell decisions.
- Financial forecasting tasks where a binary outcome is of interest and ANN models can be used as one component among broader analytic systems.

---

End of neutral, detailed explanation for Assignment 5: the content above reflects the code cell implementation, the inputs/outputs at each step, the key mathematical concepts used, and the assignment-level advantages, disadvantages, and applications.

Excellent ‚Äî this code builds an **Artificial Neural Network (ANN)** to predict **next-day stock movement** (up or down) for Apple (AAPL) using **historical price data**. Let‚Äôs carefully unpack every step, from data acquisition to evaluation, including reasoning, assumptions, and improvement opportunities.

---

## 1Ô∏è‚É£ Importing Required Libraries

```python
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
```

**What each library does:**

| Library                   | Purpose                                                   |
| ------------------------- | --------------------------------------------------------- |
| `numpy`                   | Numerical operations and array handling                   |
| `pandas`                  | Data manipulation, especially with tabular data           |
| `yfinance`                | Downloads real stock market data from Yahoo Finance       |
| `sklearn.preprocessing`   | Scaling features to normalize magnitudes                  |
| `sklearn.model_selection` | Splitting data into training/testing sets                 |
| `sklearn.metrics`         | Evaluating model performance (accuracy, confusion matrix) |
| `tensorflow.keras`        | Building and training the ANN model                       |

---

## 2Ô∏è‚É£ Step 1: Load Stock Data

```python
ticker = 'AAPL'
df = yf.download(ticker, start='2020-01-01', end='2023-01-01')
df['Target'] = np.where(df['Close'].shift(-1) > df['Close'], 1, 0)
```

**Explanation:**

* `yf.download()` fetches Apple Inc. (AAPL) stock prices from Jan 2020 to Jan 2023.
* Data includes columns: `Open`, `High`, `Low`, `Close`, `Adj Close`, `Volume`.
* A new column **Target** is created:

  * **1** ‚Üí if next day‚Äôs closing price is higher (price goes up)
  * **0** ‚Üí if next day‚Äôs closing price is lower (price goes down)

So this is a **binary classification problem**:
‚Üí Predict whether tomorrow‚Äôs price will rise or fall.

**Example:**

| Close (today) | Close (tomorrow) | Target |
| ------------- | ---------------- | ------ |
| 150.00        | 151.20           | 1      |
| 151.20        | 149.80           | 0      |

---

## 3Ô∏è‚É£ Step 2: Feature Selection and Scaling

```python
features = df[['Open', 'High', 'Low', 'Close', 'Volume']]
scaler = MinMaxScaler()
X = scaler.fit_transform(features)
y = df['Target'].values
```

**Why these features?**

* These 5 are standard OHLCV (Open, High, Low, Close, Volume) ‚Äî basic technical indicators of stock behavior.

**Scaling with MinMaxScaler:**

* Neural networks work best when inputs are normalized between [0, 1].
* `fit_transform()` learns the min‚Äìmax range of each feature and scales them accordingly.

**Result:**

* `X`: 2D NumPy array (scaled features)
* `y`: 1D array (0 or 1 target labels)

---

## 4Ô∏è‚É£ Step 3: Train-Test Split

```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)
```

**Key points:**

* Splits dataset into:

  * **80% training data**
  * **20% test data**
* `shuffle=False` preserves *time order* ‚Äî crucial for time-series prediction, to avoid ‚Äúpeeking into the future.‚Äù

**Note:**
Training data = earlier dates
Testing data = later dates (more realistic scenario)

---

## 5Ô∏è‚É£ Step 4: Build the Artificial Neural Network

```python
model = Sequential()
model.add(Dense(64, input_dim=5, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
```

### Architecture breakdown:

| Layer          | Neurons             | Activation | Purpose                                    |
| -------------- | ------------------- | ---------- | ------------------------------------------ |
| Input Layer    | 5 (from 5 features) | ‚Äì          | Receives OHLCV data                        |
| Hidden Layer 1 | 64                  | ReLU       | Learns non-linear relationships            |
| Hidden Layer 2 | 32                  | ReLU       | Refines deeper feature interactions        |
| Output Layer   | 1                   | Sigmoid    | Outputs probability (0‚Äì1) for binary class |

### Why ReLU + Sigmoid?

* **ReLU (Rectified Linear Unit):** avoids vanishing gradient, efficient for hidden layers.
* **Sigmoid:** converts output into probability ‚Äî perfect for binary classification.

---

### Compile the Model

```python
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
```

**Explanation:**

* **Loss = binary_crossentropy**: standard for 2-class classification.
* **Optimizer = Adam**: adaptive learning rate optimizer, fast and stable.
* **Metrics = accuracy**: evaluate how often the model predicts correctly.

---

### Train the Model

```python
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=1)
```

**Training parameters:**

* **epochs = 50:** one epoch = one full pass over training data.
* **batch_size = 32:** updates weights after every 32 samples.
* **verbose = 1:** shows progress output.

The model iteratively adjusts weights to minimize loss, improving accuracy on the training data.

---

## 6Ô∏è‚É£ Step 5: Evaluate the Model

```python
y_pred = (model.predict(X_test) > 0.5).astype("int32")
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
```

**Explanation:**

* `model.predict()` outputs probabilities between 0 and 1.
* `> 0.5` converts probabilities to binary class predictions (1 or 0).
* **Accuracy:** proportion of correct predictions.
* **Confusion Matrix:** gives detailed breakdown:

|              | Predicted ‚Üì / Actual ‚Üí | 0               | 1 |
| ------------ | ---------------------- | --------------- | - |
| **0 (down)** | True Negatives         | False Positives |   |
| **1 (up)**   | False Negatives        | True Positives  |   |

---

### Print Results

```python
print(f"Accuracy: {accuracy:.2f}")
print("Confusion Matrix:")
print(conf_matrix)
```

Example output:

```
Accuracy: 0.58
Confusion Matrix:
[[130  70]
 [ 90 110]]
```

This means the model correctly predicted market direction 58% of the time ‚Äî slightly above random chance, which is typical for raw OHLC-based models.

---

## 7Ô∏è‚É£ Conceptual Summary

| Step                    | Description                                     | Tools Used                           |
| ----------------------- | ----------------------------------------------- | ------------------------------------ |
| **Data Fetching**       | Get historical prices                           | `yfinance`                           |
| **Feature Engineering** | Create input features and binary target         | `numpy`, `pandas`                    |
| **Scaling**             | Normalize features to [0,1]                     | `MinMaxScaler`                       |
| **Train/Test Split**    | Separate old vs new data                        | `train_test_split`                   |
| **ANN Model**           | Learn non-linear mapping from prices ‚Üí movement | `Sequential`, `Dense`                |
| **Evaluation**          | Measure accuracy & confusion                    | `accuracy_score`, `confusion_matrix` |

---

## 8Ô∏è‚É£ Key Assumptions

1. **Next-day movement** depends only on today‚Äôs OHLCV ‚Äî ignores trend memory or external signals.
2. **Stock market** is *not fully random* ‚Äî assuming patterns exist that can be learned.
3. **Stationarity** ‚Äî assumes statistical properties don‚Äôt drastically change (not always true in markets).
4. **No lookahead bias** ‚Äî ensured by `shuffle=False` (good practice).

---

## 9Ô∏è‚É£ Limitations & Potential Improvements

| Limitation                | Improvement                                                          |
| ------------------------- | -------------------------------------------------------------------- |
| Uses only raw OHLCV       | Add **technical indicators** (e.g., RSI, MACD, EMA, Bollinger Bands) |
| Ignores temporal sequence | Use **LSTM/GRU** (recurrent neural networks) for time-dependence     |
| No regularization         | Add **Dropout** layers to prevent overfitting                        |
| No model validation       | Use **validation_split** in training                                 |
| Binary target only        | Try **regression** for predicting actual % change                    |
| Fixed threshold (0.5)     | Use **ROC curve** or tune threshold for precision-recall tradeoff    |

---

## üîü Intuitive Summary

This script builds a **feedforward neural network** that tries to ‚Äúguess‚Äù whether Apple‚Äôs price will go *up or down* tomorrow based solely on today‚Äôs market stats.

While it‚Äôs a good **introductory machine learning experiment**, it‚Äôs far from a production-grade trading model ‚Äî but forms the foundation for deeper time-series or reinforcement learning approaches.

---

Would you like me to show a **next-level version** of this ‚Äî one that includes **technical indicators**, **LSTM-based memory**, and **improved evaluation (precision, recall, ROC, etc.)** ‚Äî while keeping the logic readable?
