## Learning Objectives
<font color="#12A80D"> <b>• Generate next-day NVDA close using a Ridge meta-model fed by latest lookback predictions, plus a news-sentiment confidence signal.</br>• Spin up a clean Colab runtime, mount Drive, pin/install packages (Transformers, joblib, newsapi-python), and define reproducible paths for inputs, cache, and logs.</br>• Load the NewsAPI key from Drive, initialize the client, and implement a 5-day JSON cache to avoid quota issues and ensure deterministic runs.</br>• Fetch Nvidia news from the last 5 days, normalize fields (date/title/description), deduplicate, and persist to cache for reuse.</br>• Load ProsusAI/FinBERT (tokenizer + model), batch texts, compute softmax class probabilities, and aggregate avg positive/neutral/negative scores across articles.</br>• Map aggregated sentiment into a qualitative confidence label (STRONG / NEUTRAL / WEAK) based on thresholded averages.</br>• Discover the latest <code>* _ predictions.csv</code> from each lookback folder (1D–365D), extract the final row, and assemble an <code>X_input</code> row of <code>Pred _ *</code>  features.</br>• Load the saved feature column order and Ridge meta-model (joblib), align columns, and produce the ensemble predicted close for the next session.</br>• Compute and print a concise summary: last actual close, predicted close, Δ% change, and the sentiment-based confidence context.</br>• Log one row to an <code>ensemble_prediction_log.csv</code> with timestamp, prediction, Δ%, confidence, avg sentiment scores, and each base model’s latest <code>Pred _ *</code> value—ensuring stable column order for append-safe runs.</br>• Practice engineering hygiene: explicit error messages for missing keys/files, consistent timezone/date handling, idempotent caching, and clear checkpoint prints for traceability.

</b> </font>

## Load Dependencies into the Colab Runtime Environment
<font color="#12A80D"> <b>• Installs and upgrades required Python packages in the Colab environment.</br>• Any installation errors can be ignored, as unused dependencies do not affect the execution of <code>Nvidia_Next_Day_Closing_Meta_Model_Prediction_w_API_Call_Week6.ipynb</code>.</b> </font>

In [None]:
# Widgets & core data stack
!pip install --quiet ipywidgets
!pip install --quiet pandas==2.2.2 tqdm
# numpy==2.0.2 is already present

# Option B (explicit pin, also compatible)
!pip install -U "torch==2.6.0+cu124" "torchvision==0.21.0+cu124" "torchaudio==2.6.0+cu124" --index-url https://download.pytorch.org/whl/cu124

# News API + utilities
!pip install --quiet newsapi-python==0.2.7
!pip install --quiet joblib==1.4.2
!pip install --quiet requests==2.32.3

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.6 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/1.6 MB[0m [31m20.1 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.6/1.6 MB[0m [31m31.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m17.9 MB/s[0m eta [36m0:00:00[0m
[?25hLooking in indexes: https://download.pytorch.org/whl/cu124
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch==2.6.0+cu124)
  Downloading https://download.pytorch.org/whl/cu124/nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m48.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu12==12.4.127 (from torch==2.6.0+cu124)
  Downloading https://download.pytorch.org/whl/c

## Import necessary Libraries and Modules
<font color="#12A80D"> <b>• Imports essential Python libraries and modules for the project</br>• Includes file and path handling (os, glob), data manipulation (pandas), model persistence (joblib), and date/time utilities (datetime, timedelta)</br>• Loads API clients (NewsApiClient), HTTP request handling (requests), and NLP tools (transformers for model/tokenizer loading)</br>• Enables PyTorch tensor operations (torch) and JSON parsing (json) for configuration and data exchange</b> </font>

In [None]:
# File and path handling
import os        # File system operations
import glob      # File pattern matching

# Data manipulation
import pandas as pd   # Data analysis and DataFrame handling

# Model saving/loading
import joblib         # Serialize and deserialize models and Python objects

# Date and time utilities
from datetime import datetime, timedelta  # Date/time handling and calculations

# News API client
from newsapi import NewsApiClient # Access to the News API service

# HTTP requests
import requests # Make HTTP requests to external APIs

# NLP model loading and tokenization
from transformers import AutoTokenizer, AutoModelForSequenceClassification  # Hugging Face Transformers for text classification

# PyTorch deep learning framework
import torch # Tensor operations and model inference

# JSON utilities
import json # Work with JSON data (parse, dump, load)


## Mount Google Drive in the Colab notebook to access its contents
<font color="#12A80D"> <b>• Requires granting access to Google Drive</br>• Forces remounting even if already mounted</b> </font>

In [None]:
# Mount Google Drive in Google Colab
from google.colab import drive
drive.mount('/content/drive', force_remount=True)  # force_remount=True ensures a fresh mount

Mounted at /content/drive


## Read and configure News API settings
<font color="#12A80D"> <b>• Reads the NewsAPI.org API key from a local text file stored in Google Drive</br>• Initializes a NewsApiClient instance using the loaded API key to enable authenticated requests</br>• Defines a directory path for caching retrieved news data and specifies a JSON file path (news_cache.json) to store the cached results</br>• Facilitates efficient reuse of news data by avoiding repeated API calls for previously fetched content</b> </font>

In [None]:
# Read NewsAPI.org API Key from local file
with open("/content/drive/My Drive/Nvidia_Stock_Market_History/API_keys/newsapi_key.txt", "r") as f:
    NEWSAPI_API_KEY = f.read().strip()

# Initialize NewsAPI client
newsapi = NewsApiClient(api_key=NEWSAPI_API_KEY)

# Directory where cached news data will be stored
NEWS_CACHE_DIR = '/content/drive/My Drive/Nvidia_Stock_Market_History/Training/news_cache'

# Path to JSON file containing cached news results
NEWS_CACHE_PATH = os.path.join(NEWS_CACHE_DIR, "news_cache.json")


## Load FinBERT sentiment analysis model and tokenizer
<font color="#12A80D"> <b>• Loads the pretrained FinBERT model (ProsusAI/finbert) and its associated tokenizer from Hugging Face Transformers</br>• The tokenizer converts raw text into model-ready input IDs and attention masks</br>• The model is a sequence classification transformer fine-tuned for financial sentiment analysis (positive, negative, neutral)</br>• Enables direct inference on financial news or market-related text to assess sentiment trends for downstream stock prediction workflows</b> </font>

In [None]:
# Load FinBERT financial sentiment analysis model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")  # Tokenizer to convert text into model input IDs
model = AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert")  # Pretrained FinBERT model for sentiment classification

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


## Meta-model prediction configuration
<font color="#12A80D"> <b>• Defines key file paths and directories required for generating and logging ensemble predictions</br>• Specifies the trained Ridge meta-model file location (META_MODEL_PATH) used for inference</br>• Sets the log directory (LOG_DIR_PRED) for storing prediction history and outputs</br>• Creates a full log file path (LOG_PATH) for saving ensemble prediction results as a CSV</br>• Ensures consistent file organization for reproducibility and future evaluation</b> </font>

In [None]:
# ---------------------------
# Configuration
# ---------------------------

# Path to the trained Ridge meta-model file
META_MODEL_PATH = "/content/drive/My Drive/Nvidia_Stock_Market_History/Training/Meta_Model_Trained/meta_model_ridge.joblib"

# Directory for storing prediction logs
LOG_DIR_PRED = "/content/drive/My Drive/Nvidia_Stock_Market_History/Training/Meta_Model_Trained"

# Full path for the ensemble prediction log CSV
LOG_PATH = os.path.join(LOG_DIR_PRED, "ensemble_prediction_log.csv")

## Load latest lookback predictions
<font color="#12A80D"> <b>• Iterates through all defined lookback periods (LOOKBACK_PERIODS) to find the most recent prediction outputs for each timeframe</br>• Uses folder naming conventions containing date and time stamps to sort and identify the latest run within each lookback folder</br>• Searches for *_predictions.csv files in the most recent subfolder and records their paths for downstream processing</br>• Ensures the meta-model always uses the freshest predictions from all base models without manual file selection</b> </font>

In [None]:
# ---------------------------
# Load latest predictions
# ---------------------------

# Lookback folders to process
LOOKBACK_PERIODS = ["365D", "270D", "180D", "90D", "60D", "30D", "14D", "1D"]


# Base directory containing all lookback folders
ENSEMBLE_INPUTS_BASE = "/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs"

# Will store paths to the latest *_predictions.csv from each lookback folder
csv_files = []

for lookback in LOOKBACK_PERIODS:
    lookback_path = os.path.join(ENSEMBLE_INPUTS_BASE, lookback)
    # Get all subfolders inside the current lookback folder
    subfolders = [f for f in glob.glob(os.path.join(lookback_path, "*")) if os.path.isdir(f)]
    if not subfolders:
        raise FileNotFoundError(f"No subfolders found in {lookback_path}")

    # Parse datetime from subfolder names
    def extract_dt(folder_name):
        # Expects last part of name to be date_time: 2025-07-08_14-43-29
        dt_str = folder_name.split("_")[-2] + " " + folder_name.split("_")[-1]
        return datetime.strptime(dt_str, "%Y-%m-%d %H-%M-%S")

    # Sort subfolders by datetime descending order
    subfolders_sorted = sorted(subfolders, key=lambda f: extract_dt(os.path.basename(f)), reverse=True)
    latest_subfolder = subfolders_sorted[0]

    # Find the *_predictions.csv file inside the latest subfolder
    pred_files = glob.glob(os.path.join(latest_subfolder, "*_predictions.csv"))
    if not pred_files:
        raise FileNotFoundError(f"No *_predictions.csv found in {latest_subfolder}")

    # Assuming only one predictions file per folder
    csv_files.append(pred_files[0])

# Print the discovered prediction files
print("Found prediction files:")
for f in csv_files:
    print(f)

Found prediction files:
/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs/365D/Nvidia_Stock_Training_365D_SA_2025-08-13_04-56-15/Nvidia_C1D64_BiG550_BiG350_BAtt_D1_Lookback365_predictions.csv
/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs/270D/Nvidia_Stock_Training_270D_SA_2025-08-13_02-24-48/Nvidia_C1D64_BiG250_BiG250_BiG250_BAtt_D1_Lookback270_predictions.csv
/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs/180D/Nvidia_Stock_Training_180D_SA_2025-08-13_02-55-32/Nvidia_C1D64_BiG250_BiG250_BiG250_BAtt_D1_Lookback180_predictions.csv
/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs/90D/Nvidia_Stock_Training_90D_SA_2025-08-13_03-13-06/Nvidia_C1D64_BiG250_BiG250_BAtt_D1_Lookback90_predictions.csv
/content/drive/My Drive/Nvidia_Stock_Market_History/Training/ensemble_inputs/60D/Nvidia_Stock_Training_60D_SA_2025-08-12_05-01-55/Nvidia_C1D64_BiG250_BiG250_BAtt_D1_Lookback60_predictio

## Prepare latest base-model predictions for meta-model
<font color="#12A80D"> <b>• Iterates through the most recent prediction CSVs for each lookback period, extracts the final row (latest prediction), and stores the predicted close price under keys like Pred_30</br>• Combines all extracted predictions into a single DataFrame (X_input) representing the current input vector for the meta-model</br>• Loads the saved feature_cols.joblib to ensure the DataFrame columns are ordered exactly as the meta-model was trained, filling any missing values with NaN</br>• Prints the resulting input DataFrame so that the user can verify the values before running the ensemble prediction</b> </font>

In [None]:
 # Dictionary to store latest predictions keyed by lookback period
latest_preds = {}

for path in csv_files:
    # Extract lookback period from filename (e.g., "Lookback30" → "30")
    lookback = [s for s in os.path.basename(path).split("_") if "Lookback" in s][0].replace("Lookback", "")
    # Read the CSV for this lookback model
    df = pd.read_csv(path)
    # Get the last row — assumed to be the most recent prediction
    last_row = df.iloc[-1]
    # Extract predicted and actual close prices
    pred_close = last_row["Predicted_Close"]
    actual_close = last_row["Actual_Close"]
    # Store the prediction under a column name format like "Pred_30"
    latest_preds[f"Pred_{lookback}"] = pred_close

# Create DataFrame for meta-model input using the collected predictions
X_input = pd.DataFrame([latest_preds])

# ---------------------------
# Load feature columns and reindex
# ---------------------------

# Path where meta-model and related files are stored
META_MODEL_SAVE_FOLDER = os.path.dirname(META_MODEL_PATH)
# Load the saved feature column order
feature_cols = joblib.load(os.path.join(META_MODEL_SAVE_FOLDER, "feature_cols.joblib"))

# Reindex X_input to ensure correct column order and fill any missing columns with NaN
X_input = X_input.reindex(columns=feature_cols)

# Display latest predictions from the base models
print("\nLatest predictions from base models:")
print(X_input)


Latest predictions from base models:
    Pred_365   Pred_270    Pred_180   Pred_90     Pred_60   Pred_30  \
0  182.88945  130.47496  113.420494  141.9292  127.298706  150.2968   

     Pred_14     Pred_1  
0  144.50777  146.35782  


## Fetch latest Nvidia news articles from past 5 days
<font color="#12A80D"> <b>• Uses the NewsAPI client to query for English-language news articles mentioning "Nvidia" within the last 5 days</br>• Formats the from_param and to parameters as YYYY-MM-DD for the API call, restricting the result set to a maximum of 30 most recent articles</br>• Parses the returned JSON, extracting only the date, title, and description for each article into a simplified list of dictionaries</br>• Returns this list so it can be processed, cached, or used for sentiment analysis by downstream functions</b> </font>

In [None]:
def fetch_today_articles():
    # Import datetime utilities locally to keep scope clean
    from datetime import datetime, timedelta

    # Get current UTC date and time
    today = datetime.utcnow()
    # Format today's date as YYYY-MM-DD for API query
    today_str = today.strftime("%Y-%m-%d")
    # Get the 5 previous dates by subtracting 5 day
    days_str = (today - timedelta(days=5)).strftime("%Y-%m-%d")

    # Fetch articles from NewsAPI:
    # - Query for Nvidia
    # - Restrict to English-language articles
    # - Sort by publication date (newest first)
    # - Search only within the last 5 days
    # - Limit results to 30 articles
    all_articles = newsapi.get_everything(
        q="Nvidia",
        language="en",
        sort_by="publishedAt",
        from_param=days_str,
        to=today_str,
        page_size=30
    )

    articles = all_articles.get("articles", [])

    # Convert articles to simpler dicts
    simplified = []
    for art in articles:
        simplified.append({
            "date": today_str,
            "title": art.get("title", ""),
            "description": art.get("description", "")
        })

    return simplified

## Cache and load recent Nvidia news articles
<font color="#12A80D"> <b>• Ensures the news cache directory exists and attempts to load previously cached news data from a JSON file</br>• If today's date is missing from the cache, calls <code>fetch_today_articles()</code> to retrieve the latest Nvidia news from NewsAPI and adds it to the cache</br>• Maintains only the most recent 5 days of news data to limit file size and improve lookup speed</br>• Writes the updated cache back to disk, then flattens all stored days into a single consolidated article list for downstream sentiment analysis</b> </font>

In [None]:
# Ensure the cache directory exists
os.makedirs(NEWS_CACHE_DIR, exist_ok=True)

# Load an existing cache file if it exists, otherwise start fresh
if os.path.exists(NEWS_CACHE_PATH):
    with open(NEWS_CACHE_PATH, "r") as f:
        news_cache = json.load(f)
else:
    news_cache = {}

# Get a sorted list of all cached dates (oldest → newest)
existing_dates = sorted(news_cache.keys())

# Today's date string (local time) for indexing in the cache
today_str = datetime.now().strftime("%Y-%m-%d")

# If today's date is not in the cache, fetch new articles
if today_str not in existing_dates:
    print("Fetching today's news articles...")
    todays_articles = fetch_today_articles() # Pull from NewsAPI
    news_cache[today_str] = todays_articles  # Add to cache

     # Keep cache size small — only store the latest 5 days of articles
    all_dates_sorted = sorted(news_cache.keys(), reverse=True) # newest → oldest
    trimmed_dates = all_dates_sorted[:5]
    news_cache = {date: news_cache[date] for date in trimmed_dates}

    # Save updated cache back to disk
    with open(NEWS_CACHE_PATH, "w") as f:
        json.dump(news_cache, f, indent=2)

else:
    print("Today's articles already in cache.")


# Flatten the cache: combine all articles across stored days into a single list
all_articles = []
for date in sorted(news_cache.keys()):
    all_articles.extend(news_cache[date])

Fetching today's news articles...


## Compute FinBERT sentiment over last 5 days of news
<font color="#12A80D"> <b>• Combines each article’s title and description, skipping empty entries</br>• Tokenizes text with the FinBERT tokenizer and runs inference with gradients disabled for efficiency</br>• Applies softmax to model logits to obtain Negative, Neutral, and Positive class probabilities</br>• Aggregates probabilities across all articles and computes average sentiment scores</br>• Defaults to a neutral profile (Positive=0.0, Neutral=1.0, Negative=0.0) if no articles are available</br>• Maps averaged scores to a qualitative confidence label: STRONG if avg_positive ≥ 0.5, WEAK if avg_negative ≥ 0.5, otherwise NEUTRAL</b> </font>

In [None]:
print("\nComputing sentiment over last 5 days of news...")

# Lists to store probability scores for each sentiment class
all_pos = []
all_neu = []
all_neg = []

# Loop over every cached news article
for article in all_articles:
    # Combine title and description into one text string
    text = (article.get("title") or "") + ". " + (article.get("description") or "")
    if not text.strip(): # Skip empty entries
        continue

    # Tokenize text for FinBERT (returns PyTorch tensors, truncates if too long)
    inputs = tokenizer(text, return_tensors="pt", truncation=True)

    # Run model inference without gradient tracking
    with torch.no_grad():
        outputs = model(**inputs)
        # Convert raw logits to probabilities via softmax
        probs = torch.nn.functional.softmax(outputs.logits, dim=-1)[0]

    # Append each class probability to the correct list
    all_neg.append(probs[0].item()) # Negative sentiment score
    all_neu.append(probs[1].item()) # Neutral sentiment score
    all_pos.append(probs[2].item()) # Positive sentiment score

# If we computed at least one sentiment, calculate averages
if all_pos:
    avg_positive = sum(all_pos) / len(all_pos)
    avg_neutral = sum(all_neu) / len(all_neu)
    avg_negative = sum(all_neg) / len(all_neg)
else:
    # No articles → default to neutral sentiment
    avg_positive = 0.0
    avg_neutral = 1.0
    avg_negative = 0.0

# Map average sentiment scores to a qualitative confidence label
if avg_positive >= 0.5:
    confidence = "STRONG" # Market sentiment strongly positive
elif avg_negative >= 0.5:
    confidence = "WEAK" # Market sentiment strongly negative
else:
    confidence = "NEUTRAL" # No strong directional sentiment


Computing sentiment over last 5 days of news...


## Load Ridge meta-model and generate ensemble prediction
<font color="#12A80D"> <b>• Loads the pre-trained Ridge regression meta-model from disk using joblib</br>• Uses the latest base-model predictions in <code>X_input</code> (ensuring correct feature ordering) as input</br>• Generates a single ensemble prediction representing the forecasted next-day stock closing price</br>• This meta-model combines multiple lookback models to improve predictive accuracy over individual models</b> </font>

In [None]:
# ---------------------------
# Load Ridge meta-model
# ---------------------------

# Load the trained Ridge regression model from disk
# This meta-model was trained to combine predictions from all base models
meta_model = joblib.load(META_MODEL_PATH)

# Predict ensemble output
# X_input contains the latest predictions from each base model (properly ordered)
# The meta-model predicts the final next-day stock closing price
ensemble_pred = meta_model.predict(X_input.values)[0]

## Display ensemble prediction summary
<font color="#12A80D"> <b>Calculates the percentage change between the ensemble-predicted next-day closing price and today’s actual closing price</br>Prints a clear prediction report showing:</br>• Forecasted next-day close</br>• Today’s actual close</br>• Percentage change</br>• Sentiment-based confidence label</br>• Average positive, neutral, and negative sentiment scores from the past five days of news</br>Provides an interpretable snapshot of the prediction and its sentiment context for decision-making</b> </font>

In [None]:
# ---------------------------
# Display result
# ---------------------------

# Calculate the percentage difference between the predicted next-day close
# and today's actual close price
delta_pct = (ensemble_pred - actual_close) / actual_close * 100

# Print prediction summary
print("\n--------------------------------------------")
print(f"Predicted Close for Next Day: ${ensemble_pred:.2f}")
print(f"Today's Close: ${actual_close:.2f}")
print(f"Predicted Change: {delta_pct:.2f}%")
print(f"Confidence: {confidence}")
print(f"Sentiment: Pos={avg_positive:.2f}, Neu={avg_neutral:.2f}, Neg={avg_negative:.2f}")
print("--------------------------------------------")


--------------------------------------------
Predicted Close for Next Day: $176.99
Today's Close: $182.70
Predicted Change: -3.12%
Confidence: NEUTRAL
Sentiment: Pos=0.35, Neu=0.35, Neg=0.30
--------------------------------------------


## Log ensemble prediction with metadata to CSV
<font color="#12A80D"> <b>• Captures current date and timestamp for audit fields (<code>Date_Predicted</code>, <code>Timestamp</code>)</br>• Assembles a one-row log entry including ensemble prediction, last close, percent change, confidence label, and average FinBERT sentiment scores</br>• Adds latest base-model predictions (<code>Pred_365</code>, <code>Pred_270</code>, …) to the same entry</br>• Checks if the log file exists and reorders columns to match existing headers for consistency</br>• Appends the entry to <code>ensemble_prediction_log.csv</code> if present, otherwise creates a new file with headers</br>• Prints a status message indicating whether the log was appended or created</b> </font>

In [None]:
# Current date and time
now = datetime.now()
today = now.strftime("%Y-%m-%d") # Just the date for "Date_Predicted" column
timestamp = now.strftime("%Y-%m-%d %H:%M:%S") # Full timestamp for detailed logging

# Prepare log data dictionary
log_data = {
    "Date_Predicted": [today], # Prediction date
    "Timestamp": [timestamp], # Exact time of prediction
    "Ensemble_Predicted_Close": [ensemble_pred], # Meta-model predicted close price
    "Last_Close": [actual_close],  # Actual latest close price
    "Predicted_Change_Percent": [delta_pct],   # % change from last close to predicted close
    "Confidence": [confidence], # Sentiment-based confidence label
    "Avg_Positive": [avg_positive], # Avg FinBERT positive sentiment score
    "Avg_Neutral": [avg_neutral],  # Avg FinBERT neutral sentiment score
    "Avg_Negative": [avg_negative]  # Avg FinBERT negative sentiment score
}

# Add base model latest predictions to the log (Pred_365, Pred_270, etc.)
log_data.update({k: [v] for k, v in latest_preds.items()})

# Convert log_data to a DataFrame (one row for this run)
log_entry = pd.DataFrame(log_data)

# Determine if log file already exists
file_exists = os.path.exists(LOG_PATH)

# If appending, ensure consistent column order
if file_exists:
    # Read just headers
    headers = pd.read_csv(LOG_PATH, nrows=0).columns.tolist()
    # Reorder columns to match
    log_entry = log_entry[headers]

# Append to log file if it exists, else create a new file with headers
log_entry.to_csv(LOG_PATH, mode="a", header=not file_exists, index=False)


# Status message
if file_exists:
    print(f"\nPrediction appended to existing log: {LOG_PATH}")
else:
    print(f"\nNew log file created: {LOG_PATH}")


New log file created: /content/drive/My Drive/Nvidia_Stock_Market_History/Training/Meta_Model_Trained/ensemble_prediction_log.csv
