# Stage 16 Homework Starter

This notebook is a starting point for polishing your final repo and lifecycle mapping.

## Checklist Template
 - Add checklist elements, as in the examples below, to make sure you cover everything you would like to accomplish
- Update this checklist as you finalize your repo.

In [1]:
import os
import time
import requests
import pandas as pd
from dotenv import load_dotenv
from datetime import datetime
import yfinance as yf

def ingest_stock_data(symbol="CVNA", start_date="2023-01-01", resolution="D", output_dir="data/raw"):
    load_dotenv()
    api_key = os.getenv("FINNHUB_API_KEY")
    
    from_ts = int(time.mktime(time.strptime(start_date, "%Y-%m-%d")))
    to_ts = int(time.time())
    
    url = "https://finnhub.io/api/v1/stock/candle"
    params = {
        "symbol": symbol,
        "resolution": resolution,
        "from": from_ts,
        "to": to_ts,
        "token": api_key
    }
    
    print(f"🔄 Attempting Finnhub API for {symbol}...")
    try:
        response = requests.get(url, params=params)
        data = response.json()
        if data.get("s") == "ok":
            df = pd.DataFrame({
                "timestamp": pd.to_datetime(data["t"], unit="s"),
                "open": data["o"],
                "high": data["h"],
                "low": data["l"],
                "close": data["c"],
                "volume": data["v"]
            })
            source = "finnhub"
        else:
            print(f"⚠️ Finnhub failed: {data}")
            df = None
    except Exception as e:
        print(f"❌ Finnhub exception: {e}")
        df = None
    
    if df is None:
        print(f"🔁 Falling back to yfinance for {symbol}...")
        try:
            ticker = yf.Ticker(symbol)
            df = ticker.history(start=start_date).reset_index()
            df = df.rename(columns={
                "Date": "timestamp",
                "Open": "open",
                "High": "high",
                "Low": "low",
                "Close": "close",
                "Volume": "volume"
            })
            source = "yfinance"
        except Exception as e:
            print(f"❌ yfinance exception: {e}")
            return None
    
    if df.empty or df.isna().sum().sum() > 0:
        print("⚠️ DataFrame is empty or contains NaNs.")
        return None
    
    os.makedirs(output_dir, exist_ok=True)
    timestamp = datetime.now().strftime("%Y%m%d-%H%M")
    filename = f"{output_dir}/api_{source}_{symbol.lower()}_{timestamp}.csv"
    df.to_csv(filename, index=False)
    print(f"✅ Data saved to {filename}")
    return df


In [2]:
checklist = {
    "repo_clean": False,
    "repo_complete": False,
    "readme_complete": False,
    "lifecycle_map": False,
    "summary_doc": False
    "framework_guide_table": False
}
checklist

SyntaxError: invalid syntax. Perhaps you forgot a comma? (464881451.py, line 6)

# Reflection Prompts

## What stage of the lifecycle was hardest for you, and why?

The hardest stage of the lifecycle was **Data Acquisition / Ingestion** because handling API failures, such as invalid API keys with Finnhub, and adapting to changes in web data sources like Yahoo Finance scraping required multiple fallback solutions and troubleshooting. Ensuring data completeness and consistency amidst these disruptions demanded significant effort.

## Which part of your repo is most reusable in a future project?

The most reusable part of the repo is the **data ingestion and preprocessing pipelines**, especially the robust Python functions that fetch stock data from multiple sources and generate synthetic social media datasets aligned with stock market days. These modular and parameterized scripts can easily be adapted to other ticker symbols or social media topics.

## If a teammate had to pick up your repo tomorrow, what would help them most?

To help a teammate pick up the repo quickly, **comprehensive documentation** including a detailed README with lifecycle mapping, and clear function docstrings across all scripts would be most valuable. Additionally, maintaining a clean and consistent folder structure with example notebooks demonstrating key workflows would greatly facilitate understanding and extending the project.
