# 📊 **Stock Market Data Collection**
### *Automated Retrieval of Historical Stock Data for Market Analysis*
---
## **📌 Project Overview**
This notebook is responsible for **automatically retrieving and processing historical stock market data** for selected technology companies. The data will be used in subsequent analyses, including **exploratory data analysis (EDA), risk assessment, correlation studies, and predictive modeling** using deep learning.

### **🔹 Selected Stocks:**
The dataset consists of daily trading data for the following **technology giants**:
- 📈 **Apple Inc. (`AAPL`)**
- 🌍 **Alphabet Inc. (`GOOG`)**
- 💻 **Microsoft Corporation (`MSFT`)**
- 🛍️ **Amazon.com Inc. (`AMZN`)**

### **🔍 Key Features of the Dataset:**
- **Time Range:** Last **12 months** of stock price data
- **Granularity:** Daily trading data
- **Data Fields:**
  - **📌 Open, High, Low, Close prices**
  - **📌 Trading Volume**
  - **📌 Date-wise Stock Movements**
  
The retrieved dataset will be **cleaned, structured, and saved** for further analysis.

---
## **💾 Output**
The cleaned dataset is **stored locally** for future use in further analysis, visualization, and model training.


In [None]:
# Import required libraries
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
from datetime import datetime
from IPython.display import display


In [5]:
# Define visualization styles
sns.set_style('whitegrid')
plt.style.use("fivethirtyeight")

# Define stock tickers
tech_list = ['AAPL', 'GOOG', 'MSFT', 'AMZN']

# Set time range for fetching data
end = datetime.now()
start = datetime(end.year - 1, end.month, end.day)

# Fetch stock data using yfinance
df = yf.download(tech_list, start=start, end=end, group_by='Ticker')

# Convert multi-indexed columns to a normal DataFrame
df = df.stack(level=0).rename_axis(['Date', 'Ticker']).reset_index()

# Rename columns for better readability
df.columns = ['Date', 'Ticker', 'Open', 'High', 'Low', 'Close', 'Volume']

# Format volume to avoid scientific notation
df["Volume"] = df["Volume"].astype(int)

# Sort by date and ticker for clarity
df = df.sort_values(by=['Date', 'Ticker']).reset_index(drop=True)

# Display cleaned DataFrame
display(df)

# Define the absolute path for saving the file
save_path = "/Users/adityaiyer/Desktop/APPLE_SMA/Data/stock_data.csv"

# Ensure the "Data" directory exists before saving
os.makedirs(os.path.dirname(save_path), exist_ok=True)

# Save the dataset inside "Data" on your Desktop
df.to_csv(save_path, index=False)

[*********************100%***********************]  4 of 4 completed


Unnamed: 0,Date,Ticker,Open,High,Low,Close,Volume
0,2024-03-05,AAPL,169.957503,171.231486,168.822861,169.320511,95132400
1,2024-03-05,AMZN,176.929993,176.929993,173.300003,174.119995,37228300
2,2024-03-05,GOOG,132.264771,133.540187,131.079029,133.301041,28447600
3,2024-03-05,MSFT,410.823486,411.111297,397.604432,399.599182,26919200
4,2024-03-06,AAPL,170.256065,170.435227,167.887245,168.325180,68587700
...,...,...,...,...,...,...,...
995,2025-03-03,MSFT,398.820007,398.820007,386.160004,388.489990,23007700
996,2025-03-04,AAPL,237.710007,240.070007,234.679993,235.929993,53724800
997,2025-03-04,AMZN,200.110001,206.800003,197.429993,203.800003,60714200
998,2025-03-04,GOOG,167.940002,175.164993,167.539993,172.610001,30667900


✅ Stock data successfully collected and saved at: /Users/adityaiyer/Desktop/APPLE_SMA/Data/stock_data.csv
