# Stock Price Analysis: A Data-Driven Approach to Understanding Market Behavior

## Project Summary
This project demonstrates how data analysis can be used to evaluate stock market trends, compare performance across companies, and uncover actionable insights for strategic investment decisions. The analysis was conducted using Python and real-time data from public financial markets.



## Business Objective
To help stakeholders make better-informed investment or portfolio decisions by analyzing the historical performance, volatility, and risk-return profiles of selected stocks.

## Key Business Questions Addressed
* Which stocks delivered the best returns over time?
* How volatile are the selected companies, and what are the associated risks?
* What are the comparative trends and patterns in stock price behavior?
* Can we identify consistent performers or risky outliers?


## Approach

* Data Collection: Retrieved historical stock price data using the yfinance API.
* Data Cleaning & Preparation: Handled missing values, aligned time series, and calculated financial metrics.
* Comparative Analysis:Analyzed daily and cumulative returns across stocks.
* Identified volatility patterns using standard deviation and rolling averages.
* Visualized trends to reveal stock behavior over time.


# 1. Load Data Collection

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import glob

In [None]:
read_list = glob.glob(r'/kaggle/input/stock-5yrs-analysis/*.csv')

In [None]:
read_list[:5]

In [None]:
company_list = [
    '/kaggle/input/stock-5yrs-analysis/AAPL_data.csv',
    '/kaggle/input/stock-5yrs-analysis/AMZN_data.csv',
    '/kaggle/input/stock-5yrs-analysis/GOOG_data.csv',
    '/kaggle/input/stock-5yrs-analysis/MSFT_data.csv'
    
]

In [None]:
import warnings
from warnings import filterwarnings
filterwarnings('ignore')

In [None]:
all_data = pd.DataFrame()

for file in company_list: 
    
    current_df = pd.read_csv(file)
    
    all_data = pd.concat([current_df, all_data])

In [None]:
all_data.shape

In [None]:
all_data.head(6)

In [None]:
all_data['Name'].unique()

# 2. Analysing change in price of the stock overtime

In [None]:
all_data.isnull().sum()

In [None]:
all_data.dtypes

In [None]:
all_data['date'] = pd.to_datetime(all_data['date'])

In [None]:
all_data['date']

In [None]:
tech_list = all_data['Name'].unique()

In [None]:
tech_list

In [None]:
plt.figure(figsize=(20,10))

for index, company in enumerate(tech_list, 1): 
    plt.subplot(2,2,index)
    filter1 = all_data['Name']==company
    df = all_data[filter1]
    plt.plot(df['date'], df['close'])
    plt.title(company)

# 3. Moving Average of Stocks

In [None]:
all_data.head(5)

In [None]:
all_data['close'].rolling(window=10).mean().head(14)

In [None]:
new_data = all_data.copy()

In [None]:
ma_day = [10, 20, 50]

for ma in ma_day:
    new_data['close'+str(ma)]= new_data['close'].rolling(ma).mean()

In [None]:
new_data.tail(7)

In [None]:
new_data.set_index('date', inplace=True)

In [None]:
new_data.head(5)

In [None]:
new_data.columns

In [None]:
plt.figure(figsize=(20,15))

for index, company in enumerate(tech_list, 1): 
    plt.subplot(2,2,index)
    filter1 = new_data['Name']==company
    df = new_data[filter1]
    df[['close10','close20', 'close50']].plot(ax=plt.gca())
    plt.title(company)

# 4. Analyzing Closing price change in Amazon stock

In [None]:
company_list

In [None]:
amzn = pd.read_csv(r'/kaggle/input/stock-5yrs-analysis/AMZN_data.csv')

In [None]:
amzn.head(4)

In [None]:
amzn['close']

In [None]:
amzn.head(4)

In [None]:
amzn['Daily return(in %)'] = amzn['close'].pct_change() * 100

In [None]:
amzn.head(4)

In [None]:
import plotly.express as px

In [None]:
px.line(amzn , x="date" , y="Daily return(in %)")

# 5. Performing resampling analysis of closing price

In [None]:
amzn.dtypes

In [None]:
amzn['date'] =pd.to_datetime(amzn['date'])

In [None]:
amzn.dtypes

In [None]:
amzn.head(4)

In [None]:
amzn.set_index('date' , inplace=True)

In [None]:
amzn.head(4)

In [None]:
amzn['close'].resample('M').mean()

In [None]:
amzn['close'].resample('M').mean().plot()

In [None]:
amzn['close'].resample('Y').mean()

In [None]:
amzn['close'].resample('Y').mean().plot() 

In [None]:
amzn['close'].resample('Q').mean()

In [None]:
amzn['close'].resample('Q').mean().plot()

# 6. Are the closing prices of the tech companies correlated?

In [None]:
company_list

In [None]:
company_list[0]

In [None]:
app = pd.read_csv(company_list[0])
amzn = pd.read_csv(company_list[1])
google = pd.read_csv(company_list[2])
msft = pd.read_csv(company_list[3])

In [None]:
closing_price = pd.DataFrame()

In [None]:
closing_price['apple_close'] = app['close']
closing_price['amzn_close'] = amzn['close']
closing_price['goog_close'] = google['close']
closing_price['msft_close'] = msft['close']

In [None]:
closing_price

In [None]:
sns.pairplot(closing_price)

In [None]:
closing_price.corr()

In [None]:
sns.heatmap(closing_price.corr() , annot=True)

# 7. Correlation between Daily change in Closing price of stocks or Daily Returns in Stock

In [None]:
closing_price

In [None]:
closing_price['apple_close']

In [None]:
closing_price['apple_close'].shift(1)

In [None]:
(closing_price['apple_close'] - closing_price['apple_close'].shift(1))/closing_price['apple_close'].shift(1) * 100

In [None]:
for col in closing_price.columns:
    closing_price[col + '_pct_change'] = (closing_price[col] - closing_price[col].shift(1))/closing_price[col].shift(1) * 100

In [None]:
closing_price

In [None]:
closing_price.columns

In [None]:
clsing_p = closing_price[['apple_close_pct_change', 'amzn_close_pct_change',
       'goog_close_pct_change', 'msft_close_pct_change']]

In [None]:
clsing_p

In [None]:
g = sns.PairGrid(data= clsing_p)
g.map_diag(sns.histplot)
g.map_lower(sns.scatterplot)
g.map_upper(sns.kdeplot)

In [None]:
clsing_p.corr()

In [None]:
sns.heatmap(clsing_p.corr() , annot=True)