Project Description
Using the AAPL (Apple Inc.) stock dataset, conduct the following analyses:



Initial Data Exploration
Load the dataset using Pandas. Check for null values and understand data types.
Examine the time series properties of the data (e.g., frequency, trends).


Data Visualization
Utilize Matplotlib to plot closing prices and traded volume over time.
Create a candlestick chart to depict high and low prices.


Statistical Analysis
Compute summary statistics (mean, median, standard deviation) for key columns.
Analyze closing prices with a moving average.


Hypothesis Testing
Execute a t-test to compare average closing prices across different years.
Examine daily returns’ distribution and test for normality using SciPy.


Advanced Statistical Techniques (Bonus)
Statistical Functions in NumPy: Employ NumPy’s statistical functions for in-depth stock data analysis.
E.g., Use convolve for moving averages, or np.corrcoef to explore correlations between financial metrics.
Analyze correlations between moving averages of closing prices and trading volume across time periods.


Resources
Dataset: AAPL Stock Data (2007-2023)

Includes daily data like volume, VWAP, open, close, high, low prices, and number of transactions.

In [3]:
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats 

df = pd.DataFrame(pd.read_csv('AAPL_1D_01012007-12072023.csv'))
df

Unnamed: 0,volume,vwap,open,close,high,low,time,transactions
0,1.245445e+09,3.0302,3.0821,2.9929,3.0921,2.9250,1167800400,189737
1,8.554834e+08,3.0403,3.0018,3.0593,3.0696,2.9936,1167886800,136333
2,8.352580e+08,3.0426,3.0632,3.0375,3.0786,3.0143,1167973200,141050
3,7.974138e+08,3.0683,3.0700,3.0525,3.0904,3.0457,1168232400,130547
4,3.352007e+09,3.1946,3.0875,3.3061,3.3207,3.0411,1168318800,569578
...,...,...,...,...,...,...,...,...
4154,4.515552e+07,190.8214,189.8400,191.8100,192.0200,189.2000,1688616000,562755
4155,4.675750e+07,191.4218,191.4100,190.6800,192.6700,190.2400,1688702400,538826
4156,5.991216e+07,188.3628,189.2600,188.6100,189.9900,187.0350,1688961600,736912
4157,4.663812e+07,187.8219,189.1600,188.0800,189.3000,186.6000,1689048000,577717


In [4]:
import datetime

df['readable_time'] = df['time'].apply(lambda x: datetime.datetime.utcfromtimestamp(x))
df

Unnamed: 0,volume,vwap,open,close,high,low,time,transactions,readable_time
0,1.245445e+09,3.0302,3.0821,2.9929,3.0921,2.9250,1167800400,189737,2007-01-03 05:00:00
1,8.554834e+08,3.0403,3.0018,3.0593,3.0696,2.9936,1167886800,136333,2007-01-04 05:00:00
2,8.352580e+08,3.0426,3.0632,3.0375,3.0786,3.0143,1167973200,141050,2007-01-05 05:00:00
3,7.974138e+08,3.0683,3.0700,3.0525,3.0904,3.0457,1168232400,130547,2007-01-08 05:00:00
4,3.352007e+09,3.1946,3.0875,3.3061,3.3207,3.0411,1168318800,569578,2007-01-09 05:00:00
...,...,...,...,...,...,...,...,...,...
4154,4.515552e+07,190.8214,189.8400,191.8100,192.0200,189.2000,1688616000,562755,2023-07-06 04:00:00
4155,4.675750e+07,191.4218,191.4100,190.6800,192.6700,190.2400,1688702400,538826,2023-07-07 04:00:00
4156,5.991216e+07,188.3628,189.2600,188.6100,189.9900,187.0350,1688961600,736912,2023-07-10 04:00:00
4157,4.663812e+07,187.8219,189.1600,188.0800,189.3000,186.6000,1689048000,577717,2023-07-11 04:00:00


In [5]:
import plotly.graph_objs as go
import matplotlib.dates as mdates


trace = go.Candlestick(
    x=df['Date'],
    open=df['Open'],
    high=df['High'],
    low=df['Low'],
    close=df['Close'],
)

# Create figure and plot
layout = go.Layout(
    title='Candlestick Chart',
    xaxis=dict(title='Date'),
    yaxis=dict(title='Price'),
)
fig = go.Figure(data=[trace], layout=layout)
fig.show()

ModuleNotFoundError: No module named 'plotly'