# Exploratory Data Analysis (EDA)

## Table of Contents
1. [Dataset Overview](#dataset-overview)
2. [Handling Missing Values](#handling-missing-values)
3. [Feature Distributions](#feature-distributions)
4. [Possible Biases](#possible-biases)
5. [Correlations](#correlations)


## Dataset Overview

```python
# Download stock data
ticker = 'AAPL'
start_date = '2021-01-01'
end_date = '2024-01-01'
data = yf.download(ticker, start=start_date, end=end_date)

# Add technical indicators
data['SMA'] = ta.trend.sma_indicator(data['Close'], window=15)
data['EMA'] = ta.trend.ema_indicator(data['Close'], window=15)
data['RSI'] = ta.momentum.rsi(data['Close'], window=14)
data['MACD'] = ta.trend.macd_diff(data['Close'])
data['Bollinger_High'] = ta.volatility.bollinger_hband(data['Close'])
data['Bollinger_Low'] = ta.volatility.bollinger_lband(data['Close'])

# Drop NA values
data.dropna(inplace=True)

# Print dataset overview
print("Dataset Overview:")
print(data.info())
print(data.head())
```


## Handling Missing Values

```python
# Check for missing values
print("Handling Missing Values:")
print("Missing Values:\n", data.isnull().sum())

# Handle missing values
# Example: fill missing values with mean or median
data.fillna(data.mean(), inplace=True)
print("Missing Values Handled:\n", data.isnull().sum())
```


## Feature Distributions

```python
# Feature distributions
print("Feature Distributions:")
data.hist(figsize=(20, 10))
plt.tight_layout()
plt.show()

# Box plots
print("Box Plots:")
data[['Close', 'Volume', 'SMA', 'EMA', 'RSI', 'MACD']].plot(kind='box', subplots=True, layout=(2, 3), figsize=(15, 10))
plt.tight_layout()
plt.show()

# Density plots
print("Density Plots:")
data[['Close', 'Volume', 'SMA', 'EMA', 'RSI', 'MACD']].plot(kind='density', subplots=True, layout=(2, 3), figsize=(15, 10))
plt.tight_layout()
plt.show()
```


## Possible Biases

```python
# Assess for possible biases
# Example: check for time-based bias
print("Possible Biases:")
print(data.resample('Y').mean())
```


## Correlations

```python
# Correlation matrix
print("Correlation Matrix:")
correlation_matrix = data.corr()
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.tight_layout()
plt.show()
```
