# Project Requirements: Moving Average Crossover Strategy

## ✅ Data Source

- Use the `yfinance` API to fetch historical price data.
- Ensure **24/7 coverage** if applying this to **crypto** markets.

## ⏱️ Time Frame

- Minimum **50 days** of data required for short-term moving average.
- **Recommended:** 200+ days for long-term trend analysis and reliable crossover signals.

## 📊 Required Data Columns

- Essential columns:
  - `Open`
  - `High`
  - `Low`
  - `Close`
  - `Volume`
- For crypto:
  - Check if `Adj Close` is available or if regular `Close` should be used.

## 🧹 Missing Data Handling

- Identify gaps in timestamps.
- Handle missing values using:
  - **Forward fill**
  - **Linear interpolation**
  - **Dropping incomplete days** (if necessary)


In [102]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [103]:
%pip install matplotlib




In [104]:
%pip install yfinance

Note: you may need to restart the kernel to use updated packages.


In [2]:
import yfinance as yf

In [110]:
btc_usd = yf.Ticker("BTC-USD") #get data
hist = btc_usd.history(period = "max") #check historical data
print(hist.head())

                                 Open        High         Low       Close  \
Date                                                                        
2014-09-17 00:00:00+00:00  465.864014  468.174011  452.421997  457.334015   
2014-09-18 00:00:00+00:00  456.859985  456.859985  413.104004  424.440002   
2014-09-19 00:00:00+00:00  424.102997  427.834991  384.532013  394.795990   
2014-09-20 00:00:00+00:00  394.673004  423.295990  389.882996  408.903992   
2014-09-21 00:00:00+00:00  408.084991  412.425995  393.181000  398.821014   

                             Volume  Dividends  Stock Splits  
Date                                                          
2014-09-17 00:00:00+00:00  21056800        0.0           0.0  
2014-09-18 00:00:00+00:00  34483200        0.0           0.0  
2014-09-19 00:00:00+00:00  37919700        0.0           0.0  
2014-09-20 00:00:00+00:00  36863600        0.0           0.0  
2014-09-21 00:00:00+00:00  26580100        0.0           0.0  


In [111]:
#get the actual date 
type(btc_usd)
df = btc_usd.history(period= "60d") #60 days 
df = df.reset_index() # this line help get the date on column
df.head(10) # give me 10  first ones 

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2025-04-08 00:00:00+00:00,79218.476562,80823.890625,76198.023438,76271.953125,48314590749,0.0,0.0
1,2025-04-09 00:00:00+00:00,76273.5625,83541.0,74589.671875,82573.953125,84213627038,0.0,0.0
2,2025-04-10 00:00:00+00:00,82565.976562,82700.929688,78456.132812,79626.140625,44718000633,0.0,0.0
3,2025-04-11 00:00:00+00:00,79625.046875,84247.476562,78936.320312,83404.835938,41656778779,0.0,0.0
4,2025-04-12 00:00:00+00:00,83404.515625,85856.1875,82769.375,85287.109375,24258059104,0.0,0.0
5,2025-04-13 00:00:00+00:00,85279.46875,86015.1875,83027.007812,83684.976562,28796984817,0.0,0.0
6,2025-04-14 00:00:00+00:00,83694.523438,85785.0,83690.640625,84542.390625,34090769777,0.0,0.0
7,2025-04-15 00:00:00+00:00,84539.695312,86429.351562,83598.820312,83668.992188,28040322885,0.0,0.0
8,2025-04-16 00:00:00+00:00,83674.507812,85428.28125,83100.617188,84033.867188,29617804112,0.0,0.0
9,2025-04-17 00:00:00+00:00,84030.671875,85449.070312,83749.75,84895.75,21276866029,0.0,0.0


In [124]:
#get the actual date 
type(btc_usd)
df = btc_usd.history(period= "50d")
df = df.reset_index()
df.head(10) #
df = df.drop(columns=["Dividends", "Stock Splits"]) # drop dividens and stock_splits 
df.head(10)
print(df.columns)


Index(['Date', 'Open', 'High', 'Low', 'Close', 'Volume'], dtype='object')


In [125]:
df.head(10)

Unnamed: 0,Date,Open,High,Low,Close,Volume
0,2025-04-18 00:00:00+00:00,84900.1875,85095.046875,84298.882812,84450.804688,12728372364
1,2025-04-19 00:00:00+00:00,84450.867188,85597.703125,84353.460938,85063.414062,15259300427
2,2025-04-20 00:00:00+00:00,85066.070312,85306.382812,83976.84375,85174.304688,14664050812
3,2025-04-21 00:00:00+00:00,85171.539062,88460.09375,85143.835938,87518.90625,41396190190
4,2025-04-22 00:00:00+00:00,87521.875,93817.382812,87084.53125,93441.890625,55899038456
5,2025-04-23 00:00:00+00:00,93427.585938,94535.734375,91962.960938,93699.109375,41719568821
6,2025-04-24 00:00:00+00:00,93692.398438,94016.195312,91696.710938,93943.796875,31483175315
7,2025-04-25 00:00:00+00:00,93954.25,95768.390625,92898.59375,94720.5,40915232364
8,2025-04-26 00:00:00+00:00,94714.648438,95251.359375,93927.25,94646.929688,17612825123
9,2025-04-27 00:00:00+00:00,94660.90625,95301.203125,93665.398438,93754.84375,18090367764


In [None]:
from datetime import date, timedelta
date_column = df["Date"]
date_column.head(10)
duration  = timedelta(days=50)  #50 days range 
today = date.today()            #today dates 


In [167]:
past_dates = today - duration  # 50 days ago date equation


In [168]:
print(past_dates) #50 days ago
print(f"Todays date: {today}") # this date is from we are analyzing
print(f"50 days ag0 date: {past_dates}")

2025-04-17
Todays date: 2025-06-06
50 days ag0 date: 2025-04-17


In [1]:
def data(): 
    

NameError: name 'past_dates' is not defined