# Should your fund invest in Bitcoin?

## 📖 Background
You work as an analyst at an investment fund in New York. Your CFO wants to explore if it is a good idea to invest some of the fund's assets in Bitcoin. You have to prepare a report on this asset and how it compares to the stock market in general.

## 💾 The data
You have access to three files:

#### Bitcoin daily data in US dollars
- "date" - date from September 17, 2014 to November 17, 2021
- "open" - the price at the beginning of the trading day
- "high" - the highest price reached that day
- "low" - the lowest price reached that day
- "close" - the price at the closing of the trading day
- "volume" - how many Bitcoin were traded that day

#### S&P 500 daily data
- "date" - date from September 17, 2014 to November 17, 2021
- "open" - the index level at the beginning of the trading day
- "high" - the highest level reached that day
- "low" - the lowest level reached that day
- "close" - the level at the closing of the trading day
- "volume" - how many shares in the companies that make up the index were traded that day

#### inflation and gold as monthly data
- "date" - date from September, 2014 to November, 2021
- "gold_usd" - price in usd of gold for that month
- "cpi_us" - the inflation index for the US for that month (cpi = consumer price index)

_CPI data from the [U.S. Bureau of Labor Statistics](https://www.bls.gov/cpi/). Publicly available information_.

In [1]:
import pandas as pd
bitcoin = pd.read_csv('./data/bitcoin-usd.csv', parse_dates=['date'])
bitcoin.head()

Unnamed: 0,date,open,high,low,close,volume
0,2014-09-17,465.864014,468.174011,452.421997,457.334015,21056800.0
1,2014-09-18,456.859985,456.859985,413.104004,424.440002,34483200.0
2,2014-09-19,424.102997,427.834991,384.532013,394.79599,37919700.0
3,2014-09-20,394.673004,423.29599,389.882996,408.903992,36863600.0
4,2014-09-21,408.084991,412.425995,393.181,398.821014,26580100.0


In [2]:
sp500 = pd.read_csv('./data/sp500.csv', parse_dates=['date'])
sp500.head()

Unnamed: 0,date,open,high,low,close,volume
0,2014-09-17,1999.300049,2010.73999,1993.290039,2001.569946,3209420000
1,2014-09-18,2003.069946,2012.339966,2003.069946,2011.359985,3235340000
2,2014-09-19,2012.73999,2019.26001,2006.589966,2010.400024,4880220000
3,2014-09-22,2009.079956,2009.079956,1991.01001,1994.290039,3349670000
4,2014-09-23,1992.780029,1995.410034,1982.77002,1982.77002,3279350000


In [3]:
monthly_data = pd.read_csv('./data/monthly_data.csv', parse_dates=['date'])
monthly_data.head()

Unnamed: 0,date,gold_usd,cpi_us
0,2014-09-01,1241.33,237.852
1,2014-10-01,1223.565,238.031
2,2014-11-01,1176.413,237.433
3,2014-12-01,1200.44,236.151
4,2015-01-01,1249.333,234.812


## 💪 Competition challenge
Create a report that covers the following:

1. How does the performance of Bitcoin compare to the S&P 500 and the price of gold?
2. Analyze Bitcoin's returns and volatility profile. Do you believe it could help improve the performance of a portfolio? Do you believe Bitcoin could be used as a hedge versus inflation?
3. The CFO is looking to lower volatility in the fund. Explore building a portfolio using some or all of these assets. Make a recommendation that minimizes overall risk.

## 🧑‍⚖️ Judging criteria

| CATEGORY | WEIGHTING | DETAILS                                                              |
|:---------|:----------|:---------------------------------------------------------------------|
| **Recommendations** | 35%       | <ul><li>Clarity of recommendations - how clear and well presented the recommendation is.</li><li>Quality of recommendations - are appropriate analytical techniques used & are the conclusions valid?</li><li>Number of relevant insights found for the target audience.</li></ul>       |
| **Storytelling**  | 30%       | <ul><li>How well the data and insights are connected to the recommendation.</li><li>How the narrative and whole report connects together.</li><li>Balancing making the report in depth enough but also concise.</li></ul> |
| **Visualizations** | 25% | <ul><li>Appropriateness of visualization used.</li><li>Clarity of insight from visualization.</li></ul> |
| **Votes** | 10% | <ul><li>Up voting - most upvoted entries get the most points.</li></ul> |

## ✅ Checklist before publishing into the competition
- Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
- Remove redundant cells like the judging criteria so the workbook is focused on your story.
- Make sure the workbook reads well and explains how you found your insights.
- Check that all the cells run without error.

## ⌛️ Time is ticking. Good luck!

In [2]:
print("Hello World")

Hello World


# Introduction

Being an investment company we tradionally tends to invest in the stock opitions which are totally depend on markets behaviour and ablosutely independant of anyone perdictions.

Tratdionally it's been always adived to difersify your investment in varired portfolio so that when something unexpected happens one should be financially safe and also it goes without saying that one's return should be enoght to beat the market

Here is the study of different intvestment options to chose from
- Tradtional Stocks S&P 50 performance
- BITCOIN as a crypto
- Gold

Addtionally we are going to look at how inflation data

# importing the required libraries

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Reading the Data

In [4]:
df_stock = pd.read_csv(r'data/sp500.csv')
df_bitcoin = pd.read_csv(r'data/bitcoin-usd.csv')
df_gold_inf = pd.read_csv(r'data/monthly_data.csv')

In [5]:
display(df_stock.head())
display(df_bitcoin.head())
display(df_gold_inf.head())

Unnamed: 0,date,open,high,low,close,volume
0,2014-09-17,1999.300049,2010.73999,1993.290039,2001.569946,3209420000
1,2014-09-18,2003.069946,2012.339966,2003.069946,2011.359985,3235340000
2,2014-09-19,2012.73999,2019.26001,2006.589966,2010.400024,4880220000
3,2014-09-22,2009.079956,2009.079956,1991.01001,1994.290039,3349670000
4,2014-09-23,1992.780029,1995.410034,1982.77002,1982.77002,3279350000


Unnamed: 0,date,open,high,low,close,volume
0,2014-09-17,465.864014,468.174011,452.421997,457.334015,21056800.0
1,2014-09-18,456.859985,456.859985,413.104004,424.440002,34483200.0
2,2014-09-19,424.102997,427.834991,384.532013,394.79599,37919700.0
3,2014-09-20,394.673004,423.29599,389.882996,408.903992,36863600.0
4,2014-09-21,408.084991,412.425995,393.181,398.821014,26580100.0


Unnamed: 0,date,gold_usd,cpi_us
0,2014-09-01,1241.33,237.852
1,2014-10-01,1223.565,238.031
2,2014-11-01,1176.413,237.433
3,2014-12-01,1200.44,236.151
4,2015-01-01,1249.333,234.812


# Understanding the data

## stocks data -----> df_stock

In [6]:
display(df_stock.info())
display(df_stock.isna().sum())
display(df_stock.describe())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1805 entries, 0 to 1804
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   date    1805 non-null   object 
 1   open    1805 non-null   float64
 2   high    1805 non-null   float64
 3   low     1805 non-null   float64
 4   close   1805 non-null   float64
 5   volume  1805 non-null   int64  
dtypes: float64(4), int64(1), object(1)
memory usage: 84.7+ KB


None

date      0
open      0
high      0
low       0
close     0
volume    0
dtype: int64

Unnamed: 0,open,high,low,close,volume
count,1805.0,1805.0,1805.0,1805.0,1805.0
mean,2755.938758,2769.524277,2741.245103,2756.455533,3844502000.0
std,698.212835,701.268104,695.674679,698.850564,978146000.0
min,1833.400024,1847.0,1810.099976,1829.079956,1296540000.0
25%,2123.159912,2129.870117,2114.719971,2124.290039,3254950000.0
50%,2664.439941,2682.860107,2648.870117,2663.98999,3623320000.0
75%,3045.75,3068.669922,3012.590088,3039.419922,4154240000.0
max,4707.25,4718.5,4694.390137,4701.700195,9878040000.0


## bitcoin data -----> df_bitcon

In [7]:
display(df_bitcoin.info())
display(df_bitcoin.isna().sum())
display(df_bitcoin.describe())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2619 entries, 0 to 2618
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   date    2619 non-null   object 
 1   open    2615 non-null   float64
 2   high    2615 non-null   float64
 3   low     2615 non-null   float64
 4   close   2615 non-null   float64
 5   volume  2615 non-null   float64
dtypes: float64(5), object(1)
memory usage: 122.9+ KB


None

date      0
open      4
high      4
low       4
close     4
volume    4
dtype: int64

Unnamed: 0,open,high,low,close,volume
count,2615.0,2615.0,2615.0,2615.0,2615.0
mean,10051.643066,10334.482966,9750.736512,10073.814423,14001550000.0
std,14892.430109,15326.320248,14422.269302,14923.069664,19931580000.0
min,176.897003,211.731003,171.509995,178.102997,5914570.0
25%,582.071015,588.960998,575.311981,582.555999,74891100.0
50%,5745.599121,5865.881836,5544.089844,5750.799805,4679500000.0
75%,9866.986328,10136.996094,9642.615235,9870.199219,22876060000.0
max,67549.734375,68789.625,66382.0625,67566.828125,350967900000.0


In [26]:
# df_bitcoin[df_bitcoin.isnull().any()]
display(df_bitcoin[df_bitcoin['open'].isna()])
dt_bticoin_na = list(df_bitcoin[df_bitcoin['open'].isna()]['date'])
display(dt_bticoin_na)

Unnamed: 0,date,open,high,low,close,volume
2039,2020-04-17,,,,,
2214,2020-10-09,,,,,
2217,2020-10-12,,,,,
2218,2020-10-13,,,,,


['2020-04-17', '2020-10-09', '2020-10-12', '2020-10-13']

 ['2020-04-17', '2020-10-09', '2020-10-12', '2020-10-13'] in the give date there is not such drastict thing happended for the Bitcoin data to be missing, hence we can safely use some technique to fill the data

## gold and inf data -----> df_gold_inf

In [27]:
display(df_gold_inf.info())
display(df_gold_inf.isna().sum())
display(df_gold_inf.describe())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 87 entries, 0 to 86
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   date      87 non-null     object 
 1   gold_usd  87 non-null     float64
 2   cpi_us    87 non-null     float64
dtypes: float64(2), object(1)
memory usage: 2.2+ KB


None

date        0
gold_usd    0
cpi_us      0
dtype: int64

Unnamed: 0,gold_usd,cpi_us
count,87.0,87.0
mean,1403.186678,249.790759
std,257.985374,10.733951
min,1068.317,233.707
25%,1231.0815,240.4285
50%,1283.189,249.554
75%,1577.216,257.091
max,2041.7,276.589


# Ploting here

In [None]:
df_stocks