<a href="https://colab.research.google.com/github/mzaoualim/cryptocurrency_portfolio_optimization_app/blob/main/crypto_portfolio_optimization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!pip install yfinance --quiet
!pip install yahooquery --quiet
!pip install requests_html --quiet
!pip install ydata_profiling --quiet
!pip install PyPortfolioOpt --quiet

In [3]:
# Modules imports
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import yfinance as yf
from yahooquery import Screener
import requests
from requests_html import HTMLSession
from datetime import datetime
from ydata_profiling import ProfileReport
from pypfopt.expected_returns import mean_historical_return
from pypfopt.risk_models import CovarianceShrinkage

# The goal
Inspired by the G-Reseach competition in [Kaggle](https://www.kaggle.com/competitions/g-research-crypto-forecasting/overview) and [this portfolio optimizer tool](https://www.portfoliovisualizer.com/optimize-portfolio)

our goal is to Create a Streamlit App to generate, for a given:
  - porfolio of cryptocurrencies.
  - Budget.
  - Investements withdrawal horizon.

The optimized ratio of chosen currencies with the predicted profits.

# The Data
We'll start with a selection of 5 most popular crypto currencies on the market by market capitalization.

## Getting Data

For scraping historical trading data of the crypto currencies, we rely on Yahoo! finance API.
Fortunatly there is a python [project](https://pypi.org/project/yfinance/) who offers an easy pythonic way to get data.

In [4]:
# Scraping list of 3 most popular crypto tickets (cc)

session = HTMLSession()
num_currencies=3
resp = session.get(f"https://finance.yahoo.com/crypto?offset=0&count={num_currencies}")
tables = pd.read_html(resp.html.raw_html)               
df = tables[0].copy()
cc = df.Symbol.tolist()
cc

['BTC-USD', 'ETH-USD', 'USDT-USD']

In [7]:
# For the given cryptocurrencies, we grab max available histrical closing price data:

tickers = yf.Tickers(cc)
end_date = datetime.now().strftime('%Y-%m-%d')
data = tickers.history(period='max',end=end_date,interval='1d')['Close']
data

[*********************100%***********************]  3 of 3 completed


Unnamed: 0_level_0,BTC-USD,ETH-USD,USDT-USD
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2014-09-17,457.334015,,
2014-09-18,424.440002,,
2014-09-19,394.795990,,
2014-09-20,408.903992,,
2014-09-21,398.821014,,
...,...,...,...
2023-03-01,23646.550781,1663.433716,1.000144
2023-03-02,23475.466797,1647.319336,1.000091
2023-03-03,22362.679688,1569.167603,1.000100
2023-03-04,22353.349609,1566.923950,1.000114


## Preprocessing Data

In [8]:
# missing data
data.isna().sum()

BTC-USD        0
ETH-USD     1149
USDT-USD    1149
dtype: int64

In [9]:
# data length before removing missing data
len(data)

3092

In [10]:
# after removing row with missing obs
data = data.dropna(axis=0)
len(data)

1943

These missing data are caused by dirrenecne in starting dates of tradings of each currency.

In [11]:
data.isna().sum()

BTC-USD     0
ETH-USD     0
USDT-USD    0
dtype: int64

In [12]:
data.dtypes

BTC-USD     float64
ETH-USD     float64
USDT-USD    float64
dtype: object

In [None]:
# Save data to csv for future manipulations

# EDA

## Pandas Profiling

In [16]:
# Trying  ydata-profiling 
profile = ProfileReport(data, tsmode=True)

In [21]:
# display report in ipnb friendly manner!
!jupyter nbextension enable --py widgetsnbextension

Enabling notebook extension jupyter-js-widgets/extension...
Paths used for configuration of notebook: 
    	/root/.jupyter/nbconfig/notebook.json
Paths used for configuration of notebook: 
    	
      - Validating: [32mOK[0m
Paths used for configuration of notebook: 
    	/root/.jupyter/nbconfig/notebook.json


In [None]:
# display report in ipynb friendly manner
# widget?


In [22]:
profile



In [23]:
profile.to_widgets()



Render widgets:   0%|          | 0/1 [00:00<?, ?it/s]

## Correlation Analysis

## Volatility Analysis

In [None]:
# Volatility analysis
# https://www.learnpythonwithrune.org/calculate-the-volatility-of-historic-stock-prices-with-pandas-and-python/

# Modeling

## Mean-variance optimization

In [None]:
mu = mean_historical_return(data)
S = CovarianceShrinkage(data).ledoit_wolf()

In [None]:
mu

ADA-USD      0.392415
BNB-USD      0.938108
BTC-USD      0.386834
BUSD-USD    -0.000260
DOGE-USD     0.715782
ETH-USD      0.241572
MATIC-USD    1.859780
USDC-USD    -0.000365
USDT-USD    -0.001039
XRP-USD      0.082332
dtype: float64

In [None]:
S

Unnamed: 0,ADA-USD,BNB-USD,BTC-USD,BUSD-USD,DOGE-USD,ETH-USD,MATIC-USD,USDC-USD,USDT-USD,XRP-USD
ADA-USD,0.809856,0.270042,0.208195,-0.002803,0.331362,0.288446,0.252119,-0.001545,-0.001844,0.370304
BNB-USD,0.270042,0.567404,0.194072,-0.002507,0.226596,0.242442,0.244439,-0.001449,-0.001012,0.233421
BTC-USD,0.208195,0.194072,0.402053,-0.002262,0.205923,0.199494,0.158058,-0.001108,0.00056,0.175071
BUSD-USD,-0.002803,-0.002507,-0.002262,0.101079,-0.002022,-0.002948,-0.003358,0.000748,0.000871,-0.002295
DOGE-USD,0.331362,0.226596,0.205923,-0.002022,1.674588,0.241352,0.168044,-0.000744,-0.000403,0.257123
ETH-USD,0.288446,0.242442,0.199494,-0.002948,0.241352,0.427266,0.227332,-0.001628,-0.000231,0.262785
MATIC-USD,0.252119,0.244439,0.158058,-0.003358,0.168044,0.227332,0.799462,-0.001514,-0.001329,0.215289
USDC-USD,-0.001545,-0.001449,-0.001108,0.000748,-0.000744,-0.001628,-0.001514,0.101511,0.000788,-0.001065
USDT-USD,-0.001844,-0.001012,0.00056,0.000871,-0.000403,-0.000231,-0.001329,0.000788,0.10263,-0.001035
XRP-USD,0.370304,0.233421,0.175071,-0.002295,0.257123,0.262785,0.215289,-0.001065,-0.001035,0.6671


In [None]:
from pypfopt.efficient_frontier import EfficientFrontier

ef = EfficientFrontier(mu, S)
weights = ef.max_sharpe()
cleaned_weights = ef.clean_weights()
print(cleaned_weights)

OrderedDict([('ADA-USD', 0.0), ('BNB-USD', 0.23753), ('BTC-USD', 0.0), ('BUSD-USD', 0.0), ('DOGE-USD', 0.04041), ('ETH-USD', 0.0), ('MATIC-USD', 0.72206), ('USDC-USD', 0.0), ('USDT-USD', 0.0), ('XRP-USD', 0.0)])


In [None]:
ef.portfolio_performance(verbose=True)

Expected annual return: 159.5%
Annual volatility: 74.1%
Sharpe Ratio: 2.12


(1.5946207229943448, 0.7413261070118093, 2.1240594498168144)

In [None]:
from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices

latest_prices = get_latest_prices(data)
da = DiscreteAllocation(weights, latest_prices, total_portfolio_value=1000)
allocation, leftover = da.lp_portfolio()
print(allocation)

{'BNB-USD': 1, 'DOGE-USD': 318, 'MATIC-USD': 444}


##Hierarchical Risk Parity (HRP)

##Mean Conditional Value at Risk (mCVAR)

## Machine Learning solution

## Deep Learning Solution

In [None]:
# DeepDow

# Streamlit App

In [None]:
# Design
#  3 pages app
## page 1: choose (portfolio, budget, horizon)
## page 2: basic EDA (history, volatility...)
## page 3: portfolio ratio/profits/probabilities

# Conclusion

Beyond the App
- Portfolio builder App

# References
[Getting Crypto Symbols](https://stackoverflow.com/a/74656748)

[Portfolio Optimization Using Python](https://github.com/areed1192/portfolio-optimization/blob/master/samples/portfolio_optimization.ipynb)

[Portfolio Builder](https://github.com/yeungadrian/PortfolioBuilder)

[Portfolio Selection with Graph Algorithms and Deep Learning](https://www.linkedin.com/pulse/portfolio-selection-graph-algorithms-deep-learning-maya-benowitz)

[G-Research Crypto Forecasting](https://www.kaggle.com/competitions/g-research-crypto-forecasting/overview)

[G-Research Crypto Forecasting](https://www.kaggle.com/code/cstein06/tutorial-to-the-g-research-crypto-competition/notebook#Preprocessing)

[yfinance guide](https://www.qmr.ai/yfinance-library-the-definitive-guide/#Fetch_Historical_Prices_using_yfinance
)

[Portfolio Optimization with PyPortfolioOpt](https://github.com/paulsg3/PortfolioOptimization/blob/main/Portfolio_Optimization.ipynb)

[PyPortfolioOpt Documentation](https://pyportfolioopt.readthedocs.io/en/latest/)

[Portfolio Optimization using Reinforcement Learning](https://github.com/kvsnoufal/portfolio-optimization)

[Multi-level Columns](https://stackoverflow.com/a/56080234)

[On the non-stationarity of financial time series: Impact on optimal portfolio selection](https://www.researchgate.net/publication/224905259_On_the_non-stationarity_of_financial_time_series_Impact_on_optimal_portfolio_selection)

[Stationary TS](https://analyticsindiamag.com/how-to-make-a-time-series-stationary/)

[Volatility Analysis](https://www.learnpythonwithrune.org/calculate-the-volatility-of-historic-stock-prices-with-pandas-and-python/)

[Volatility Analysis](https://blog.quantinsti.com/volatility-and-measures-of-risk-adjusted-return-based-on-volatility/)