<div>
    <h1><center style="background-color:#6A5ACD; color:white;"> Dogecoin Historical Data Analysis</center></h1>
</div>


<div>
<img src="https://i.pinimg.com/originals/d6/7e/a9/d67ea9ed1d1a8bb3fd69dad57e11e23c.gif" width='400'>
</div>

<a id="top"></a>

<div class="list-group" id="list-tab" role="tablist">
<h3 class="list-group-item list-group-item-action active" data-toggle="list" style='background-color:#6A5ACD; border:0' role="tab" aria-controls="home" color=black><center>Quick navigation</center></h3>

* [1. Required Libraries](#1)
* [2. Dataset Quick Overview](#2)
* [3. Distribution of Features](#3)
* [4. Correlation Analysis ](#4)   
* [5. Candle stick chart on Dogecoin historical data](#5)
* [6. Time series analysis and prediction using prophet](#6)
* [7. References](#7)
* [8. Related works](#8)
* [9. Some interesting factors, which led to recent rise in dogecoin](#9)



<div class="alert alert-info">
<p><center><b>Dogecoin is a cryptocurrency invented by software engineers Billy Markus and Jackson Palmer, who decided to create a payment system that is instant, fun, and free from traditional banking fees.</b> <center><p>

- Circulating supply: 127 billion (113 billion coins have already been mined)
- Original author(s): Billy Markus, Jackson Palmer
- Initial release: December 6, 2013; 7 years ago

<p>Here’s a list of few interesting facts about Dogecoin:

<p>1. Dogecoin started as a joke created by Jackson Palmer and Billy Marcus in November 2013. Marcus recently claimed that he sold all of his DOGE in 2015.</p>
<p>2. There are 128,264,356,384 DOGE coins in circulation at this moment, compared to 18.5 million bitcoins.</p>
<p>3. Dogecoin hosts one of the largest communities in the crypto space.</p>
<p>4. In 2014, the Dogecoin community raised $55,000 to sponsor NASCAR driver Josh Wise and covered his car entirely in Dogecoin and Reddit alien images.</p>
</div>

<a id="1"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Required Libraries</center><h2>

In [None]:

#Data Pre-Processing packages:
import numpy as np 
import pandas as pd 
from datetime import datetime


#Data Visualization Packages:
#Seaborn
import seaborn as sns
sns.set(rc={'figure.figsize':(10,6)})
custom_colors = ["#4e89ae", "#c56183","#ed6663","#ffa372"]

#Matplotlib
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib.image as mpimg

#Colorama
from colorama import Fore, Back, Style # For text colors
y_= Fore.CYAN
m_= Fore.WHITE

#NetworkX
import networkx as nx
import plotly.graph_objects as go #To construct network graphs

#To avoid printing of un necessary Deprecation warning and future warnings!
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

#Time series Analysis pacakages:

from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.tsa.stattools import kpss
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

#Facebook Prophet packages:
from fbprophet import Prophet
from fbprophet.diagnostics import cross_validation, performance_metrics
from fbprophet.plot import add_changepoints_to_plot, plot_cross_validation_metric


#Importing of Data 
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
        
data=pd.read_csv('../input/dogecoin-historical-data/DOGE-USD.csv')



<a id="2"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Data set Overview</center><h2>

In [None]:
print(f"{m_}Total records:{y_}{data.shape}\n")
print(f"{m_}Data types of data columns: \n{y_}{data.dtypes}")

In [None]:
# Coverting the date column to a datetime format and sorting the dataframe by date
data['Date'] =  pd.to_datetime(data['Date'],infer_datetime_format=True,format='%y-%m-%d')
data.sort_values(by='Date',inplace=True)
data.head()

<div class="alert alert-info">
    <h3><b><center>Missing values</center><b></h3>
</div>


In [None]:
missed = pd.DataFrame()
missed['column'] = data.columns

missed['percent'] = [round(100* data[col].isnull().sum() / len(data), 2) for col in data.columns]
missed = missed.sort_values('percent',ascending=False)
missed = missed[missed['percent']>0]

fig = sns.barplot(
    x=missed['percent'], 
    y=missed["column"], 
    orientation='horizontal',palette="winter"
).set_title('Missed values percent for every column')

<div class="alert alert-info">
    <h3><b>Let's drop the records with NA value, to make sure it doesnt cloud our analysis<b></h3>
</div>


In [None]:
prev_len=data.shape[0]
data=data.dropna()
print(f"{m_}Total records after the removal of NA values: {y_}{data.shape}\n")
print(f"{m_}Removed records:{y_}{prev_len-data.shape[0]}\n")
print(f"{m_}Removed records percentage:{y_}{round(((prev_len-data.shape[0])/prev_len)*100,2)}")

<a id="3"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Distribution of Features</center><h2>

<div class="alert alert-info">
    <h3><b>Let's Visualize the distribution of the key variables like Opening price, Highest price, Lowest price and Closing price in Dogecoin<b></h3>
</div>


In [None]:
def triple_plot(x, title,c):
    fig, ax = plt.subplots(3,1,figsize=(20,10),sharex=True)
    sns.distplot(x, ax=ax[0],color=c)
    ax[0].set(xlabel=None)
    ax[0].set_title('Histogram + KDE')
    sns.boxplot(x, ax=ax[1],color=c)
    ax[1].set(xlabel=None)
    ax[1].set_title('Boxplot')
    sns.violinplot(x, ax=ax[2],color=c)
    ax[2].set(xlabel=None)
    ax[2].set_title('Violin plot')
    fig.suptitle(title, fontsize=30)
    plt.tight_layout(pad=3.0)
    plt.show()

In [None]:
triple_plot(data['Open'],'Distribution of Opening Price(in dollars)',custom_colors[0])

In [None]:
triple_plot(data['Close'],'Distribution of Closing Price(in dollars)',custom_colors[1])

In [None]:
triple_plot(data['High'],'Distribution of Highest Price(in dollars)',custom_colors[2])

In [None]:
triple_plot(data['Low'],'Distribution of Lowest Price(in dollars)',custom_colors[3])

In [None]:
triple_plot(data['Volume'],'Distribution of Volume',custom_colors[0])

<a id="4"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Correlation Analysis</center><h2>

In [None]:
plt.figure(figsize=(10,10))
corr=data[data.columns[1:]].corr()
mask = np.triu(np.ones_like(corr, dtype=bool))
sns.heatmap(data[data.columns[1:]].corr(), mask=mask, cmap='coolwarm', vmax=.3, center=0,
            square=True, linewidths=.5,annot=True)
plt.show()

<div class="alert alert-info">
    <h3><b><center>Correlation Network</center><b></h3>
</div>


In [None]:
indices = corr.index.values
cor_matrix = np.asmatrix(corr)
G = nx.from_numpy_matrix(cor_matrix)
G = nx.relabel_nodes(G,lambda x: indices[x])
#G.edges(data=True)

In [None]:
def corr_network(G, corr_direction, min_correlation):
    H = G.copy()

    for s1, s2, weight in G.edges(data=True):       
        if corr_direction == "positive":
            if weight["weight"] < 0 or weight["weight"] < min_correlation:
                H.remove_edge(s1, s2)
        else:
            if weight["weight"] >= 0 or weight["weight"] > min_correlation:
                H.remove_edge(s1, s2)
                
    edges,weights = zip(*nx.get_edge_attributes(H,'weight').items())
    weights = tuple([(1+abs(x))**2 for x in weights])
   
    d = dict(nx.degree(H))
    nodelist=d.keys()
    node_sizes=d.values()
    
    positions=nx.circular_layout(H)
    
    plt.figure(figsize=(10,10))

    nx.draw_networkx_nodes(H,positions,node_color='#d100d1',nodelist=nodelist,
                       node_size=tuple([x**4 for x in node_sizes]),alpha=0.8)

    nx.draw_networkx_labels(H, positions, font_size=13)

    if corr_direction == "positive":
        edge_colour = plt.cm.summer 
    else:
        edge_colour = plt.cm.autumn
        
    nx.draw_networkx_edges(H, positions, edgelist=edges,style='solid',
                          width=weights, edge_color = weights, edge_cmap = edge_colour,
                          edge_vmin = min(weights), edge_vmax=max(weights))
    plt.axis('off')
    plt.show() 

In [None]:
corr_network(G, corr_direction="positive",min_correlation = 0.5)


<a id="5"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Candle stick chart on the price movements of the Dogecoin</center><h2>


![](https://www.tradingwithrayner.com/wp-content/uploads/2018/05/1-OHLC-COMBINE.png)

In [None]:
fig = go.Figure(data=[go.Candlestick(x=data['Date'],
                open=data['Open'], high=data['High'],
                low=data['Low'], close=data['Close'])
                      ])
fig.show()

In [None]:
data['month']=data['Date'].dt.month
fig = go.Figure(data=[go.Candlestick(x=data['month'],
                open=data['Open'], high=data['High'],
                low=data['Low'], close=data['Close'])
                      ])
fig.show()

<a id="6"></a>
<h2 style='background-color:#6A5ACD; border:0; color:black'><center>Time series Analysis and Prediction using Prophet</center><h2>



<img src="https://insightimi.files.wordpress.com/2020/07/on-de793_201909_g_20190830121038.gif" width='500' height='500'>

<div class="alert alert-warning">
    <h1><b><center>What is Prophet?</center><b></h1>
    <h3>Prophet is a facebooks’ open source time series prediction. Prophet decomposes time series into trend, seasonality and holiday. It has intuitive hyper parameters which are easy to tune.</h3>
</div>




<div class='alert alert-warning'>
    <h2><b><center>Advantages of using Prophet</center><b></h2>
        <h3> 1. Accommodates seasonality with multiple periods</h3>
        <h3> 2. Prophet is resilient to missing values</h3>
        <h3> 3. Best way to handle outliers in Prophet is to remove them</h3>
        <h3> 4. Fitting of the model is fast</h3>
    <h3> 5. Intuitive hyper parameters which are easy to tune</h3>
</div>

In [None]:
plt.figure(figsize=(15,12))
series = data.Close
result = seasonal_decompose(series, model='additive',period=1)
result.plot()

<div class='alert alert-warning'>
    <h3><center>Input to Prophet is a data frame with minimum two columns : ds and y</center></h3>
</div>

In [None]:
# Renaming the column names accroding to Prophet's requirements
prophet_df=data[['Date','Close']]
prophet_df.rename(columns={'Date':'ds','Close':'y'},inplace=True)


<div class='alert alert-warning'>
    <h2><center>Creating and fitting the Prophet model with default values</center></h2>
    <h3>We will first explore the default Prophet model. Create the Prophet instance with all default values, fit the dataset.</h3>
</div>

In [None]:
prophet_basic = Prophet()
prophet_basic.fit(prophet_df[['ds','y']])

<div class='alert alert-warning'>
    <h2><center>Predicting the values for the future</center></h2>
<h4>For predicting the values using Prophet, we need to create a dataframe with ds(datetime stamp) containing the dates for which we want to make the predictions.<br><br>
We use make_future_dataframe() to which we specify the number of days to extend into the future. By default it includes dates from the history</h4>
</div>

In [None]:
future= prophet_basic.make_future_dataframe(periods=300)
future.tail(2)

In [None]:
forecast=prophet_basic.predict(future)

<div class='alert alert-warning'>
    <h3><center>Plotting the predicted data</center></h3>
</div>

In [None]:
fig1 =prophet_basic.plot(forecast)

<div class='alert alert-warning'>
    <h3><center>Plotting the Forecasted components(Trend and Seasonality)</center></h3>
</div>

In [None]:
fig1 = prophet_basic.plot_components(forecast)

<div class='alert alert-warning'>
    <h3><center>Adding ChangePoints to Prophet</center></h3>
<h4>Changepoints are the datetime points where the time series have abrupt changes in the trajectory.<br>
By default, Prophet adds 25 changepoints to the initial 80% of the data-set.<br>
    <br><center>Let’s plot the vertical lines where the potential changepoints occurred</center><h4>
    </div>

In [None]:
fig = prophet_basic.plot(forecast)
a = add_changepoints_to_plot(fig.gca(), prophet_basic, forecast)

<div class='alert alert-warning'>
        <h3> Following are the Changepoints, where the time series had abrupt changes in the trajectory.</h3>
</div>

In [None]:
print(f'{m_}Change points:\n {y_}{prophet_basic.changepoints}\n')

<div class='alert alert-warning'>
    <h2><center>Adding Multiple Regressors</center></h2>
<h3>Additional regressors can be added to the Prophet model. This is done by using add_regressor. Additional regressor column value needs to be present in both the fitting as well as prediction dataframes.<h3>
    <h3><center>Creating fitting and predicting dataset with additional regressors</center></h3>
    </div>

In [None]:
prophet_df['Open'] = data['Open']
prophet_df['High'] = data['High']
prophet_df['Low'] = data['Low']
prophet_df['Vol'] = data['Volume']

prophet_df=prophet_df.dropna()
train_X= prophet_df[:1500]
test_X= prophet_df[1500:]

In [None]:
pro_regressor= Prophet()
pro_regressor.add_regressor('Open')
pro_regressor.add_regressor('High')
pro_regressor.add_regressor('Low')
pro_regressor.add_regressor('Vol')



In [None]:
#Fitting the data
pro_regressor.fit(train_X)
future_data = pro_regressor.make_future_dataframe(periods=249)

In [None]:
#Forecast the data for Test  data
forecast_data = pro_regressor.predict(test_X)
pro_regressor.plot(forecast_data);

In [None]:
forecast_data.tail()

<div class='alert alert-warning'>
<h3><center>6 different types of metrics are shown by each time horizon, but by taking moving average over 37 days in this case (can be changed by ‘rolling_window’ option).</center></h3>
</div>

In [None]:
df_cv = cross_validation(pro_regressor, initial='100 days', period='180 days', horizon = '365 days')
pm = performance_metrics(df_cv, rolling_window=0.1)
display(pm.head(),pm.tail())
fig = plot_cross_validation_metric(df_cv, metric='mape', rolling_window=0.1)
plt.show()

<div class='alert alert-info'>
    <h3><center>MAPE</center></h3>
    <p>The MAPE (Mean Absolute Percent Error) measures the size of the error in percentage terms. It is calculated as the average of the unsigned percentage error</p>
    <p>Many organizations focus primarily on the MAPE when assessing forecast accuracy. Most people are comfortable thinking in percentage terms, making the MAPE easy to interpret. It can also convey information when you don’t know the item’s demand volume. For example, telling your manager, "we were off by less than 4%" is more meaningful than saying "we were off by 3,000 cases," if your manager doesn’t know an item’s typical demand volume.<p>
    <img src="https://www.forecastpro.com/Trends/images/MAPE1.jpg" width='500'>
</div>

<div class='alert alert-warning'>
    <h3><center>What Prophet doesnt do</center></h3>
    <h4><b>1.Prophet does not allow non-Gaussian noise distribution:<b></h4> 
<p>In Prophet, noise distribution is always Gaussian and pre-transformation of y values is the only way to handle the values following skewed distribution.</p>
        <h4><b>2. Prophet does not take autocorrelation on residual into account</b></h4>
<p>Since epsilon noise portion in the formula assume i.i.d. normal distribution, the residual is not assumed to have autocorrelation, unlike ARIMA model.</p>
        <h4><b>3. Prophet does not assume stochastic trend</b></h4>
<p>Prophet’s trend component is always deterministic+possible changepoints and it won’t assume stochastic trend unlike ARIMA.</p>
</div>

<div class='alert alert-info'>
    <h3> <center>Interesting factor</center> </h3>
</div>

<div class='alert alert-warning'>
    <h3><center> <b>Tweet by Mark cuban On 14th April 2021</b>, American billionaire entrepreneur,owner of the National Basketball Association's (NBA) Dallas Mavericks<center></h3>
<br><h4>FYI, the Mavs sales in @dogecoin have increased 550pct over the past month. We have now sold more than 122k Doge in merchandise ! We will never sell 1 single Doge ever. So keep buying</h4> 
</div>

![](https://images.news18.com/ibnlive/uploads/2021/04/1618473964_untitled-design-3.jpg?impolicy=website&width=534&height=356)    

<div class='alert alert-info'>
    <h3> Yea the name is Elon musk, His single tweet about Dogecoin on Apr 15, 2021, which said <b>Doge barking at the Moon</b> received around 20.8k comments, 52.3k re-tweets, and 314.1K likes</h3>
    

In [None]:
data[(data['Date'].dt.year==2021)&(data['month']==4)&(data['Date'].dt.day>=13)&(data['Date'].dt.day<17)]

In [None]:
###Just look at the numbers!
    
# On 13th april 1 dogecoin was trading at 0.09$
# On 14th april 1 dogecoin went upto to a high price of 0.14$(Mark cuban tweets about doge)
# On 15th april 1 dogecoin went upto to a high price of 0.18$(Elon musk tweets about doge)
# On 16th april 1 dogecoin ended with an all time high of 0.43$, yea that's true😱😱😱


#As of 11:10 a.m. Friday(16th april), the value of Dogecoin had jumped 203% 
#in just the past 24 hours to 0.404(USD), according to Coinbase,
#giving the cryptocurrency a market cap value of $52.2 billion.

#Over the past week, Dogecoin’s value has more than quintupled in value.

#Dogecoin was the seventh largest cryptocurrency in terms of market cap



In [None]:
#If you think the above numbers were suprising then you are wrong.
#Things changed pretty quick once elon musk posted this tweet 
#and multiple companies accpeting doge as a payment method

![](https://www.thesun.co.uk/wp-content/uploads/2021/04/doge.png)

In [None]:
data[(data['Date'].dt.year==2021)&(data['month']==4)&(data['Date'].dt.day>=26)&(data['Date'].dt.day<30)]

<div class='alert alert-info'>
<h4>1. You could see that the tweet by elon musk saying that he is gonna be a part of the Saturday day night live(May 8) titled THE DOGEFATHER at CNBC with miley cyrus took the dogecoin price to new heights of 0.344$</h4>

<h4>2. Speculation about a possible cryptocurrency sketch pushed Doge past the $0.50 mark for the first time. It peaked at 0.66 on May 5.</h4>

<h4>3. Dogecoin rose 140%, from 0.2747 on April 27 to 0.6618 on May 5.</h4>
</div>

In [None]:
data[(data['Date'].dt.year==2021)&(data['month']==5)&(data['Date'].dt.day>=1)&(data['Date'].dt.day<7)]

<div class='alert alert-info'>
<h4>In the past one week Dogecoin closing price on 1st May was around 0.39USD and it significantly increased to 0.58 USD on May 6th.</h3>

<h4> 1. People are buying heavily to make sure that they hold dogecoin before the SNL on MAY 8</h3>
<h4> 2. Huge volatility has been observed, where we can see certain amount of whales playing a bear game</h3>
</div>

<div class='alert alert-info'>
<h3>But to be honest, it not always PROFIT PROFIT PROFIT in any financial assets especially crypto</h3>

<h3>People were expecting a huge rise in Dogecoin Price after the SNL live, which was hosted by Elon Musk and miley cyrus</h3>

<h3> But the results were completely opposite</h3>
<p>Dogecoin fell as much as 29.5%, dropping to 49 cents, during Elon Musk’s SNL debut.</p>
<p> Reason: There was a huge whale activity playing a bear game, who dumps(sells) the large portion of his doge creating a huge supply and less demand therefore reducing the price of the dogecoin significantly</p>
    <p> There might be other possible reasons, pls let them know in the comments section</p>
</div>

<div class ='alert alert-warning'>
    <h2><center> DO YOU THINK DOGE WILL REACH THE MOON?📈🐕🌚🚀🚀🚀🚀🚀<center></h2>
    <h4>Let me know in the comments section and stay tuned for more updates</h4>
 </div>

<a id="7"></a>
## References 

1. [Time series prediction using Prophet in Python by Renu Khandelwal](https://towardsdatascience.com/time-series-prediction-using-prophet-in-python-35d65f626236)
2. [Facebook Prophet by Moto DEI](https://medium.com/swlh/facebook-prophet-426421f7e331)
3. [Housing pices EDA and Prediction by Ruchi Bhatia](https://www.kaggle.com/ruchi798/housing-prices-eda-and-prediction)
4. [88.9 r2_score with pycaret by Kerem Yucedag](https://www.kaggle.com/keremyceda/88-9-r2-score-with-pycaret)

## [Credits to Dhruvil Dave for the dataset](https://www.kaggle.com/dhruvildave/dogecoin-historical-data)

<a id="8"></a>
## Related works

Below are some of my other cryptocurrency related datasets and notebooks. Do let me know your thoughts. Thank you!

### Notebooks:
1. [₿ Bitcoin Prices : EDA and Prediction (R2~0.99)](https://www.kaggle.com/kaushiksuresh147/bitcoin-prices-eda-and-prediction-r2-0-99)
2. [Ethereum EDA and Prediction using Prophet](https://www.kaggle.com/kaushiksuresh147/ethereum-eda-and-prediction-using-prophet)
3. [People's reaction on India's proposed crypto ban](https://www.kaggle.com/kaushiksuresh147/people-s-reaction-on-india-s-proposed-crypto-ban)


### Datasets:
1. [Ethereum Cryptocurrency Historical Dataset](https://www.kaggle.com/kaushiksuresh147/ethereum-cryptocurrency-historical-dataset)
2. [Matic(Polygon) Cryptocurrency Historical Dataset](https://www.kaggle.com/kaushiksuresh147/maticpolygon-crytocurrency-historical-dataset)
3. [Bitcoin Tweets](https://www.kaggle.com/kaushiksuresh147/bitcoin-tweets)
4. [#IndiaWantsCrypto tweets](https://www.kaggle.com/kaushiksuresh147/india-wants-crypto-tweets)
