**This is the final project in Data Analysis.**

Goals:
1. Analyze the cryptocurrency market in the selected time period
2. Try to predict the results of exchange rates (without focusing on events)
3. Analyze current profitability and risk
4. Prepare forecasted financial statements
5. Analyze the share of cryptocurrencies in the economy

Description of variables

* slug - unique name of cryptocurrency (text)
* symbol - unique short name (text)
* name - name of cryptocurrency (text)
* date - dates (categorical)
* ranknow - market entry (ordinal)
* open - starting bid price (numerical)
* high - highest bid price (numerical)
* low - lowest bid price (numerical)
* close - closing bid price (numerical)
* volume - number of transactions (quantitative)
* market - market capitalization (numerical)
* close_ration - difference between open and close price (numerical)
* spread - difference between the lowest and the highest price (numerical)

In [34]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import datetime as dt
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import plotly.express as px
from plotly import tools
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go

import xgboost as xgb

from sklearn.linear_model import LinearRegression as LinReg
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

/kaggle/input/all-crypto-currencies/crypto-markets.csv


Read the data and drop the symbols

In [35]:
df = pd.read_csv("/kaggle/input/all-crypto-currencies/crypto-markets.csv")

In [36]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 942297 entries, 0 to 942296
Data columns (total 13 columns):
slug           942297 non-null object
symbol         942297 non-null object
name           942297 non-null object
date           942297 non-null object
ranknow        942297 non-null int64
open           942297 non-null float64
high           942297 non-null float64
low            942297 non-null float64
close          942297 non-null float64
volume         942297 non-null float64
market         942297 non-null float64
close_ratio    942297 non-null float64
spread         942297 non-null float64
dtypes: float64(8), int64(1), object(4)
memory usage: 93.5+ MB


In [37]:
df.describe()

Unnamed: 0,ranknow,open,high,low,close,volume,market,close_ratio,spread
count,942297.0,942297.0,942297.0,942297.0,942297.0,942297.0,942297.0,942297.0,942297.0
mean,1000.170608,348.3522,408.593,296.2526,346.1018,8720383.0,172506000.0,0.459499,112.34
std,587.575283,13184.36,16163.86,10929.31,13098.22,183980200.0,3575590000.0,0.32616,6783.713
min,1.0,2.5e-09,3.2e-09,2.5e-10,2e-10,0.0,0.0,-1.0,0.0
25%,465.0,0.002321,0.002628,0.002044,0.002314,175.0,29581.0,0.1629,0.0
50%,1072.0,0.023983,0.026802,0.021437,0.023892,4278.0,522796.0,0.4324,0.0
75%,1484.0,0.22686,0.250894,0.204391,0.225934,119090.0,6874647.0,0.7458,0.03
max,2072.0,2298390.0,2926100.0,2030590.0,2300740.0,23840900000.0,326502500000.0,1.0,1770563.0


In [None]:
df.mean()  # Mean value

In [None]:
df.quantile()  # Data span (quantiles)

In [None]:
df.std()  # Standard deviation

In [None]:
df.std() ** 2  # Dispersion is squared degree of standard deviation

In [None]:
df.isnull().sum()  # Checking NULLs

In [None]:
df = df.drop(['symbol'], axis=1)  # Drop useless columns
df = df.drop(['slug'], axis=1)

In [None]:
groupByName.head(10)

Traders still like to analyze the concept of HLC (and OHLC|HL) [proof](https://www.mypivots.com/dictionary/definition/92/hlc-3)

In [None]:
df['hl_average'] = (df['high'] + df['low']) / 2
df['hlc_average'] = (df['high'] + df['low'] + df['close']) / 3
df['ohlc_average'] = (df['open'] + df['high'] + df['low'] + df['close']) / 4

Checking other currencies

In [None]:
top10 = df[(df['ranknow'] >= 1) & (df['ranknow'] <= 10)]
top10.name.unique()

*Volume* - All trades buys and sells that were made during that time (for example 24 hours like coinmarketcap does by default).

*Circulating supply* - number of coins mined and existing right now.

*Marketcap* = circulating supply multiply by price of coin.

In [None]:
fig = px.pie(top10, values='volume', names='name', title='Cryptocurrencies Top-10 by Transaction Volume')
fig.show()

In [None]:
fig = px.pie(top10, values='market', names='name', title='Cryptocurrencies Top-10 by Market capitalization')
fig.show()

In [None]:
fig = tools.make_subplots(subplot_titles=('Time'))
for name in top10.name.unique():
    currency = top10[top10['name'] == name]
    trace = go.Scatter(x=currency['date'], y=currency['ohlc_average'], name=name)
    fig.append_trace(trace, 1, 1)
    
fig['layout'].update(title='Top-10 Cryptocurrencies Comparison')
fig['layout']['yaxis1'].update(title='USD')
fig.show()

Adding minor cryptocurrencies that not affect too much on the market

In [None]:
top10minorCurrencies = df[(df['ranknow'] >= 11) & (df['ranknow'] <= 21)]

top10minorCurrencies.name.unique()

In [None]:
fig = px.pie(top10minorCurrencies, values='volume', names='name', title='Minor Cryptocurrencies by Transaction Volume')
fig.show()

In [None]:
fig = px.pie(top10minorCurrencies, values='market', names='name', title='Minor Cryptocurrencies by Market capitalization')
fig.show()

In [None]:
top10loserCoins = df[(df['ranknow'] >= max(df['ranknow']) - 10) & (df['ranknow'] <= max(df['ranknow']))]

top10loserCoins.name.unique()

In [None]:
fig = px.pie(top10loserCoins, values='volume', names='name', title='Loser Cryptocurrencies by Transaction Volume')
fig.show()

In [None]:
fig = px.pie(top10loserCoins, values='market', names='name', title='Loser Cryptocurrencies by Market capitalization')
fig.show()