# Cryptocurrency Historical Prices - Data Analytics Notebook - only modyfing data

<h4><b>Dataset name / short description:</b></h4>
Name is "Cryptocurrency Historical Prices". The dataset has one csv file for each currency. Price history is available on a daily basis from April 28, 2013. This dataset has the historical price information of some of the top crypto currencies by market capitalization.

<h4><b>Columns:</b></h4>

- Date: date of observation

- Open: Opening price on the given day

- High: Highest price on the given day

- Low: Lowest price on the given day

- Close: Closing price on the given day

- Volume: Volume of transactions on the given day

- Market Cap: Market capitalization in USD


<h4><b>Data source (url):</b></h4>
https://www.kaggle.com/datasets/sudalairajkumar/cryptocurrencypricehistory?resource=download

<h4><b>Data format:</b></h4>
csv file



In [6]:
# import modules
# numpy and pandas for downloading and analyzing
# seaborn and matplotlib for plots and chars
# dtale for open df in the browser for analyzing
# os and glob for combining csv-files into one
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# import modules to match the pattern (‘csv’) 
import glob
import os

In [7]:
# I tried many codes without repetition code lines
# to combine 22 csv-files into one
# all of them had errors except this one
# but every time when run the notebook, 
# the code continues to combine csv-files
# it creates duplicates
# I have to drop duplicates

# use glod to match the pattern (‘csv’) 
# and save the list of file names in the ‘all_filenames’ variable
extension = 'csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]

#combine all files in the list
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])

#export to csv and create a dataframe from csv-file
combined_csv.to_csv( "combined_csv.csv", index=False, encoding='utf-8-sig')
df = pd.read_csv('combined_csv.csv')

# drop duplicates
df = df.drop_duplicates()

# show the dataframe
df

Unnamed: 0,SNo,Name,Symbol,Date,High,Low,Open,Close,Volume,Marketcap
0,1,NEM,XEM,2015-04-02 23:59:59,0.000323,0.000227,0.000242,0.000314,2.854940e+04,2.823534e+06
1,2,NEM,XEM,2015-04-03 23:59:59,0.000330,0.000291,0.000309,0.000310,2.067790e+04,2.792457e+06
2,3,NEM,XEM,2015-04-04 23:59:59,0.000318,0.000251,0.000310,0.000277,2.355020e+04,2.488770e+06
3,4,NEM,XEM,2015-04-05 23:59:59,0.000283,0.000218,0.000272,0.000232,2.680020e+04,2.087388e+06
4,5,NEM,XEM,2015-04-06 23:59:59,0.000299,0.000229,0.000232,0.000289,2.251150e+04,2.598354e+06
...,...,...,...,...,...,...,...,...,...,...
518961,702,Wrapped Bitcoin,WBTC,2021-01-01 23:59:59,29594.913005,28788.139851,28963.726548,29349.442124,7.339634e+07,3.396052e+09
518970,711,Wrapped Bitcoin,WBTC,2021-01-10 23:59:59,41299.374329,35872.388720,40235.962172,38332.339179,2.199866e+08,4.435471e+09
518972,713,Wrapped Bitcoin,WBTC,2021-01-12 23:59:59,36462.202657,32806.331547,35333.425652,33853.324860,1.660921e+08,3.917200e+09
518980,721,Wrapped Bitcoin,WBTC,2021-01-20 23:59:59,36373.757289,33613.889006,36129.084356,35489.386571,1.852120e+08,3.966729e+09


In [8]:
# cleaning data
# we can see do have any missing values
df.isna().sum()

# we do not have NaN values

SNo          0
Name         0
Symbol       0
Date         0
High         0
Low          0
Open         0
Close        0
Volume       0
Marketcap    0
dtype: int64

In [9]:
# we can see that Date column
# has a default time format, we can delete hours, minutes and seconds
# we can make dormat as to year-month-day (int)
df['Date'] = df['Date'].str.slice(0, 10)

In [10]:
# actually, we can split Date column to smaller pieces
# probably, we could see correlations
# between days, months and years
# we make rows in colums as int
df["Year"] = df['Date'].str.slice(0, 4).astype(int)
df["Month"] = df['Date'].str.slice(5, 7).astype(int)
df["Day"] = df['Date'].str.slice(8, 10).astype(int)
df = df.drop('Date', axis=1)

In [11]:
df

Unnamed: 0,SNo,Name,Symbol,High,Low,Open,Close,Volume,Marketcap,Year,Month,Day
0,1,NEM,XEM,0.000323,0.000227,0.000242,0.000314,2.854940e+04,2.823534e+06,2015,4,2
1,2,NEM,XEM,0.000330,0.000291,0.000309,0.000310,2.067790e+04,2.792457e+06,2015,4,3
2,3,NEM,XEM,0.000318,0.000251,0.000310,0.000277,2.355020e+04,2.488770e+06,2015,4,4
3,4,NEM,XEM,0.000283,0.000218,0.000272,0.000232,2.680020e+04,2.087388e+06,2015,4,5
4,5,NEM,XEM,0.000299,0.000229,0.000232,0.000289,2.251150e+04,2.598354e+06,2015,4,6
...,...,...,...,...,...,...,...,...,...,...,...,...
518961,702,Wrapped Bitcoin,WBTC,29594.913005,28788.139851,28963.726548,29349.442124,7.339634e+07,3.396052e+09,2021,1,1
518970,711,Wrapped Bitcoin,WBTC,41299.374329,35872.388720,40235.962172,38332.339179,2.199866e+08,4.435471e+09,2021,1,10
518972,713,Wrapped Bitcoin,WBTC,36462.202657,32806.331547,35333.425652,33853.324860,1.660921e+08,3.917200e+09,2021,1,12
518980,721,Wrapped Bitcoin,WBTC,36373.757289,33613.889006,36129.084356,35489.386571,1.852120e+08,3.966729e+09,2021,1,20
