<a href="https://colab.research.google.com/github/Papadopoulos18/Cryptocurrency-predicting-RNN-BTC-LTC-BCH-ETH-with-Tensorflow/blob/main/Cryptocurrency_predicting_RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import pandas as pd 
import os



upload files manualy on Google Colab (link to download the data from:https://pythonprogramming.net/static/downloads/machine-learning-data/crypto_data.zip)

we are going to name the columns of the .csv file

In [4]:
df = pd.read_csv("/content/LTC-USD.csv", names=["time", "low", "high", "open", "close", "volume"])
print(df.head())

        times        low       high       open      close      volume
0  1528968660  96.580002  96.589996  96.589996  96.580002    9.647200
1  1528968720  96.449997  96.669998  96.589996  96.660004  314.387024
2  1528968780  96.470001  96.570000  96.570000  96.570000   77.129799
3  1528968840  96.449997  96.570000  96.570000  96.500000    7.216067
4  1528968900  96.279999  96.540001  96.500000  96.389999  524.539978


we want to get the close and the volume for each one of the 4 .csv files. The only thing that these 4 csv files have in common is the "time" column. They all share the same index, which is time

In [32]:
main_df = pd.DataFrame() # begin empty

ratios = ["BTC-USD", "LTC-USD", "BCH-USD", "ETH-USD"]  # the 4 ratios we want to consider
for ratio in ratios:
  print(ratio)
  dataset = f"/content/{ratio}.csv"
  # print(dataset)

  df = pd.read_csv(dataset, names=["time", "low", "high", "open", "close", "volume"])
  # print(df.head()) we want to work with close and volume
  df.rename(columns={"close": f"{ratio}_close", "volume": f"{ratio}_volume"}, inplace=True)

  df.set_index("time", inplace=True)
  df = df[[f'{ratio}_close',f"{ratio}_volume"]] # ignore the other columns besides price and volume
  # print(df.head())

  # now we want to merge those 4
  if len(main_df) == 0:           #i.e. is empty
    main_df = df
  else:
    main_df = main_df.join(df)

main_df.fillna(method="ffill", inplace=True)  # if there are gaps in data, use previously known values
main_df.dropna(inplace=True)
print(main_df.head())

BTC-USD
LTC-USD
BCH-USD
ETH-USD
            BTC-USD_close  BTC-USD_volume  ...  ETH-USD_close  ETH-USD_volume
time                                       ...                               
1528968720    6487.379883        7.706374  ...      486.01001       26.019083
1528968780    6479.410156        3.088252  ...      486.00000        8.449400
1528968840    6479.410156        1.404100  ...      485.75000       26.994646
1528968900    6479.979980        0.753000  ...      486.00000       77.355759
1528968960    6480.000000        1.490900  ...      486.00000        7.503300

[5 rows x 8 columns]


Next, we need to create a target. To do this, we need to know which price we're trying to predict. We also need to know how far out we want to predict. We'll go with Litecoin for now. Knowing how far out we want to predict probably also depends how long our sequences are. If our sequence length is 3 (so...3 minutes), we probably can't easily predict out 10 minutes. If our sequence length is 300, 10 might not be as hard. I'd like to go with a sequence length of 60, and a future prediction out of 3. We could also make the prediction a regression question, using a linear activation with the output layer, but, instead, I am going to just go with a binary classification.

If price goes up in 3 minutes, then it's a buy. If it goes down in 3 minutes, not buy/sell. With all of that in mind, I am going to make the following constants:

In [14]:
SEQ_LEN = 60
FUTURE_PERIOD_PREDICT = 3
RATIO_TO_PREDICT = "LTC-USD"

def classify(current, future):
  if float(futute)>float(current):
    return 1                        #BUY
  else:
    return 0                        #DONT BUY

## so knowing these we are writing our code like below:

In [33]:
import pandas as pd 
import os

SEQ_LEN = 60
FUTURE_PERIOD_PREDICT = 3
RATIO_TO_PREDICT = "LTC-USD"


def classify(current, future):
  if float(future)>float(current):
    return 1                        #BUY
  else:
    return 0                        #DONT BUY



main_df = pd.DataFrame() # begin empty

ratios = ["BTC-USD", "LTC-USD", "BCH-USD", "ETH-USD"]  # the 4 ratios we want to consider
for ratio in ratios:
  # print(ratio)
  dataset = f"/content/{ratio}.csv"
  # print(dataset)

  df = pd.read_csv(dataset, names=["time", "low", "high", "open", "close", "volume"])
  # print(df.head()) we want to work with close and volume
  df.rename(columns={"close": f"{ratio}_close", "volume": f"{ratio}_volume"}, inplace=True)

  df.set_index("time", inplace=True)
  df = df[[f'{ratio}_close',f"{ratio}_volume"]]
  # print(df.head())

  # now we want to merge those 4
  if len(main_df) == 0:           #i.e. is empty
    main_df = df
  else:
    main_df = main_df.join(df)

  main_df.fillna(method="ffill", inplace=True)  # if there are gaps in data, use previously known values
  main_df.dropna(inplace=True)

# Now lets check the future price of litecoin  of all coins

main_df['future'] = main_df[f'{RATIO_TO_PREDICT}_close'].shift(-FUTURE_PERIOD_PREDICT) 
print(main_df.head())


#future price of litecoin(LTC-USD) 
# the 1st column is the "current" and the 2nd is the "future" after 3 periods 
print(main_df[[f'{RATIO_TO_PREDICT}_close', "future"]].head()) 


# The map part is what allows us to do this row-by-row for these columns, but also do it quite fast. 
# The list part converts the end result to a list, which we can just set as a column.
main_df['target'] = list(map(classify, main_df[f'{RATIO_TO_PREDICT}_close'], main_df['future']))
print(main_df[[f'{RATIO_TO_PREDICT}_close', "future","target" ]].head(12)) 



            BTC-USD_close  BTC-USD_volume  ...  ETH-USD_volume     future
time                                       ...                           
1528968720    6487.379883        7.706374  ...       26.019083  96.389999
1528968780    6479.410156        3.088252  ...        8.449400  96.519997
1528968840    6479.410156        1.404100  ...       26.994646  96.440002
1528968900    6479.979980        0.753000  ...       77.355759  96.470001
1528968960    6480.000000        1.490900  ...        7.503300  96.400002

[5 rows x 9 columns]
            LTC-USD_close     future
time                                
1528968720      96.660004  96.389999
1528968780      96.570000  96.519997
1528968840      96.500000  96.440002
1528968900      96.389999  96.470001
1528968960      96.519997  96.400002
            LTC-USD_close     future  target
time                                        
1528968720      96.660004  96.389999       0
1528968780      96.570000  96.519997       0
1528968840      96.50

In [None]:
# 