<a name="100"></a>
**HOME**

**Main Idea:**

Binary classification in trading predicts whether the market will **move up** or **move down** within a specific timeframe, using only OHLC price data. By leveraging machine learning, traders can simplify decision-making, and improve trading efficiency, enhancing the chances of consistent profits in volatile markets.


**References:**

* [Evaluating Machine Learning Classification for Financial Trading: An Empirical Approach](https://jfin-swufe.springeropen.com/articles/10.1186/s40854-020-00217-x)
* [Trading via Selective Classification](https://arxiv.org/pdf/2110.14914v1)
* [Forecasting and trading cryptocurrencies with machine learning under changing market conditions](https://jfin-swufe.springeropen.com/articles/10.1186/s40854-020-00217-x)
* [Trading via Selective Classification](https://arxiv.org/pdf/2110.14914v1)

**Content:**

* [**Import Dataset**](#1)
* [**Data Preparation**](#2)
* [**Modeling and Evaluation**](#3)
* [**Modeling All Data**](#4)
* [**Today's Prediction**](#5)

> **Prev Red Candle: Open2Close**

____

<a name="id"></a>
[**Back to HOME**](#100)

<a name="1"></a>

**Import Dataset**

In [1]:
from binance.client import Client
import pandas as pd
import time

# Initialize the Binance client
api_key = "sytvkKKUmXPabC877r7MFv7rhibYAMoczrMdTse0OSB6dRyImx1G8yEInE889y00"
api_secret = "KYgkq441X5spXpdDoLELwlcoJ3k7uh9LeXGgf7aQvABSMZl42Py3OUIwFCqVgc6L"
client = Client(api_key, api_secret)

def fetch_ohlcv_batch(client, symbol, interval, start_time, limit=1000):
    """
    Fetch a batch of OHLCV data from Binance.
    """
    try:
        candles = client.get_klines(
            symbol=symbol,
            interval=interval,
            startTime=start_time,
            limit=limit
        )
        # Transform data into desired format
        ohlcv = [
            [int(c[0]), float(c[1]), float(c[2]), float(c[3]), float(c[4]), float(c[5])]
            for c in candles
        ]
        return ohlcv
    except Exception as e:
        print(f"Error fetching data: {e}")
        return None

def fetch_historical_ohlcv(client, symbol, interval, start_time, limit=1000):
    """
    Fetch historical OHLCV data in batches from Binance.
    """
    all_data = []
    while True:
        data = fetch_ohlcv_batch(client, symbol, interval, start_time, limit)
        if data:
            # Append data to all_data
            all_data.extend(data)
            # Update `start_time` to the timestamp of the last fetched data point + 1 millisecond
            start_time = data[-1][0] + 1
            print(f"Fetched {len(data)} data points. Total so far: {len(all_data)}")
        else:
            print("No more data to fetch or an error occurred.")
            break

        # If the batch size is less than the limit, it means we reached the end of available data
        if len(data) < limit:
            print("Reached the end of available data.")
            break

        # To avoid rate limit issues, wait for a short while
        time.sleep(1)

    # Convert data to DataFrame
    df = pd.DataFrame(all_data, columns=['timestamp', 'open', 'high', 'low', 'close', 'volume'])
    df['timestamp'] = pd.to_datetime(df['timestamp'], unit='ms')
    return df

# Usage example
if __name__ == "__main__":
    # Define parameters
    symbol = 'BTCUSDT'        # Symbol to fetch (without '/')
    interval = Client.KLINE_INTERVAL_1DAY  # Timeframe ('1m', '5m', '1h', '1d', etc.)
    start_time = int(pd.Timestamp("2007-01-01").timestamp() * 1000)  # Start date in milliseconds
    limit = 1000              # Max data points per batch

    # Fetch historical data
    df = fetch_historical_ohlcv(client, symbol, interval, start_time, limit)
    print(f"Total fetched data points: {len(df)}")
    print(df.head())

Fetched 1000 data points. Total so far: 1000
Fetched 1000 data points. Total so far: 2000
Fetched 653 data points. Total so far: 2653
Reached the end of available data.
Total fetched data points: 2653
   timestamp     open     high      low    close       volume
0 2017-08-17  4261.48  4485.39  4200.74  4285.08   795.150377
1 2017-08-18  4285.08  4371.52  3938.77  4108.37  1199.888264
2 2017-08-19  4108.37  4184.69  3850.00  4139.98   381.309763
3 2017-08-20  4120.98  4211.08  4032.62  4086.29   467.083022
4 2017-08-21  4069.13  4119.62  3911.79  4016.00   691.743060


In [2]:
df.head()

Unnamed: 0,timestamp,open,high,low,close,volume
0,2017-08-17,4261.48,4485.39,4200.74,4285.08,795.150377
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264
2,2017-08-19,4108.37,4184.69,3850.0,4139.98,381.309763
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022
4,2017-08-21,4069.13,4119.62,3911.79,4016.0,691.74306


In [3]:
df.tail()

Unnamed: 0,timestamp,open,high,low,close,volume
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.87689
2649,2024-11-17,90587.98,91449.99,88722.0,89855.99,23867.55609
2650,2024-11-18,89855.98,92594.0,89376.9,90464.08,46545.03448
2651,2024-11-19,90464.07,93905.51,90357.0,92310.79,43660.04682
2652,2024-11-20,92310.8,92418.68,91500.0,92167.99,4538.66496


<a name="id"></a>
[**Back to HOME**](#100)

<a name="2"></a>

**Data Preparation**

In [4]:
# Select all rows except the last one
df = df.iloc[:-1]

In [5]:
df.tail()

Unnamed: 0,timestamp,open,high,low,close,volume
2647,2024-11-15,87325.59,91850.0,87073.38,91032.07,47927.95068
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.87689
2649,2024-11-17,90587.98,91449.99,88722.0,89855.99,23867.55609
2650,2024-11-18,89855.98,92594.0,89376.9,90464.08,46545.03448
2651,2024-11-19,90464.07,93905.51,90357.0,92310.79,43660.04682


In [6]:
df.columns

Index(['timestamp', 'open', 'high', 'low', 'close', 'volume'], dtype='object')

In [7]:
df_open2close=df.copy()

In [8]:
df_open2close['prev_open'] = df['open'].shift(1)

In [9]:
df_open2close

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
0,2017-08-17,4261.48,4485.39,4200.74,4285.08,795.150377,
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264,4261.48
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763,4285.08
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022,4108.37
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060,4120.98
...,...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680,90375.21
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890,87325.59
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090,91032.08
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480,90587.98


In [10]:
# Drop rows with any NaN values
df_open2close.dropna(inplace=True)

In [11]:
df_open2close

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264,4261.48
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763,4285.08
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022,4108.37
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060,4120.98
5,2017-08-22,4016.00,4104.82,3400.00,4040.00,966.684858,4069.13
...,...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680,90375.21
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890,87325.59
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090,91032.08
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480,90587.98


In [12]:
# Create the 'up_down' column: 1 if today's close is higher than yesterday's, else 0
df_open2close['down_open2close'] = (df_open2close['close'] < df_open2close['prev_open']).astype(int)

In [13]:
df_open2close.columns

Index(['timestamp', 'open', 'high', 'low', 'close', 'volume', 'prev_open',
       'down_open2close'],
      dtype='object')

In [14]:
df_open2close.tail()

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open,down_open2close
2647,2024-11-15,87325.59,91850.0,87073.38,91032.07,47927.95068,90375.21,0
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.87689,87325.59,0
2649,2024-11-17,90587.98,91449.99,88722.0,89855.99,23867.55609,91032.08,1
2650,2024-11-18,89855.98,92594.0,89376.9,90464.08,46545.03448,90587.98,1
2651,2024-11-19,90464.07,93905.51,90357.0,92310.79,43660.04682,89855.98,0


In [15]:
# Delete columns 
df_open2close_select = df_open2close.drop(['timestamp'], axis=1)

In [16]:
df_open2close_select.tail()

Unnamed: 0,open,high,low,close,volume,prev_open,down_open2close
2647,87325.59,91850.0,87073.38,91032.07,47927.95068,90375.21,0
2648,91032.08,91779.66,90056.17,90586.92,22717.87689,87325.59,0
2649,90587.98,91449.99,88722.0,89855.99,23867.55609,91032.08,1
2650,89855.98,92594.0,89376.9,90464.08,46545.03448,90587.98,1
2651,90464.07,93905.51,90357.0,92310.79,43660.04682,89855.98,0


In [17]:
# Separate features and target
X = df_open2close_select.drop('down_open2close', axis=1)  # Replace 'target' with your actual target column name
y = df_open2close_select['down_open2close']

In [18]:
# Split the data into training, validation, and test sets
from sklearn.model_selection import train_test_split
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.3, random_state=42)
X_val, X_test, y_val, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)

In [19]:
# Handle class imbalance using SMOTE
from imblearn.over_sampling import SMOTE
smote = SMOTE(random_state=42)
X_train_res, y_train_res = smote.fit_resample(X_train, y_train)

<a name="id"></a>
[**Back to HOME**](#100)

<a name="3"></a>

**Modeling and Evaluation**

In [20]:
# Define the model
from sklearn.ensemble import RandomForestClassifier
model_rf = RandomForestClassifier(random_state=42)

In [21]:
# Train the model
model_rf.fit(X_train_res, y_train_res)

In [22]:
# Evaluate the model on the validation set
y_val_pred = model_rf.predict(X_val)
y_val_pred_proba = model_rf.predict_proba(X_val)[:, 1]

In [23]:
# Print metrics on validation
from sklearn.metrics import accuracy_score, fbeta_score, precision_score, recall_score, f1_score, roc_auc_score
val_accuracy = accuracy_score(y_val, y_val_pred)
val_precision = precision_score(y_val, y_val_pred)
val_recall = recall_score(y_val, y_val_pred)
val_f1 = f1_score(y_val, y_val_pred)
val_f2 = fbeta_score(y_val, y_val_pred, beta=2)
val_roc_auc = roc_auc_score(y_val, y_val_pred_proba)

In [24]:
# Evaluate the model on the test set
y_test_pred = model_rf.predict(X_test)
y_test_pred_proba = model_rf.predict_proba(X_test)[:, 1]

In [25]:
# Print metrics on test
from sklearn.metrics import accuracy_score, fbeta_score, precision_score, recall_score, f1_score, roc_auc_score
test_accuracy = accuracy_score(y_test, y_test_pred)
test_precision = precision_score(y_test, y_test_pred)
test_recall = recall_score(y_test, y_test_pred)
test_f1 = f1_score(y_test, y_test_pred)
test_f2 = fbeta_score(y_test, y_test_pred, beta=2)
test_roc_auc = roc_auc_score(y_test, y_test_pred_proba)

In [26]:
print("Validation set Accuracy:", val_accuracy)
print("Validation set Precision:", val_precision)
print("Validation set Recall:", val_recall)
print("Validation set F1 Score:", val_f1)
print("Validation set F2 Score:", val_f2)
print("Validation set ROC AUC:", val_roc_auc)

Validation set Accuracy: 0.8517587939698492
Validation set Precision: 0.883495145631068
Validation set Recall: 0.8387096774193549
Validation set F1 Score: 0.8605200945626478
Validation set F2 Score: 0.8472998137802608
Validation set ROC AUC: 0.9283804771240167


In [27]:
print("Test set Accuracy:", test_accuracy)
print("Test set Precision:", test_precision)
print("Test set Recall:", test_recall)
print("Test set F1 Score:", test_f1)
print("Test set F2 Score:", test_f2)
print("Test set ROC AUC:", test_roc_auc)

Test set Accuracy: 0.8743718592964824
Test set Precision: 0.8611111111111112
Test set Recall: 0.8611111111111112
Test set F1 Score: 0.8611111111111112
Test set F2 Score: 0.8611111111111112
Test set ROC AUC: 0.9379587155963303


<a name="id"></a>
[**Back to HOME**](#100)

<a name="4"></a>

**Modeling All Data**

In [28]:
from binance.client import Client
import pandas as pd
import time

# Initialize the Binance client
api_key = "sytvkKKUmXPabC877r7MFv7rhibYAMoczrMdTse0OSB6dRyImx1G8yEInE889y00"
api_secret = "KYgkq441X5spXpdDoLELwlcoJ3k7uh9LeXGgf7aQvABSMZl42Py3OUIwFCqVgc6L"
client = Client(api_key, api_secret)

def fetch_ohlcv_batch(client, symbol, interval, start_time, limit=1000):
    """
    Fetch a batch of OHLCV data from Binance.
    """
    try:
        candles = client.get_klines(
            symbol=symbol,
            interval=interval,
            startTime=start_time,
            limit=limit
        )
        # Transform data into desired format
        ohlcv = [
            [int(c[0]), float(c[1]), float(c[2]), float(c[3]), float(c[4]), float(c[5])]
            for c in candles
        ]
        return ohlcv
    except Exception as e:
        print(f"Error fetching data: {e}")
        return None

def fetch_historical_ohlcv(client, symbol, interval, start_time, limit=1000):
    """
    Fetch historical OHLCV data in batches from Binance.
    """
    all_data = []
    while True:
        data = fetch_ohlcv_batch(client, symbol, interval, start_time, limit)
        if data:
            # Append data to all_data
            all_data.extend(data)
            # Update `start_time` to the timestamp of the last fetched data point + 1 millisecond
            start_time = data[-1][0] + 1
            print(f"Fetched {len(data)} data points. Total so far: {len(all_data)}")
        else:
            print("No more data to fetch or an error occurred.")
            break

        # If the batch size is less than the limit, it means we reached the end of available data
        if len(data) < limit:
            print("Reached the end of available data.")
            break

        # To avoid rate limit issues, wait for a short while
        time.sleep(1)

    # Convert data to DataFrame
    df = pd.DataFrame(all_data, columns=['timestamp', 'open', 'high', 'low', 'close', 'volume'])
    df['timestamp'] = pd.to_datetime(df['timestamp'], unit='ms')
    return df

# Usage example
if __name__ == "__main__":
    # Define parameters
    symbol = 'BTCUSDT'        # Symbol to fetch (without '/')
    interval = Client.KLINE_INTERVAL_1DAY  # Timeframe ('1m', '5m', '1h', '1d', etc.)
    start_time = int(pd.Timestamp("2010-07-17").timestamp() * 1000)  # Start date in milliseconds
    limit = 1000              # Max data points per batch

    # Fetch historical data
    df_all = fetch_historical_ohlcv(client, symbol, interval, start_time, limit)
    print(f"Total fetched data points: {len(df_all)}")
    print(df_all.head())

Fetched 1000 data points. Total so far: 1000
Fetched 1000 data points. Total so far: 2000
Fetched 653 data points. Total so far: 2653
Reached the end of available data.
Total fetched data points: 2653
   timestamp     open     high      low    close       volume
0 2017-08-17  4261.48  4485.39  4200.74  4285.08   795.150377
1 2017-08-18  4285.08  4371.52  3938.77  4108.37  1199.888264
2 2017-08-19  4108.37  4184.69  3850.00  4139.98   381.309763
3 2017-08-20  4120.98  4211.08  4032.62  4086.29   467.083022
4 2017-08-21  4069.13  4119.62  3911.79  4016.00   691.743060


In [29]:
# Select all rows except the last one
df_all = df_all.iloc[:-1]

In [30]:
df_all

Unnamed: 0,timestamp,open,high,low,close,volume
0,2017-08-17,4261.48,4485.39,4200.74,4285.08,795.150377
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060
...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480


In [31]:
# Shift 1 
df_all['prev_open'] = df_all['open'].shift(1)

In [32]:
df_all

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
0,2017-08-17,4261.48,4485.39,4200.74,4285.08,795.150377,
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264,4261.48
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763,4285.08
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022,4108.37
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060,4120.98
...,...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680,90375.21
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890,87325.59
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090,91032.08
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480,90587.98


In [33]:
# Drop rows with any NaN values
df_all.dropna(inplace=True)

In [34]:
df_all

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264,4261.48
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763,4285.08
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022,4108.37
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060,4120.98
5,2017-08-22,4016.00,4104.82,3400.00,4040.00,966.684858,4069.13
...,...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680,90375.21
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890,87325.59
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090,91032.08
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480,90587.98


In [35]:
# Create the 'up_down' column: 1 if today's close is higher than yesterday's, else 0
df_all['down_open2close'] = (df_all['close'] < df_all['prev_open']).astype(int)

In [36]:
# Delete columns 
df_all_select = df_all.drop(['timestamp'], axis=1)

In [37]:
# Separate features and target
X_all = df_all_select.drop('down_open2close', axis=1)  # Replace 'target' with your actual target column name
y_all = df_all_select['down_open2close']

In [38]:
# Handle class imbalance using SMOTE
from imblearn.over_sampling import SMOTE
smote = SMOTE(random_state=42)
X_train_res_all, y_train_res_all = smote.fit_resample(X_all, y_all)

In [39]:
# Define the model ALL
from sklearn.ensemble import RandomForestClassifier
model_rf_all = RandomForestClassifier(random_state=42)

In [40]:
# Train the model
model_rf_all.fit(X_train_res_all, y_train_res_all)

<a name="id"></a>
[**Back to HOME**](#100)

<a name="5"></a>

**Today's Prediction**

In [41]:
from binance.client import Client
import pandas as pd
import time

# Initialize the Binance client
api_key = "sytvkKKUmXPabC877r7MFv7rhibYAMoczrMdTse0OSB6dRyImx1G8yEInE889y00"
api_secret = "KYgkq441X5spXpdDoLELwlcoJ3k7uh9LeXGgf7aQvABSMZl42Py3OUIwFCqVgc6L"
client = Client(api_key, api_secret)

def fetch_ohlcv_batch(client, symbol, interval, start_time, limit=1000):
    """
    Fetch a batch of OHLCV data from Binance.
    """
    try:
        candles = client.get_klines(
            symbol=symbol,
            interval=interval,
            startTime=start_time,
            limit=limit
        )
        # Transform data into desired format
        ohlcv = [
            [int(c[0]), float(c[1]), float(c[2]), float(c[3]), float(c[4]), float(c[5])]
            for c in candles
        ]
        return ohlcv
    except Exception as e:
        print(f"Error fetching data: {e}")
        return None

def fetch_historical_ohlcv(client, symbol, interval, start_time, limit=1000):
    """
    Fetch historical OHLCV data in batches from Binance.
    """
    all_data = []
    while True:
        data = fetch_ohlcv_batch(client, symbol, interval, start_time, limit)
        if data:
            # Append data to all_data
            all_data.extend(data)
            # Update `start_time` to the timestamp of the last fetched data point + 1 millisecond
            start_time = data[-1][0] + 1
            print(f"Fetched {len(data)} data points. Total so far: {len(all_data)}")
        else:
            print("No more data to fetch or an error occurred.")
            break

        # If the batch size is less than the limit, it means we reached the end of available data
        if len(data) < limit:
            print("Reached the end of available data.")
            break

        # To avoid rate limit issues, wait for a short while
        time.sleep(1)

    # Convert data to DataFrame
    df = pd.DataFrame(all_data, columns=['timestamp', 'open', 'high', 'low', 'close', 'volume'])
    df['timestamp'] = pd.to_datetime(df['timestamp'], unit='ms')
    return df

# Usage example
if __name__ == "__main__":
    # Define parameters
    symbol = 'BTCUSDT'        # Symbol to fetch (without '/')
    interval = Client.KLINE_INTERVAL_1DAY  # Timeframe ('1m', '5m', '1h', '1d', etc.)
    start_time = int(pd.Timestamp("2010-07-17").timestamp() * 1000)  # Start date in milliseconds
    limit = 1000              # Max data points per batch

    # Fetch historical data
    df_today = fetch_historical_ohlcv(client, symbol, interval, start_time, limit)
    print(f"Total fetched data points: {len(df_today)}")
    print(df_today.head())

Fetched 1000 data points. Total so far: 1000
Fetched 1000 data points. Total so far: 2000
Fetched 653 data points. Total so far: 2653
Reached the end of available data.
Total fetched data points: 2653
   timestamp     open     high      low    close       volume
0 2017-08-17  4261.48  4485.39  4200.74  4285.08   795.150377
1 2017-08-18  4285.08  4371.52  3938.77  4108.37  1199.888264
2 2017-08-19  4108.37  4184.69  3850.00  4139.98   381.309763
3 2017-08-20  4120.98  4211.08  4032.62  4086.29   467.083022
4 2017-08-21  4069.13  4119.62  3911.79  4016.00   691.743060


In [42]:
# Select all rows except the last one
df_today = df_today.iloc[:-1]

In [43]:
df_today['prev_open'] = df_today['open'].shift(1)

In [44]:
df_today

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
0,2017-08-17,4261.48,4485.39,4200.74,4285.08,795.150377,
1,2017-08-18,4285.08,4371.52,3938.77,4108.37,1199.888264,4261.48
2,2017-08-19,4108.37,4184.69,3850.00,4139.98,381.309763,4285.08
3,2017-08-20,4120.98,4211.08,4032.62,4086.29,467.083022,4108.37
4,2017-08-21,4069.13,4119.62,3911.79,4016.00,691.743060,4120.98
...,...,...,...,...,...,...,...
2647,2024-11-15,87325.59,91850.00,87073.38,91032.07,47927.950680,90375.21
2648,2024-11-16,91032.08,91779.66,90056.17,90586.92,22717.876890,87325.59
2649,2024-11-17,90587.98,91449.99,88722.00,89855.99,23867.556090,91032.08
2650,2024-11-18,89855.98,92594.00,89376.90,90464.08,46545.034480,90587.98


In [45]:
df_today_test= df_today.tail(1)

In [46]:
df_today_test

Unnamed: 0,timestamp,open,high,low,close,volume,prev_open
2651,2024-11-19,90464.07,93905.51,90357.0,92310.79,43660.04682,89855.98


In [47]:
# Delete column
df_today_test_ready = df_today_test.drop(columns=['timestamp'])

In [48]:
df_today_test_ready

Unnamed: 0,open,high,low,close,volume,prev_open
2651,90464.07,93905.51,90357.0,92310.79,43660.04682,89855.98


In [49]:
# Evaluate the model on the test set
today_pred = model_rf.predict(df_today_test_ready)
today_pred_proba = model_rf.predict_proba(df_today_test_ready)[:, 1]

In [50]:
today_pred

array([0])

In [51]:
today_pred_proba

array([0.15])

In [52]:
# Evaluate the model on the test set
today_pred_all = model_rf_all.predict(df_today_test_ready)
today_pred_proba_all = model_rf_all.predict_proba(df_today_test_ready)[:, 1]

In [53]:
today_pred_all

array([0])

In [54]:
today_pred_proba_all

array([0.19])

<a name="id"></a>
[**Back to HOME**](#100)