import of needed libraries

In [37]:
import pandas as pd 
import numpy as np
import joblib
import requests
import os
import time
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from dotenv import load_dotenv
from sklearn.metrics import accuracy_score

load_dotenv()

API = os.getenv("API")

This code loads a dataset from a CSV file into a pandas DataFrame, then separates it into 
features (X) and the target variable (y). The data is split into training and testing sets 
using a 80-20% split. The target variable in both training and testing sets is then binned 
into two categories based on the threshold of 1.2 using np.digitize, resulting in binned 
target variables (y_train_binned and y_test_binned).


In [38]:
data = pd.read_csv('csfail_database.csv')
X = data.drop(columns=['crashedAt'])
y = data['crashedAt']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
y_train_binned = np.digitize(y_train, bins=[1.2])
y_test_binned = np.digitize(y_test, bins=[1.2])


This code initializes a DecisionTreeClassifier model and trains it on the binned training 
data (X_train and y_train_binned). After training, the model makes predictions on the 
binned test data (X_test). The accuracy of the model's predictions is then calculated by 
comparing the predicted values with the actual binned test values (y_test_binned), 
and the accuracy score is returned.


In [None]:
model = DecisionTreeClassifier()
model.fit(X_train, y_train_binned)


prediction = model.predict(X_test)
score = accuracy_score(y_test_binned, prediction)
score

The model is saved to a file named 'crash_geusser.joblib' using joblib.dump,
which allows for efficient storage and 
later use of the trained model.


In [None]:
model = DecisionTreeClassifier()
model.fit(X_train.values, y_train_binned)

joblib.dump(model, 'crash_geusser.joblib')

This function `modeling` makes predictions on crash game data from an API within a specified range 
of games (from `start_game` to `end_game`). For each game, it sends a GET request to retrieve 
the game data, extracts relevant features (total bank, number of users, and item count), and 
makes a prediction using the loaded model. It then compares the predicted value (bin) with the 
actual crash data (`crashedAt`), and prints `True` if they match and `False` if they don't. 
The process repeats for each game in the given range, with a 1-second delay between requests.


In [None]:
def modeling(start_game, end_game, model):
    for i in range(start_game, end_game):
        time.sleep(1)
        api = requests.get(f'{API}{i}')
        api = api.json()

        crash = np.digitize(api["data"]["game"]["crashedAt"], bins=[1.2])
        total = api["data"]["game"]["totalBankUsd"]
        users = api["data"]["game"]["usersCount"]
        items = api["data"]["game"]["itemsCount"]

        prediction = model.predict([[total, users, items]])

        if prediction[0] == crash:
            print(True)
        else:
            print(False)

start_game = 5000000
end_game = 5661524

model = joblib.load('crash_geusser.joblib')

modeling(start_game, end_game, model)
