### We are going to build a machine learning system in python that can detect whether a credit card transaction is legit or fraud

#### Brief explanation of what is found in this python code: I successfully developed a complete machine learning pipeline in Python to detect credit card fraud using logistic regression. I started by importing essential libraries such as pandas, numpy, and modules from scikit-learn, then loaded the credit card transactions dataset into a DataFrame. After exploring the dataset, I confirmed there were no missing values and observed that the data was highly imbalanced, with the vast majority of transactions being legitimate and only a small portion marked as fraudulent. To address this imbalance, I performed under-sampling by randomly selecting 492 legitimate transactions to match the 492 fraudulent ones, creating a balanced dataset of 984 records. I then separated the features (X) from the target (Y) and applied standardization using StandardScaler to ensure all features had a similar scale, which I verified by checking that their standard deviations were approximately one. Afterward, I split the data into training and testing sets using an 80/20 split with stratification to maintain class distribution. I trained a logistic regression model on the training set and evaluated its performance using accuracy scores on both the training and test sets, achieving approximately 94% accuracy in both cases. Finally, I built a predictive system where I could input new transaction data, scale it using the fitted scaler, and use the trained model to classify it as either a legitimate or fraudulent transaction. This project demonstrates my understanding of data preprocessing, handling class imbalance, model training, evaluation, and deployment for real-world predictions.

In [3]:
# Importing the libraries we need
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

#### Loading the dataset into a Pandas DataFrame

In [5]:
creditcard_df = pd.read_csv(r"C:\Users\hp\Downloads\creditcard.csv\creditcard.csv")
creditcard_df

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.166480,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.167170,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.379780,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,-0.108300,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.50,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.009431,0.798278,-0.137458,0.141267,-0.206010,0.502292,0.219422,0.215153,69.99,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
284802,172786.0,-11.881118,10.071785,-9.834783,-2.066656,-5.364473,-2.606837,-4.918215,7.305334,1.914428,...,0.213454,0.111864,1.014480,-0.509348,1.436807,0.250034,0.943651,0.823731,0.77,0
284803,172787.0,-0.732789,-0.055080,2.035030,-0.738589,0.868229,1.058415,0.024330,0.294869,0.584800,...,0.214205,0.924384,0.012463,-1.016226,-0.606624,-0.395255,0.068472,-0.053527,24.79,0
284804,172788.0,1.919565,-0.301254,-3.249640,-0.557828,2.630515,3.031260,-0.296827,0.708417,0.432454,...,0.232045,0.578229,-0.037501,0.640134,0.265745,-0.087371,0.004455,-0.026561,67.88,0
284805,172788.0,-0.240440,0.530483,0.702510,0.689799,-0.377961,0.623708,-0.686180,0.679145,0.392087,...,0.265245,0.800049,-0.163298,0.123205,-0.569159,0.546668,0.108821,0.104533,10.00,0


In [6]:
creditcard_df.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,...,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.16648,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,...,-0.225775,-0.638672,0.101288,-0.339846,0.16717,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,...,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,...,-0.1083,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.5,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,...,-0.009431,0.798278,-0.137458,0.141267,-0.20601,0.502292,0.219422,0.215153,69.99,0


In [7]:
creditcard_df.tail()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
284802,172786.0,-11.881118,10.071785,-9.834783,-2.066656,-5.364473,-2.606837,-4.918215,7.305334,1.914428,...,0.213454,0.111864,1.01448,-0.509348,1.436807,0.250034,0.943651,0.823731,0.77,0
284803,172787.0,-0.732789,-0.05508,2.03503,-0.738589,0.868229,1.058415,0.02433,0.294869,0.5848,...,0.214205,0.924384,0.012463,-1.016226,-0.606624,-0.395255,0.068472,-0.053527,24.79,0
284804,172788.0,1.919565,-0.301254,-3.24964,-0.557828,2.630515,3.03126,-0.296827,0.708417,0.432454,...,0.232045,0.578229,-0.037501,0.640134,0.265745,-0.087371,0.004455,-0.026561,67.88,0
284805,172788.0,-0.24044,0.530483,0.70251,0.689799,-0.377961,0.623708,-0.68618,0.679145,0.392087,...,0.265245,0.800049,-0.163298,0.123205,-0.569159,0.546668,0.108821,0.104533,10.0,0
284806,172792.0,-0.533413,-0.189733,0.703337,-0.506271,-0.012546,-0.649617,1.577006,-0.41465,0.48618,...,0.261057,0.643078,0.376777,0.008797,-0.473649,-0.818267,-0.002415,0.013649,217.0,0


In [8]:
#dataset information
creditcard_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 284807 entries, 0 to 284806
Data columns (total 31 columns):
 #   Column  Non-Null Count   Dtype  
---  ------  --------------   -----  
 0   Time    284807 non-null  float64
 1   V1      284807 non-null  float64
 2   V2      284807 non-null  float64
 3   V3      284807 non-null  float64
 4   V4      284807 non-null  float64
 5   V5      284807 non-null  float64
 6   V6      284807 non-null  float64
 7   V7      284807 non-null  float64
 8   V8      284807 non-null  float64
 9   V9      284807 non-null  float64
 10  V10     284807 non-null  float64
 11  V11     284807 non-null  float64
 12  V12     284807 non-null  float64
 13  V13     284807 non-null  float64
 14  V14     284807 non-null  float64
 15  V15     284807 non-null  float64
 16  V16     284807 non-null  float64
 17  V17     284807 non-null  float64
 18  V18     284807 non-null  float64
 19  V19     284807 non-null  float64
 20  V20     284807 non-null  float64
 21  V21     28

In [9]:
# checking for missing values in each column
creditcard_df.isnull().sum()

Time      0
V1        0
V2        0
V3        0
V4        0
V5        0
V6        0
V7        0
V8        0
V9        0
V10       0
V11       0
V12       0
V13       0
V14       0
V15       0
V16       0
V17       0
V18       0
V19       0
V20       0
V21       0
V22       0
V23       0
V24       0
V25       0
V26       0
V27       0
V28       0
Amount    0
Class     0
dtype: int64

In [10]:
# the distribution of legit transactions and fraudulent transactions
creditcard_df['Class'].value_counts()

Class
0    284315
1       492
Name: count, dtype: int64

#### This dataset is highly unbalanced (0 = Normal Transaction, 1 = Fraudulent Transaction)

In [12]:
# separating the data for analysis
legit = creditcard_df[creditcard_df.Class == 0]
fraud = creditcard_df[creditcard_df.Class == 1]

In [13]:
print(legit.shape)
print(fraud.shape)

(284315, 31)
(492, 31)


In [14]:
# statistical measures of the data for legit transactions
legit.Amount.describe()

count    284315.000000
mean         88.291022
std         250.105092
min           0.000000
25%           5.650000
50%          22.000000
75%          77.050000
max       25691.160000
Name: Amount, dtype: float64

In [15]:
fraud.Amount.describe()

count     492.000000
mean      122.211321
std       256.683288
min         0.000000
25%         1.000000
50%         9.250000
75%       105.890000
max      2125.870000
Name: Amount, dtype: float64

In [16]:
# compare the values for both transactions using mean
creditcard_df.groupby('Class').mean()

Unnamed: 0_level_0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V20,V21,V22,V23,V24,V25,V26,V27,V28,Amount
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,94838.202258,0.008258,-0.006271,0.012171,-0.00786,0.005453,0.002419,0.009637,-0.000987,0.004467,...,-0.000644,-0.001235,-2.4e-05,7e-05,0.000182,-7.2e-05,-8.9e-05,-0.000295,-0.000131,88.291022
1,80746.806911,-4.771948,3.623778,-7.033281,4.542029,-3.151225,-1.397737,-5.568731,0.570636,-2.581123,...,0.372319,0.713588,0.014049,-0.040308,-0.10513,0.041449,0.051648,0.170575,0.075667,122.211321


#### We will now do Under-Sampling: Where we would build a sample dataset containing similar distribution of normal transactions and Fraudulent transactions

Number of Fradulent Transaction is 492

In [18]:
legit_sample = legit.sample(n=492)
legit_sample.shape

(492, 31)

#### Concatenate 2 dataframes now

In [20]:
new_dataset = pd.concat([legit_sample, fraud], axis=0)
new_dataset.shape

(984, 31)

In [21]:
new_dataset.head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
194710,130727.0,-0.406705,-0.808118,1.166326,-2.09666,-0.432964,-0.450999,0.230346,-0.102149,-0.648974,...,0.412207,1.140903,0.087362,0.036039,0.191967,-0.211565,0.006342,-0.206794,129.0,0
259887,159317.0,0.082781,0.885979,0.011616,-0.762846,0.83841,-0.621566,1.035002,-0.177296,-0.105053,...,-0.291774,-0.619071,-0.015414,-0.702053,-0.386521,0.184421,0.253564,0.091373,4.49,0
94724,65004.0,1.027133,0.043529,0.137253,0.922418,0.448705,1.056445,-0.156301,0.412074,0.095766,...,0.14622,0.541825,0.024624,-0.978604,0.338261,-0.198237,0.076489,0.006404,20.0,0
56143,47288.0,-0.763004,1.633189,1.883,2.831664,-0.42538,0.036711,-0.038138,0.470369,-1.551014,...,0.258973,0.80912,-0.153167,0.450532,-0.131773,0.322829,0.280816,0.13373,10.65,0
1920,1475.0,1.184914,0.076415,0.529932,0.850806,-0.485773,-0.677905,0.081468,-0.201042,0.304756,...,-0.205162,-0.41396,-0.020973,0.45968,0.479929,0.262636,-0.020293,0.019509,31.62,0


In [22]:
new_dataset['Class'].value_counts()

Class
0    492
1    492
Name: count, dtype: int64

In [23]:
new_dataset.groupby('Class').mean()

Unnamed: 0_level_0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V20,V21,V22,V23,V24,V25,V26,V27,V28,Amount
Class,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,92853.443089,0.112449,-0.036742,-0.018328,0.02775,-0.060555,-0.103908,-0.054912,0.059465,0.115694,...,0.031815,0.036422,0.020782,-0.021206,0.003651,0.020399,-0.006778,0.013899,0.010862,83.463374
1,80746.806911,-4.771948,3.623778,-7.033281,4.542029,-3.151225,-1.397737,-5.568731,0.570636,-2.581123,...,0.372319,0.713588,0.014049,-0.040308,-0.10513,0.041449,0.051648,0.170575,0.075667,122.211321


#### Splitting the dataset into features and targets

In [25]:
X = new_dataset.drop(columns = 'Class', axis=1)
Y = new_dataset['Class']

In [26]:
print(X)

            Time        V1        V2        V3        V4        V5        V6  \
194710  130727.0 -0.406705 -0.808118  1.166326 -2.096660 -0.432964 -0.450999   
259887  159317.0  0.082781  0.885979  0.011616 -0.762846  0.838410 -0.621566   
94724    65004.0  1.027133  0.043529  0.137253  0.922418  0.448705  1.056445   
56143    47288.0 -0.763004  1.633189  1.883000  2.831664 -0.425380  0.036711   
1920      1475.0  1.184914  0.076415  0.529932  0.850806 -0.485773 -0.677905   
...          ...       ...       ...       ...       ...       ...       ...   
279863  169142.0 -1.927883  1.125653 -4.518331  1.749293 -1.566487 -2.010494   
280143  169347.0  1.378559  1.289381 -5.004247  1.411850  0.442581 -1.326536   
280149  169351.0 -0.676143  1.126366 -2.213700  0.468308 -1.120541 -0.003346   
281144  169966.0 -3.113832  0.585864 -5.399730  1.817092 -0.840618 -2.943548   
281674  170348.0  1.991976  0.158476 -2.583441  0.408670  1.151147 -0.096695   

              V7        V8        V9  .

In [27]:
print(Y)

194710    0
259887    0
94724     0
56143     0
1920      0
         ..
279863    1
280143    1
280149    1
281144    1
281674    1
Name: Class, Length: 984, dtype: int64


In [28]:
X.std()

Time      48223.847889
V1            5.561973
V2            3.762229
V3            6.205283
V4            3.181827
V5            4.194305
V6            1.732688
V7            5.829967
V8            4.840393
V9            2.366654
V10           4.579416
V11           2.775093
V12           4.582198
V13           1.040216
V14           4.662209
V15           0.994784
V16           3.483296
V17           5.976760
V18           2.401273
V19           1.256768
V20           1.078028
V21           2.775511
V22           1.165585
V23           1.170761
V24           0.567558
V25           0.673562
V26           0.464715
V27           1.008708
V28           0.431967
Amount      221.370666
dtype: float64

In [29]:
Y.std()

0.5002542588519275

#### Data Standardization

In [31]:
from sklearn.preprocessing import StandardScaler

In [32]:
scaler = StandardScaler()

In [33]:
scaler.fit(X)

In [34]:
standardized_data = scaler.transform(X)

In [35]:
X = standardized_data
Y = new_dataset['Class']

In [36]:
print(X)
print(Y)

[[ 0.91135848  0.34592437 -0.69186609 ... -0.08519599 -0.57917825
   0.11824492]
 [ 1.50452016  0.43397508 -0.24134642 ...  0.16001644  0.11142729
  -0.44449141]
 [-0.45220798  0.60384855 -0.46538337 ... -0.01561932 -0.08537601
  -0.3743923 ]
 ...
 [ 1.71269729  0.29745698 -0.17741908 ...  0.29049015  0.34996526
  -0.11275222]
 [ 1.7254568  -0.14104362 -0.32115748 ...  0.78619578 -0.68782143
   0.64251939]
 [ 1.73338222  0.77740815 -0.43481501 ... -0.08852359 -0.1356665
  -0.27256554]]
194710    0
259887    0
94724     0
56143     0
1920      0
         ..
279863    1
280143    1
280149    1
281144    1
281674    1
Name: Class, Length: 984, dtype: int64


In [37]:
X.std()

1.0

#### Split the data into Training and Testing Data

In [39]:
X_train, X_test, Y_train, Y_test = train_test_split(X,Y,test_size=0.2,stratify=Y, random_state=42)

In [40]:
print(X.shape, X_train.shape, X_test.shape)

(984, 30) (787, 30) (197, 30)


#### Model Training

In [42]:
#Logistic Regression
lr_model = LogisticRegression(max_iter=1000)

In [43]:
#training the Logistic Regression Model with Training Data
lr_model.fit(X_train, Y_train)

#### Model Evaluation: Accuracy Score

In [45]:
# accuracy on training data 
X_train_prediction = lr_model.predict(X_train)
training_data_accuracy = accuracy_score(X_train_prediction, Y_train)

In [46]:
print('Accuracy on Training Data:', training_data_accuracy)

Accuracy on Training Data: 0.9491740787801779


In [47]:
X_test_prediction = lr_model.predict(X_test)
testing_data_accuracy = accuracy_score(X_test_prediction, Y_test)

In [48]:
print('Accuracy on Testing Data:', testing_data_accuracy)

Accuracy on Testing Data: 0.9441624365482234


#### MAKING A PREDICTIVE SYSTEM

In [50]:
new_dataset[new_dataset.Class == 0].head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
194710,130727.0,-0.406705,-0.808118,1.166326,-2.09666,-0.432964,-0.450999,0.230346,-0.102149,-0.648974,...,0.412207,1.140903,0.087362,0.036039,0.191967,-0.211565,0.006342,-0.206794,129.0,0
259887,159317.0,0.082781,0.885979,0.011616,-0.762846,0.83841,-0.621566,1.035002,-0.177296,-0.105053,...,-0.291774,-0.619071,-0.015414,-0.702053,-0.386521,0.184421,0.253564,0.091373,4.49,0
94724,65004.0,1.027133,0.043529,0.137253,0.922418,0.448705,1.056445,-0.156301,0.412074,0.095766,...,0.14622,0.541825,0.024624,-0.978604,0.338261,-0.198237,0.076489,0.006404,20.0,0
56143,47288.0,-0.763004,1.633189,1.883,2.831664,-0.42538,0.036711,-0.038138,0.470369,-1.551014,...,0.258973,0.80912,-0.153167,0.450532,-0.131773,0.322829,0.280816,0.13373,10.65,0
1920,1475.0,1.184914,0.076415,0.529932,0.850806,-0.485773,-0.677905,0.081468,-0.201042,0.304756,...,-0.205162,-0.41396,-0.020973,0.45968,0.479929,0.262636,-0.020293,0.019509,31.62,0


In [51]:
input_data = (0,-1.359807134,-0.072781173,2.53634673796914,1.37815522427443,-0.33832077,0.462387777762292,0.239598554061257,
              0.0986979012610507,0.363786969611213,0.0907941719789316,-0.551599533,-0.617800856,-0.991389847,-0.311169354,
              1.46817697209427,-0.470400525,0.207971241929242,0.0257905801985591,0.403992960255733,0.251412098239705,-0.018306778,
              0.277837575558899,-0.11047391,0.0669280749146731,0.128539358273528,-0.189114844,0.133558376740387,-0.021053053,149.62	
)

input_data_as_array = np.asarray(input_data)
input_data_reshaped = input_data_as_array.reshape(1, -1)
# Standardize the new data using the same scaler
std_input_data = scaler.transform(input_data_reshaped)
print("Standardized input:", std_input_data)
prediction = lr_model.predict(std_input_data)
print("Prediction result:", prediction)

if prediction[0] == 0:
    print("This is a Legit transaction")
else:
    print("This is a Fraudulent transaction")


Standardized input: [[-1.80085721  0.17447686 -0.49631436  0.97743065 -0.28511776  0.30236566
   0.70054577  0.52366882 -0.04472009  0.67492467  0.63140101 -0.88151966
   0.54973294 -0.86351251  0.67936189  1.50079392  0.45507631  0.59065099
   0.48219704  0.05767837  0.04579652 -0.14177997  0.22353985 -0.06812447
   0.20742849  0.1449976  -0.45545577  0.04098584 -0.14897113  0.21143923]]
Prediction result: [0]
This is a Legit transaction




In [52]:
new_dataset[new_dataset.Class == 1].head()

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,...,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
541,406.0,-2.312227,1.951992,-1.609851,3.997906,-0.522188,-1.426545,-2.537387,1.391657,-2.770089,...,0.517232,-0.035049,-0.465211,0.320198,0.044519,0.17784,0.261145,-0.143276,0.0,1
623,472.0,-3.043541,-3.157307,1.088463,2.288644,1.359805,-1.064823,0.325574,-0.067794,-0.270953,...,0.661696,0.435477,1.375966,-0.293803,0.279798,-0.145362,-0.252773,0.035764,529.0,1
4920,4462.0,-2.30335,1.759247,-0.359745,2.330243,-0.821628,-0.075788,0.56232,-0.399147,-0.238253,...,-0.294166,-0.932391,0.172726,-0.08733,-0.156114,-0.542628,0.039566,-0.153029,239.93,1
6108,6986.0,-4.397974,1.358367,-2.592844,2.679787,-1.128131,-1.706536,-3.496197,-0.248778,-0.247768,...,0.573574,0.176968,-0.436207,-0.053502,0.252405,-0.657488,-0.827136,0.849573,59.0,1
6329,7519.0,1.234235,3.01974,-4.304597,4.732795,3.624201,-1.357746,1.713445,-0.496358,-1.282858,...,-0.379068,-0.704181,-0.656805,-1.632653,1.488901,0.566797,-0.010016,0.146793,1.0,1


In [53]:
input_data = (472,-3.043540624,-3.157307121,1.08846277997285,2.2886436183814,1.35980512966107,-1.064822523,0.325574266158614,-0.067793653,
              -0.270952836,-0.838586565,-0.414575448,-0.50314086,0.676501544635863,-1.692028933,2.00063483909015,0.666779695901966,0.599717413841732,
              1.72532100745514,0.283344830149495,2.10233879259444,0.661695924845707,0.435477208966341,1.37596574254306,-0.293803153,0.279798031841214,
              -0.145361715,-0.252773123,0.0357642251788156,529
)
input_data_as_array = np.asarray(input_data)
input_data_reshaped = input_data_as_array.reshape(1, -1)
# Standardize the new data using the same scaler
std_input_data = scaler.transform(input_data_reshaped)
print("Standardized input:", std_input_data)
prediction = lr_model.predict(std_input_data)
print("Prediction result:", prediction)

if prediction[0] == 0:
    print("This is a Legit transaction")
else:
    print("This is a Fraudulent transaction")

Standardized input: [[-1.79106455e+00 -1.28399411e-01 -1.31659794e+00  7.43981151e-01
   1.18044822e-03  7.07436235e-01 -1.81313487e-01  5.38423522e-01
  -7.91338652e-02  4.06586970e-01  4.28350359e-01 -8.32118156e-01
   5.74768593e-01  7.40711696e-01  3.83029864e-01  2.03631598e+00
   7.81709071e-01  6.56229225e-01  1.19031925e+00 -3.83691476e-02
   1.76362552e+00  1.03345531e-01  3.58853755e-01  1.20215626e+00
  -4.28479793e-01  3.69677076e-01 -3.61257489e-01 -3.42205296e-01
  -1.73727280e-02  1.92608792e+00]]
Prediction result: [1]
This is a Fraudulent transaction


