##### Files and Field Descriptions

- train_series.parquet - Series to be used as training data. Each series is a continuous recording of accelerometer data for a single subject spanning many days.

- series_id - Unique identifier for each accelerometer series.
- step - An integer timestep for each observation within a series.
- timestamp - A corresponding datetime with ISO 8601 format %Y-%m-%dT%H:%M:%S%z.
- anglez - As calculated and described by the GGIR package, z-angle is a metric derived from individual accelerometer components that is commonly used in sleep detection, and refers to the angle of the arm relative to the vertical axis of the body
- enmo - As calculated and described by the GGIR package, ENMO is the Euclidean Norm Minus One of all accelerometer signals, with negative values rounded to zero. While no standard measure of acceleration exists in this space, this is one of the several commonly computed features
- test_series.parquet - Series to be used as the test data, containing the same fields as above. You will predict event occurrences for series in this file.
- train_events.csv - Sleep logs for series in the training set recording onset and wake events.
- series_id - Unique identifier for each series of accelerometer data in train_series.parquet.
- night - An enumeration of potential onset / wakeup event pairs. At most one pair of events can occur for each night.
- event - The type of event, whether onset or wakeup.
- step and timestamp - The recorded time of occurence of the event in the accelerometer series.

sample_submission.csv - A sample submission file in the correct format. See the Evaluation page for more details.

# 1. DATA COLLECTION

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory
plt.rcParams["figure.figsize"] = (20,3)

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
def translate_time(data: pd.DataFrame, col: str):
    try:
        data['datetime'] = data[col].astype(str).str.replace('T', ' ').str[:-5]
        data['datetime'] = pd.to_datetime(data['datetime'] )
    except Exception as e:
        print(e)
    return data

# 2. DATA UNDERSTANDING
## Understanding Training Events and Series Data

- Process through assessing data distributions
- Identifying Missing Values
- Outlier detection
- Understanding Relationships between variables

The training events data has 14510 rows and 5 columns aka features.

In [None]:
training_data_events = pd.read_csv('/kaggle/input/child-mind-institute-detect-sleep-states/train_events.csv')
training_data_events.shape

In [None]:
training_data_events.info()

## Describing the numerical data types

- ##### night :

  - This is an Interval value that denotes the night number corresponding to the event. We can use this to find duration of sleep and awake.
  - Night has integral range from [1 - 84]. Interval should be different for each series-id
- Step has no signifance in this set but can act as a key for other train_series parquet file

In [None]:
training_data_events.describe()

## Describing the categorical variables 

##### series_id :

This ID represents the id i.e Nominal Value of the device / denoting a child. There are 0 NULL values

##### event :

Event is an Ordinal value that takes 2 values : onset and wakeup

In [None]:
training_data_events.describe(include=['O'])

In [None]:
training_data_events = translate_time(training_data_events, "timestamp")
training_data_events.sample(5)

### Observation : Nights for each User-id available in the events data

In [None]:
seriesid_group_by_nights = training_data_events.groupby('series_id')['night'].nunique()
seriesid_group_by_nights.describe()

In [None]:
seriesid_group_by_nights.sort_values(ascending=False).head(10).plot(kind='barh')
plt.title('Nights per Series ID')
plt.xlabel('Number of nights')
plt.ylabel('Top 10 Series IDs')

### Check NaN values in the dataset

4923 values in step, timestamp are NaN. This needs to be worked upon.

In [None]:
training_data_events.isna().sum()

In [None]:
# Check values which are NaNs or NaTs
series_id_having_all_nas = training_data_events[training_data_events.isna().any(axis=1)]

print(series_id_having_all_nas["series_id"].nunique())

In [None]:
## Get the unique events in event data
training_data_events["event"].unique()

In [None]:
## Get the unique series ids in event data
training_data_events["series_id"].nunique()

From 277 unique series ids, we end up with 269 series_ids. So 8 series_ids have no information.

In [None]:
training_data_events.dropna().nunique()

### Observation: Wake Up and Onset data points

In [None]:
training_events_wake_up = training_data_events[training_data_events["event"] == "wakeup"]
training_events_onset = training_data_events[training_data_events["event"] == "onset"]

In [None]:
training_events_onset['datetime'].dt.hour.value_counts().plot(kind='barh')
plt.title('Onset Times')
plt.xlabel('Number of cases')
plt.ylabel('Onset time')

In [None]:
training_events_wake_up['datetime'].dt.hour.value_counts().plot(kind='barh')
plt.title('WakeUp Times')
plt.xlabel('Number of cases')
plt.ylabel('WakeUp time')

In [None]:
train_events = pd.read_csv('/kaggle/input/child-mind-institute-detect-sleep-states/train_events.csv')
train_events.shape

In [None]:
train_series = pd.read_parquet('/kaggle/input/child-mind-institute-detect-sleep-states/train_series.parquet')

In [None]:
train_series['timestamp'].min()

In [None]:
train_series['timestamp'].max()

In [None]:
top_five_series = train_events.groupby('series_id')['event'].count().sort_values(ascending=False).head(5)
top_five_series

### Now get the top 5 series id data from train_series and work on the those data points 

Since the data has over 127 millions rows. By taking the top 5, we reduce it to 5 million rows.

In [None]:
train_series_subset = train_series[train_series['series_id'].isin(top_five_series.index)]
train_series_subset = translate_time(train_series_subset, 'timestamp')
train_series_subset.shape

The training events now has about 592 rows in the top 5 IDs

In [None]:
train_events_subset = train_events[train_events['series_id'].isin(top_five_series.index)]
train_events_subset = translate_time(train_events_subset, 'timestamp')
train_events_subset.shape

In [None]:
import matplotlib.pyplot as plt

In this step, we work on the NaN and NaT values for the event and series data points.

In [None]:
train_events_subset.sample(5)

# 3. DATA PREPARATION
### Remove all NULL values

There is not a correlation that can be formed just from just the training events. So we removed all the NULL values in training events subset

In [None]:
mask_non_NULL = (~train_events_subset['step'].isnull()) | (~train_events_subset['timestamp'].isnull())
train_events_subset_nonNULL = train_events_subset[mask_non_NULL]

train_events_subset_nonNULL.shape

In [None]:
train_events_subset_nonNULL['series_id'].unique()

In [None]:
train_events_subset_nonNULL[train_events_subset_nonNULL['series_id'] == '78569a801a38']

### Looking at the sleep duration

- We perform some exloratory analysis on the sleep events.
- Duration of sleep gives us an idea of how long people
- We can use this information to work out states beyond the onset and wakeup values.

In [None]:
sleep_duration = train_events_subset_nonNULL.groupby([
    train_events_subset_nonNULL['series_id'], 
    train_events_subset_nonNULL['night']])['datetime'].agg(['min', 'max']).reset_index()

In [None]:
sleep_duration = sleep_duration.rename(columns={'min': 'onset', 'max': 'wakeup'})
sleep_duration.head(6)

In [None]:
sleep_duration['duration'] = ((sleep_duration['wakeup'] - sleep_duration['onset']).dt.seconds / 3600).round(0)
sleep_duration.head(6)

In [None]:
sleep_duration['duration'].value_counts().plot(kind='barh')

plt.title('Sleep duration')
plt.xlabel('Frequency counts')
plt.ylabel('Sleep duration in hours')

### So the people sleep for 9 - 10 hours mostly 

## Merge both the datsets based on the user_id

- We can inner join on the timestamp and series_id
- Merge the values so that we can perform EDA on the anglez and enmo

In [None]:
train_series_subset.head()

In [None]:
train_events_subset_nonNULL.head()

In [None]:
train_data_events = pd.merge(train_series_subset, train_events_subset_nonNULL, on=['series_id', 'timestamp', 'datetime'], how='left')
train_data_events.head(5)

In [None]:
train_data_events = train_data_events.drop(['step_y'], axis=1)
train_data_events = train_data_events.rename(columns = {'step_x' : 'step'})

In [None]:
train_data_events.info()

In [None]:
train_data_events[5496:13104]

In [None]:
onset_wakeup_events = train_data_events[~train_data_events['event'].isnull()]
onset_wakeup_events = onset_wakeup_events.reset_index()

In [None]:
onset_wakeup_events.info()

### USEFUL  METHODS FOR DATA PROCESSING

- In this processing step we fill some of the NA values for nights and event
- We create 2 new events sleep and awake to characterize the nights between the onset and wakeup time duration
- 

In [None]:
def attach_nightID_and_sleep_state(
    user_id: str,
    train_series: pd.DataFrame, 
    wakeup_onset: pd.DataFrame,
    init_last_index: int):
    
    train_series = train_series[train_series['series_id'] == user_id]
    train_series = train_series.reset_index()
    wakeup_onset = wakeup_onset[wakeup_onset['series_id'] == user_id]
    wakeup_onset = wakeup_onset.reset_index()
    
    train_series.loc[0: init_last_index, 'night'] = 1.0
    train_series.loc[0: init_last_index, 'event'] = 'awake'
    
    if user_id == 'f564985ab692':
        wakeup_onset = wakeup_onset.drop([1, 3])
    
    for idx in np.arange(0, len(wakeup_onset), 2):
        step_sleep = wakeup_onset.iloc[idx]['step']
        step_wakeup = int(wakeup_onset.iloc[idx + 1]['step'])
        night = wakeup_onset.iloc[idx]['night']
        # print(step_sleep, step_wakeup, night)

        train_series.loc[step_sleep+1: step_wakeup, 'night'] = night
        train_series.loc[step_sleep+1: step_wakeup, 'event'] = 'sleep'

    # print("======")
    for idx in np.arange(1, len(wakeup_onset) -1, 2):
        # print(onset_wakeup_events.iloc[idx])
        night_value = wakeup_onset.iloc[idx]['night']
        step_sleep = wakeup_onset.iloc[idx]['step']
        step_wakeup = int(wakeup_onset.iloc[idx + 1]['step'])
        night = wakeup_onset.iloc[idx + 1]['night']
        # print(step_sleep, step_wakeup, night)

        train_series.loc[step_sleep+1: step_wakeup, 'night'] = night
        train_series.loc[step_sleep+1: step_wakeup, 'event'] = 'awake'
    return train_series, wakeup_onset

### Fix NULLs in the following IDs 

- A : 78569a801a38
- B : f564985ab692
- C : fb223ed2278c
- D : f56824b503a0
- E : cfeb11428dd7

## SOME EDA

## ID : 78569a801a38

In [None]:
# Set night from 0 to 5495 for ID A 

train_data_events_A, wakeup_onset_A = attach_nightID_and_sleep_state(
    train_series=train_data_events.copy(),
    wakeup_onset=onset_wakeup_events.copy(),
    user_id="78569a801a38",
    init_last_index=5496
)

train_data_events_A.sample(10)

In [None]:
def get_unique_nights(data: pd.DataFrame):
    return pd.Series(data['night'].unique()).dropna()

In [None]:
# Plot for
plt.rcParams["figure.figsize"] = (20,60)
nights = get_unique_nights(train_data_events_A)
for idx, night in enumerate(nights):
    anglez_for_series_A = train_data_events_A[train_data_events_A['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(anglez_for_series_A['step'], anglez_for_series_A['anglez'])

    anglez_for_onset_info_A = wakeup_onset_A[wakeup_onset_A['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(anglez_for_onset_info_A['step'], anglez_for_onset_info_A['anglez'], color='red')

In [None]:
# Plot for 
nights = get_unique_nights(train_data_events_A)
for idx, night in enumerate(nights):
    enmo_for_series_A = train_data_events_A[train_data_events_A['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(enmo_for_series_A['step'], enmo_for_series_A['enmo'])

    enmo_for_onset_series_A = wakeup_onset_A[wakeup_onset_A['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(enmo_for_onset_series_A['step'], enmo_for_onset_series_A['enmo'], color='red')

### ID : f564985ab692

In [None]:
# Set night from 0 to 5495 for ID A 

train_data_events_B, wakeup_onset_B = attach_nightID_and_sleep_state(
    train_series=train_data_events.copy(),
    wakeup_onset=onset_wakeup_events.copy(),
    user_id="f564985ab692",
    init_last_index=5640
)

wakeup_onset_B.head(10)

In [None]:
plt.rcParams["figure.figsize"] = (20,20)
# Plot for 
nights = get_unique_nights(train_data_events_B)
for idx, night in enumerate(nights):
    anglez_for_series_B = train_data_events_B[train_data_events_B['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(anglez_for_series_B['step'], anglez_for_series_B['anglez'])

    anglez_for_onset_series_B = wakeup_onset_B[wakeup_onset_B['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(anglez_for_onset_series_B['step'], anglez_for_onset_series_B['anglez'], color='red')

In [None]:
# Plot for 
plt.rcParams["figure.figsize"] = (20,20)
# Plot for 
nights = get_unique_nights(train_data_events_B)
for idx, night in enumerate(nights):
    enmo_for_series_B = train_data_events_B[train_data_events_B['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(enmo_for_series_B['step'], enmo_for_series_B['enmo'])

    enmo_for_onset_series_B = wakeup_onset_B[wakeup_onset_B['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(enmo_for_onset_series_B['step'], enmo_for_onset_series_B['enmo'], color='red')

### ID : fb223ed2278c

In [None]:
# Set night from 0 to 5495 for ID A 

train_data_events_C, wakeup_onset_C = attach_nightID_and_sleep_state(
    train_series=train_data_events.copy(),
    wakeup_onset=onset_wakeup_events.copy(),
    user_id="fb223ed2278c",
    init_last_index=7536
)

wakeup_onset_C.head()

In [None]:
# Plot for 
# Plot for
plt.rcParams["figure.figsize"] = (20,20)
# Plot for 
nights = get_unique_nights(train_data_events_C)
for idx, night in enumerate(nights):
    enmo_for_series_C = train_data_events_C[train_data_events_C['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(enmo_for_series_C['step'], enmo_for_series_C['enmo'])

    enmo_for_onset_series_C = wakeup_onset_C[wakeup_onset_C['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(enmo_for_onset_series_C['step'], enmo_for_onset_series_C['enmo'], color='red')

In [None]:
# Plot for
plt.rcParams["figure.figsize"] = (20,20)
# Plot for 
nights = get_unique_nights(train_data_events_C)
for idx, night in enumerate(nights):
    anglez_for_series_C = train_data_events_C[train_data_events_C['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(anglez_for_series_C['step'], anglez_for_series_C['anglez'])

    anglez_for_onset_series_C = wakeup_onset_C[wakeup_onset_C['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(anglez_for_onset_series_C['step'], anglez_for_onset_series_C['anglez'], color='red')

### ID : f56824b503a0

In [None]:
# Set night from 0 to 5495 for ID D

train_data_events_D, wakeup_onset_D = attach_nightID_and_sleep_state(
    train_series=train_data_events.copy(),
    wakeup_onset=onset_wakeup_events.copy(),
    user_id="f56824b503a0",
    init_last_index=24228
)

wakeup_onset_D.head(10)

In [None]:
# Plot for
plt.rcParams["figure.figsize"] = (20,40)
# Plot for 
nights = get_unique_nights(train_data_events_D)
for idx, night in enumerate(nights):
    anglez_for_series_D = train_data_events_D[train_data_events_D['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(anglez_for_series_D['step'], anglez_for_series_D['anglez'])

    anglez_for_onset_series_D = wakeup_onset_D[wakeup_onset_D['night'] == night]
    # print(anglez_for_onset_info_A.shape)
    plt.plot(anglez_for_onset_series_D['step'], anglez_for_onset_series_D['anglez'], color='red')

In [None]:
# Plot for
# Plot for
plt.rcParams["figure.figsize"] = (20,45)
# Plot for 
nights = get_unique_nights(train_data_events_D)
for idx, night in enumerate(nights):
    enmo_for_series_D = train_data_events_D[train_data_events_D['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(enmo_for_series_D['step'], enmo_for_series_D['enmo'])

    enmo_for_onset_series_D = wakeup_onset_D[wakeup_onset_D['night'] == night]
    # print(enmo_for_onset_series_D.shape)
    plt.plot(enmo_for_onset_series_D['step'], enmo_for_onset_series_D['enmo'], color='red')

### ID : cfeb11428dd7

In [None]:
# Set night from 0 to 5495 for ID D

train_data_events_E, wakeup_onset_E = attach_nightID_and_sleep_state(
    train_series=train_data_events.copy(),
    wakeup_onset=onset_wakeup_events.copy(),
    user_id="cfeb11428dd7",
    init_last_index=7200
)

wakeup_onset_E

In [None]:
# Plot for
plt.rcParams["figure.figsize"] = (20,45)
# Plot for 
nights = get_unique_nights(train_data_events_E)
for idx, night in enumerate(nights):
    anglez_for_series_E = train_data_events_E[train_data_events_E['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(anglez_for_series_E['step'], anglez_for_series_E['anglez'])

    anglez_for_onset_series_E = wakeup_onset_E[wakeup_onset_E['night'] == night]
    # print(enmo_for_onset_series_D.shape)
    plt.plot(anglez_for_onset_series_E['step'], anglez_for_onset_series_E['anglez'], color='red')

In [None]:
# Plot for 
# Plot for
plt.rcParams["figure.figsize"] = (20,45)
# Plot for 
nights = get_unique_nights(train_data_events_E)
for idx, night in enumerate(nights):
    enmo_for_series_E = train_data_events_E[train_data_events_E['night'] == night]
    plt.subplot(len(nights), 1, idx + 1)
    plt.title(f'Night : {night}')
    plt.plot(enmo_for_series_E['step'], enmo_for_series_E['enmo'])

    enmo_for_onset_series_E = wakeup_onset_E[wakeup_onset_E['night'] == night]
    # print(enmo_for_onset_series_D.shape)
    plt.plot(enmo_for_onset_series_E['step'], enmo_for_onset_series_E['enmo'], color='red')

In [None]:
train_data_events.event.value_counts()

In [None]:
train_data_events.shape

In [None]:
train_data_events_E.event.value_counts()

In [None]:
wakeup_onset_E.event.value_counts()

In [None]:
train_data_events_E

In [None]:
train_data_events[train_data_events['series_id'] == 'f564985ab692']['event'].value_counts()

In [None]:
train_data_events['series_id'].value_counts()

In [None]:
train_data_events['event'].fillna("NoChange", inplace=True)

In [None]:
def event_cat_mapper(x):
    if x == "NoChange":
        return 0
    elif x == "wakeup":
        return 1
    return 2
train_data_events["event_cat"] = train_data_events['event'].apply(event_cat_mapper)

In [None]:
import matplotlib.pyplot as plt
plt.plot(train_data_events[train_data_events['series_id'] == 'cfeb11428dd7']['event_cat'])
plt.figure(figsize=(10, 1))
# plt.show()

In [None]:
train_data_events["y"] = "NULL"

In [None]:
for idx in train_data_events['series_id'].unique():
    print(idx)
    val = 0
    for index, row in train_data_events[train_data_events['series_id'] == idx].iterrows():
        if row['event'] == "NoChange":
            train_data_events.loc[index, 'y'] = val
        else:
            if row['event'] == 'wakeup':
                val = 0
                train_data_events.loc[index, 'y'] = val
            else:
                val = 1
                train_data_events.loc[index, 'y'] = val

In [None]:
print(train_data_events['series_id'].unique())

In [None]:
train_data_events['y'].isna().sum()

In [None]:
import matplotlib.pyplot as plt
plt.plot(train_data_events[train_data_events['series_id'] == 'f564985ab692']['event_cat'])
plt.plot(train_data_events[train_data_events['series_id'] == 'f564985ab692']['y'])
plt.figure(figsize=(10, 1))

In [None]:
train_data_events

In [None]:
train_data_events['y'].value_counts()

In [None]:
X = train_data_events.drop(columns=['y', 'series_id', 'night', 'event', 'event_cat', 'timestamp', 'datetime'])
y = train_data_events['y'].astype('int')

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
X_train

In [None]:
y_train.value_counts()

In [None]:
y_test.value_counts()

In [None]:
from sklearn.linear_model import LogisticRegression
log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)

In [None]:
log_reg_predictions = log_reg.predict(X_test)
from sklearn.metrics import f1_score, accuracy_score
# f1_score(y_test, log_reg_predictions)
accuracy_score(y_test, log_reg_predictions)

In [None]:
sum(log_reg_predictions)

In [None]:
np.linalg.norm(train_data_events[(train_data_events['y'] == 1) & (train_data_events['series_id'] == 'f564985ab692')]['anglez'].values)

In [None]:
np.linalg.norm(train_data_events[(train_data_events['y'] == 0) & (train_data_events['series_id'] == 'f564985ab692')]['anglez'].values)

In [None]:
np.linalg.norm(train_data_events[(train_data_events['y'] == 1) & (train_data_events['series_id'] == '78569a801a38')]['anglez'].values)

In [None]:
np.linalg.norm(train_data_events[(train_data_events['y'] == 0) & (train_data_events['series_id'] == '78569a801a38')]['anglez'].values)

In [None]:
from sklearn.ensemble import RandomForestClassifier

In [None]:
rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)  # You can adjust the number of estimators as needed
rf_classifier.fit(X_train, y_train)

In [None]:
y_pred = rf_classifier.predict(X_test)
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report


In [None]:
accuracy = accuracy_score(y_test, y_pred) * 100
conf_matrix = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", classification_rep)


In [None]:
from sklearn.tree import DecisionTreeClassifier

dt_classifier = DecisionTreeClassifier(random_state=42)
dt_classifier.fit(X_train, y_train)
y_pred = dt_classifier.predict(X_test)


In [None]:
accuracy = accuracy_score(y_test, y_pred) * 100
conf_matrix = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", classification_rep)


In [None]:
from sklearn.svm import SVC
svm_classifier = SVC(kernel='linear', C=1.0, random_state=42)  # You can adjust the kernel and hyperparameters as needed
svm_classifier.fit(X_train, y_train)
y_pred = svm_classifier.predict(X_test)


In [None]:
accuracy = accuracy_score(y_test, y_pred) * 100
conf_matrix = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", classification_rep)
