# SaaS User Behavior Analysis â€” Data Overview

## Objective
This notebook provides an initial overview of the SaaS user behavior dataset.
The goal is to:
- Understand the structure of the data
- Validate feature ranges and types
- Detect obvious inconsistencies or anomalies

No modeling or assumptions are made at this stage.

In [1]:
import pandas as pd

# Load dataset
df = pd.read_csv("../data/raw/realistic_user_behavior_dataset_1000.csv")

# Basic structural checks
df.shape

(1000, 9)

In [2]:
df.head()

Unnamed: 0,age,income,score,height,weight,visits,clicks,time_spent,target
0,39,59796,0.435,154.8,45.0,8,4,191,1
1,33,48031,0.709,164.3,63.9,5,4,366,0
2,41,45971,0.316,168.3,68.5,9,3,209,1
3,50,48756,0.508,189.0,94.7,9,5,314,0
4,32,35634,0.371,177.0,60.6,11,5,384,1


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 9 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   age         1000 non-null   int64  
 1   income      1000 non-null   int64  
 2   score       1000 non-null   float64
 3   height      1000 non-null   float64
 4   weight      1000 non-null   float64
 5   visits      1000 non-null   int64  
 6   clicks      1000 non-null   int64  
 7   time_spent  1000 non-null   int64  
 8   target      1000 non-null   int64  
dtypes: float64(3), int64(6)
memory usage: 70.4 KB


In [4]:
df.describe()

Unnamed: 0,age,income,score,height,weight,visits,clicks,time_spent,target
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,34.812,42433.235,0.472451,171.846,71.5213,9.256,5.561,371.404,0.29
std,9.462991,16343.081256,0.148494,9.200383,14.64087,3.595747,2.755335,145.022123,0.453989
min,18.0,8000.0,0.0,150.0,45.0,1.0,0.0,30.0,0.0
25%,28.0,31268.25,0.362,165.375,61.5,7.0,4.0,271.75,0.0
50%,35.0,40328.5,0.4715,172.0,71.4,9.0,5.0,354.0,0.0
75%,41.0,51475.75,0.57625,178.0,81.7,11.0,7.0,463.25,1.0
max,65.0,112113.0,0.862,200.0,128.7,22.0,16.0,954.0,1.0


### Initial Data Inspection

- The dataset contains 1000 user sessions.
- No missing values are observed.
- Feature ranges appear realistic for a SaaS product.