**Gender Behaviour Analysis**

# Analisis Perilaku Pengguna: Gender, Swipe Right Ratio, dan Likes

Notebook ini merangkum hasil analisis eksplorasi data terkait perilaku pengguna aplikasi
dating berdasarkan gender. Analisis berfokus pada:

1. Perbedaan **Swipe Right Ratio** antar **Gender**.
2. Jumlah **Likes Received** berdasarkan **Gender**.
3. Hubungan antara **App Usage Time(minutes)** dengan **Likes Receive**.
4. Persebaran **Sexual Orientation** berdasar **Gender** dan analisa Chi square untuk melihat korelasinya.

Dataset berasal dari database `DatingSQL` dengan total ~50 ribu baris data.


**Import dan Load Data**


Pada bagian ini akan digunakan untuk mengimpor semua library dan juga memberikan ringkasan statistik tahap awal.

In [8]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import f_oneway

# Import fungsi dari utils & script EDA
import sys, os
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))

from utils.db_connection import load_columns
from utils.cleaning import clean_numeric

# Load data
df_swipe = load_columns(['gender', 'swipe_right_ratio', 'likes_received', 'app_usage_time_min','sexual_orientation'])
df_swipe.describe()

  df = pd.read_sql_query(query, conn)


Unnamed: 0,swipe_right_ratio,likes_received,app_usage_time_min
count,50000.0,50000.0,50000.0
mean,0.500655,99.52604,149.9124
std,0.197468,57.996799,86.990521
min,0.0,0.0,0.0
25%,0.37,49.0,74.0
50%,0.5,100.0,150.0
75%,0.64,150.0,225.0
max,1.0,200.0,300.0


## 1. Perbandingan Swipe Right Ratio antar Gender

Pertanyaan: Apakah terdapat perbedaan rata-rata swipe right ratio antar gender?


In [9]:
# Bersihkan data
df_swipe = clean_numeric(df.dropna(subset=['gender']), 'swipe_right_ratio')

# Hitung mean
mean_swipe = df_swipe.groupby('gender')['swipe_right_ratio'].mean().sort_values(ascending=False)
print("Mean Swipe Right Ratio berdasarkan Gender:")
print(mean_swipe)

# Uji ANOVA
groups = [g['swipe_right_ratio'] for _, g in df_swipe.groupby('gender')]
stat, p = f_oneway(*groups)
print(f"\nANOVA Result → F: {stat:.3f}, p-value: {p:.5f}")

# Visualisasi
plt.figure(figsize=(8, 5))
sns.boxplot(data=df_swipe, x='gender', y='swipe_right_ratio', hue='gender', palette='Set2', legend=False)
plt.title('Distribusi Swipe Right Ratio Berdasarkan Gender')
plt.xticks(rotation=15)
plt.tight_layout()
plt.show()


NameError: name 'df' is not defined

**Insight**  
- Rata-rata swipe right ratio terbagi secara merata pada setiap gendernya 
- ANOVA analisis menunjukkan tidak terdapat perbedaan signifikan yang dihasilkan oleh antar gender

➡ Menunjukkan adanya variasi perilaku swipe right berdasarkan gender.
