In [1]:
import pandas as pd
import numpy as np

In [2]:
house_data = pd.read_csv("House_Rent_Dataset.csv")

In [3]:
house_data.head()

Unnamed: 0,Posted On,BHK,Rent,Size,Floor,Area Type,Area Locality,City,Furnishing Status,Tenant Preferred,Bathroom,Point of Contact
0,2022-05-18,2,10000,1100,Ground out of 2,Super Area,Bandel,Kolkata,Unfurnished,Bachelors/Family,2,Contact Owner
1,2022-05-13,2,20000,800,1 out of 3,Super Area,"Phool Bagan, Kankurgachi",Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
2,2022-05-16,2,17000,1000,1 out of 3,Super Area,Salt Lake City Sector 2,Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
3,2022-07-04,2,10000,800,1 out of 2,Super Area,Dumdum Park,Kolkata,Unfurnished,Bachelors/Family,1,Contact Owner
4,2022-05-09,2,7500,850,1 out of 2,Carpet Area,South Dum Dum,Kolkata,Unfurnished,Bachelors,1,Contact Owner


# Tipe Area

In [4]:
house_data["Area Type"].unique()

array(['Super Area', 'Carpet Area', 'Built Area'], dtype=object)

Fungsi `groupby()` di **pandas** digunakan untuk **mengelompokkan data** berdasarkan satu atau lebih kolom, sehingga kamu bisa melakukan **agregasi** (seperti `sum()`, `mean()`, `count()`, dll) untuk masing-masing grup.

---

### 🔧 **Fungsi Dasar**

```python
df.groupby("kolom")
```

Ini akan mengelompokkan data dalam `df` berdasarkan nilai-nilai unik di kolom tersebut.

---

### 📊 **Ilustrasi:**

Misalnya kamu punya DataFrame seperti ini:

```python
import pandas as pd

data = pd.DataFrame({
    "Tipe Area": ["Urban", "Urban", "Rural", "Suburban", "Rural"],
    "Sewa": [1500, 1600, 800, 1200, 900]
})
```

| Tipe Area | Sewa |
| --------- | ---- |
| Urban     | 1500 |
| Urban     | 1600 |
| Rural     | 800  |
| Suburban  | 1200 |
| Rural     | 900  |

---

### ✅ **Contoh Penggunaan `groupby()`**

#### Rata-rata sewa per tipe area:

```python
data.groupby("Tipe Area")["Sewa"].mean()
```

**Hasil:**

```
Tipe Area
Rural        850.0
Suburban    1200.0
Urban       1550.0
Name: Sewa, dtype: float64
```

#### Total sewa per tipe area:

```python
data.groupby("Tipe Area")["Sewa"].sum()
```

#### Jumlah baris per tipe area:

```python
data.groupby("Tipe Area").size()
```

---

### 📌 Inti Konsep `groupby`:

1. **Split** → Bagi data ke dalam grup berdasarkan kolom tertentu.
2. **Apply** → Lakukan operasi agregasi (sum, mean, count, dsb).
3. **Combine** → Satukan hasilnya dalam satu struktur.

---


In [5]:
rent_by_area = house_data.groupby("Area Type")

In [6]:
rent_by_area.head()

Unnamed: 0,Posted On,BHK,Rent,Size,Floor,Area Type,Area Locality,City,Furnishing Status,Tenant Preferred,Bathroom,Point of Contact
0,2022-05-18,2,10000,1100,Ground out of 2,Super Area,Bandel,Kolkata,Unfurnished,Bachelors/Family,2,Contact Owner
1,2022-05-13,2,20000,800,1 out of 3,Super Area,"Phool Bagan, Kankurgachi",Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
2,2022-05-16,2,17000,1000,1 out of 3,Super Area,Salt Lake City Sector 2,Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
3,2022-07-04,2,10000,800,1 out of 2,Super Area,Dumdum Park,Kolkata,Unfurnished,Bachelors/Family,1,Contact Owner
4,2022-05-09,2,7500,850,1 out of 2,Carpet Area,South Dum Dum,Kolkata,Unfurnished,Bachelors,1,Contact Owner
5,2022-04-29,2,7000,600,Ground out of 1,Super Area,Thakurpukur,Kolkata,Unfurnished,Bachelors/Family,2,Contact Owner
8,2022-06-07,2,26000,800,1 out of 2,Carpet Area,"Palm Avenue Kolkata, Ballygunge",Kolkata,Unfurnished,Bachelors,2,Contact Agent
9,2022-06-20,2,10000,1000,1 out of 3,Carpet Area,Natunhat,Kolkata,Semi-Furnished,Bachelors/Family,2,Contact Owner
10,2022-05-23,3,25000,1200,1 out of 4,Carpet Area,"Action Area 1, Rajarhat Newtown",Kolkata,Semi-Furnished,Bachelors/Family,2,Contact Agent
11,2022-06-07,1,5000,400,1 out of 1,Carpet Area,Keshtopur,Kolkata,Unfurnished,Bachelors/Family,1,Contact Agent


Fungsi `.get_group()` di **pandas** digunakan untuk **mengambil data dari satu grup tertentu** setelah kamu melakukan `groupby()`.

---

### 📌 Format Umum:

```python
group = df.groupby("Kolom")
group.get_group("nilai_grup")
```

---

### 📊 Contoh:

Misalnya kamu punya DataFrame seperti ini:

```python
import pandas as pd

df = pd.DataFrame({
    "Tipe Area": ["Urban", "Urban", "Rural", "Suburban", "Rural"],
    "Sewa": [1500, 1600, 800, 1200, 900]
})
```

Kamu ingin melihat semua baris yang termasuk dalam `"Tipe Area" = "Rural"`:

```python
grouped = df.groupby("Tipe Area")
rural_data = grouped.get_group("Rural")
print(rural_data)
```

**Output:**

```
  Tipe Area  Sewa
2     Rural   800
4     Rural   900
```

---

### ✅ Kegunaan `get_group()`:

* Mengambil **semua baris asli** dari grup tertentu.
* Cocok kalau kamu ingin **melihat data mentah** dari satu grup spesifik, bukan hanya hasil agregasi.

---

### ⚠️ Catatan:

* Pastikan nilai grup yang kamu minta **benar-benar ada** di kolom pengelompokan, kalau tidak akan muncul error `KeyError`.

---


In [7]:
super_area = rent_by_area.get_group("Super Area")
super_area.head()

Unnamed: 0,Posted On,BHK,Rent,Size,Floor,Area Type,Area Locality,City,Furnishing Status,Tenant Preferred,Bathroom,Point of Contact
0,2022-05-18,2,10000,1100,Ground out of 2,Super Area,Bandel,Kolkata,Unfurnished,Bachelors/Family,2,Contact Owner
1,2022-05-13,2,20000,800,1 out of 3,Super Area,"Phool Bagan, Kankurgachi",Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
2,2022-05-16,2,17000,1000,1 out of 3,Super Area,Salt Lake City Sector 2,Kolkata,Semi-Furnished,Bachelors/Family,1,Contact Owner
3,2022-07-04,2,10000,800,1 out of 2,Super Area,Dumdum Park,Kolkata,Unfurnished,Bachelors/Family,1,Contact Owner
5,2022-04-29,2,7000,600,Ground out of 1,Super Area,Thakurpukur,Kolkata,Unfurnished,Bachelors/Family,2,Contact Owner


In [8]:
rent_by_area.get_group("Carpet Area")

Unnamed: 0,Posted On,BHK,Rent,Size,Floor,Area Type,Area Locality,City,Furnishing Status,Tenant Preferred,Bathroom,Point of Contact
4,2022-05-09,2,7500,850,1 out of 2,Carpet Area,South Dum Dum,Kolkata,Unfurnished,Bachelors,1,Contact Owner
8,2022-06-07,2,26000,800,1 out of 2,Carpet Area,"Palm Avenue Kolkata, Ballygunge",Kolkata,Unfurnished,Bachelors,2,Contact Agent
9,2022-06-20,2,10000,1000,1 out of 3,Carpet Area,Natunhat,Kolkata,Semi-Furnished,Bachelors/Family,2,Contact Owner
10,2022-05-23,3,25000,1200,1 out of 4,Carpet Area,"Action Area 1, Rajarhat Newtown",Kolkata,Semi-Furnished,Bachelors/Family,2,Contact Agent
11,2022-06-07,1,5000,400,1 out of 1,Carpet Area,Keshtopur,Kolkata,Unfurnished,Bachelors/Family,1,Contact Agent
...,...,...,...,...,...,...,...,...,...,...,...,...
4739,2022-07-06,2,25000,1040,2 out of 4,Carpet Area,Gachibowli,Hyderabad,Unfurnished,Bachelors,2,Contact Owner
4741,2022-05-18,2,15000,1000,3 out of 5,Carpet Area,Bandam Kommu,Hyderabad,Semi-Furnished,Bachelors/Family,2,Contact Owner
4743,2022-07-10,3,35000,1750,3 out of 5,Carpet Area,"Himayath Nagar, NH 7",Hyderabad,Semi-Furnished,Bachelors/Family,3,Contact Agent
4744,2022-07-06,3,45000,1500,23 out of 34,Carpet Area,Gachibowli,Hyderabad,Semi-Furnished,Family,2,Contact Agent


In [None]:
area_local = house_data["Area Locality"].unique()

2235