
# EDA Geospatial Data – **LEAFMAP ONLY v2a** 🇮🇩

Notebook ini **hanya** menggunakan `leafmap` untuk semua visualisasi geospasial (tanpa 3D).  
Warna peta menggunakan **gradasi merah → hijau** (merah = intensitas tertinggi).  
Tujuan: *Run All* langsung menghasilkan peta interaktif PU/DO (pickup/dropoff), heatmap, cluster, dan (opsional) choropleth jika batas wilayah tersedia.

**Apa yang disediakan di notebook ini:**
1. Setup dan import paket yang dibutuhkan.
2. Load dataset *trip* (PU/DO), POI (opsional), dan batas wilayah (opsional).
3. Pembersihan ringan dan konversi ke `GeoDataFrame`.
4. Visualisasi interaktif berbasis `leafmap`:
   - Heatmap PU & DO (gradasi merah→hijau).
   - Cluster titik PU & DO (toggle layer).
   - (Opsional) Choropleth agregasi PU/DO per wilayah jika boundary poligon tersedia.
5. Ekspor peta ke HTML (opsional).
6. Markdown deskriptif ringkas terkait insight.

> **Catatan:** Nama kolom longitude/latitude akan dideteksi otomatis dari beberapa variasi umum.


In [12]:

# (Opsional) Install paket jika diperlukan.
# Jalankan sekali jika environment belum punya paket-paket berikut.
# Jika sudah terpasang, blok ini bisa dilewati.
try:
    import leafmap
except ImportError:
    %pip install -q leafmap geopandas shapely pyproj rtree

# Paket standar (akan di-import lagi di sel berikut untuk memastikan)


In [13]:

import os
import math
import json
import pandas as pd
import numpy as np
import geopandas as gpd
from shapely import wkt
from shapely.geometry import Point, Polygon
import leafmap

pd.set_option('display.max_columns', 50)


In [14]:

# ====== KONFIGURASI FILE INPUT ======
# Ubah path sesuai lokasi file Anda bila perlu.
CANDIDATE_DIRS = [
    ".",                # current dir
    "./data",
    "/mnt/data"         # jika Anda menjalankan di lingkungan dengan direktori ini
]

# Nama-nama default file berdasarkan konteks proyek Anda
TRIPS_FILE_NAMES = [
    "jakarta_ride_trips_5000.csv",
    "ride_trips.csv",
    "trips.csv"
]
POIS_FILE_NAMES = [
    "jakarta_pois.csv",
    "pois.csv"
]
AREAS_FILE_NAMES = [
    "jakarta_areas.csv",
    "areas.csv",
    "areas.geojson",
    "areas.geo.csv"
]

def find_first_existing(candidates, filenames):
    for d in candidates:
        for f in filenames:
            path = os.path.join(d, f)
            if os.path.exists(path):
                return path
    return None

TRIPS_PATH = find_first_existing(CANDIDATE_DIRS, TRIPS_FILE_NAMES)
POIS_PATH = find_first_existing(CANDIDATE_DIRS, POIS_FILE_NAMES)
AREAS_PATH = find_first_existing(CANDIDATE_DIRS, AREAS_FILE_NAMES)

print("TRIPS_PATH :", TRIPS_PATH)
print("POIS_PATH  :", POIS_PATH)
print("AREAS_PATH :", AREAS_PATH)


TRIPS_PATH : None
POIS_PATH  : None
AREAS_PATH : None


In [15]:

# ====== LOAD DATA ======
df_trips = None
df_pois = None
gdf_areas = None

if TRIPS_PATH and os.path.exists(TRIPS_PATH):
    df_trips = pd.read_csv(TRIPS_PATH)
    print("Loaded trips:", df_trips.shape)
else:
    print("Peringatan: File trips tidak ditemukan. Harap set TRIPS_PATH manual.")

if POIS_PATH and os.path.exists(POIS_PATH):
    try:
        df_pois = pd.read_csv(POIS_PATH)
        print("Loaded POIs:", df_pois.shape)
    except Exception as e:
        print("POI tidak terbaca sebagai CSV:", e)
else:
    print("Info: File POIs opsional tidak ditemukan. Melanjutkan tanpa POI.")

# Area boundaries (bisa CSV dengan kolom WKT 'geometry' atau file geojson)
if AREAS_PATH and os.path.exists(AREAS_PATH):
    if AREAS_PATH.lower().endswith(".geojson"):
        gdf_areas = gpd.read_file(AREAS_PATH)
    else:
        try:
            df_areas = pd.read_csv(AREAS_PATH)
            # Deteksi kolom geometri (WKT) umum
            geom_cols = [c for c in df_areas.columns if c.lower() in ["geometry", "wkt", "geom"]]
            if geom_cols:
                geom_col = geom_cols[0]
                gdf_areas = gpd.GeoDataFrame(
                    df_areas.copy(),
                    geometry=df_areas[geom_col].apply(lambda x: wkt.loads(x) if isinstance(x, str) and x.strip().startswith(("POLY", "MULTI", "LINE", "POINT")) else None),
                    crs="EPSG:4326"
                )
            else:
                print("Tidak menemukan kolom WKT pada areas CSV. Boundary akan di-skip.")
                gdf_areas = None
        except Exception as e:
            print("Gagal membaca areas. Boundary akan di-skip:", e)
else:
    print("Info: File area boundaries opsional tidak ditemukan. Melanjutkan tanpa boundary.")


Peringatan: File trips tidak ditemukan. Harap set TRIPS_PATH manual.
Info: File POIs opsional tidak ditemukan. Melanjutkan tanpa POI.
Info: File area boundaries opsional tidak ditemukan. Melanjutkan tanpa boundary.


In [16]:

# ====== QUICK EDA ======
if df_trips is not None:
    display(df_trips.head(3))
    print("\nKolom tersedia:", list(df_trips.columns))
    print("\nJumlah baris:", len(df_trips))
    # Kolom kandidat untuk lon/lat (pickup & dropoff)
    candidate_lon = [c for c in df_trips.columns if "lon" in c.lower() or "lng" in c.lower() or "longitude" in c.lower()]
    candidate_lat = [c for c in df_trips.columns if "lat" in c.lower() or "latitude" in c.lower()]
    print("\nDeteksi kandidat longitude:", candidate_lon)
    print("Deteksi kandidat latitude :", candidate_lat)
else:
    print("Lewati EDA karena data trips belum tersedia.")


Lewati EDA karena data trips belum tersedia.


In [17]:

# ====== FUNGSI BANTU: DETEKSI NAMA KOLOM ======
def first_match(cols, keywords):
    cols_l = [c.lower() for c in cols]
    for kw in keywords:
        for c in cols:
            if kw == c.lower():
                return c
    return None

def guess_lon_lat(df, prefix_candidates):
    # cari kolom seperti: pickup_longitude / pickup_latitude, atau dropoff_longitude / dropoff_latitude
    for pre in prefix_candidates:
        lon = first_match(df.columns, [f"{pre}_longitude", f"{pre}_lon", f"{pre}_lng"])
        lat = first_match(df.columns, [f"{pre}_latitude", f"{pre}_lat"])
        if lon and lat:
            return lon, lat
    return None, None

def to_gdf_points(df, lon_col, lat_col, crs="EPSG:4326"):
    good = df.dropna(subset=[lon_col, lat_col]).copy()
    good = good[(good[lon_col].astype(str).str.len()>0) & (good[lat_col].astype(str).str.len()>0)]
    good[lon_col] = pd.to_numeric(good[lon_col], errors="coerce")
    good[lat_col] = pd.to_numeric(good[lat_col], errors="coerce")
    good = good.dropna(subset=[lon_col, lat_col])
    gdf = gpd.GeoDataFrame(good, geometry=gpd.points_from_xy(good[lon_col], good[lat_col]), crs=crs)
    return gdf

gdf_pu = None
gdf_do = None

if df_trips is not None:
    pu_lon, pu_lat = guess_lon_lat(df_trips, ["pickup", "pu", "start"])
    do_lon, do_lat = guess_lon_lat(df_trips, ["dropoff", "do", "end"])

    if pu_lon and pu_lat:
        gdf_pu = to_gdf_points(df_trips, pu_lon, pu_lat)
        print(f"GeoDataFrame PU: {gdf_pu.shape}, kolom: {pu_lon}, {pu_lat}")
    else:
        print("Tidak menemukan kolom PU lon/lat yang cocok.")

    if do_lon and do_lat:
        gdf_do = to_gdf_points(df_trips, do_lon, do_lat)
        print(f"GeoDataFrame DO: {gdf_do.shape}, kolom: {do_lon}, {do_lat}")
    else:
        print("Tidak menemukan kolom DO lon/lat yang cocok.")
else:
    print("Lewati konversi GeoDataFrame karena data trips belum tersedia.")


Lewati konversi GeoDataFrame karena data trips belum tersedia.


In [18]:

# ====== TEBAK PUSAT PETA (Jakarta fallback) ======
def guess_center(gdf_list):
    for gdf in gdf_list:
        if gdf is not None and len(gdf) > 0:
            y = gdf.geometry.y.median()
            x = gdf.geometry.x.median()
            if not (pd.isna(x) or pd.isna(y)):
                return [y, x]
    return [-6.2, 106.8]  # Jakarta fallback

map_center = guess_center([gdf_pu, gdf_do])
map_center


[-6.2, 106.8]

In [19]:

# ====== PETA 1: HEATMAP PU & DO (MERAH = TINGGI) ======
m_heat = leafmap.Map(center=map_center, zoom=11)
m_heat.add_basemap("CartoDB.Positron")

# Gradient heatmap merah-tinggi hijau-rendah
gradient_red_high = {
    0.0: "green",
    0.5: "yellow",
    1.0: "red",
}

if gdf_pu is not None and len(gdf_pu) > 0:
    m_heat.add_heatmap(
        data=gdf_pu,
        latitude=gdf_pu.geometry.y.name,  # y
        longitude=gdf_pu.geometry.x.name, # x
        name="PU Heatmap",
        radius=20,
        gradient=gradient_red_high,
        blur=15,
        min_opacity=0.2,
        max_zoom=18,
    )

if gdf_do is not None and len(gdf_do) > 0:
    m_heat.add_heatmap(
        data=gdf_do,
        latitude=gdf_do.geometry.y.name,
        longitude=gdf_do.geometry.x.name,
        name="DO Heatmap",
        radius=20,
        gradient=gradient_red_high,
        blur=15,
        min_opacity=0.2,
        max_zoom=18,
    )

m_heat.add_layer_control()
m_heat


Map(center=[-6.2, 106.8], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_ou…

In [20]:

# ====== PETA 2: CLUSTER TITIK PU & DO ======
m_cluster = leafmap.Map(center=map_center, zoom=11)
m_cluster.add_basemap("CartoDB.Positron")

if gdf_pu is not None and len(gdf_pu) > 0:
    m_cluster.add_points_from_xy(
        gdf_pu,
        x=gdf_pu.geometry.x.name,
        y=gdf_pu.geometry.y.name,
        name="PU (Cluster)",
        add_heatmap=False,
        cluster=True,
        icon_colors=["red"],  # marker merah untuk PU
    )

if gdf_do is not None and len(gdf_do) > 0:
    m_cluster.add_points_from_xy(
        gdf_do,
        x=gdf_do.geometry.x.name,
        y=gdf_do.geometry.y.name,
        name="DO (Cluster)",
        add_heatmap=False,
        cluster=True,
        icon_colors=["green"],  # marker hijau untuk DO
    )

m_cluster.add_layer_control()
m_cluster


Map(center=[-6.2, 106.8], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_ou…

In [21]:

# ====== (OPSIONAL) PETA 3: CHOROPLETH AGREGASI PER WILAYAH ======
# Syarat: gdf_areas harus poligon dan gdf_pu/gdf_do ada.
def choropleth_by_area(gdf_points, gdf_polygons, value_name):
    tmp = gpd.sjoin(gdf_points.to_crs(4326), gdf_polygons.to_crs(4326), how="left", predicate="within")
    key_col = None
    # cari satu kolom identitas wilayah yang "bermakna"
    for c in gdf_polygons.columns:
        if c.lower() in ["name", "nama", "kecamatan", "kelurahan", "kabupaten", "kota", "provinsi", "id", "kode"]:
            key_col = c
            break
    if key_col is None:
        key_col = "index_right"
    agg = tmp.groupby(key_col).size().reset_index(name=value_name)
    gdf_poly2 = gdf_polygons.copy()
    gdf_poly2 = gdf_poly2.merge(agg, on=key_col, how="left")
    gdf_poly2[value_name] = gdf_poly2[value_name].fillna(0)
    return gdf_poly2, key_col

m_choro = None
if gdf_areas is not None and gdf_areas.geometry.iloc[0] is not None:
    m_choro = leafmap.Map(center=map_center, zoom=10)
    m_choro.add_basemap("CartoDB.Positron")

    if gdf_pu is not None and len(gdf_pu) > 0:
        gdf_pu_area, key_pu = choropleth_by_area(gdf_pu, gdf_areas, "PU_Count")
        m_choro.add_gdf(
            gdf_pu_area,
            layer_name="Choropleth PU",
            column="PU_Count",
            cmap="RdYlGn_r",  # merah tinggi → hijau rendah
            scheme="Quantiles",
            k=5,
            legend_title="Jumlah PU",
            opacity=0.7,
        )

    if gdf_do is not None and len(gdf_do) > 0:
        gdf_do_area, key_do = choropleth_by_area(gdf_do, gdf_areas, "DO_Count")
        m_choro.add_gdf(
            gdf_do_area,
            layer_name="Choropleth DO",
            column="DO_Count",
            cmap="RdYlGn_r",
            scheme="Quantiles",
            k=5,
            legend_title="Jumlah DO",
            opacity=0.7,
        )

    m_choro.add_layer_control()

m_choro if m_choro is not None else print("Choropleth dilewati karena boundary tidak tersedia.")


Choropleth dilewati karena boundary tidak tersedia.


In [22]:

# ====== EKSPOR KE HTML (OPSIONAL) ======
EXPORT_DIR = "outputs"
os.makedirs(EXPORT_DIR, exist_ok=True)

heat_html = os.path.join(EXPORT_DIR, "leafmap_heatmap_pu_do.html")
cluster_html = os.path.join(EXPORT_DIR, "leafmap_cluster_pu_do.html")
choro_html = os.path.join(EXPORT_DIR, "leafmap_choropleth.html")

try:
    m_heat.to_html(heat_html)
    print("Saved:", heat_html)
except Exception as e:
    print("Gagal simpan heatmap:", e)

try:
    m_cluster.to_html(cluster_html)
    print("Saved:", cluster_html)
except Exception as e:
    print("Gagal simpan cluster:", e)

if 'm_choro' in globals() and m_choro is not None:
    try:
        m_choro.to_html(choro_html)
        print("Saved:", choro_html)
    except Exception as e:
        print("Gagal simpan choropleth:", e)


Saved: outputs\leafmap_heatmap_pu_do.html
Saved: outputs\leafmap_cluster_pu_do.html



## Ringkasan & Catatan Insight (Contoh)

- **Heatmap PU** menunjukkan area panas lokasi penjemputan. Zona berwarna **merah** adalah titik dengan kepadatan tertinggi.
- **Heatmap DO** menyoroti area pengantaran padat. Bandingkan layer PU vs DO untuk melihat **ketidakseimbangan** supply-demand.
- **Cluster PU/DO** memudahkan eksplorasi titik individual dan sebarannya.
- **Choropleth** (jika boundary tersedia) menyajikan agregasi per wilayah (mis. kecamatan/kelurahan) untuk membantu **prioritas intervensi** (zona pickup yang perlu signage, kandidat *virtual pickup point*, dsb.).

> **Tips:** Anda bisa menggunakan tombol **Layer Control** di pojok kanan atas peta untuk **toggle** (menyalakan/mematikan) layer PU/DO atau tipe peta tertentu.



---

**Notebook generated:** 2025-08-17 05:09:29 UTC  
Author: *Converted to Leafmap-only by ChatGPT*

