# Real Estate Transaction Map Visualization

This notebook demonstrates how to visualize real estate transaction data from Japan's MLIT API using the `jiken` library.

We'll use:
- **jiken** — fetch real estate transaction data
- **pandas** — process data into a DataFrame
- **geopy** — geocode addresses to latitude/longitude
- **plotly** — render an interactive map

![Example map output](https://github.com/user-attachments/assets/4dbddf4f-b665-4396-877d-2365d598906d)

## Prerequisites

You need an API key from [MLIT Real Estate Information Library](https://www.reinfolib.mlit.go.jp/).

Install the required packages:

In [None]:
# Install dependencies
# !uv pip install jiken pandas plotly geopy

## Configuration

Edit the variables below to match your query.

In [None]:
from typing import Optional

# API key obtained from MLIT Real Estate Information Library
API_KEY: str = "YOUR_API_KEY_HERE"

# Query parameters
YEAR: int = 2023
AREA: Optional[str] = "13"      # Prefecture code (e.g. "13" = Tokyo)
CITY: Optional[str] = "13103"   # Municipality code (e.g. "13103" = Minato-ku)
QUARTER: Optional[int] = None   # 1–4, or None for all quarters

# Number of records to geocode (0 = all; large values take time)
SAMPLE_SIZE: int = 50

# Output file
OUTPUT_FILE: str = "real_estate_map.html"

## Step 1 — Fetch Transaction Data

`JikenClient.search_transactions` returns a list of `Transaction` objects.
We convert them to a pandas DataFrame for easier processing.

In [None]:
import pandas as pd
from jiken import JikenClient, SearchCondition


def fetch_transactions(
    api_key: str,
    year: int,
    area: Optional[str] = None,
    city: Optional[str] = None,
    quarter: Optional[int] = None,
    language: str = "en",
) -> pd.DataFrame:
    """Fetch real estate transactions and return as a DataFrame.

    Args:
        api_key: MLIT API key.
        year: Target year.
        area: Prefecture code (optional).
        city: Municipality code (optional).
        quarter: Quarter 1–4 (optional).
        language: "en" or "ja".

    Returns:
        DataFrame of transaction records.
    """
    client = JikenClient(api_key=api_key)
    condition = SearchCondition(
        year=year, area=area, city=city, quarter=quarter, language=language
    )
    transactions = client.search_transactions(condition)

    if not transactions:
        print("No data found.")
        return pd.DataFrame()

    records = [
        {
            "TradePrice": tx.transaction_price,
            "Area": tx.area,
            "UnitPrice": tx.unit_price,
            "Prefecture": tx.prefecture,
            "Municipality": tx.city,
            "DistrictName": tx.district,
            "BuildingYear": tx.building_year,
            "Type": tx.property_type,
            "Structure": tx.structure,
            "FloorAreaRatio": tx.floor_area_ratio,
            "CoverageRatio": tx.building_coverage,
            "Frontage": tx.frontage_road_width,
            "Period": tx.transaction_period,
        }
        for tx in transactions
    ]

    df = pd.DataFrame(records)
    print(f"Fetched {len(df)} records.")
    return df


df_en = fetch_transactions(
    api_key=API_KEY, year=YEAR, area=AREA, city=CITY, quarter=QUARTER, language="en"
)
df_en.head()

## Step 2 — Geocode Addresses

We use [Nominatim](https://nominatim.org/) (OpenStreetMap) to convert addresses to latitude/longitude.

Nominatim works best with Japanese address strings, so we fetch the same data again in Japanese (`language="ja"`) for geocoding, then merge the coordinates back into the English DataFrame.

In [None]:
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim


def geocode_dataframe(
    df: pd.DataFrame,
    api_key: str,
    year: int,
    area: Optional[str],
    city: Optional[str],
    quarter: Optional[int],
    sample_size: int = 0,
) -> pd.DataFrame:
    """Add latitude/longitude columns to the DataFrame via geocoding.

    Fetches Japanese address strings from the API for accurate geocoding,
    then merges coordinates into the original (English) DataFrame.

    Args:
        df: English transaction DataFrame.
        api_key: MLIT API key.
        year: Target year.
        area: Prefecture code.
        city: Municipality code.
        quarter: Quarter 1–4.
        sample_size: Records to geocode (0 = all).

    Returns:
        DataFrame with "latitude" and "longitude" columns added.
    """
    if sample_size > 0:
        df = df.sample(n=min(sample_size, len(df)), random_state=42).copy()
        print(f"Sampled {len(df)} records for geocoding.")

    # Fetch Japanese addresses for Nominatim compatibility
    df_ja = fetch_transactions(
        api_key=api_key, year=year, area=area, city=city, quarter=quarter, language="ja"
    )
    df_ja = df_ja[["Prefecture", "Municipality", "DistrictName"]].rename(
        columns={
            "Prefecture": "Prefecture_ja",
            "Municipality": "Municipality_ja",
            "DistrictName": "DistrictName_ja",
        }
    )

    # Align row count and merge
    df = df.reset_index(drop=True)
    df_ja = df_ja.iloc[: len(df)].reset_index(drop=True)
    df = pd.concat([df, df_ja], axis=1)

    geolocator = Nominatim(user_agent="jiken_map_tutorial")
    geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1)

    print(f"Geocoding {len(df)} records (~{len(df)} seconds) …")

    latitudes, longitudes = [], []
    for i, row in df.iterrows():
        if i % 10 == 0:
            print(f"  {i}/{len(df)}")
        address = "".join(
            str(row.get(col, "") or "")
            for col in ["Prefecture_ja", "Municipality_ja", "DistrictName_ja"]
        )
        try:
            location = geocode(address)
            latitudes.append(location.latitude if location else None)
            longitudes.append(location.longitude if location else None)
        except Exception:
            latitudes.append(None)
            longitudes.append(None)

    df["latitude"] = latitudes
    df["longitude"] = longitudes

    success = df["latitude"].notna().sum()
    print(f"Geocoding complete: {success}/{len(df)} succeeded.")
    return df


df_geo = geocode_dataframe(
    df=df_en,
    api_key=API_KEY,
    year=YEAR,
    area=AREA,
    city=CITY,
    quarter=QUARTER,
    sample_size=SAMPLE_SIZE,
)
df_geo[["Prefecture", "Municipality", "DistrictName", "latitude", "longitude"]].head()

## Step 3 — Visualize on a Map

We use Plotly's `scatter_map` to render an interactive map.
Each marker represents one transaction; size encodes the trade price.

In [None]:
import plotly.express as px


def create_map(df: pd.DataFrame, output_file: str = "real_estate_map.html") -> None:
    """Render an interactive real estate map and save it as HTML.

    Args:
        df: DataFrame with latitude, longitude, and transaction columns.
        output_file: Path for the HTML output file.
    """
    df = df.dropna(subset=["latitude", "longitude"]).copy()
    if df.empty:
        print("No geocoded records to plot.")
        return

    df["TradePrice"] = pd.to_numeric(df["TradePrice"], errors="coerce")
    df["Area"] = pd.to_numeric(df["Area"], errors="coerce")

    fig = px.scatter_map(
        df,
        lat="latitude",
        lon="longitude",
        color="Type",
        size="TradePrice",
        hover_name="DistrictName",
        hover_data={
            "Prefecture": True,
            "Municipality": True,
            "TradePrice": ":,.0f JPY",
            "Area": ":.1f m²",
            "BuildingYear": True,
            "Period": True,
            "latitude": False,
            "longitude": False,
        },
        title="Real Estate Transactions (jiken)",
        zoom=12,
        height=700,
        size_max=30,
    )
    fig.update_layout(margin={"r": 0, "t": 50, "l": 0, "b": 0})
    fig.write_html(output_file)
    print(f"Map saved to: {output_file}")
    fig.show()


create_map(df_geo, OUTPUT_FILE)

## Reference: Common Prefecture and City Codes

### Prefecture codes (area)

| Code | Prefecture |
|------|------------|
| `01` | Hokkaido   |
| `13` | Tokyo      |
| `14` | Kanagawa   |
| `23` | Aichi      |
| `27` | Osaka      |
| `40` | Fukuoka    |

### City codes (city) — Tokyo wards

| Code    | Ward         |
|---------|--------------|
| `13101` | Chiyoda-ku   |
| `13102` | Chuo-ku      |
| `13103` | Minato-ku    |
| `13104` | Shinjuku-ku  |
| `13113` | Shibuya-ku   |

For the full list, see the [MLIT API documentation](https://www.reinfolib.mlit.go.jp/help/apiManual/).