## Geocode Wards of Quảng Ngãi (deprecated)

## This process is no longer part of the active pipeline and is retained only for traceability.

Purpose: Assign geographic coordinates (latitude, longitude) to each ward/commune in Quảng Ngãi using the Nominatim (OpenStreetMap) API:

- Read the ward list from quang_ngai_wards.csv.
- Use the geocode_location function to query Nominatim API and retrieve coordinates for each ward.
- Handle network errors and cases where no results are returned.
- Save the results, including ward names and coordinates, to quang_ngai_wards_geocoded.csv.

### 1. Setup


In [1]:
import pandas as pd
import requests
import time
from pathlib import Path
from tqdm import tqdm

### 2. Paths


In [2]:
project_root = Path.cwd()
input_csv_path = project_root / Path("..") / "resources" / "quang_ngai_wards.csv"
output_csv_path = project_root / Path("..") / "resources" / "quang_ngai_wards_geocoded.csv"

print(f"Input file path: {input_csv_path.resolve()}")
print(f"Output file path: {output_csv_path.resolve()}")

Input file path: /home/tan/geo-weather-lake/resources/quang_ngai_wards.csv
Output file path: /home/tan/geo-weather-lake/resources/quang_ngai_wards_geocoded.csv


### 3. Nominatim API URL


In [3]:
NOMINATIM_API_URL = "https://nominatim.openstreetmap.org/search"

### 4. Load wards CSV


In [4]:
df_wards = pd.read_csv(input_csv_path)
df_wards

Unnamed: 0,tenhc
0,Đăk Long
1,Đăk Môn
2,Dục Nông
3,Đăk Sao
4,Đăk Tờ Kan
...,...
91,Trương Quang Trọng
92,Đức Phổ
93,Măng Ri
94,Tu Mơ Rông


### 5. Geocoding function


In [5]:
def geocode_location(query_string: str) -> tuple:
    params = {
        'q': query_string,
        'format': 'json',
        'addressdetails': 1,
        'limit': 1
    }

    headers = {
        'User-Agent': 'weather-de'
    }

    try:
        response = requests.get(NOMINATIM_API_URL, params=params, headers=headers)

        if response.status_code == 200:
            results = response.json()
            if results:
                top_result = results[0]
                lat = float(top_result.get('lat'))
                lon = float(top_result.get('lon'))
                return lat, lon
            else:
                print(f"No results found for query: {query_string}")
                return None, None
        else:
            print(f"API request failed for '{query_string}' with status code {response.status_code}")
            return None, None
    except requests.exceptions.RequestException as e:
        print(f"A network error occurred: {e}")
        return None, None

In [6]:
test_lat, test_lon = geocode_location("Lý Sơn, Quảng Ngãi, Việt Nam")
print(f"Test geocoding for 'Lý Sơn': Latitude={test_lat}, Longitude={test_lon}")

Test geocoding for 'Lý Sơn': Latitude=15.3809098, Longitude=109.1174595


### 6. Batch Geocoding


In [7]:
geocoded_res = []
for i, row in tqdm(df_wards.iterrows(), total=len(df_wards)):
    ward_name = row['tenhc']
    query = f"{ward_name}, Quảng Ngãi, Việt Nam"
    lat, lon = geocode_location(query)
    geocoded_res.append({'latitude': lat, 'longitude': lon})
    time.sleep(1.1)

100%|██████████| 96/96 [03:52<00:00,  2.43s/it]


In [8]:
df_coords = pd.DataFrame(geocoded_res)
geocoded_res


[{'latitude': 14.6300997, 'longitude': 107.9257688},
 {'latitude': 14.8891081, 'longitude': 107.7088038},
 {'latitude': 14.852787, 'longitude': 107.6759855},
 {'latitude': 14.9271916, 'longitude': 107.8230721},
 {'latitude': 14.7951793, 'longitude': 107.8659328},
 {'latitude': 14.763761, 'longitude': 107.7626116},
 {'latitude': 14.6995493, 'longitude': 107.8368625},
 {'latitude': 14.6703991, 'longitude': 107.9523156},
 {'latitude': 14.5635754, 'longitude': 108.0019313},
 {'latitude': 14.7050728, 'longitude': 107.5629434},
 {'latitude': 14.6177311, 'longitude': 107.6375198},
 {'latitude': 14.6612477, 'longitude': 107.840036},
 {'latitude': 14.4095371, 'longitude': 107.7951502},
 {'latitude': 14.5200244, 'longitude': 107.7392226},
 {'latitude': 14.1443394, 'longitude': 107.424164},
 {'latitude': 14.3775557, 'longitude': 107.5458843},
 {'latitude': 14.0933495, 'longitude': 107.5451485},
 {'latitude': 14.3799656, 'longitude': 107.8565103},
 {'latitude': 14.3159359, 'longitude': 107.8340144

### 7. Combine results and check failed Geocodes


In [9]:
df_geocoded = pd.concat([df_wards.reset_index(drop=True), df_coords], axis=1)
df_geocoded

Unnamed: 0,tenhc,latitude,longitude
0,Đăk Long,14.630100,107.925769
1,Đăk Môn,14.889108,107.708804
2,Dục Nông,14.852787,107.675985
3,Đăk Sao,14.927192,107.823072
4,Đăk Tờ Kan,14.795179,107.865933
...,...,...,...
91,Trương Quang Trọng,15.158716,108.793287
92,Đức Phổ,14.808719,108.957163
93,Măng Ri,14.956587,107.927374
94,Tu Mơ Rông,14.888138,107.955301


In [10]:
failed_count = df_geocoded['latitude'].isnull().sum()
failed_count

0

### 8. Save Geocoded CSV


In [11]:
df_geocoded.to_csv(output_csv_path, index=False)