---
title: Evaluating CHIRPS with Local Rainfall Data
description: This notebook evaluates the performance of CHIRPS rainfall data against local weather station observations in the Citarum Basin, Indonesia.
author:
  - name:
      given: Taruma Sakti
      family: Megariansyah
      # literal: Taruma Sakti Megariansyah
    orcid: 0000-0002-1551-7673
    email: hi@taruma.info
    url: https://dev.taruma.info
abstract: > 
  {{< lipsum 1 >}}
keywords:
  - CHIRPS
  - Citarum Watershed
  - Rainfall
  - Precipitation
  - Hydrology
  - Data Comparison
  - Data Analysis
  - Indonesia
license: "CC BY-NC"
copyright: 
  holder: Taruma Sakti Megariansyah
  year: 2024
date: 2024-12-14
date-modified: last-modified
date-format: full
format:
    html:
        code-fold: true
        number-sections: true
        # toc-title: Daftar Isi
        other-links:
        - text: My Github
          icon: github
          href: https://github.com/taruma
        - text: My Other Projects
          icon: journals
          href: https://dev.taruma.info/projects
        - text: Sponsor Me
          icon: heart
          href: https://github.com/sponsors/taruma
        - text: Buy Me a Drink
          icon: cup-straw
          href: https://trakteer.id/taruma/tip
        code-links:
        - text: Repository
          icon: github
          href: https://github.com/taruma/rf-comp-id
        - text: Source Code
          icon: code
          href: https://github.com/taruma/rf-comp-id/blob/main/notebook_en.ipynb
        theme: journal
        toc: true
        toc-location: left
        toc-expand: 2
        toc-depth: 4
        embed-resources: true
        css: assets/quarto_styles.css
include-in-header: # from: https://github.com/quarto-dev/quarto-cli/discussions/4618
  - text: |
      <link rel = "shortcut icon" href = "favicon-ti.png" />
execute:
  cache: true
  enabled: true
citation: 
  url: https://dev.taruma.info/rf-comp-id/notebook_en.html
lightbox: auto
lang: en
title-block-banner: true
# title-block-banner: banner.png
# title-block-banner-color: black
---


Imagine two weather reporters, one in a satellite 🛰️ high above the Earth and one on the ground at a local weather station 📡. They're both reporting on the same thing: **rainfall** 🌧️. The satellite reporter represents global, gridded rainfall datasets like _CHIRPS_ (Climate Hazards Group InfraRed Precipitation with Station data), which provide a broad, top-down view of rainfall patterns across vast regions. The ground reporter represents the network of local weather stations, collecting precise rainfall measurements at specific points on the Earth's surface. Are these two reporters telling the same story about the rain? 🤔 How consistent are their reports? That's what we're going to explore in this notebook!

Essentially, we'll be playing the role of **fact-checkers**, scrutinizing the rainfall data from both our "reporters." We'll use a variety of tools and techniques to _analyze_ the data, create _visualizations_, and assess the _reliability_ of each source. This will involve looking for trends, calculating statistics, and even comparing how they describe specific events, like heavy downpours. Understanding the **strengths** and **weaknesses** of both satellite-derived and ground-based rainfall data is crucial. It can help us improve hydrological models, inform water management strategies, and enhance our ability to predict and respond to extreme weather events, no matter where we are in the world. By the end of this notebook, we'll have a clearer picture of how to interpret and utilize these different sources of rainfall information for a more comprehensive understanding of our planet's precipitation patterns. Let's get started!

::: {.callout-important}
## About this notebook

This notebook provides an educational demonstration on analyzing and comparing rainfall data from different sources, with a focus on the process rather than being a definitive research paper. It's open-source, so you're welcome to use and adapt it. If you find any errors or have suggestions, please help improve this resource by creating [an issue on the GitHub](https://github.com/taruma/rf-comp-id). Your input is greatly valued!
:::

In [1]:
#| echo: false

import geopandas as gpd
import plotly.express as px
import plotly.graph_objects as go
import json
import myfunc
import numpy as np
import pytemplate # noqa

## Introduction

This chapter provides the foundational context for this notebook, outlining the critical role of accurate rainfall data in effective water resource management. It introduces the two primary data sources, CHIRPS and BBWS Citarum, and describes the Citarum River Basin as the study area. 

### Project Background and Objectives

Rainfall is a crucial element in managing water resources, especially in a region like the Citarum River Basin. Understanding how much rain falls, where it falls, and when it falls is essential for preventing floods, managing droughts, and ensuring a reliable water supply for communities and agriculture. This notebook focuses on comparing two different sources of rainfall data: one from a global satellite-based system called CHIRPS and another from local rain gauges operated by BBWS Citarum, which we consider as the ground truth. By examining how well these two datasets agree, we can gain valuable insights into the accuracy of the satellite data and its potential for improving water management practices in the region.

The main goal of this notebook is to see how well the CHIRPS rainfall data matches up with the measurements taken from rain gauges on the ground (BBWS Citarum). We want to find out if the satellite data is consistent with the ground truth, where they might differ, and what those differences might mean for understanding rainfall patterns in the Citarum River Basin. Ultimately, this comparison will help us determine if CHIRPS data can be a reliable tool for supporting water resource management decisions, especially in areas where ground-based measurements are limited.

### Data Sources

This section will briefly introduce the two datasets we're using in this project: CHIRPS and BBWS Citarum. 

- **CHIRPS (Climate Hazards Group InfraRed Precipitation with Station data)**: Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) is a 35+ year quasi-global rainfall data set. Spanning 50°S-50°N (and all longitudes) and ranging from 1981 to near-present, CHIRPS incorporates their in-house climatology, CHPclim, 0.05° resolution satellite imagery, and in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring ^[https://www.chc.ucsb.edu/data/chirps]. It's especially helpful in areas where there aren't many weather stations on the ground. 

- **BBWS Citarum (Balai Besar Wilayah Sungai Citarum)**: This organization is responsible for managing water resources within the Citarum River Basin. They collect rainfall data using a network of rain gauges located throughout the basin. These rain gauges provide direct measurements of rainfall at specific points, which we consider our "ground truth" data. However, it is only cover specific area within Citarum River Basin. We will select rainfall data from automatic rain gauges operated by BBWS Citarum in this notebook.

We will go into more detail about how we access and process the data from each source in the next chapter ([Chapter 2: Data Acquisition and Preprocessing](#data-acquisition-and-preprocessing)).

### Study Area

In this notebook, we'll be exploring rainfall data from a specific area in Indonesia called the Upper Citarum River Watershed (or "DAS Citarum Hulu" in Indonesian). Think of it as our area of interest for this project! This watershed is important for managing water in West Java. It's a fairly large area, covering about 1,738 square kilometers – that's a bit bigger than the size of London or New York City.

In [2]:
#| echo: false
#| label: fig-upper-citarum-watershed
#| fig-cap: Upper Citarum Watershed

# Read the shapefile
gdf = gpd.read_file("data/gis/watershed_citarum_hulu.shp")

# Convert to GeoJSON format
geojson = json.loads(gdf.to_json())

longs_utm, lats_utm = myfunc.extract_coordinates(geojson)
lats, longs = myfunc.convert_utm_to_latlong(lats_utm, longs_utm)


fig = go.Figure()

watershed = go.Scattermap(
    lon=longs,
    lat=lats,
    mode="lines",
    line=dict(width=3),
    hoverinfo="skip",
)

fig.add_trace(watershed)

fig.update_layout(
    margin={"l": 0, "t": 0, "b": 0, "r": 0},
    map={
        "style": "open-street-map",
        "center": {"lon": np.mean(longs), "lat": np.mean(lats)},
        "zoom": 9,
    },
    height=450,
    width=450,
    showlegend=False,
)

fig.show()

Geographically, the Upper Citarum River Watershed sits between 6°45' and 7°15' South latitude and 107°21' and 107°57' East longitude. Parts of several cities and regions fall within this watershed, including Bandung City, Cimahi City, Bandung Regency, and Sumedang Regency. You can see the location of the watershed on @fig-upper-citarum-watershed.

---

## Data Acquisition and Preprocessing

This chapter details the process of acquiring and preparing the rainfall data from our two sources: the satellite-based CHIRPS dataset and the ground-based measurements from BBWS Citarum rain gauges. We will outline the steps taken to download, clean, and align these datasets, ensuring they are compatible for a robust comparison within the Upper Citarum River Watershed for a defined time period. This meticulous preparation is crucial to ensure the accuracy and reliability of our subsequent analysis.


### CHIRPS Data

Building upon our introduction, we now delve into the specifics of our data sources, beginning with the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS). In this section, we'll detail how we obtained the CHIRPS data for our study area, the Upper Citarum River Watershed, from the ClimateSERV platform, a tool designed for visualizing and downloading historical and forecasted climate data. While CHIRPS data is available from various sources, we opted for ClimateSERV for its user-friendly interface, which produced the data in netCDF4 format.

For this analysis, we'll be using the ClimateSERV platform ([https://climateserv.servirglobal.net/map](https://climateserv.servirglobal.net/map)) to obtain CHIRPS rainfall data specifically for the Upper Citarum River Watershed. ClimateSERV offers a user-friendly interface for downloading pre-processed climate data. Follow these steps to get the data:

![CHIRPS Data Download from ClimateSERV](assets/img/fig-climateserv-chirps.png){#fig-climateserv-chirps width=450}

1. **Navigate to ClimateSERV:** Open your web browser and go to [https://climateserv.servirglobal.net/map](https://climateserv.servirglobal.net/map).

2. **Set Area of Interest (AOI):**

    - On the left panel, you'll see "Statistical Query" and "Set Area of Interest."
    - Click on the "**Upload**" tab under "Set Area of Interest."
    - You can either drag and drop your shapefile (in .zip, .json, or .geojson format) representing the Upper Citarum River Watershed boundary or click to select the file from your computer. The map on the right will automatically zoom into the uploaded shapefile as shown on @fig-climateserv-chirps. If you don't have a shapefile, you can use the "Draw" option to manually draw a rectangle around the area, but this method is less precise. For this case we use boundary of upper citarum basin which highlited with blue line.

3. **Select Data Parameters:**

    - Under "Select Data," choose "**Download Raw Data**" for "Type of Request."
    - Select "**Observation**" for "Dataset Type."
    - Choose "**UCSB CHIRPS Rainfall**" as the "Data Source."
    - Select "**NetCDF**" as the "Download Format."

4. **Specify Date Range:**

    - Set the "Date Range" according to your analysis period. For this example, let's use **2007-01-01** as the start date and **2019-12-31** as the end date.

5. **Submit Query:**

    - Click the "**Submit Query**" button. ClimateSERV will process your request. The downloaded file will be contain NetCDF file (.nc), containing CHIRPS rainfall data clipped to the Upper Citarum River Watershed for the specified period.

By following these steps, we efficiently obtain CHIRPS data that is both spatially and temporally aligned with our study area and period, ready for further processing and analysis.


#### BBWS Citarum Data


In [3]:
print("hello")

hello


#### Data Alignment and Cleaning