# Other changepoint detections

The study on water governance stages relies heavily on the integrated index IWGI at the basin scale, which offers significant advantages for decision-making support. Nonetheless, in large basins, the index may obscure crucial details and spatial heterogeneity. For instance, averaging the water stress of areas with and without water stress could lead to a misleading "no water stress" result. To improve the discussion on water governance transition, I recommend the authors link the three periods to well-known phenomena or policies, such as the occurrence of zero-flow in the Yellow River. 
Additionally, the current separation of periods using the Pettitt method for change point detection could be improved or compared to other methods, considering its sensitivity to time series changes. 

关于水资源治理阶段的研究，严重依赖于流域规模上的综合指数IWGI，这为决策支持提供了显著优势。然而，在大型流域中，该指数可能会掩盖关键细节和空间异质性。例如，平均有水压力区和无水压力区的水压力可能导致误导性的“无水压力”结果。为了改进对水资源治理转变的讨论，我建议作者将三个时期与众所周知的现象或政策联系起来，如黄河出现零流量。

此外，目前使用Pettitt方法分隔各个时期以检测变化点可以进行改进或与其他方法进行比较, 考虑到其对时间序列变化的敏感度。

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

import pandas as pd
import numpy as np

In [2]:
from hydra import compose, initialize
import os

# 加载项目层面的配置
with initialize(version_base=None, config_path="../config"):
    cfg = compose(config_name="config")
os.chdir(cfg.root)

## 加载数据

In [3]:
from regimes_yrb.tools.statistic import (
    ratio_contribution,
    plot_pettitt_change_points,
    plot_ratio_contribution,
    pettitt_changes,
)
import matplotlib
from matplotlib import pyplot as plt
from matplotlib.gridspec import GridSpec


plt.rcParams["xtick.direction"] = "in"
plt.rcParams["ytick.direction"] = "out"

COLORS = cfg.style.colors
period_colors = COLORS.period
region_colors = COLORS.region
index_colors = COLORS.index

index_colormap = matplotlib.colors.ListedColormap(index_colors, "indexed")
total_water_use_color = COLORS.total_WU

In [4]:
# 加载阈值为 0.05的数据，即与黄河流域相交面积大于全市总面积 5% 的所有市
city_yr = pd.read_csv(cfg.db.perfectures)

In [5]:
city_yr.head()

Unnamed: 0,City_ID,Year,IRR,Irrigated area: Total,Irrigated area: Rice,Irrigated area: Wheat,Irrigated area: Maize,Irrigated area: Vegetables and fruits,Irrigated area: Others,Irrigation water-use intensity (WUI): Total,...,Rural domestic WUI,Rural livestock WU,Livestock population,Livestock WUI,Total water use,Province_n,Area_calcu,Region,Intersect_area,Ratio
0,C27,1965,0.300518,46.631997,0.391448,16.089679,1.152312,0.571298,28.427261,644.445209,...,31.895556,0.003203,141.750766,0.022595,0.328586,Gansu,20091.467281,UR,19188.439369,0.955054
1,C27,1966,0.323595,49.468303,0.383836,16.485679,1.434736,0.636613,30.52744,654.146772,...,28.371723,0.003336,147.646616,0.022592,0.351996,Gansu,20091.467281,UR,19188.439369,0.955054
2,C27,1967,0.340063,52.309331,0.416675,17.803304,1.442818,0.697033,31.949501,650.100439,...,21.033715,0.003413,151.033245,0.0226,0.372432,Gansu,20091.467281,UR,19188.439369,0.955054
3,C27,1968,0.35269,53.870788,0.437429,18.863369,1.514685,0.770592,32.284713,654.69528,...,22.233352,0.003487,154.2665,0.022604,0.391458,Gansu,20091.467281,UR,19188.439369,0.955054
4,C27,1969,0.36574,55.12073,0.447621,19.700679,1.54992,0.75268,32.66983,663.524461,...,32.825618,0.003575,158.251492,0.022594,0.406136,Gansu,20091.467281,UR,19188.439369,0.955054


In [6]:
from regimes_yrb.index import integrated_water_governance_index

sfv = pd.read_csv(cfg.db.results.S, index_col=0).iloc[:, 0]
priority = pd.read_csv(cfg.db.results.P, index_col=0).iloc[:, 0]
allocation = pd.read_csv(cfg.db.results.A, index_col=0).iloc[:, 0]

iwgi = integrated_water_governance_index(
    priority=priority,
    scarcity=sfv,
    allocation=allocation,
)

iwgi
# 导出数据
iwgi = iwgi["IWGI"]

impressions = iwgi.values

Unnamed: 0,S,P,A,IWGI,stage
1965,0.804414,0.894241,0.051926,0.583527,P1
1966,0.779488,0.938673,0.738032,0.818731,P1
1967,0.662708,0.972201,0.768474,0.801128,P1
1968,0.256621,0.993857,0.784101,0.678193,P1
1969,0.327834,0.975086,0.869404,0.724108,P1
1970,0.20811,0.834666,1.0,0.680925,P1
1971,0.229082,0.960036,0.763665,0.650928,P1
1972,0.227475,0.956681,0.739309,0.641155,P1
1973,0.158712,0.97745,0.682352,0.606171,P1
1974,0.188652,1.0,0.654079,0.614244,P1


In [7]:
import matplotlib.pyplot as plt
import ruptures as rpt

import ruptures as rpt


algorithms = (
    "Dynp",
    # 'KernelCPD',
    # 'Pelt',
    "Binseg",
    "BottomUp",
    "Window",
)

for alg in algorithms:
    algorithm = getattr(rpt, alg, None)
    algo = algorithm(model="l2", min_size=5)
    algo.fit(impressions)
    result = algo.predict(n_bkps=2)
    breakpoints = [iwgi.index[i] for i in result[:-1]]
    print(alg, breakpoints)

<ruptures.detection.dynp.Dynp at 0x1370c6e90>

Dynp [1975, 2000]


<ruptures.detection.binseg.Binseg at 0x2a4f19c90>

Binseg [1975, 2000]


<ruptures.detection.bottomup.BottomUp at 0x1370b28d0>

BottomUp [1975, 2000]


<ruptures.detection.window.Window at 0x1370ac410>

Window []


In [8]:
algo = rpt.Dynp(model="l2", min_size=5)
algo.fit(impressions)

result = algo.predict(n_bkps=2)

<ruptures.detection.dynp.Dynp at 0x2a4fde690>

In [9]:
breakpoints = [iwgi.index[i] for i in result[:-1]]
breakpoints

[1975, 2000]

In [10]:
from signal_processing_algorithms.energy_statistics import energy_statistics

change_points = energy_statistics.e_divisive(impressions, pvalue=0.01, permutations=100)

breakpoints = [iwgi.index[i] for i in change_points]
breakpoints

[1975, 2001, 1978]

In [14]:
algo = rpt.Pelt(model="l2", min_size=2)
algo.fit(impressions)
result = algo.predict(pen=0.05)

breakpoints = [iwgi.index[i] for i in result[:-1]]
breakpoints

<ruptures.detection.pelt.Pelt at 0x2a6503b10>

[1975, 2000]