# 机器学习如何助力气候变化预估：一个海温订正与预测案例


工业革命以来，人类活动不断增加，全球气候不断增暖，1911-2011 年全球平均海表温度上升了 0.8℃左右，这对海洋生态系统和气候带来了较大威胁。因此，准确、合理地预估**未来百年全球平均海表温度**，不仅有利于我们理解未来气候变化，也有助于气候变化应对政策的制定。


**气候变化的模拟与预测预估**是当前海洋、大气领域世界性研究热点之一，备受公众和各国政府的关注。**海表温度**是海洋和大气相互作用的产物，既能反映整个气候系统的变化特征，也会影响大气活动和气候系统，因此我们可以利用海温模拟与预测未来气候。**传统的气候模式**作为研究气候变化的主要手段之一，尽管经历了半个世纪的发展但**仍有偏差**，这增加了我们预估未来海表温度的不确定性。因此，减少海温模拟偏差，进而提高模式预测预估结果的准确性具有重要的意义。

近年来，随着人工智能的发展，**机器学习算法**越来越受到重视，已被广泛应用于地球系统科学的多个研究领域。气候模式的偏差不是线性那么简单，具有非线性的特征，而基于机器学习的订正模型可以帮助我们捕捉数值模式模拟结果与观测之间偏差的非线性变化，得到更精准的模式订正结果。

本期，我们将基于和鲸 ModelWhale 平台，手把手教大家动手开展**气候模式预估全球平均海表面温度订正**，并对模式结果进行后处理。我们将采用**数据分解和机器学习结合**的方法来订正气候模式模拟和预估的全球平均海表温度偏差。我们将从以下几部分学习：


[1] 首先我们带领大家先来了解一下我们的研究背景，先从**气候模式**这一研究海洋与气候变化的重要工具出发，这里有两个主要要学习的问题：

&emsp;&emsp; 1）什么是气候模式？我们能用气候模式做些什么？
&emsp;&emsp; 2）气候模式现在有什么不足？

同时，提出我们的研究对象——海表面温度数据的模式模拟预估情况，针对气候模式模拟海表面温度存在的不足，我们采用什么方法去弥补？从而介绍近年来表现出有巨大潜力的机器学习，用机器学习模型来订正模式偏差。这里的值得注意的地方就是**为什么用机器学习来订正气候模式的海温偏差。**



[2] 在大家了解了气候模式之后，我们选择自然资源部第一海洋研究所自主发展的[FIO-ESM v2.0模式](https://www.hanspub.org/journal/PaperInformation.aspx?paperID=34034) 模拟和预估的全球海表面温度数据作为研究对象，来带大家学习一些**数据预处理**的操作，主要包括：

&emsp;&emsp; 1）数据格式是怎么样的？我们怎么读取这些数据？
&emsp;&emsp; 2）对读进来的数据分辨率不统一的情况，我们怎么去做数据统一？也就是怎么去做**空间插值**？
&emsp;&emsp; 3）对插值后的数据，我们最终要得到全球平均海表温度，那么我们怎么计算呢？这里就要到**区域平均**计算方法了。

&emsp;&emsp; 4）上述计算完成后，我们怎么将计算结果可视化？也就是如何**绘图**？绘图完之后如何分析？

[3] 对数据做一个基本的预处理之后，然后我们基于模式偏差本身非线性非平稳增长的特征，直接订正未必效果好，这里先使用一个数据分解的方法——[**EEMD**](https://blog.csdn.net/liu_xiao_cheng/article/details/83897034)，目的就是为了对原始时间序列分解成不同频率的分量，再按照时间尺度组合时间序列，最后对每个组合时间序列订正。这里大家需要注意的地方有：

&emsp;&emsp; 1）**EEMD方法的工作原理**是什么？它是怎么把一串时间序列分解成多串不同频率的时间序列的？
&emsp;&emsp; 2）对分解完的时间序列，我们根据什么时间尺度来对它们进行**组合以满足物理约束**呢？

[4] 在得到组合后的EEMD分解时间序列，我们就可以对它们建立模型来订正了，这里我们选择了一个神经网络模型——[**BPNN**](https://blog.csdn.net/cufewxy1/article/details/80445023?ops_request_misc=&request_id=&biz_id=102&utm_term=BPNN%E6%A8%A1%E5%9E%8B&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-1-80445023.nonecase&spm=1018.2226.3001.4187)，这里大家要学习的地方有：

&emsp;&emsp; 1）我们选择BPNN模型来订正海表温度的原因是什么？
&emsp;&emsp; 2）我们的BPNN订正模型需要的数据要做哪些处理？模式历史偏差订正和未来预估的关系是什么？
&emsp;&emsp; 3）我们如何评价订正结果？


[5] 上述涵盖了我们这次教学的内容，在大家学习完这些内容后，我们还给大家留了两个作业来练习。作业分基础题目和拓展题目：

&emsp;&emsp; 1） 本项目用到的BPNN模型哪些**参数**会明显影响订正效果？
&emsp;&emsp; 2） 能否继续采用EEMD-BPNN模型**对其他模式数据做订正**呢？





到这里，我们本次的项目前言就全部交代清楚了，希望通过这次的学习，可以带领大家进入学习气候模式和机器学习的大门并有所收获。


# 背景介绍
在学习我们的项目之前，我们需要先了解一下我们的研究背景，包括以下几个部分：
&emsp;&emsp;  1）什么是气候模式？我们能用气候模式做些什么？
&emsp;&emsp;  2）气候模式现在有什么不足？

随后我们根据气候模式对海温模拟的偏差情况提出机器学习订正的方法，包括为什么用神经网络来订正气候模式的海温偏差。


## 1.1 什么是气候模式？


气候模式，简单理解就是一组大的计算机程序，它基于描述我们生存的地球气候系统（下图1）现象及其变化规律的数学物理方程组，采用数值积分的方式在计算机上实现求解。地球系统模式则是在气候模式的基础上，增加了复杂的生物地球化学循环过程。严格意义上的气候模式与地球系统模式存在上述区别，但是这里，我们均称两者为“气候模式”。



![Image Name](https://cdn.kesci.com/upload/image/rfmdqe1aq5.png?imageView2/0/w/960/h/960)
&emsp;&emsp;  &emsp;&emsp; &emsp;&emsp; &emsp;&emsp;  &emsp;&emsp; &emsp;&emsp; 图1. 地球气候系统是一个复杂的系统，包括大气圈、水圈、冰冻圈、岩石圈和生物圈


气候模式是气候系统研究三大手段之一[^1]，是研究海洋和气候变化科学的重要工具，有助于增强人们对地球各个圈层及其相互作用的科学理解和认知，提高人们对地球系统和气候变化的理解和预测水平，特别是在如今气候变暖的大背景下，建立、发展和使用气候模式的作用极为重要，影响极为深远。


[^1]:气候系统研究的三大研究手段是观测、理论和数值模式。


![Image Name](https://cdn.kesci.com/upload/image/rfmem2nyh5.jpg?imageView2/0/w/960/h/960)
&emsp;&emsp;  &emsp;&emsp; &emsp;&emsp; &emsp;&emsp;  &emsp;&emsp; &emsp;&emsp;  &emsp;&emsp;  &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp;图2. 观测、理论和数模三者之间的关系

## 1.2 气候模式的发展现状及不足


气候模式最早可追溯到1969年，Manabe和Bryan建立起了世界上第一个海气耦合模式，五十年来，气候模式已经取得了极大的发展和进步，从海气耦合环流模式到气候模式再到包含生物地球化学过程的地球系统模式。模式的分辨率已经越来越高、包含的过程已经越来越复杂，模拟准确率也在逐步提高，成为气候变化科学研究的核心工具。

然而，由于模式的分辨率不是无限精细的，不能完美刻画气候系统中各种过程的发生，加上人们对气候系统的很多过程都尚未认识清楚等原因，现在气候模式仍然存在模拟和预测偏差。拿海表面温度来说，就存在如东太平洋年平均存在暖偏差，中太平洋存在冷偏差等问题。这些海表面温度偏差会影响未来的精准预测，因此对模式结果开展偏差订正就格外有必要。




海表温度（Sea Surface Temperature, SST）是海洋和大气相互作用的产物，既能反映整个气候系统的变化特征，也会影响大气活动和气候系统，因此我们可以利用海温模拟与预测未来气候。

## 1.3 为何选择机器学习订正气候模式偏差？

确定完了研究对象，我们接下来就要确定方法了。传统上，人们很多是拿到模式数据直接用于分析，或者**基于模式数据—>选择传统上的机器学习方法如决策树、支持向量机来订正模式结果。**
近年来，以**神经网络**为代表的机器学习特别是深度学习在近些年来已经被广泛地应用到气象、海洋领域，并展露出了强大能力和潜力，如识别海洋涡旋、降水预测等。气候模式模拟的海温偏差时间序列本身并不是线性变化那么简单，而是具有非线性的特征。神经网络作为有强大的非线性表达能力的模型，能够挖掘模式与观测之间偏差的规律特征，实现偏差订正的目的。这里我们考虑模式海温偏差的非线性特征，建立一个三层的神经网络，来完成我们的项目。

## 小结
本章我们交代了项目的背景，包含气候模式的概念、气候模式存在的不足，同时确定了我们研究对象海温，和订正气候模式模拟海温选择的方法。

接下来，让我们开启本次项目的第一步，先从海温数据预处理开始。

# SST数据预处理
对我们的研究对象和方法有了大致的了解之后，我们就可以开始我们的项目了，首先就是对**SST数据预处理**了。

如前言所述，我们在这一章需要注意：

   1）数据格式是怎么样的？我们怎么读取这些数据？
   2）对读进来的数据分辨率不统一的情况，我们怎么去做数据统一？也就是怎么去做空间插值？
   3）对插值后的数据，我们最终要得到全球平均SST，那么我们怎么计算来得到最终的全球平均值？这里就要用到海表面温度的区域平均计算方法了。
   4）上述计算完成后，我们怎么将计算结果可视化？也就是如何绘图？对绘图结果我们怎么分析？
	 
注：
&emsp;&emsp;   1）数据处理这里我们计算资源选择2核8G CPU资源即可，使用镜像为【octave 测试镜像-song-v1】，Kernel类型为Python3。
&emsp;&emsp;   2）教案中涉及跑循环的部分时，可能会弹出“内存使用已超过80%”之类的注意字样，不影响实操，大家可忽略。
 

## 2.1 数据简介

对SST数据预处理之前，我们需要先对数据有一个大致的了解。
本章节我们选用的数据是自然资源部第一海洋研究所自主发展的第二代气候模式FIO-ESM v2.0模拟和预估的全球SST以及观测值，模式和观测的数据格式都是**nc格式**。

选用的数据为：

1） 观测数据：选用的是第5代扩展重构SST数据[ERSST v5](https://climatedataguide.ucar.edu/climate-data/sst-data-noaa-extended-reconstruction-ssts-version-5-ersstv5) ，空间覆盖全球海洋，分辨率为2°×2°，时间范围从1854年1月到2019年12月，时间间隔是1个月，共166年的数据。          

2） [模式数据](https://esgf-node.llnl.gov/search/cmip6/)：包含**历史时期模拟数据和未来情景预估数据**。FIO-ESM v2.0历史时期数据，空间覆盖全球海洋，高纬度地区1.1°，赤道地区加密为0.3°—0.5°，时间范围从1850年1月到2014年12月，时间间隔是1个月，共165年的数据; **3种未来不同温室气体排放程度预估**[^1]的数据，空间覆盖全球海洋，分辨率和历史数据相同，时间范围从2015年1月到2100年12月，时间间隔是1个月，共86年的数据。



[^1]: 为了展现不同政策选择带来的气候影响和社会经济风险，科学家们提出了不同的预估情景，简单说是设置不同的温室气体排放试验（用'SSPa-b'表示，SSP代表Shared Socioeconomic Pathways，共享社会经济路径，a代表我们经济上走的不同未来发展路径，b代表2100年气候系统的辐射强迫值），看气候系统在未来会如何变化。我们这次用到的三种未来预估数据分别代表着低排放（SSP1-2.6）、中等排放（SSP2-4.5）和高排放情景（SSP5-8.5）。详情可见[CMIP6情景模式比较计划概述文章](http://www.climatechange.cn/CN/10.12006/j.issn.1673-1719.2019.082)。

## 2.2 数据处理
了解完要用到的数据之后，我们就可以开始对数据的处理过程了。

 数据处理可分以下几个部分：
 &emsp;&emsp;  1）读取数据
 &emsp;&emsp;  2）数据空间插值
 &emsp;&emsp;  3）计算全球平均SST
 &emsp;&emsp;  4）绘图
 
 




### 2.2.1 读取数据
这里需要先导入工具包，这里我们要特别注意的是读取数据要用到的**netCDF4包以及其中的Dataset**功能。

In [1]:
import numpy as np
import netCDF4 as nc
from netCDF4 import Dataset  # 读取nc文件用到的包
from scipy.interpolate import griddata  # 对SST空间插值用到的函数
import time
import matplotlib.pyplot as plt
import scipy.io as io
import os

导包结束后，我们就要开始读取观测数据和模式数据了。数据已经挂载到公开社区数据集**《CMIP6气候模式模拟和预估海表面温度》**中。

### 2.2.1.1 读取观测数据

查看文件信息

In [2]:
Path_Data = './dataset/CMIP6/'  # 读取数据路径
NC_ERSST = Dataset(Path_Data + 'sst.mnmean.nc') # 读取数据
print(NC_ERSST)  # 打印查看文件信息

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4_CLASSIC data model, file format HDF5):
    climatology: Climatology is based on 1971-2000 SST, Xue, Y., T. M. Smith, and R. W. Reynolds, 2003: Interdecadal changes of 30-yr SST normals during 1871.2000. Journal of Climate, 16, 1601-1612.
    description: In situ data: ICOADS2.5 before 2007 and NCEP in situ data from 2008 to present. Ice data: HadISST ice before 2010 and NCEP ice after 2010.
    keywords_vocabulary: NASA Global Change Master Directory (GCMD) Science Keywords
    keywords: Earth Science > Oceans > Ocean Temperature > Sea Surface Temperature >
    instrument: Conventional thermometers
    source_comment: SSTs were observed by conventional thermometers in Buckets (insulated or un-insulated canvas and wooded buckets) or Engine Room Intaker
    geospatial_lon_min: -1.0
    geospatial_lon_max: 359.0
    geospatial_laty_max: 89.0
    geospatial_laty_min: -89.0
    geospatial_lat_max: 89.0
    geospatial_lat_min: -89.0
    

上述打印输出的文件信息比较详细，我们不需要都看都了解，我们关心的主要包含：

1）这套观测数据是怎么产生的，也就是**description**部分，我们可以知道数据是原位观测数据和海冰数据等组合产生；
2）经纬度的范围，给出了最大最小值；
3）数据的下载网址和版本信息，大家也可以参照网址去网站看到更详细的信息；
4）这套数据的维度信息，一共有4个维度，我们只需关注纬度lat，经度lon和时间time。

后面大家在打印模式数据文件信息时同样只需要关注上述4条信息。

In [3]:
print(NC_ERSST.variables.keys()) # 查看一下有哪些变量

odict_keys(['lat', 'lon', 'time_bnds', 'time', 'sst'])


In [4]:
print(NC_ERSST.variables['sst'].missing_value ) # 查看一下变量缺失值

-9.96921e+36


In [5]:
print(NC_ERSST.variables['time']) # 查看一下变量时间信息

<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
    units: days since 1800-1-1 00:00:00
    long_name: Time
    delta_t: 0000-01-00 00:00:00
    avg_period: 0000-01-00 00:00:00
    prev_avg_period: 0000-00-07 00:00:00
    standard_name: time
    axis: T
    actual_range: [19723. 80322.]
unlimited dimensions: time
current shape = (1992,)
filling on, default _FillValue of 9.969209968386869e+36 used


In [6]:
Lat_ERSST = NC_ERSST.variables['lat'][:].data  # 纬度，维度是1维的，范围是88 到 -88，即(89,)
Lon_ERSST = NC_ERSST.variables['lon'][:].data  # 经度，,维度是1维的，范围是0 到 358,即(180,)
# SST，维度是3维的，分别是(时间, 纬度, 经度)，即(1992, 89, 180)
SST_ERSST = NC_ERSST.variables['sst'][:].data
Time_ERSST0 = NC_ERSST.variables['time'][:]  # 时间，维度是1维的，即(1992,)
Time_ERSST0 = nc.num2date(
    Time_ERSST0,
    'days since 1800-1-1 00:00:00').data  # 185401-201912,共1992个元素
print(
    Lat_ERSST.shape,
    Lon_ERSST.shape,
    SST_ERSST.shape,
    len(Time_ERSST0))  # 打印各个变量的尺寸

(89,) (180,) (1992, 89, 180) 1992


In [7]:
# 自己造一个日期列表
Time_ERSST = [int(Time_ERSST0[i].year*100) + int(Time_ERSST0[i].month)
              for i in range(Time_ERSST0.shape[0])]  # 185401-201912, 共1992个元素
print(len(Time_ERSST))  # 查看Time_ERSST长度
print(Time_ERSST[0:5], Time_ERSST[-5:])  # 打印时间列表头尾5个内容

1992
[185401, 185402, 185403, 185404, 185405] [201908, 201909, 201910, 201911, 201912]


### 2.2.1.2 读取模式数据：历史数据
读取操作都与观测数据相似

查看文件信息

In [8]:
NC_Historical = Dataset(Path_Data + 'tos_Omon_FIO-ESM-2-0_historical_r1i1p1f1_gn_185001-201412.nc')
print(NC_Historical) # 打印查看文件信息

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4_CLASSIC data model, file format HDF5):
    Conventions: CF-1.7 CMIP-6.2
    activity_id: CMIP
    branch_method: standard
    branch_time_in_child: 59400.0
    branch_time_in_parent: 59400.0
    contact: songroy@fio.org.cn
    creation_date: 2019-11-22T07:16:56Z
    data_specs_version: 01.00.31
    experiment: all-forcing simulation of the recent past
    experiment_id: historical
    external_variables: areacello
    forcing_index: 1
    frequency: mon
    further_info_url: https://furtherinfo.es-doc.org/CMIP6.FIO-QLNM.FIO-ESM-2-0.historical.none.r1i1p1f1
    grid: native atmosphere regular grid (192x288 latxlon)
    grid_label: gn
    history: 2019-11-22T07:16:56Z ;rewrote data to be consistent with CMIP for variable tos found in table Omon.
    initialization_index: 1
    institution: FIO (First Institute of Oceanography, State Oceanic Administration, Qingdao 266061, China), QNLM (Qingdao National Laboratory for Marine Science a

In [9]:
print(NC_Historical.variables.keys())  # 查看一下有哪些变量

odict_keys(['time', 'time_bnds', 'j', 'i', 'latitude', 'longitude', 'vertices_latitude', 'vertices_longitude', 'tos'])


In [10]:
print(NC_Historical.variables['tos'].missing_value ) # 查看一下变量缺失值

1e+20


In [11]:
print(NC_Historical.variables['time'])   # 查看一下变量时间信息

<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
    bounds: time_bnds
    units: days since 0001-01-01
    calendar: 365_day
    axis: T
    long_name: time
    standard_name: time
unlimited dimensions: time
current shape = (1980,)
filling on, default _FillValue of 9.969209968386869e+36 used


In [12]:
# 纬度，是2维的网格，按照(纬度,经度)排列数组大小是(384, 320)
Lat_Historical = NC_Historical.variables['latitude'][:].data
# 经度，是2维的网格，按照(纬度,经度)排列数组大小是(384, 320)
Lon_Historical = NC_Historical.variables['longitude'][:].data
# SST，3维，按照(时间, 纬度, 经度)排列数组大小是(1980, 384, 320)
SST_Historical = NC_Historical.variables['tos'][:].data
# 自己造一个日期列表,185001-201412,共1980个元素。
Time_Historical = [int(i*100) + int(j) for i in range(1850, 2015)
                   for j in range(1, 13)]  # 185001-201412,共1980个元素。
print(
    Lat_Historical.shape,
    Lon_Historical.shape,
    SST_Historical.shape,
    len(Time_Historical))  # 打印各个变量的尺寸

(384, 320) (384, 320) (1980, 384, 320) 1980


### 2.2.1.3 读取模式数据：未来预估数据
同样地，我们来读取3种未来情景预估数据，分别是SSP1-2.6、SSP2-4.5和SSP5-8.5。

In [13]:
## 未来预估低排放情景：SSP1-2.6
NC_SSP126 = Dataset(Path_Data + 'tos_Omon_FIO-ESM-2-0_ssp126_r1i1p1f1_gn_201501-210012.nc')
print(NC_SSP126) 

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4_CLASSIC data model, file format HDF5):
    Conventions: CF-1.7 CMIP-6.2
    activity_id: ScenarioMIP
    branch_method: standard
    branch_time_in_child: 59400.0
    branch_time_in_parent: 59400.0
    contact: songroy@fio.org.cn
    creation_date: 2019-12-27T13:38:05Z
    data_specs_version: 01.00.31
    experiment: update of RCP2.6 based on SSP1
    experiment_id: ssp126
    external_variables: areacello
    forcing_index: 1
    frequency: mon
    further_info_url: https://furtherinfo.es-doc.org/CMIP6.FIO-QLNM.FIO-ESM-2-0.ssp126.none.r1i1p1f1
    grid: native atmosphere regular grid (384x320 latxlon)
    grid_label: gn
    history: 2019-12-27T13:38:05Z ;rewrote data to be consistent with ScenarioMIP for variable tos found in table Omon.
    initialization_index: 1
    institution: FIO (First Institute of Oceanography, State Oceanic Administration, Qingdao 266061, China), QNLM (Qingdao National Laboratory for Marine Science and Te

In [14]:
print(NC_SSP126.variables.keys()) # 查看一下有哪些变量

odict_keys(['time', 'time_bnds', 'j', 'i', 'latitude', 'longitude', 'vertices_latitude', 'vertices_longitude', 'tos'])


In [15]:
print(NC_SSP126.variables['tos'].missing_value )  # 查看一下变量缺失值

1e+20


In [16]:
print(NC_SSP126.variables['time'])   # 查看一下变量时间信息

<class 'netCDF4._netCDF4.Variable'>
float64 time(time)
    bounds: time_bnds
    units: days since 0001-01-01
    calendar: 365_day
    axis: T
    long_name: time
    standard_name: time
unlimited dimensions: time
current shape = (1032,)
filling on, default _FillValue of 9.969209968386869e+36 used


In [17]:
# 纬度，同历史数据，是2维的网格，按照(纬度,经度)排列数组大小是(384, 320)
Lat_SSP = NC_SSP126.variables['latitude'][:].data
# 经度，同历史数据，是2维的网格，按照(纬度,经度)排列数组大小是(384, 320)
Lon_SSP = NC_SSP126.variables['longitude'][:].data
# SST，3维，按照(时间, 纬度, 经度)排列数组大小是(1032, 384, 320)
SST_SSP126 = NC_SSP126.variables['tos'][:].data

In [18]:
# 自己造一个日期列表
Time_SSP = [int(i*100) + int(j) for i in range(2015, 2101)
            for j in range(1, 13)]  # 201501-210012,(1032,)
print(Lat_SSP.shape, Lon_SSP.shape, SST_SSP126.shape, len(Time_SSP))

(384, 320) (384, 320) (1032, 384, 320) 1032


In [19]:
# 未来预估中等排放情景：SSP245，除了SST不同，其他都相同，我们只需读取SST即可
NC_SSP245 = Dataset(
    Path_Data +
    'tos_Omon_FIO-ESM-2-0_ssp245_r1i1p1f1_gn_201501-210012.nc')
SST_SSP245 = NC_SSP245.variables['tos'][:].data   # (1032, 384, 320)
# 未来预估高排放情景：SSP585，除了SST不同，其他都相同，我们只需读取SST即可
NC_SSP585 = Dataset(
    Path_Data +
    'tos_Omon_FIO-ESM-2-0_ssp585_r1i1p1f1_gn_201501-210012.nc')
SST_SSP585 = NC_SSP585.variables['tos'][:].data   # (1032, 384, 320)

## 2.2.2 数据空间插值
从上面的数据读取操作，相信大家也都看到观测和模式数据的分辨是不同的。正是由于观测数据和模式数据的网格分辨率不相同，特别是模式数据还是不均匀的网格，所以不利于我们后面对其计算全球平均，所以我们为了统一，先将观测和模式数据进行空间插值到相同网格下，方便后面的计算。

这里，我们使用griddata函数对数据进行插值，具体插值方法介绍可见[python的griddata插值](https://blog.csdn.net/weixin_44052055/article/details/118752091?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522165873149216782350830959%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=165873149216782350830959&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~baidu_landing_v2~default-2-118752091-null-null.142^v33^pc_rank_34,185^v2^control&utm_term=python%20griddata&spm=1018.2226.3001.4187)。

### 2.2.2.1 观测数据插值
先对插值前后的经纬度打个网格，主要用到了[meshgrid函数](https://lixiaoqian.blog.csdn.net/article/details/81532855?spm=1001.2101.3001.6661.1&utm_medium=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1-81532855-blog-84628976.pc_relevant_aa&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2%7Edefault%7ECTRLIST%7Edefault-1-81532855-blog-84628976.pc_relevant_aa&utm_relevant_index=1)。

In [20]:
# 设置经纬度最大最小值
Min_Lat_ERSST = -88
Max_Lat_ERSST = 88
Min_Lon_ERSST = 0
Max_Lon_ERSST = 359
# 插值前后的经纬度间隔
Step_LatLon_Before = 2
Step_LatLon_After = 1
# 给经纬度打网格
# 用到了meshgrid，将1维的经纬度信息生成二维的网格点坐标矩阵，插值处理和画图常用到。
Lat_ERSST_mgrid, Lon_ERSST_mgrid = np.meshgrid(Lat_ERSST, Lon_ERSST)
print(Lat_ERSST_mgrid.shape, Lon_ERSST_mgrid.shape)
# 经纬度的维度信息，按照（纬度，经度）的顺序排列
Lat_ERSST_mgrid = Lat_ERSST_mgrid.T
Lon_ERSST_mgrid = Lon_ERSST_mgrid.T

(180, 89) (180, 89)


In [21]:
# 插值前的经纬度, 维度是(180*89,2)
LatLon_ERSST_Before = np.hstack(
    (Lat_ERSST_mgrid.reshape(-1, 1), Lon_ERSST_mgrid.reshape(-1, 1)))  # 按水平方向进行叠加，形成两列，
# 插值前的SST，维度是(180*89,1)
SST_ERSST_Before = SST_ERSST.reshape((SST_ERSST.shape[0], -1, 1))
# 插值前的经纬度和SST有了，我们还需要给出插值后的经纬度信息
Lat_ERSST_After_1D = np.arange(
    Max_Lat_ERSST, Min_Lat_ERSST-1, -Step_LatLon_After)  # 生成插值后的纬度
Lon_ERSST_After_1D = np.arange(
    Min_Lon_ERSST,
    Max_Lon_ERSST+1,
    Step_LatLon_After)  # 生成插值后的经度
Lat_ERSST_After_2D, Lon_ERSST_After_2D = np.meshgrid(
    Lat_ERSST_After_1D, Lon_ERSST_After_1D)  # 打网格
print(Lat_ERSST_After_2D.shape, Lon_ERSST_After_2D.shape)
Lat_ERSST_After_2D = Lat_ERSST_After_2D.T  # 转置一下，为了和变量维度匹配
Lon_ERSST_After_2D = Lon_ERSST_After_2D.T

(360, 177) (360, 177)


In [22]:
# 使用griddata函数对SST插值
SST_ERSST_After = np.zeros([SST_ERSST.shape[0],
                            Lat_ERSST_After_2D.shape[0],
                            Lat_ERSST_After_2D.shape[1]])
# 对每一个月的全球观测资料插值，并⏲计时
start_time = time.time()
for t in range(SST_ERSST.shape[0]):
    # griddata函数插值，需要输入插值前的经纬度信息，插值前的变量，插值后的经纬度，插值方法，这里插值方法选择'nearest'方法。
    SST_ERSST_After_Now = griddata(LatLon_ERSST_Before,
                                   SST_ERSST_Before[t,
                                                    :,
                                                    :],
                                   (Lat_ERSST_After_2D,
                                    Lon_ERSST_After_2D),
                                   method='nearest').squeeze()
    SST_ERSST_After[t, :, :] = SST_ERSST_After_Now
    if t % 100 == 0:  # 每100个数打印显示一下
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")  # 打印看一下循环计算的用时

# 清空一下无关变量，后面不再用了
del LatLon_ERSST_Before
del SST_ERSST_Before

0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
84.58050012588501s


### 2.2.2.2 模式的历史数据插值
插值操作与上述观测数据一样。


In [23]:
## 设置经纬度最大最小值
Min_Lat_Historical = -88
Max_Lat_Historical = 88
Min_Lon_Historical = 0
## 插值前后的经纬度间隔
Max_Lon_Historical = 359
Step_LatLon_After = 1
## 给经纬度打网格
Lat_Historical_mgrid = Lat_Historical  
Lon_Historical_mgrid = Lon_Historical
## 插值前的经纬度, 维度是(384*320,2)
LatLon_Historical_Before = np.hstack( ( Lat_Historical_mgrid.reshape(-1,1), Lon_Historical_mgrid.reshape(-1,1) ) ) 
SST_Historical_Before = SST_Historical.reshape( (SST_Historical.shape[0],-1,1) ) 
Lat_Historical_After_1D = np.arange(Max_Lat_Historical,Min_Lat_Historical-1,-Step_LatLon_After)
Lon_Historical_After_1D = np.arange(Min_Lon_Historical,Max_Lon_Historical+1,Step_LatLon_After)
Lat_Historical_After_2D, Lon_Historical_After_2D = np.meshgrid(Lat_Historical_After_1D, Lon_Historical_After_1D)
print(Lat_Historical_After_2D.shape, Lon_Historical_After_2D.shape)
Lat_Historical_After_2D = Lat_Historical_After_2D.T 
Lon_Historical_After_2D = Lon_Historical_After_2D.T

(360, 177) (360, 177)


In [24]:
# 使用griddata函数对SST插值
SST_Historical_After = np.zeros([SST_Historical.shape[0],
                                 Lat_Historical_After_2D.shape[0],
                                 Lat_Historical_After_2D.shape[1]])
# 开始循环计算并计时
start_time = time.time()
for t in range(SST_Historical.shape[0]):
    # griddata函数插值，需要输入插值前的经纬度信息，插值前的变量，插值后的经纬度，插值方法，这里插值方法选择'nearest'方法。
    SST_Historical_After_Now = griddata(LatLon_Historical_Before,
                                        SST_Historical_Before[t,
                                                              :,
                                                              :],
                                        (Lat_Historical_After_2D,
                                         Lon_Historical_After_2D),
                                        method='nearest').squeeze()
    SST_Historical_After_Now[np.where(
        abs(SST_Historical_After_Now) >= 50)] == np.nan  # 大于50的值赋成nan值
    SST_Historical_After[t, :, :] = SST_Historical_After_Now
    if t % 100 == 0:  # 每100个数打印显示一下
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")  # 打印用时

# 清空一下无关变量，后面不再用了
del LatLon_Historical_Before
del SST_Historical_Before

0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
174.06292486190796s


### 2.2.2.3 模式的未来预估数据插值
同样地，我们采用同样的插值操作对未来预估数据插值。


In [25]:
# 设置经纬度最大最小值
Min_Lat_SSP = -88
Max_Lat_SSP = 88
Min_Lon_SSP = 0
Max_Lon_SSP = 359
# 插值前后的经纬度间隔
Step_LatLon_After = 1
# 给经纬度打网格
Lat_SSP_mgrid = Lat_SSP
Lon_SSP_mgrid = Lon_SSP
# 插值前的经纬度, 维度是(384*320,2)
LatLon_SSP_Before = np.hstack(
    (Lat_SSP_mgrid.reshape(-1, 1), Lon_SSP_mgrid.reshape(-1, 1)))
SST_SSP126_Before = SST_SSP126.reshape((SST_SSP126.shape[0], -1, 1))
SST_SSP245_Before = SST_SSP245.reshape((SST_SSP245.shape[0], -1, 1))
SST_SSP585_Before = SST_SSP585.reshape((SST_SSP585.shape[0], -1, 1))
Lat_SSP_After_1D = np.arange(Max_Lat_SSP, Min_Lat_SSP-1, -Step_LatLon_After)
Lon_SSP_After_1D = np.arange(Min_Lon_SSP, Max_Lon_SSP+1, Step_LatLon_After)
Lat_SSP_After_2D, Lon_SSP_After_2D = np.meshgrid(
    Lat_SSP_After_1D, Lon_SSP_After_1D)
print(Lat_SSP_After_2D.shape, Lon_SSP_After_2D.shape)
Lat_SSP_After_2D = Lat_SSP_After_2D.T
Lon_SSP_After_2D = Lon_SSP_After_2D.T


(360, 177) (360, 177)


In [26]:
# 使用griddata函数对SST插值
SST_SSP126_After = np.zeros(
    [SST_SSP126.shape[0], Lat_SSP_After_2D.shape[0],  Lat_SSP_After_2D.shape[1]])
SST_SSP245_After = np.zeros(
    [SST_SSP245.shape[0], Lat_SSP_After_2D.shape[0],  Lat_SSP_After_2D.shape[1]])
SST_SSP585_After = np.zeros(
    [SST_SSP585.shape[0], Lat_SSP_After_2D.shape[0],  Lat_SSP_After_2D.shape[1]])
# 开始循环计算并计时
start_time = time.time()
for t in range(SST_SSP126.shape[0]):
    # griddata函数插值，需要输入插值前的经纬度信息，插值前的变量，插值后的经纬度，插值方法，这里插值方法选择'nearest'方法。
    # SSP1-2.6
    SST_SSP126_After_Now = griddata(LatLon_SSP_Before,
                                    SST_SSP126_Before[t,
                                                      :,
                                                      :],
                                    (Lat_SSP_After_2D,
                                     Lon_SSP_After_2D),
                                    method='nearest')
    SST_SSP126_After[t, :, :] = SST_SSP126_After_Now.squeeze()
    # SSP2-4.5
    SST_SSP245_After_Now = griddata(LatLon_SSP_Before,
                                    SST_SSP245_Before[t,
                                                      :,
                                                      :],
                                    (Lat_SSP_After_2D,
                                     Lon_SSP_After_2D),
                                    method='nearest')
    SST_SSP245_After[t, :, :] = SST_SSP245_After_Now.squeeze()
    # SSP5-8.5
    SST_SSP585_After_Now = griddata(LatLon_SSP_Before,
                                    SST_SSP585_Before[t,
                                                      :,
                                                      :],
                                    (Lat_SSP_After_2D,
                                     Lon_SSP_After_2D),
                                    method='nearest')
    SST_SSP585_After[t, :, :] = SST_SSP585_After_Now.squeeze()
    if t % 100 == 0:
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")

# 清空一下无关变量，后面不再用了
del LatLon_SSP_Before
del SST_SSP126_Before
del SST_SSP245_Before
del SST_SSP585_Before

0
100
200
300
400
500
600
700
800
900
1000
258.7564148902893s


### 2.2.3 计算全球平均SST
插值得到均匀网格下的数据后，再计算一下全球平均的SST就能得到最终结果了。这里计算全球平均我们要注意不同纬度下的网格面积是不一样的，所以需要采用加权平均，不能用算术平均，否则会带来较大误差。


#### 2.2.3.1 区域平均函数的定义

我们先来定义一个计算变量区域平均的函数，方便后续直接调用。关于区域平均计算的推导公示参见[魏萌的博文](https://blog.sciencenet.cn/blog-907194-687322.html)。
这个插值函数areamean2D，我们在调用时只需要输入以下三个变量：
&emsp;&emsp;  X: 待求区域平均的变量，这里就是SST;
&emsp;&emsp;  lat: 纬度，1维；
&emsp;&emsp;  lon: 经度，1维。
				就能得到输入变量X的区域平均值。

In [27]:
# 定义求区域平均函数
def areamean2D(X, lat, lon):
    """
    本函数计算经纬度网格下某变量区域平均值。

    输入：
        X: 待求区域平均的变量，如SST;
        lat: 纬度；
        lon: 经度。
    输出：
        X_AreaMean: 输入变量X的区域平均值

    """
    import numpy as np

    Len_lat = lat.shape[0]
    Len_lon = lon.shape[0]
    lat_2D = lat.repeat(Len_lon).reshape((Len_lat, Len_lon))
    Weight = np.cos(np.pi/180*lat_2D)  # 关键是要计算不同纬度对应的权重
    # 设置一个较大的较为合理的值，小于这个值才是正常的SST值
    Weight_Ocean = Weight[np.where(abs(X) < 50)]
    X_Ocean = X[np.where(abs(X) < 50)]  # 取出是海洋的点
    X_AreaMean = sum(X_Ocean * Weight_Ocean) / sum(Weight_Ocean)  # 区域平均计算公式

    return X_AreaMean  # 返回计算结果

#### 2.2.3.2 计算观测的全球平均SST
定义好计算区域平均函数后，我们就可以调用它来计算全球平均SST了。


In [28]:
SST_ERSST_AreaMean = np.zeros(
    SST_ERSST_After.shape[0])  # 预定义一个空数组，以储存观测的全球平均SST。
# 开始对观测SST逐月计算全球平均值
start_time = time.time()
for t in range(SST_ERSST_After.shape[0]):
    # 调用函数areamean2D
    SST_ERSST_AreaMean[t] = areamean2D(
        SST_ERSST_After[t, :, :], Lat_ERSST_After_1D, Lon_ERSST_After_1D)
    if t % 100 == 0:  # 每100个数打印显示一下
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")
del SST_ERSST_After
del Lat_ERSST_After_1D
del Lon_ERSST_After_1D

0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
28.827313899993896s


#### 2.2.3.3 计算模式历史的全球平均SST
与观测数据计算类似，我们调用函数计算即可。

In [29]:
SST_Historical_AreaMean = np.zeros(SST_Historical_After.shape[0])
# 开始对模式的历史SST逐月计算全球平均值
start_time = time.time()
for t in range(SST_Historical_After.shape[0]):
    SST_Historical_AreaMean[t] = areamean2D(
        SST_Historical_After[t, :, :], Lat_Historical_After_1D, Lon_Historical_After_1D)
    if t % 100 == 0:  # 每100个数打印显示一下
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")
del SST_Historical_After
del Lat_Historical_After_1D
del Lon_Historical_After_1D

0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
27.72053289413452s


#### 2.2.3.4  计算模式未来预估的全球平均SST
与观测和模式历史数据操作类似。

In [30]:
SST_SSP126_AreaMean = np.zeros(SST_SSP126_After.shape[0])
SST_SSP245_AreaMean = np.zeros(SST_SSP245_After.shape[0])
SST_SSP585_AreaMean = np.zeros(SST_SSP585_After.shape[0])
start_time = time.time()
for t in range(SST_SSP126_After.shape[0]):
    SST_SSP126_AreaMean[t] = areamean2D(
        SST_SSP126_After[t, :, :], Lat_SSP_After_1D, Lon_SSP_After_1D)
    SST_SSP245_AreaMean[t] = areamean2D(
        SST_SSP245_After[t, :, :], Lat_SSP_After_1D, Lon_SSP_After_1D)
    SST_SSP585_AreaMean[t] = areamean2D(
        SST_SSP585_After[t, :, :], Lat_SSP_After_1D, Lon_SSP_After_1D)
    if t % 100 == 0:
        print(t)
end_time = time.time()
print(str(end_time - start_time) + "s")

0
100
200
300
400
500
600
700
800
900
1000
42.439735412597656s


#### 2.2.3.5 保存数据
算出最终结果，我们就要记得把计算得到的全球平均SST的观测值和模式值保存下来。


In [31]:
# 设置一下保存路径以及文件夹存在与否的判断
Path_Pre = './project/'
Path_SaveData = Path_Pre + 'SaveData/Original/'
isExists = os.path.exists(Path_SaveData)
if not isExists:
    os.makedirs(Path_Pre + 'SaveData/Original')   # 创建文件夹
    # 创建文件夹，打印 ' successfully build!'
    print(Path_SaveData + ' successfully build!')
else:
    # 如果文件夹已经建立，打印 ‘has been existed!’
    print(Path_SaveData + ' has been existed!')

# 保存成npy文件
Num_Year = 2014-1854+1
Num_Sample_SST = Num_Year*12
SST_ERSST_AreaMean = SST_ERSST_AreaMean[0:Num_Sample_SST]
SST_Historical_AreaMean = SST_Historical_AreaMean[-Num_Sample_SST::]
Time_ERSST = Time_ERSST[0:Num_Sample_SST]
np.save(Path_SaveData + 'SST_ERSST_AreaMean.npy', SST_ERSST_AreaMean)
np.save(Path_SaveData + 'SST_Historical_AreaMean.npy', SST_Historical_AreaMean)
np.save(Path_SaveData + 'SST_SSP126_AreaMean.npy', SST_SSP126_AreaMean)
np.save(Path_SaveData + 'SST_SSP245_AreaMean.npy', SST_SSP245_AreaMean)
np.save(Path_SaveData + 'SST_SSP585_AreaMean.npy', SST_SSP585_AreaMean)
np.save(Path_SaveData + 'Time_ERSST.npy', Time_ERSST)
np.save(Path_SaveData + 'Time_SSP.npy', Time_SSP)

# 保存成mat文件，留作后续octave环境下eemd数据分解使用
io.savemat(Path_SaveData + 'SST_ERSST_AreaMean.mat',
           {'SST_ERSST_AreaMean': SST_ERSST_AreaMean})
io.savemat(Path_SaveData + 'SST_Historical_AreaMean.mat',
           {'SST_Historical_AreaMean': SST_Historical_AreaMean})
io.savemat(Path_SaveData + 'SST_SSP126_AreaMean.mat',
           {'SST_SSP126_AreaMean': SST_SSP126_AreaMean})
io.savemat(Path_SaveData + 'SST_SSP245_AreaMean.mat',
           {'SST_SSP245_AreaMean': SST_SSP245_AreaMean})
io.savemat(Path_SaveData + 'SST_SSP585_AreaMean.mat',
           {'SST_SSP585_AreaMean': SST_SSP585_AreaMean})

./project/SaveData/Original/ successfully build!


### 2.2.4 绘图
我们可以通过画图查看观测与模式的全球SST随时间的变化序列，包括历史时期的和模式未来预估的时间序列。


同样地，我们要设置保存路径

In [32]:
Path_Pre = './project/'
Path_SaveFigures = Path_Pre + 'Figures/Original/'
isExists = os.path.exists(Path_SaveFigures)
if not isExists:
    os.makedirs(Path_Pre + 'Figures/Original')
    print(Path_SaveFigures + ' successfully build!')
else:
    print(Path_SaveFigures + ' has been existed!')

./project/Figures/Original/ successfully build!


#### 2.2.4.1 历史时期全球平均SST的观测、模式和模式偏差的时间序列（模式减观测）
我们先看一下历史时期的时间序列。

In [42]:
# 先预设一下画图的设置
markersizenum = 4  # 标记大小
Num_Year = 2014-1854+1
Num_Sample_SST = Num_Year*12
X_Time = np.arange(0, Num_Sample_SST)  # 生成时间数组
Time_SST = np.array(Time_Historical[-Num_Sample_SST::])
xtick_SST_interval = 20*12  # 画图横坐标的刻度间隔
# SST
SST_Historical_AreaMean = SST_Historical_AreaMean[-Num_Sample_SST::]
SST_ERSST_AreaMean = SST_ERSST_AreaMean[0:Num_Sample_SST]
xticklabel_SST = list(range(0, Num_Sample_SST, xtick_SST_interval))
print(xticklabel_SST)
xticklabels_SST = Time_SST[xticklabel_SST[0::]].astype('str').tolist()
ylim_sst_max = 19
ylim_sst_min = 16
delta_ylim_sst = ylim_sst_max - ylim_sst_min
yticklabel_SST = np.linspace(ylim_sst_min, ylim_sst_max, num=7).tolist()
yticklabel_SST = [round(i, 1) for i in yticklabel_SST]
print(yticklabel_SST)
yticklabels_SST = [str(i) for i in yticklabel_SST]
print(yticklabels_SST)
# SST偏差
SST_Bias_AreaMean = SST_Historical_AreaMean - SST_ERSST_AreaMean
xticklabel_Bias = xticklabel_SST
xticklabels_Bias = xticklabels_SST
ylim_bias_max = 1
ylim_bias_min = -1
delta_ylim_bias = ylim_bias_max - ylim_bias_min
yticklabel_Bias = np.linspace(-1, 1, num=5).tolist()
yticklabel_Bias = [round(i, 1) for i in yticklabel_Bias]
print(yticklabel_Bias)
yticklabels_Bias = [str(i) for i in yticklabel_Bias]
print(yticklabels_Bias)
str_panels = ['a)ERSST ', 'b)Historical', 'c)SST Bias ']
# 画图
fig, ax = plt.subplots(3, 1, figsize=(16, 9), dpi=600)
# 观测：ERSST
ax = plt.subplot(3, 1, 1)
ax.plot(
    X_Time,
    SST_ERSST_AreaMean,
    'k-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='black')
ax.set_xlim(0, Num_Sample_SST)
ax.set_ylim(ylim_sst_min, ylim_sst_max)
ax.set_xticks(xticklabel_SST)
ax.set_yticks(yticklabel_SST)
ax.set_xticklabels(xticklabels_SST)
ax.set_yticklabels(yticklabels_SST)
ax.text(1, ylim_sst_min + 0.9*delta_ylim_sst, str_panels[0])
ax.set_ylabel('SST(℃)')
ax.grid(linestyle='-.')
# 模式：历史时期
ax = plt.subplot(3, 1, 2)
ax.plot(
    X_Time,
    SST_Historical_AreaMean,
    'b-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='blue')
ax.set_xlim(0, Num_Sample_SST)
ax.set_ylim(ylim_sst_min, ylim_sst_max)
ax.set_xticks(xticklabel_SST)
ax.set_yticks(yticklabel_SST)
ax.set_xticklabels(xticklabels_SST)
ax.set_yticklabels(yticklabels_SST)
ax.text(1, ylim_sst_min + 0.9*delta_ylim_sst, str_panels[1])
ax.set_ylabel('SST(℃)')
ax.grid(linestyle='-.')
# 模式偏差：模式减观测
ax = plt.subplot(3, 1, 3)
ax.plot(
    X_Time,
    SST_Bias_AreaMean,
    'b-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='blue')
ax.set_xlim(0, Num_Sample_SST)
ax.set_ylim(ylim_bias_min, ylim_bias_max)
ax.set_xticks(xticklabel_Bias)
ax.set_yticks(yticklabel_Bias)
ax.set_xticklabels(xticklabels_Bias)
ax.set_yticklabels(yticklabels_Bias)
ax.text(1, ylim_bias_min + 0.9*delta_ylim_bias, str_panels[2])
ax.set_xlabel('Time')
ax.set_ylabel('SST Bias(℃)')
ax.grid(linestyle='-.')
# 保存图片
fns = os.path.join(Path_SaveFigures, 'SST_ERSST_Historical_Bias.png')
plt.savefig(fns, dpi=600, bbox_inches='tight')
plt.close()

[0, 240, 480, 720, 960, 1200, 1440, 1680, 1920]
[16.0, 16.5, 17.0, 17.5, 18.0, 18.5, 19.0]
['16.0', '16.5', '17.0', '17.5', '18.0', '18.5', '19.0']
[-1.0, -0.5, 0.0, 0.5, 1.0]
['-1.0', '-0.5', '0.0', '0.5', '1.0']


上图自上到下依次是：a）观测SST，b）模式SST，c）模式模拟的SST偏差（模式减去观测）在185401-201412的全球平均时间序列，可以看出观测和模式的全球平均SST随着时间均呈现增长的趋势，但是模式相对观测还存在正负1度之间的偏差，且偏差时间序列有着非线性非平稳的特征。

#### 2.2.4.2 模式预估全球平均SST的3种未来情景试验的时间序列
画完历史时期的，我们再来看一下模式对未来预估的全球平均SST时间序列。


In [43]:
# 同上，预设画图参数
markersizenum = 4
Num_Year = 2100-2015+1
Num_Sample_SSP = Num_Year*12
X_Time = np.arange(0, Num_Sample_SSP)
Time_SSP = np.array(Time_SSP)
xtick_SSP_interval = 12*12
# SSP126
xticklabel_SSP = list(range(0, Num_Sample_SSP, xtick_SSP_interval))
print(xticklabel_SSP)
xticklabels_SSP = Time_SSP[xticklabel_SSP[0::]].astype('str').tolist()
ylim_ssp126_max = 22
ylim_ssp126_min = 18
delta_ylim_ssp126 = ylim_ssp126_max - ylim_ssp126_min
yticklabel_SSP126 = np.linspace(
    ylim_ssp126_min,
    ylim_ssp126_max,
    num=5).tolist()
yticklabel_SSP126 = [round(i, 1) for i in yticklabel_SSP126]
print(yticklabel_SSP126)
yticklabels_SSP126 = [str(i) for i in yticklabel_SSP126]
print(yticklabels_SSP126)
# SSP245
ylim_ssp245_max = 22
ylim_ssp245_min = 18
delta_ylim_ssp245 = ylim_ssp245_max - ylim_ssp245_min
yticklabel_SSP245 = np.linspace(
    ylim_ssp245_min,
    ylim_ssp245_max,
    num=5).tolist()
yticklabel_SSP245 = [round(i, 1) for i in yticklabel_SSP245]
print(yticklabel_SSP245)
yticklabels_SSP245 = [str(i) for i in yticklabel_SSP245]
print(yticklabels_SSP245)
# SSP585
ylim_ssp585_max = 22
ylim_ssp585_min = 18
delta_ylim_ssp585 = ylim_ssp585_max - ylim_ssp585_min
yticklabel_SSP585 = np.linspace(
    ylim_ssp585_min,
    ylim_ssp585_max,
    num=5).tolist()
yticklabel_SSP585 = [round(i, 1) for i in yticklabel_SSP585]
print(yticklabel_SSP585)
yticklabels_SSP585 = [str(i) for i in yticklabel_SSP585]
print(yticklabels_SSP585)
str_panels = ['a)SSP1-2.6 ', 'b)SSP2-4.5', 'c)SSP5-8.5']
# 绘图
fig, ax = plt.subplots(3, 1, figsize=(4, 3), dpi=600)
# SSP126
ax = plt.subplot(3, 1, 1)
ax.plot(
    X_Time,
    SST_SSP126_AreaMean,
    'b-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='blue')
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp126_min, ylim_ssp126_max)
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP126)
ax.set_xticklabels(xticklabels_SSP)
ax.set_yticklabels(yticklabels_SSP126)
ax.text(1, ylim_ssp126_min + 0.9*delta_ylim_ssp126, str_panels[0])
ax.set_ylabel('SST(℃)')
ax.grid(linestyle='-.')
# SSP245
ax = plt.subplot(3, 1, 2)
ax.plot(
    X_Time,
    SST_SSP245_AreaMean,
    'b-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='blue')
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp245_min, ylim_ssp245_max)
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP245)
ax.set_xticklabels(xticklabels_SSP)
ax.set_yticklabels(yticklabels_SSP245)
ax.text(1, ylim_ssp245_min + 0.9*delta_ylim_ssp245, str_panels[1])
ax.set_ylabel('SST(℃)')
ax.grid(linestyle='-.')
# SSP585
ax = plt.subplot(3, 1, 3)
ax.plot(
    X_Time,
    SST_SSP585_AreaMean,
    'b-',
    linewidth=2,
    marker='.',
    markersize=markersizenum,
    mfc='blue')
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp585_min, ylim_ssp585_max)
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP585)
ax.set_xticklabels(xticklabels_SSP)
ax.set_yticklabels(yticklabels_SSP585)
ax.text(1, ylim_ssp585_min + 0.9*delta_ylim_ssp585, str_panels[2])
ax.set_xlabel('Time')
ax.set_ylabel('SST(℃)')
ax.grid(linestyle='-.')
# 保存图片
fns = os.path.join(Path_SaveFigures, 'SST_SSP126_245_585.png')
plt.savefig(fns, dpi=600, bbox_inches='tight')
plt.close()

[0, 144, 288, 432, 576, 720, 864, 1008]
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']


上图自上到下依次是：a）低排放预估，b）中等排放预估，c）高排放预估在201501-210012的全球平均时间序列，三种排放预估结果都显示未来全球平均SST将随时间逐渐升高，排放越高，上升的越快，到本世纪末温度越高。

## 小结
本章我们带领大家认识了用到的SST数据，并对数据做了基本的读取、插值、计算全球平均和画图展示的操作，得到全球平均SST数据之后，我们就可以进行下一步：对SST数据进行EEMD分解了。

### EEMD分解部分参考EEMD.ipynb文件（该部分使用Octave实现）

# EEMD后处理

上一章得到经过EEMD分解后数据后，我们需要对不同频率的分解时间序列确定周期，再按照特定时间尺度来做一个组合，这样等于给SST施加了物理约束。
我们在这一章就是要对上一章分解出来的SST结果画图展示一下，然后计算各个分量的平均周期，最后做一个组合，以便用到最后的订正上。

**大家需要注意的是，我们主要使用的环境还是Python3环境，上一章的海表面温度数据EEMD分解是在Octave环境下完成的，也就是说只有EEMD分解这一步需要Octave环境，其他步骤都在Python3环境下完成。**
**这里计算资源建议大家选择2核8G CPU资源，使用镜像为octave 测试镜像-song-v1，Kernel类型为Python3。**


## 4.1 导包、导入数据

In [35]:
import numpy as np
import scipy.io as sio
import matplotlib.pyplot as plt
import os

In [36]:
# 原始数据
Path_Data_Original = './project/SaveData/Original/'
Time_ERSST = np.load(Path_Data_Original + 'Time_ERSST.npy')
Time_SSP = np.load(Path_Data_Original + 'Time_SSP.npy')
Num_Year = 86  # 1929-2014
Num_Sample_SST = Num_Year*12
Time_SST = np.arange(0,Num_Sample_SST)
Time_ERSST = Time_ERSST[-Num_Sample_SST::]
# EEMD分解的数据
Path_Data_EEMD = './project/SaveData/EEMD/'
IMFs_ERSST = sio.loadmat(Path_Data_EEMD + 'IMFs_ERSST.mat'); IMFs_ERSST = IMFs_ERSST['IMFs_ERSST']
IMFs_Historical = sio.loadmat(Path_Data_EEMD + 'IMFs_Historical.mat'); IMFs_Historical = IMFs_Historical['IMFs_Historical']
IMFs_SSP126 = sio.loadmat(Path_Data_EEMD + 'IMFs_SSP126.mat'); IMFs_SSP126 = IMFs_SSP126['IMFs_SSP126']
IMFs_SSP245 = sio.loadmat(Path_Data_EEMD + 'IMFs_SSP245.mat'); IMFs_SSP245 = IMFs_SSP245['IMFs_SSP245']
IMFs_SSP585 = sio.loadmat(Path_Data_EEMD + 'IMFs_SSP585.mat'); IMFs_SSP585 = IMFs_SSP585['IMFs_SSP585']
# 保存成npy文件
np.save(Path_Data_EEMD + 'IMFs_ERSST.npy', IMFs_ERSST)
np.save(Path_Data_EEMD + 'IMFs_Historical.npy', IMFs_Historical)
np.save(Path_Data_EEMD + 'IMFs_SSP126.npy', IMFs_SSP126)
np.save(Path_Data_EEMD + 'IMFs_SSP245.npy', IMFs_SSP245)
np.save(Path_Data_EEMD + 'IMFs_SSP585.npy', IMFs_SSP585)

## 4.2 画图：画出观测和模式历史和未来的各个IMF

**设置保存路径**

In [37]:
Path_Pre = './project/Figures/'
Path_SaveFigures =  Path_Pre + 'EEMD/'
isExists = os.path.exists(Path_SaveFigures)
if not isExists:
    os.makedirs(Path_Pre + 'EEMD/')
    print( Path_SaveFigures +  ' successfully build!')
else:
    print( Path_SaveFigures + ' has been existed!')

./project/Figures/EEMD/ successfully build!


### 4.2.1 画观测SST的IMF

In [44]:
NIMFs_ERSST = IMFs_ERSST.shape[0]
Max_IMFs_ERSST = np.max(IMFs_ERSST, 1)
Min_IMFs_ERSST = np.min(IMFs_ERSST, 1)
# 预设
fontsizenum = 16
Num_Year = 86
Num_Sample_SST = Num_Year*12
X_Time = np.arange(0, Num_Sample_SST)
Time_SST = np.array(Time_SST[-Num_Sample_SST::])
xtick_SST_interval = 7*12
xticklabel_SST = list(range(0, Num_Sample_SST, xtick_SST_interval))
xticklabels_SST = Time_ERSST[xticklabel_SST[0::]].astype('str').tolist()
# 画图
fig, ax = plt.subplots(NIMFs_ERSST, 1, figsize=(15, 18), dpi=600)
for n in range(NIMFs_ERSST):
    ax = plt.subplot(NIMFs_ERSST, 1, n + 1)
    ax.plot(X_Time, IMFs_ERSST[n, :], 'k-', linewidth=2)
    ax.set_xlim(0, Num_Sample_SST)
    ax.set_xticks(xticklabel_SST)
    ax.set_xticklabels([])
    yticks_all = np.round(ax.get_yticks(), 1)
    if yticks_all[0] == yticks_all[-1]:
        yticks_all[0] = -0.1
        yticks_all[-1] = 0.1
    yticks_end = np.array((yticks_all[0], np.round(
        (yticks_all[0] + yticks_all[-1]) / 2, 1), yticks_all[-1]))
    yticklabels_end = [str(i) for i in yticks_end]
    print(yticklabels_end)
    ax.set_yticks(yticks_end)
    ax.set_yticklabels(yticklabels_end, fontsize=fontsizenum)
    ax.set_ylabel('IMF' + str(n+1), fontsize=fontsizenum)
ax.set_xticks(xticklabel_SST)
ax.set_xticklabels(xticklabels_SST, fontsize=fontsizenum)
ax.set_xlabel('Time', fontsize=fontsizenum)
# 保存图片，先保存再显示。
fns = os.path.join(Path_SaveFigures, 'IMFs_ERSST.png')
plt.savefig(fns, dpi=600, bbox_inches='tight')
plt.close()

['-0.3', '-0.0', '0.2']
['-0.2', '0.0', '0.2']
['-0.1', '0.0', '0.2']
['-0.2', '0.0', '0.2']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.2']
['-0.1', '0.0', '0.1']
['-0.1', '-0.0', '0.0']
['-0.1', '0.0', '0.1']
['17.6', '18.0', '18.4']


上图是观测SST在192901-201412的10个IMF时间序列，自上到下依次是IMF1到IMF10。其中IMF1到IMF9频率依次降低，IMF10是趋势项，可以看出近百年来，我们生存的地球平均SST呈现逐年升温的趋势。

### 4.2.2 画模式的历史SST的IMF


In [45]:
NIMFs_Historical = IMFs_Historical.shape[0]
Max_IMFs_Historical = np.max(IMFs_Historical, 1)
Min_IMFs_Historical = np.min(IMFs_Historical, 1)
# 预设
fontsizenum = 16  
Num_Year = 86
Num_Sample_SST = Num_Year*12
X_Time = np.arange(0, Num_Sample_SST )
Time_SST = np.array(Time_SST[-Num_Sample_SST::])
xtick_SST_interval = 7*12
xticklabel_SST = list(range(0, Num_Sample_SST, xtick_SST_interval))
xticklabels_SST = Time_ERSST[xticklabel_SST[0::]].astype('str').tolist()
# 画图
fig, ax = plt.subplots(NIMFs_Historical, 1, figsize = (15,18), dpi = 600)
for n in range(NIMFs_Historical):
    ax = plt.subplot(NIMFs_Historical, 1, n + 1)
    ax.plot(X_Time, IMFs_Historical[n,:], 'b-', linewidth = 2)
    ax.set_xlim(0, Num_Sample_SST)
    ax.set_xticks(xticklabel_SST)
    ax.set_xticklabels([])
    yticks_all = np.round(ax.get_yticks(), 1)
    if yticks_all[0] == yticks_all[-1]:
        yticks_all[0] = -0.1
        yticks_all[-1] = 0.1
    yticks_end = np.array((yticks_all[0], np.round((yticks_all[0] + yticks_all[-1]) / 2, 1), yticks_all[-1]))
    yticklabels_end = [str(i) for i in yticks_end]
    print(yticklabels_end)
    ax.set_yticks(yticks_end)
    ax.set_yticklabels(yticklabels_end, fontsize = fontsizenum)
    ax.set_ylabel('IMF' + str(n+1), fontsize = fontsizenum)
ax.set_xticks(xticklabel_SST)
ax.set_xticklabels(xticklabels_SST, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize = fontsizenum)
# 保存图片，先保存再显示。
fns = os.path.join(Path_SaveFigures, 'IMFs_Historical.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight') 
plt.close()


['-0.2', '0.0', '0.2']
['-0.4', '0.0', '0.4']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.2', '0.0', '0.2']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['17.5', '18.0', '18.5']


上图是模式SST在192901-201412的10个IMF时间序列，自上到下依次是IMF1到IMF10。其中IMF1到IMF9频率依次降低，IMF10是趋势项，可以看出近百年来，模式也能模拟出全球平均SST呈现逐年升温的趋势。

### 4.2.3 画模式的未来预估SST的IMF


**SSP126**

In [46]:
NIMFs_SSP = IMFs_SSP126.shape[0]
Max_IMFs_SSP126 = np.max(IMFs_SSP126, 1)
Min_IMFs_SSP126 = np.min(IMFs_SSP126, 1)
# 预设
fontsizenum = 16  
Num_Year = 86
Num_Sample_SST = Num_Year*12
X_Time = np.arange(0, Num_Sample_SST )
Time_SST = np.array(Time_SST[-Num_Sample_SST::])
xtick_SST_interval = 7*12
xticklabel_SST = list(range(0, Num_Sample_SST, xtick_SST_interval))
xticklabels_SST = Time_SSP[xticklabel_SST[0::]].astype('str').tolist()
# 画图
fig, ax = plt.subplots(NIMFs_SSP, 1, figsize = (15,18), dpi = 600)
for n in range(NIMFs_SSP):
    ax = plt.subplot(NIMFs_SSP, 1, n + 1)
    ax.plot(X_Time, IMFs_SSP126[n,:], 'b-', linewidth = 2)
    ax.set_xlim(0, Num_Sample_SST)
    ax.set_xticks(xticklabel_SST)
    ax.set_xticklabels([])
    yticks_all = np.round(ax.get_yticks(), 1)
    if yticks_all[0] == yticks_all[-1]:
        yticks_all[0] = -0.1
        yticks_all[-1] = 0.1
    yticks_end = np.array( (yticks_all[0], np.round((yticks_all[0] + yticks_all[-1]) / 2, 1), yticks_all[-1]) )
    yticklabels_end = [str(i) for i in yticks_end]
    print(yticklabels_end)
    ax.set_yticks(yticks_end)
    ax.set_yticklabels(yticklabels_end, fontsize = fontsizenum)
    ax.set_ylabel('IMF' + str(n+1), fontsize = fontsizenum)
ax.set_xticks(xticklabel_SST)
ax.set_xticklabels(xticklabels_SST, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize = fontsizenum)
# 保存图片，先保存再显示。
fns = os.path.join(Path_SaveFigures, 'IMFs_SSP126.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight') 
plt.close()


['-0.2', '0.0', '0.2']
['-0.4', '0.0', '0.4']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['18.4', '18.8', '19.2']


上图是模式在未来低排放情景下预估SST在201501-210012的10个IMF时间序列，自上到下依次是IMF1到IMF10。其中IMF1到IMF9频率依次降低，IMF10是趋势项，预估结果显示，未来近百年全球平均SST呈现继续升温趋势。

**SSP245**

In [47]:
Max_IMFs_SSP245 = np.max(IMFs_SSP245, 1)
Min_IMFs_SSP245 = np.min(IMFs_SSP245, 1)
# 画图
fig, ax = plt.subplots(NIMFs_SSP, 1, figsize = (15,18), dpi = 600)
for n in range(NIMFs_SSP):
    ax = plt.subplot(NIMFs_SSP, 1, n + 1)
    ax.plot(X_Time, IMFs_SSP245[n,:], 'b-', linewidth = 2)
    ax.set_xlim(0, Num_Sample_SST)
    ax.set_xticks(xticklabel_SST)
    ax.set_xticklabels([])
    yticks_all = np.round(ax.get_yticks(), 1)
    if yticks_all[0] == yticks_all[-1]:
        yticks_all[0] = -0.1
        yticks_all[-1] = 0.1
    yticks_end = np.array( (yticks_all[0], np.round((yticks_all[0] + yticks_all[-1]) / 2, 1), yticks_all[-1]) )
    yticklabels_end = [str(i) for i in yticks_end]
    print(yticklabels_end)
    ax.set_yticks(yticks_end)
    ax.set_yticklabels(yticklabels_end, fontsize = fontsizenum)
    ax.set_ylabel('IMF' + str(n+1), fontsize = fontsizenum)
ax.set_xticks(xticklabel_SST)
ax.set_xticklabels(xticklabels_SST, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize = fontsizenum)
# 保存图片，先保存再显示。
fns = os.path.join(Path_SaveFigures, 'IMFs_SSP245.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight') 
plt.close()


['-0.2', '-0.0', '0.1']
['-0.4', '0.0', '0.4']
['-0.2', '0.0', '0.2']
['-0.1', '0.0', '0.2']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '-0.0', '0.0']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['18.0', '19.0', '20.0']


上图是模式在未来中等排放情景下预估SST在201501-210012的10个IMF时间序列，自上到下依次是IMF1到IMF10。其中IMF1到IMF9频率依次降低，IMF10是趋势项，预估结果显示，未来近百年全球平均SST呈现继续升温趋势，升温幅度比低排放更大。

**SSP585**

In [48]:
Max_IMFs_SSP585 = np.max(IMFs_SSP585, 1)
Min_IMFs_SSP585 = np.min(IMFs_SSP585, 1)
# 画图
fig, ax = plt.subplots(NIMFs_SSP, 1, figsize = (15,18), dpi = 600)
for n in range(NIMFs_SSP):
    ax = plt.subplot(NIMFs_SSP, 1, n + 1)
    ax.plot(X_Time, IMFs_SSP585[n,:], 'b-', linewidth = 2)
    ax.set_xlim(0, Num_Sample_SST)
    ax.set_xticks(xticklabel_SST)
    ax.set_xticklabels([])
    yticks_all = np.round(ax.get_yticks(), 1)
    if yticks_all[0] == yticks_all[-1]:
        yticks_all[0] = -0.1
        yticks_all[-1] = 0.1
    yticks_end = np.array( (yticks_all[0], np.round((yticks_all[0] + yticks_all[-1]) / 2, 1), yticks_all[-1]) )
    yticklabels_end = [str(i) for i in yticks_end]
    print(yticklabels_end)
    ax.set_yticks(yticks_end)
    ax.set_yticklabels(yticklabels_end, fontsize = fontsizenum)
    ax.set_ylabel('IMF' + str(n+1), fontsize = fontsizenum)
ax.set_xticks(xticklabel_SST)
ax.set_xticklabels(xticklabels_SST, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize = fontsizenum)
# 保存图片，先保存再显示。
fns = os.path.join(Path_SaveFigures, 'IMFs_SSP585.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight') 
plt.close()



['-0.2', '0.0', '0.2']
['-0.2', '0.0', '0.2']
['-0.2', '0.0', '0.2']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.1', '0.0', '0.1']
['-0.2', '0.0', '0.2']
['18.0', '20.0', '22.0']


上图是模式在未来高排放情景下预估SST在201501-210012的10个IMF时间序列，自上到下依次是IMF1到IMF10。其中IMF1到IMF9频率依次降低，IMF10是趋势项，预估结果显示，未来近百年全球平均SST呈现继续升温趋势，升温幅度在三种排放情景中最大，本世纪末较2015年升高近4°C。

### 4.3 计算EEMD分解数据平均周期
**计算各个IMF的平均周期，这里我们根据跨零点的个数计算平均周期。大家要特别注意最后两个IMF，计算公式可能带来较大误差**


In [49]:
# ERSST
Num_CrossZero_ERSST = np.zeros(NIMFs_ERSST - 1)
MeanPeriod_IMFs_ERSST = np.zeros(NIMFs_ERSST - 1)
MeanYearPeriod_IMFs_ERSST = np.zeros(NIMFs_ERSST - 1)
for n in range(NIMFs_ERSST - 1):
    k = 0
    for i in range(len(Time_SST) - 1):
        if IMFs_ERSST[n,i] * IMFs_ERSST[n,i + 1] < 0:  # 数跨零点的个数
            k = k + 1
    Num_CrossZero_ERSST[n] = k
    MeanPeriod_IMFs_ERSST[n] = len(Time_SST) * 2 / k  # 根据跨零点的个数计算平均周期，单位是月
    MeanYearPeriod_IMFs_ERSST[n] = len(Time_SST) * 2 / k / 12  # 单位转换成年
# 打印观测的SST各个imf的平均周期
print(MeanYearPeriod_IMFs_ERSST)

# Historical
NIMFs_Historical = IMFs_Historical.shape[0]
Num_CrossZero_Historical = np.zeros(NIMFs_Historical - 1)
MeanPeriod_IMFs_Historical = np.zeros(NIMFs_Historical - 1)
MeanYearPeriod_IMFs_Historical = np.zeros(NIMFs_Historical - 1)
for n in range(NIMFs_Historical - 1):
    k = 0
    for i in range(len(Time_SST)-1):
        if IMFs_Historical[n,i] * IMFs_Historical[n,i + 1] < 0:
            k = k + 1
    Num_CrossZero_Historical[n] = k
    MeanPeriod_IMFs_Historical[n] = len(Time_SST) * 2 / k
    MeanYearPeriod_IMFs_Historical[n] = len(Time_SST) * 2 / k / 12
# 打印模式的历史SST各个imf的平均周期
print(MeanYearPeriod_IMFs_Historical)

# SSP126/SSP245/SSP585
NIMFs_SSP = IMFs_SSP126.shape[0]
Num_CrossZero_SSP126 = np.zeros(NIMFs_SSP - 1)
Num_CrossZero_SSP245 = np.zeros(NIMFs_SSP - 1)
Num_CrossZero_SSP585 = np.zeros(NIMFs_SSP - 1)
MeanPeriod_IMFs_SSP126 = np.zeros(NIMFs_SSP - 1)
MeanPeriod_IMFs_SSP245 = np.zeros(NIMFs_SSP - 1)
MeanPeriod_IMFs_SSP585 = np.zeros(NIMFs_SSP - 1)
MeanYearPeriod_IMFs_SSP126 = np.zeros(NIMFs_SSP - 1)
MeanYearPeriod_IMFs_SSP245 = np.zeros(NIMFs_SSP - 1)
MeanYearPeriod_IMFs_SSP585 = np.zeros(NIMFs_SSP - 1)
for n in range(NIMFs_SSP - 1):
    # SSP126
    k = 0
    for i in range(len(Time_SST)-1):
        if IMFs_SSP126[n,i] * IMFs_SSP126[n,i + 1] < 0:
            k = k + 1
    Num_CrossZero_SSP126[n] = k
    MeanPeriod_IMFs_SSP126[n] = len(Time_SST) * 2 / k
    MeanYearPeriod_IMFs_SSP126[n] = len(Time_SST) * 2 / k / 12

    # SSP245
    k = 0
    for i in range(len(Time_SST) - 1):
        if IMFs_SSP245[n, i] * IMFs_SSP245[n, i + 1] < 0:
            k = k + 1
    Num_CrossZero_SSP245[n] = k
    MeanPeriod_IMFs_SSP245[n] = len(Time_SST) * 2 / k
    MeanYearPeriod_IMFs_SSP245[n] = len(Time_SST) * 2 / k / 12

    # SSP585
    k = 0
    for i in range(len(Time_SST) - 1):
        if IMFs_SSP585[n, i] * IMFs_SSP585[n, i + 1] < 0:
            k = k + 1
    Num_CrossZero_SSP585[n] = k
    MeanPeriod_IMFs_SSP585[n] = len(Time_SST) * 2 / k
    MeanYearPeriod_IMFs_SSP585[n] = len(Time_SST) * 2 / k / 12
# 打印模式的未来预估SST各个imf的平均周期
print(MeanYearPeriod_IMFs_SSP126)
print(MeanYearPeriod_IMFs_SSP245)
print(MeanYearPeriod_IMFs_SSP585)

[ 0.49002849  0.88659794  1.14666667  3.51020408  6.61538462 13.23076923
 34.4        43.         86.        ]
[ 0.47645429  1.00584795  1.4214876   4.0952381   7.47826087 19.11111111
 43.         43.         86.        ]
[ 0.48725212  1.00584795  1.4214876   4.77777778  6.88       15.63636364
 43.         86.         86.        ]
[ 0.46112601  0.89119171  1.00584795  3.51020408  4.91428571 10.11764706
 24.57142857 43.         86.        ]
[ 0.37310195  0.57718121  1.00584795  2.6875      4.91428571 10.11764706
 43.         86.         86.        ]


### 4.4. 根据平均周期组合IMF

#### &emsp;&emsp; 我们依据的时间尺度有以下6个:
#### &emsp;&emsp; &emsp;&emsp; 1) 季节：3个月~12个月
#### &emsp;&emsp; &emsp;&emsp; 2) 年：12个月左右
#### &emsp;&emsp; &emsp;&emsp; 3) 年际：1年~10年
#### &emsp;&emsp; &emsp;&emsp; 4) 十年：10年左右
#### &emsp;&emsp; &emsp;&emsp; 5) 年代际：大于10年
#### &emsp;&emsp; &emsp;&emsp; 6) 年代际：>80年

In [50]:
# ERSST
IMFs_ERSST_Compose = np.zeros( (6, Num_Sample_SST) )
IMFs_ERSST_Compose[0,:] = IMFs_ERSST[0,:]  # 第1个imf
IMFs_ERSST_Compose[1,:] = np.sum( IMFs_ERSST[1:3,:], 0 )  # 第2-3个imf
IMFs_ERSST_Compose[2,:] = np.sum( IMFs_ERSST[3:5,:], 0 )  # 第4-5个imf
IMFs_ERSST_Compose[3,:] = IMFs_ERSST[5,:]  # 第6个imf
IMFs_ERSST_Compose[4,:] = np.sum( IMFs_ERSST[6:8,:], 0 )  # 第7-8个imf
IMFs_ERSST_Compose[5,:] = np.sum( IMFs_ERSST[8:10,:], 0 ) # 第9个imf和最后一个趋势项
# Historical
IMFs_Historical_Compose = np.zeros( (6, Num_Sample_SST) )
IMFs_Historical_Compose[0,:] = IMFs_Historical[0,:] 
IMFs_Historical_Compose[1,:] = np.sum( IMFs_Historical[1:3,:], 0 )  
IMFs_Historical_Compose[2,:] = np.sum( IMFs_Historical[3:5,:], 0 )  
IMFs_Historical_Compose[3,:] = IMFs_Historical[5,:]  
IMFs_Historical_Compose[4,:] = np.sum( IMFs_Historical[6:8,:], 0 )
IMFs_Historical_Compose[5,:] = np.sum( IMFs_Historical[8:10,:], 0 )  

## SSP126/SSP245/SSP585
# SSP126
IMFs_SSP126_Compose = np.zeros( (6, Num_Sample_SST) )
IMFs_SSP126_Compose[0,:] = IMFs_SSP126[0,:]  
IMFs_SSP126_Compose[1,:] = np.sum( IMFs_SSP126[1:3,:], 0 ) 
IMFs_SSP126_Compose[2,:] = np.sum( IMFs_SSP126[3:5,:], 0 )  
IMFs_SSP126_Compose[3,:] = IMFs_SSP126[5,:]  
IMFs_SSP126_Compose[4,:] = np.sum( IMFs_SSP126[6:8,:], 0 )  
IMFs_SSP126_Compose[5,:] = np.sum( IMFs_SSP126[8:10,:], 0 )  
# SSP245
IMFs_SSP245_Compose = np.zeros( (6, Num_Sample_SST) )
IMFs_SSP245_Compose[0,:] = IMFs_SSP245[0,:] 
IMFs_SSP245_Compose[1,:] = np.sum( IMFs_SSP245[1:3,:], 0 )  
IMFs_SSP245_Compose[2,:] = np.sum( IMFs_SSP245[3:5,:], 0 )  
IMFs_SSP245_Compose[3,:] = IMFs_SSP245[5,:]  
IMFs_SSP245_Compose[4,:] = np.sum( IMFs_SSP245[6:8,:], 0 )  
IMFs_SSP245_Compose[5,:] = np.sum( IMFs_SSP245[8:10,:], 0 )  
# SSP585
IMFs_SSP585_Compose = np.zeros( (6, Num_Sample_SST) )
IMFs_SSP585_Compose[0,:] = IMFs_SSP585[0,:]  
IMFs_SSP585_Compose[1,:] = np.sum( IMFs_SSP585[1:3,:], 0 )  
IMFs_SSP585_Compose[2,:] = np.sum( IMFs_SSP585[3:5,:], 0 )  
IMFs_SSP585_Compose[3,:] = IMFs_SSP585[5,:]  
IMFs_SSP585_Compose[4,:] = np.sum( IMFs_SSP585[6:8,:], 0 )  
IMFs_SSP585_Compose[5,:] = np.sum( IMFs_SSP585[8:10,:], 0 )  


**保存数据**

In [51]:
Path_SaveData = './project/SaveData/EEMD/'
np.save(Path_SaveData + 'IMFs_ERSST_Compose.npy', IMFs_ERSST_Compose)
np.save(Path_SaveData + 'IMFs_Historical_Compose.npy', IMFs_Historical_Compose)
np.save(Path_SaveData + 'IMFs_SSP126_Compose.npy', IMFs_SSP126_Compose)
np.save(Path_SaveData + 'IMFs_SSP245_Compose.npy', IMFs_SSP245_Compose)
np.save(Path_SaveData + 'IMFs_SSP585_Compose.npy', IMFs_SSP585_Compose)

## 小结
本章我们带领大家学习绘制了观测和模式的imf分图，并计算了各个imf的平均周期，并根据特定时间尺度来对imf组合得到最终的6个组合时间序列。之后，我们就可以开始下一步使用机器学习模型对SST订正的部分了。

# SST订正及预估
对EEMD分解数据后处理之后，我们就可以用来做下一步的订正了。
本章我们要关注的问题有：
&emsp;&emsp; 1）我们选择BPNN模型来订正海表温度的原因是什么？
&emsp;&emsp; 2）我们的BPNN订正模型需要的数据要做哪些处理？模式历史偏差订正和未来预估的关系是什么？
&emsp;&emsp; 3）我们如何评价订正结果？

**这里，我们计算资源选择2核8G CPU资源即可，使用镜像为octave 测试镜像-song-v1，Kernel类型为Python3。**


## 5.1 BPNN机器学习模型简介

[BPNN（Back Propagation Neural Network）](https://blog.csdn.net/cufewxy1/article/details/80445023?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522165906166016781685332684%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=165906166016781685332684&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduend~default-2-80445023-null-null.142^v35^new_blog_fixed_pos&utm_term=bpnn%E6%A8%A1%E5%9E%8B&spm=1018.2226.3001.4187)一般包括输入层、隐含层和输出层。一般地，输入层和输出层只有一层，隐含层则至少有一层。网络的每一层都有一定数量的神经元，不同数量的神经元组成的多层网络有很强大的非线性表达能力。以三层BPNN为例，其网络结构如下图所示。如背景所述和前两章我们画的海温偏差图也可以看出，，模式海温偏差本身并不是线性变化那么简单，而是具有非线性的变化特征，BPNN正有强大的非线性表达能力，能够挖掘模式与观测之间偏差的规律特征，实现偏差订正的目的。因此，我们选择了这个模型来做后续的订正。

![Image Name](https://cdn.kesci.com/upload/image/rfj5o9e7hp.png?imageView2/0/w/960/h/960)


&emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; &emsp;&emsp; 三层神经网络示意图

## 5.2 BPNN 模型订正历史全球平均SST数据
了解了我们需要的BPNN模型之后，接下来就要用模型来订正SST了，具体可分为以下几个部分：
&emsp;&emsp; 1）导入工具包：我们要用到Sklearn库，包括归一化函数和模型包。
&emsp;&emsp; 2）导入数据：把组合后的模式和观测数据导进来使用。
&emsp;&emsp; 3）划分数据集：按照机器学习常用的划分数据集方法，我们对数据随机划分成训练集、验证集和测试集，以满足数据集的均匀分布。
&emsp;&emsp; 4）预定义归一化数组。对后面需要归一化的数组进行预分配。
&emsp;&emsp; 5）建立并训练模型：建模使用到了Sklearn库中MLPRegressor的包，并根据验证集的误差表现来调整隐层参数。
&emsp;&emsp; 6）处理模型输出数据并保存

### 5.2.1 导入工具包

In [53]:
import numpy as np
import random
from sklearn.preprocessing import MinMaxScaler  # 最大最小归一化函数
from sklearn.neural_network import MLPRegressor # 多层感知器回归，这里等于BPNN
import time
import pickle
import os
import scipy.stats
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression # 线性回归函数用于最后的海温增加量
 
# 定义均方根误差rmse函数
def rmse(predictions, targets):
    return np.sqrt( ( (predictions - targets)**2 ).mean() )

### 5.2.2 导入数据

In [54]:
# EEMD分解后的组合数据
Path_EEMDData = './project/SaveData/EEMD/'
IMFs_ERSST_Compose = np.load(Path_EEMDData + 'IMFs_ERSST_Compose.npy')
IMFs_Historical_Compose = np.load(Path_EEMDData + 'IMFs_Historical_Compose.npy')
IMFs_SSP126_Compose = np.load(Path_EEMDData + 'IMFs_SSP126_Compose.npy')
IMFs_SSP245_Compose = np.load(Path_EEMDData + 'IMFs_SSP245_Compose.npy')
IMFs_SSP585_Compose = np.load(Path_EEMDData + 'IMFs_SSP585_Compose.npy')

### 4.2.3 划分数据集
按照年份**随机划分训练集、验证集和测试集**，这里我们训练集，验证集和测试集的数目分别取70、8和8。划分好数据集之后，我们再对三个数据集的数组变换一下维度，也就是变成（组合数，年份数，12）的维度，为后面输入模型准备。

这里组合数就是我们用EEMD分解后按照特定时间尺度得到的IMF组合，我们后面建立模型要对每个IMF组合分别建模订正，最后将所有组合相加得到最终的订正结果。

**模型的输入输出设置为：输入12个神经元，每个神经元输入同一月份组成的多年的模式向量，输出12个神经元，同样地，每个神经元输出同一月份组成的多年的观测向量。隐藏层的数目作为超参数，我们可以设置多个不同的数目去调参，观察验证集误差，最后取误差最小时对应的神经元数作为最终网络结构。**

In [55]:
## 随机划分数据集
# 3个集合的年份数分配
Num_Year= 86
Num_Year_Train = 70
Num_Year_Validate = 8
Num_Year_Test = 8
# 设置随机数种子，使每次实验结果一致
random.seed(1)
Index_Year = list( range(Num_Year) )
Index_Year_Random = list( range(Num_Year) )
random.shuffle(Index_Year_Random) # 打乱年序号，后面根据年序号随机划分训练集、验证集和测试集
Index_Year_Train = Index_Year_Random[0:Num_Year_Train] # 训练集的序号
Index_Year_Validate = Index_Year_Random[Num_Year_Train:Num_Year_Train + Num_Year_Validate] # 验证集的序号
Index_Year_Test = Index_Year_Random[-Num_Year_Test::]  # 测试集的序号
## 数据集：变换成3维数组，即（组合数，年份数，12）
IMFs_ERSST_Compose_3D = IMFs_ERSST_Compose.reshape((6, Num_Year, 12)) 
IMFs_Historical_Compose_3D = IMFs_Historical_Compose.reshape((6, Num_Year, 12))
IMFs_SSP126_Compose_3D = IMFs_SSP126_Compose.reshape((6, Num_Year, 12))
IMFs_SSP245_Compose_3D = IMFs_SSP245_Compose.reshape((6, Num_Year, 12))
IMFs_SSP585_Compose_3D = IMFs_SSP585_Compose.reshape((6, Num_Year, 12))
# 对观测数据ERSST划分三个数据集合
IMFs_ERSST_Compose_3D_Train = IMFs_ERSST_Compose_3D[:,Index_Year_Train, :]
IMFs_ERSST_Compose_3D_Validate = IMFs_ERSST_Compose_3D[:,Index_Year_Validate, :]
IMFs_ERSST_Compose_3D_Test = IMFs_ERSST_Compose_3D[:,Index_Year_Test, :]
IMFs_ERSST_Compose_2D_Train = IMFs_ERSST_Compose_3D_Train.reshape((6,-1)) # 训练集 变成2维，方便计算每个组合的误差
IMFs_ERSST_Compose_2D_Validate = IMFs_ERSST_Compose_3D_Validate.reshape((6,-1)) # 验证集 变成2维，方便计算每个组合的误差
IMFs_ERSST_Compose_2D_Test = IMFs_ERSST_Compose_3D_Test.reshape((6,-1)) # 测试集 变成2维，方便计算每个组合的误差
# 模式的历史时期数据，同观测
IMFs_Historical_Compose_3D_Train = IMFs_Historical_Compose_3D[:,Index_Year_Train, :]
IMFs_Historical_Compose_3D_Validate = IMFs_Historical_Compose_3D[:,Index_Year_Validate, :]
IMFs_Historical_Compose_3D_Test = IMFs_Historical_Compose_3D[:,Index_Year_Test, :]
IMFs_Historical_Compose_2D_Train = IMFs_Historical_Compose_3D_Train.reshape((6,-1))
IMFs_Historical_Compose_2D_Validate = IMFs_Historical_Compose_3D_Validate.reshape((6,-1))
IMFs_Historical_Compose_2D_Test = IMFs_Historical_Compose_3D_Test.reshape((6,-1))

### 5.2.4 预定义归一化数组
这里给归一化需要的数组预分配空间，真正归一化的步骤我们把它放在4.2.5建立模型的循环中了。

In [56]:
# 归一化：预分配变量空间
Input_Train_Scaler = np.zeros(IMFs_Historical_Compose_3D_Train.shape)
Output_Train_Scaler = np.zeros(IMFs_ERSST_Compose_3D_Train.shape)
Input_Validate_Scaler = np.zeros(IMFs_Historical_Compose_3D_Validate.shape)
Input_Test_Scaler = np.zeros(IMFs_Historical_Compose_3D_Test.shape)
Input_Forecast126_Scaler = np.zeros(IMFs_SSP126_Compose_3D.shape)
Input_Forecast245_Scaler = np.zeros(IMFs_SSP245_Compose_3D.shape)
Input_Forecast585_Scaler = np.zeros(IMFs_SSP585_Compose_3D.shape)
# 预设相关统计量和变量
RMSE_Predict_Train_All = np.zeros((6,9))  # (IMF的组合数, 待测试的隐层神经元数)
RMSE_Predict_Validate_All =  np.zeros((6,9))  # (IMF的组合数, 待测试的隐层神经元数)
RMSE_Predict_Test_All =  np.zeros((6,9))  # (IMF的组合数, 待测试的隐层神经元数)
Neuron_Predict_All = np.zeros(6)  # 储存每个组合验证集修正后误差最小时对应的隐层神经元数目
def rmse(predictions, targets):
    return np.sqrt( ( (predictions - targets)**2 ).mean() )
BP_Output_Train = np.zeros(IMFs_ERSST_Compose_3D_Train.shape)  # 反归一化后的原始值
BP_Output_Validate = np.zeros(IMFs_ERSST_Compose_3D_Validate.shape)
BP_Output_Test = np.zeros(IMFs_ERSST_Compose_3D_Test.shape)
BP_Output_SSP126 = np.zeros(IMFs_SSP126_Compose_3D.shape)
BP_Output_SSP245 = np.zeros(IMFs_SSP245_Compose_3D.shape)
BP_Output_SSP585 = np.zeros(IMFs_SSP585_Compose_3D.shape)
 
# 保存模型路径 
Path_Pre = './project/'
Path_SaveModel =  Path_Pre + 'Model/'
isExists = os.path.exists(Path_SaveModel)
if not isExists:
    os.makedirs(Path_Pre + 'Model/')
    print( Path_SaveModel +  ' successfully build!')
else:
    print( Path_SaveModel + ' has been existed!')

./project/Model/ successfully build!


### 5.2.5 建立并训练模型
我们建立模型采用的是[sklearn.neural_network 模块中MLPRegressor类](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html?highlight=mlp#sklearn.neural_network.MLPRegressor)，中文链接[点这](https://scikit-learn.org.cn/view/714.html)，建立3层神经网络，1层隐藏层。
这里，我们在对历史全球海温数据订正的同时，也已经把未来预估的结果进行了订正，最后根据最佳隐藏层神经元数的订正结果选择最终的历史和未来预估的订正结果。

In [57]:
starttime = time.time()
for n in range(6):
    ## 归一化
    # 训练集
    Scaler_Input = MinMaxScaler(feature_range = (-1, 1) ) # 初始化输入的MinMaxScaler对象
    Input_Train_Scaler[n,:,:] = Scaler_Input.fit_transform(IMFs_Historical_Compose_3D_Train[n,:,:]) # 输入训练集归一化数组
    Scaler_Output = MinMaxScaler(feature_range = (-1, 1) ) # 初始化输出的MinMaxScaler对象
    Output_Train_Scaler[n,:,:] = Scaler_Output.fit_transform(IMFs_ERSST_Compose_3D_Train[n,:,:]) # 输出训练集归一化数组
    Input_Validate_Scaler[n,:,:] = Scaler_Input.transform(IMFs_Historical_Compose_3D_Validate[n,:,:])  # 依据训练集最大最小值，对输入验证集归一化
    Input_Test_Scaler[n,:,:] = Scaler_Input.transform(IMFs_Historical_Compose_3D_Test[n,:,:])  # 依据训练集最大最小值，对输入测试集归一化
    Input_Forecast126_Scaler[n,:,:] =  Scaler_Input.transform(IMFs_SSP126_Compose_3D[n,:,:]) # 依据训练集最大最小值，对输入未来预估低排放数据集归一化
    Input_Forecast245_Scaler[n,:,:] =  Scaler_Input.transform(IMFs_SSP245_Compose_3D[n,:,:]) # 依据训练集最大最小值，对输入未来预估中等排放数据集归一化
    Input_Forecast585_Scaler[n,:,:] =  Scaler_Input.transform(IMFs_SSP585_Compose_3D[n,:,:]) # 依据训练集最大最小值，对输入未来预估高排放数据集归一化

    ## 训练模型，尝试不同隐层神经元
    for number in range(6,15):  # 从6到15对隐藏层神经元调试
        # 建立模型
        Model_MLPR = MLPRegressor(solver = 'sgd',  # 权重优化的求解器，选择的是随机梯度下降法
                                  activation = 'relu',  # 隐藏层的激活函数，选择relu函数
                                  alpha = 1e-4, # L2惩罚（正则项）参数
                                  learning_rate_init = 0.01, # 使用的初始学习率。
                                  max_iter = 200, # 最大迭代次数。
                                  hidden_layer_sizes = (number), # 隐藏层神经元数目
                                  random_state = 1  ) # 决定用于权重和偏差初始化的随机数生成
        # 训练网络
        Model_MLPR.fit( Input_Train_Scaler[n,:,:], Output_Train_Scaler[n,:,:] )  # 输入: (样本条数, 输入层神经元数); 输出：(样本条数, 输出层神经元数)
        Predict_Train_Scaler = Model_MLPR.predict(Input_Train_Scaler[n, :, :])  # (训练样本年份数, 输入神经元数)
        # 反归一化
        Predict_Train_Real = Scaler_Output.inverse_transform( Predict_Train_Scaler )   # (训练样本年份数, 输入神经元数)
        Predict_Train_Real_1D = Predict_Train_Real.reshape(-1)   # (训练样本月份数, )
        # 计算训练RMSE
        RMSE_Train = rmse(IMFs_Historical_Compose_2D_Train[n, :], IMFs_ERSST_Compose_2D_Train[n,:] )
        RMSE_Predict_Train_All[n, number-6] = rmse(Predict_Train_Real_1D, IMFs_ERSST_Compose_2D_Train[n,:] )
        print('IMF number is: ', n + 1, ', neuron number is: ',number,  ', RMSE_Train: ', RMSE_Train, ',RMSE_Predict_Train: ', RMSE_Predict_Train_All[n, number-6])
       
        ## 验证集
        Predict_Validate_Scaler = Model_MLPR.predict(Input_Validate_Scaler[n, :, :])  # (验证样本年份数,输出层神经元数 )
        # 反归一化
        Predict_Validate_Real = Scaler_Output.inverse_transform( Predict_Validate_Scaler )   # (验证样本年份数, 输出层神经元数)
        Predict_Validate_Real_1D = Predict_Validate_Real.reshape(-1)  # (验证样本月份数, )
        # 计算验证集RMSE
        RMSE_Validate = rmse(IMFs_Historical_Compose_2D_Validate[n, :], IMFs_ERSST_Compose_2D_Validate[n,:] )
        RMSE_Predict_Validate_All[n, number - 6] = rmse(Predict_Validate_Real_1D, IMFs_ERSST_Compose_2D_Validate[n, :])
        print('IMF number is: ', n + 1, ', neuron number is: ',number, ', RMSE_Validate: ', RMSE_Validate, ',RMSE_Predict_Validate: ', RMSE_Predict_Validate_All[n, number-6])
        
        ## 测试集
        Predict_Test_Scaler = Model_MLPR.predict(Input_Test_Scaler[n, :, :])  # (测试样本年份数, 输出层神经元数)
        # 反归一化
        Predict_Test_Real = Scaler_Output.inverse_transform(Predict_Test_Scaler)  # (测试样本年份数, 输出层神经元数)
        Predict_Test_Real_1D = Predict_Test_Real.reshape(-1)  # (测试样本月份数, )
        # 计算测试集RMSE
        RMSE_Test = rmse(IMFs_Historical_Compose_2D_Test[n, :], IMFs_ERSST_Compose_2D_Test[n, :])
        RMSE_Predict_Test_All[n, number - 6] = rmse(Predict_Test_Real_1D, IMFs_ERSST_Compose_2D_Test[n, :])
        print('IMF number is: ', n + 1, ', neuron number is: ',number,  ', RMSE_Test: ', RMSE_Test, ',RMSE_Predict_Test: ', RMSE_Predict_Test_All[n, number - 6])
        
        ## SSP：未来预估订正
        Predict_SSP126_Scaler = Model_MLPR.predict(Input_Forecast126_Scaler[n, :, :])  # (未来预估年份数, 输出层神经元数)
        Predict_SSP245_Scaler = Model_MLPR.predict(Input_Forecast245_Scaler[n, :, :])  # (未来预估年份数, 输出层神经元数)
        Predict_SSP585_Scaler = Model_MLPR.predict(Input_Forecast585_Scaler[n, :, :])  # (未来预估年份数, 输出层神经元数)
        # 反归一化
        Predict_SSP126_Real = Scaler_Output.inverse_transform(Predict_SSP126_Scaler)  # (未来预估年份数, 输出层神经元数)
        Predict_SSP245_Real = Scaler_Output.inverse_transform(Predict_SSP245_Scaler)  # (未来预估年份数, 输出层神经元数)
        Predict_SSP585_Real = Scaler_Output.inverse_transform(Predict_SSP585_Scaler)  # (未来预估年份数, 输出层神经元数)

        ## 找到验证集最小的RMSE及其对应的隐层神经元数，以保存所有集合的订正结果
        if number == 6:
            RMSE_Predict_Validate_Min = rmse(Predict_Validate_Real_1D, IMFs_ERSST_Compose_2D_Validate[n, :] )
            print('--------------------------------------------------------------------------------------------------------------------------------')
            print('IMF number is: ', n + 1, ', neuron number is: ',number,  ', RMSE_Predict_Validate_Min: ', RMSE_Predict_Validate_Min)
            print('--------------------------------------------------------------------------------------------------------------------------------')
            Neuron_Predict_All[n] = number # 保存隐藏层神经元数，由于6是第一个测试数据，先当作误差最小时的保存下来，后面遇到误差更小的再把6替换
            BP_Output_Train[n,:,:] = Predict_Train_Real # 保存模型预测的就是订正的结果
            BP_Output_Validate[n,:,:] = Predict_Validate_Real # 保存验证集
            BP_Output_Test[n,:,:] = Predict_Test_Real # 保存测试集
            BP_Output_SSP126[n,:,:] = Predict_SSP126_Real # 保存未来预估的低排放情景ssp1-2.6
            BP_Output_SSP245[n,:,:] = Predict_SSP245_Real # 保存未来预估的中等排放情景ssp2-4.5
            BP_Output_SSP585[n,:,:] = Predict_SSP585_Real  # 保存未来预估的高排放情景ssp5-8.5
            exec('Model_MLPR_IMF_%s = Model_MLPR'%(n + 1) )
            exec("pickle.dump(Model_MLPR_IMF_%s, open(Path_SaveModel + 'Model_MLPR_IMF_%s', 'wb') )"%(n + 1, n + 1) )
        elif number >= 7: # 开始对大于6的隐层判断误差是否有比之前更小的
            RMSE_Predict_Validate_Current = rmse(Predict_Validate_Real_1D, IMFs_ERSST_Compose_2D_Validate[n, :] )
            if RMSE_Predict_Validate_Current < RMSE_Predict_Validate_Min: # 如果当前验证集误差比已经存在的最小误差更小，那就替换成最小误差
                RMSE_Predict_Validate_Min = RMSE_Predict_Validate_Current
                print('--------------------------------------------------------------------------------------------------------------------------------')
                print('IMF number is: ', n + 1, ', neuron number is: ', number, ', RMSE_Predict_Validate_Min: ', RMSE_Predict_Validate_Min)
                print('--------------------------------------------------------------------------------------------------------------------------------')
                Neuron_Predict_All[n] = number # 记录替换后的隐藏层神经元数目
                BP_Output_Train[n, :, :] = Predict_Train_Real
                BP_Output_Validate[n, :, :] = Predict_Validate_Real
                BP_Output_Test[n, :, :] = Predict_Test_Real
                BP_Output_SSP126[n, :, :] = Predict_SSP126_Real
                BP_Output_SSP245[n, :, :] = Predict_SSP245_Real
                BP_Output_SSP585[n, :, :] = Predict_SSP585_Real
                exec('Model_MLPR_IMF_%s = Model_MLPR' % (n + 1))
                exec("pickle.dump(Model_MLPR_IMF_%s, open(Path_SaveModel + 'Model_MLPR_IMF_%s', 'wb') )"%(n + 1, n + 1) )


endtime = time.time()
print('spend time is ', str(endtime - starttime) + "s")
print('Neuron_Predict_All :', Neuron_Predict_All)  # 打印输出各个组合最佳隐层数目

IMF number is:  1 , neuron number is:  6 , RMSE_Train:  0.044241979267417554 ,RMSE_Predict_Train:  0.025645057571702753
IMF number is:  1 , neuron number is:  6 , RMSE_Validate:  0.05155530181593175 ,RMSE_Predict_Validate:  0.03468478565846921
IMF number is:  1 , neuron number is:  6 , RMSE_Test:  0.04100331926996361 ,RMSE_Predict_Test:  0.022841334361694537
--------------------------------------------------------------------------------------------------------------------------------
IMF number is:  1 , neuron number is:  6 , RMSE_Predict_Validate_Min:  0.03468478565846921
--------------------------------------------------------------------------------------------------------------------------------
IMF number is:  1 , neuron number is:  7 , RMSE_Train:  0.044241979267417554 ,RMSE_Predict_Train:  0.02555725822336281
IMF number is:  1 , neuron number is:  7 , RMSE_Validate:  0.05155530181593175 ,RMSE_Predict_Validate:  0.03667097236547388
IMF number is:  1 , neuron number is:  7 , RMSE

### 5.2.6 处理模型输出数据并保存

In [58]:
# 按时间顺序重新排列修正数据
BP_Output_All = np.zeros(IMFs_ERSST_Compose_3D.shape)
BP_Output_All[:, 0:Num_Year_Train, :] = BP_Output_Train
BP_Output_All[:, Num_Year_Train:Num_Year_Train +
              Num_Year_Validate, :] = BP_Output_Validate
BP_Output_All[:, -Num_Year_Test::, :] = BP_Output_Test

BP_Output_All_Chronological = np.zeros(IMFs_ERSST_Compose_3D.shape)
for n in range(Num_Year):
    Index_Year_Real = Index_Year_Random[n]
    BP_Output_All_Chronological[:, Index_Year_Real, :] = BP_Output_All[:, n, :]

# 保存数据
Path_Pre = './project/SaveData/'
Path_SaveModelData = Path_Pre + 'Model/'
isExists = os.path.exists(Path_SaveModelData)
if not isExists:
    os.makedirs(Path_Pre + 'Model/')
    print(Path_SaveModelData + ' successfully build!')
else:
    print(Path_SaveModelData + ' has been existed!')

np.save(
    Path_SaveModelData +
    'BP_Output_All_Chronological.npy',
    BP_Output_All_Chronological)
np.save(Path_SaveModelData + 'BP_Output_SSP126.npy', BP_Output_SSP126)
np.save(Path_SaveModelData + 'BP_Output_SSP245.npy', BP_Output_SSP245)
np.save(Path_SaveModelData + 'BP_Output_SSP585.npy', BP_Output_SSP585)

# 加载模型，可不运行，后续不再调用模型了。
# for n in range(6):
# exec("Model_MLPR_IMF_%s = pickle.load( open(Path_SaveModel + 'Model_MLPR_IMF_%s', 'rb') )" % (n + 1, n + 1) )

./project/SaveData/Model/ successfully build!



## 5.3 BPNN订正结果分析


### 5.3.1 导入数据


**原始数据**

In [59]:
Path_SaveData_Original = './project/SaveData/Original/'
SST_ERSST_AreaMean = np.load(Path_SaveData_Original + 'SST_ERSST_AreaMean.npy')
SST_Historical_AreaMean = np.load(
    Path_SaveData_Original +
    'SST_Historical_AreaMean.npy')
SST_SSP126 = np.load(Path_SaveData_Original + 'SST_SSP126_AreaMean.npy')
SST_SSP245 = np.load(Path_SaveData_Original + 'SST_SSP245_AreaMean.npy')
SST_SSP585 = np.load(Path_SaveData_Original + 'SST_SSP585_AreaMean.npy')
Time_ERSST = np.load(Path_SaveData_Original + 'Time_ERSST.npy')
Time_SSP = np.load(Path_SaveData_Original + 'Time_SSP.npy')
# 数据集划分
Num_Year = 86
Num_Year_Train = 70
Num_Year_Validate = 8
Num_Year_Test = 8
Num_Sample_SST = Num_Year*12
Num_Sample_Train = Num_Year_Train*12
Num_Sample_Validate = Num_Year_Validate*12
Num_Sample_Test = Num_Year_Test*12
Time_SST = np.arange(0, Num_Sample_SST)
Time_ERSST = Time_ERSST[-Num_Sample_SST::]
SST_ERSST = SST_ERSST_AreaMean[-Num_Sample_SST::]
SST_Historical = SST_Historical_AreaMean[-Num_Sample_SST::]
SST_ERSST_2D = SST_ERSST.reshape((Num_Year, 12))
SST_Historical_2D = SST_Historical.reshape((Num_Year, 12))
# 随机数
random.seed(1)
Index_Year = list(range(Num_Year))
Index_Year_Random = list(range(Num_Year))
random.shuffle(Index_Year_Random)
Index_Year_Train = Index_Year_Random[0:Num_Year_Train]
Index_Year_Validate = Index_Year_Random[Num_Year_Train:Num_Year_Train +
                                        Num_Year_Validate]
Index_Year_Test = Index_Year_Random[-Num_Year_Test::]
# 原始数据：2维（年份数，12）
SST_ERSST_Train_2D = SST_ERSST_2D[Index_Year_Train, :]
SST_ERSST_Validate_2D = SST_ERSST_2D[Index_Year_Validate, :]
SST_ERSST_Test_2D = SST_ERSST_2D[Index_Year_Test, :]
SST_Historical_Train_2D = SST_Historical_2D[Index_Year_Train, :]
SST_Historical_Validate_2D = SST_Historical_2D[Index_Year_Validate, :]
SST_Historical_Test_2D = SST_Historical_2D[Index_Year_Test, :]
# 原始数据：1维（月份数，）
SST_ERSST_All_1D = SST_ERSST_2D.reshape(-1)
SST_ERSST_Train_1D = SST_ERSST_Train_2D.reshape(-1)
SST_ERSST_Validate_1D = SST_ERSST_Validate_2D.reshape(-1)
SST_ERSST_Test_1D = SST_ERSST_Test_2D.reshape(-1)
SST_Historical_All_1D = SST_Historical_2D.reshape(-1)
SST_Historical_Train_1D = SST_Historical_Train_2D.reshape(-1)
SST_Historical_Validate_1D = SST_Historical_Validate_2D.reshape(-1)
SST_Historical_Test_1D = SST_Historical_Test_2D.reshape(-1)

**EEMD分解的数据**

In [60]:
Path_SaveData_EEMD = './project/SaveData/EEMD/'
IMFs_ERSST_Compose = np.load(Path_SaveData_EEMD + 'IMFs_ERSST_Compose.npy')
IMFs_Historical_Compose = np.load(
    Path_SaveData_EEMD +
    'IMFs_Historical_Compose.npy')
IMFs_SSP126_Compose = np.load(Path_SaveData_EEMD + 'IMFs_SSP126_Compose.npy')
IMFs_SSP245_Compose = np.load(Path_SaveData_EEMD + 'IMFs_SSP245_Compose.npy')
IMFs_SSP585_Compose = np.load(Path_SaveData_EEMD + 'IMFs_SSP585_Compose.npy')
# IMF: 2维 (6, 月份数)
IMFs_ERSST_Compose_3D = IMFs_ERSST_Compose.reshape((6, Num_Year, 12))
IMFs_ERSST_Compose_All_2D = IMFs_ERSST_Compose_3D.reshape((6, -1))
IMFs_ERSST_Compose_Train_2D = IMFs_ERSST_Compose_3D[:, Index_Year_Train, :].reshape(
    (6, -1))
IMFs_ERSST_Compose_Validate_2D = IMFs_ERSST_Compose_3D[:, Index_Year_Validate, :].reshape(
    (6, -1))
IMFs_ERSST_Compose_Test_2D = IMFs_ERSST_Compose_3D[:, Index_Year_Test, :].reshape(
    (6, -1))
IMFs_Historical_Compose_3D = IMFs_Historical_Compose.reshape((6, Num_Year, 12))
IMFs_Historical_Compose_All_2D = IMFs_Historical_Compose_3D.reshape((6, -1))
IMFs_Historical_Compose_Train_2D = IMFs_Historical_Compose_3D[:, Index_Year_Train, :].reshape(
    (6, -1))
IMFs_Historical_Compose_Validate_2D = IMFs_Historical_Compose_3D[:, Index_Year_Validate, :].reshape(
    (6, -1))
IMFs_Historical_Compose_Test_2D = IMFs_Historical_Compose_3D[:, Index_Year_Test, :].reshape(
    (6, -1))

**BPNN输出数据**

In [61]:
Path_SaveData_Model = './project/SaveData/Model/'
IMFs_Historical_Compose_Correct_3D = np.load(
    Path_SaveData_Model + 'BP_Output_All_Chronological.npy')
IMFs_SSP126_Compose_Correct_3D = np.load(
    Path_SaveData_Model + 'BP_Output_SSP126.npy')
IMFs_SSP245_Compose_Correct_3D = np.load(
    Path_SaveData_Model + 'BP_Output_SSP245.npy')
IMFs_SSP585_Compose_Correct_3D = np.load(
    Path_SaveData_Model + 'BP_Output_SSP585.npy')
# 订正后IMF: 2维 (6,年份数)
IMFs_Historical_Compose_Correct_All_2D = IMFs_Historical_Compose_Correct_3D.reshape(
    (6, -1))
IMFs_Historical_Compose_Correct_Train_2D = IMFs_Historical_Compose_Correct_3D[:, Index_Year_Train, :].reshape(
    (6, -1))
IMFs_Historical_Compose_Correct_Validate_2D = IMFs_Historical_Compose_Correct_3D[:, Index_Year_Validate, :].reshape(
    (6, -1))
IMFs_Historical_Compose_Correct_Test_2D = IMFs_Historical_Compose_Correct_3D[:, Index_Year_Test, :].reshape(
    (6, -1))
# 重构的订正IMF: 1维 (月份数,)
SST_Historical_Correct_All_1D = np.sum(
    IMFs_Historical_Compose_Correct_All_2D, 0)
SST_Historical_Correct_Train_1D = np.sum(
    IMFs_Historical_Compose_Correct_Train_2D, 0)
SST_Historical_Correct_Validate_1D = np.sum(
    IMFs_Historical_Compose_Correct_Validate_2D, 0)
SST_Historical_Correct_Test_1D = np.sum(
    IMFs_Historical_Compose_Correct_Test_2D, 0)
SST_SSP126_Correct_1D = np.sum(
    IMFs_SSP126_Compose_Correct_3D.reshape(
        (6, -1)), 0)
SST_SSP245_Correct_1D = np.sum(
    IMFs_SSP245_Compose_Correct_3D.reshape(
        (6, -1)), 0)
SST_SSP585_Correct_1D = np.sum(
    IMFs_SSP585_Compose_Correct_3D.reshape(
        (6, -1)), 0)
# 重构的订正IMF: 2维 (年份数, 12)
SST_Historical_Correct_All_2D = SST_Historical_Correct_All_1D.reshape((-1, 12))
SST_Historical_Correct_Train_2D = SST_Historical_Correct_Train_1D.reshape(
    (-1, 12))
SST_Historical_Correct_Validate_2D = SST_Historical_Correct_Validate_1D.reshape(
    (-1, 12))
SST_Historical_Correct_Test_2D = SST_Historical_Correct_Test_1D.reshape(
    (-1, 12))

### 5.3.2 结果分析
对订正结果，我们首先是想知道偏差订正了多少，这里我们用到基本的评价指标：相关系数和均方根误差来评价订正结果。评价顺序这里，我们先来看一下历史时期的原始及误差时间序列，再来具体看看各个月份的误差条形统计图，最后画出未来预估的订正结果。即以下三部分
####  &emsp;&emsp; 1）历史时期的时间序列
####  &emsp;&emsp; 2）逐月误差条形统计图
####  &emsp;&emsp; 3）未来预估的时间序列


#### 5.3.2.1. 历史时期的时间序列

** SST的IMF：修正前/后，训练集/验证集/测试集/全时段**

In [62]:
### 计算SST IMF的： 均方根误差（RMSE）/相关系数（R）
## 均方根误差
RMSE_IMFs_Historical_Compose_Train = np.zeros(6)
RMSE_IMFs_Historical_Compose_Validate = np.zeros(6)
RMSE_IMFs_Historical_Compose_Test = np.zeros(6)
RMSE_IMFs_Historical_Compose_All = np.zeros(6)
RMSE_IMFs_Historical_Compose_Correct_Train = np.zeros(6)
RMSE_IMFs_Historical_Compose_Correct_Validate = np.zeros(6)
RMSE_IMFs_Historical_Compose_Correct_Test = np.zeros(6)
RMSE_IMFs_Historical_Compose_Correct_All = np.zeros(6)
for n in range(6):
    # 原始
    RMSE_IMFs_Historical_Compose_Train[n] = rmse( IMFs_Historical_Compose_Train_2D[n,:], IMFs_ERSST_Compose_Train_2D[n,:] )
    RMSE_IMFs_Historical_Compose_Validate[n] = rmse( IMFs_Historical_Compose_Validate_2D[n,:], IMFs_ERSST_Compose_Validate_2D[n,:] )
    RMSE_IMFs_Historical_Compose_Test[n] = rmse( IMFs_Historical_Compose_Test_2D[n,:], IMFs_ERSST_Compose_Test_2D[n,:] )
    RMSE_IMFs_Historical_Compose_All[n] = rmse( IMFs_Historical_Compose_All_2D[n,:], IMFs_ERSST_Compose_All_2D[n,:] )
    # 订正
    RMSE_IMFs_Historical_Compose_Correct_Train[n] = rmse( IMFs_Historical_Compose_Correct_Train_2D[n,:], IMFs_ERSST_Compose_Train_2D[n,:] )
    RMSE_IMFs_Historical_Compose_Correct_Validate[n] = rmse( IMFs_Historical_Compose_Correct_Validate_2D[n,:], IMFs_ERSST_Compose_Validate_2D[n,:] )
    RMSE_IMFs_Historical_Compose_Correct_Test[n] = rmse( IMFs_Historical_Compose_Correct_Test_2D[n,:], IMFs_ERSST_Compose_Test_2D[n,:] )
    RMSE_IMFs_Historical_Compose_Correct_All[n] = rmse( IMFs_Historical_Compose_Correct_All_2D[n,:], IMFs_ERSST_Compose_All_2D[n,:] )

## 相关系数
R_IMFs_Historical_Compose_Train = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Validate = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Test = np.zeros( (6,2) )
R_IMFs_Historical_Compose_All = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Correct_Train = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Correct_Validate = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Correct_Test = np.zeros( (6,2) )
R_IMFs_Historical_Compose_Correct_All = np.zeros( (6,2) )
for n in range(6):
    # 原始
    R_IMFs_Historical_Compose_Train[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Train_2D[n,:], IMFs_ERSST_Compose_Train_2D[n,:] )[:]
    R_IMFs_Historical_Compose_Validate[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Validate_2D[n,:], IMFs_ERSST_Compose_Validate_2D[n,:] )[:]
    R_IMFs_Historical_Compose_Test[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Test_2D[n,:], IMFs_ERSST_Compose_Test_2D[n,:] )[:]
    R_IMFs_Historical_Compose_All[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_All_2D[n,:], IMFs_ERSST_Compose_All_2D[n,:] )[:]
    # 订正
    R_IMFs_Historical_Compose_Correct_Train[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Correct_Train_2D[n,:], IMFs_ERSST_Compose_Train_2D[n,:] )[:]
    R_IMFs_Historical_Compose_Correct_Validate[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Correct_Validate_2D[n,:], IMFs_ERSST_Compose_Validate_2D[n,:] )[:]
    R_IMFs_Historical_Compose_Correct_Test[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Correct_Test_2D[n,:], IMFs_ERSST_Compose_Test_2D[n,:] )[:]
    R_IMFs_Historical_Compose_Correct_All[n,:] = scipy.stats.pearsonr( IMFs_Historical_Compose_Correct_All_2D[n,:], IMFs_ERSST_Compose_All_2D[n,:] )[:]
## 结果较多，大家可打印自行查看

**重构的SST：修正前/后，训练集/验证集/测试集/全时段**

In [63]:
### 计算重构SST的均方根误差/相关系数
## 均方根误差
RMSE_SST_Historical_Train = rmse( SST_Historical_Train_1D, SST_ERSST_Train_1D )
RMSE_SST_Historical_Validate = rmse( SST_Historical_Validate_1D, SST_ERSST_Validate_1D )
RMSE_SST_Historical_Test = rmse( SST_Historical_Test_1D, SST_ERSST_Test_1D )
RMSE_SST_Historical_All = rmse( SST_Historical_All_1D, SST_ERSST_All_1D )
RMSE_SST_Historical_Correct_Train = rmse( SST_Historical_Correct_Train_1D, SST_ERSST_Train_1D )
RMSE_SST_Historical_Correct_Validate = rmse( SST_Historical_Correct_Validate_1D, SST_ERSST_Validate_1D )
RMSE_SST_Historical_Correct_Test = rmse( SST_Historical_Correct_Test_1D, SST_ERSST_Test_1D )
RMSE_SST_Historical_Correct_All = rmse( SST_Historical_Correct_All_1D, SST_ERSST_All_1D )
# 打印查看订正前后均方根误差
print(RMSE_SST_Historical_Train, RMSE_SST_Historical_Validate, RMSE_SST_Historical_Test, RMSE_SST_Historical_All)
print(RMSE_SST_Historical_Correct_Train, RMSE_SST_Historical_Correct_Validate, RMSE_SST_Historical_Correct_Test, RMSE_SST_Historical_Correct_All)

## 相关系数
R_SST_Historical_Train = scipy.stats.pearsonr( SST_Historical_Train_1D, SST_ERSST_Train_1D )[:]
R_SST_Historical_Validate = scipy.stats.pearsonr( SST_Historical_Validate_1D, SST_ERSST_Validate_1D )[:]
R_SST_Historical_Test = scipy.stats.pearsonr( SST_Historical_Test_1D, SST_ERSST_Test_1D )[:]
R_SST_Historical_All = scipy.stats.pearsonr( SST_Historical_All_1D, SST_ERSST_All_1D )[:]
R_SST_Historical_Correct_Train = scipy.stats.pearsonr( SST_Historical_Correct_Train_1D, SST_ERSST_Train_1D )[:]
R_SST_Historical_Correct_Validate = scipy.stats.pearsonr( SST_Historical_Correct_Validate_1D, SST_ERSST_Validate_1D )[:]
R_SST_Historical_Correct_Test = scipy.stats.pearsonr( SST_Historical_Correct_Test_1D, SST_ERSST_Test_1D )[:]
R_SST_Historical_Correct_All = scipy.stats.pearsonr( SST_Historical_Correct_All_1D, SST_ERSST_All_1D )[:]
# 打印查看订正前后相关系数
print(R_SST_Historical_Train, R_SST_Historical_Validate, R_SST_Historical_Test, R_SST_Historical_All)
print(R_SST_Historical_Correct_Train, R_SST_Historical_Correct_Validate, R_SST_Historical_Correct_Test, R_SST_Historical_Correct_All)


0.20784581370243535 0.22257860044431999 0.2133404373631007 0.20977381961027866
0.10053041885793738 0.1443042559570034 0.11073906087398842 0.10632007518038103
(0.7279122407416198, 1.7470650216429933e-139) (0.6571257905729102, 3.56328932044392e-13) (0.6957722332701931, 3.611407039770055e-15) (0.7216456050215255, 1.0369404819037778e-166)
(0.9155161903209507, 0.0) (0.8066538387212107, 3.435826527987728e-23) (0.8556195169101555, 1.2478972512739245e-28) (0.9025459185953143, 0.0)


可以看到订正前后三个集合的均方根误差均有所减少，**减少了30%-50%**，它们按时间顺序排列的全序列，均方根误差也**减少了50%左右**。三个集合的相关系数都提升到**0.78以上**，时间排序的全序列从0.72提升到了**0.90**。





**画图看一看重构后的全时段时间序列**

In [68]:
### 画重构SST
## 预设
fontsizenum = 16  
markersizenum = 6
Num_Year = 2014-1929+1
Num_Sample_SST = Num_Year*12
X_Time = np.arange(0, Num_Sample_SST )
Time_SST = np.array(Time_ERSST[-Num_Sample_SST::])
xtick_SST_interval = 12*12
xticklabel_SST = list(range(0, Num_Sample_SST, xtick_SST_interval)); print(xticklabel_SST)
xticklabels_SST = Time_SST[xticklabel_SST[0::]].astype('str').tolist()
ylim_sst_max = 19
ylim_sst_min = 17
delta_ylim_sst = ylim_sst_max - ylim_sst_min
yticklabel_SST = np.linspace(ylim_sst_min, ylim_sst_max, num = 5).tolist()
yticklabel_SST = [round(i,1) for i in yticklabel_SST]; print(yticklabel_SST)
yticklabels_SST = [str(i) for i in yticklabel_SST]; print(yticklabels_SST)
## 画
fig, ax = plt.subplots(1, 1, figsize = (16,9), dpi = 600)
ax.plot(X_Time, SST_Historical_All_1D, 'b-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'blue', label = 'FIO-ESM v2')
ax.plot(X_Time, SST_Historical_Correct_All_1D, 'r-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'red', label = 'EEMD-BPNN')
ax.plot(X_Time, SST_ERSST_All_1D, 'k-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'black', label = 'ERSST')
ax.set_xlim(0, Num_Sample_SST)
ax.set_ylim(ylim_sst_min, ylim_sst_max)  
ax.set_xticks(xticklabel_SST)
ax.set_yticks(yticklabel_SST)  
ax.set_xticklabels(xticklabels_SST, fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_SST, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize=fontsizenum)
ax.set_ylabel('SST(℃)', fontsize = fontsizenum)
legend_font = { 'weight': 'normal', 'size': fontsizenum}
ax.legend(loc = 'upper left', frameon = False, prop = legend_font, ncol = 1)
ax.grid(linestyle='-.')
## 保存图片
Path_Pre = './project/Figures/'
Path_SaveFigures =  Path_Pre + 'Model/'
isExists = os.path.exists(Path_SaveFigures)
if not isExists:
    os.makedirs(Path_Pre + 'Model/')
    print( Path_SaveFigures +  ' successfully build!')
else:
    print( Path_SaveFigures + ' has been existed!')

fns = os.path.join(Path_SaveFigures, 'SST_Historical_Correct_ERSST.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight')
plt.close()


[0, 144, 288, 432, 576, 720, 864, 1008]
[17.0, 17.5, 18.0, 18.5, 19.0]
['17.0', '17.5', '18.0', '18.5', '19.0']
./project/Figures/Model/ has been existed!


上图是模式（蓝色），订正后（红色）和观测（黑色）的时间序列。从上图可以看出，较原始模式FIO-ESM v2，基于EEMD-BPNN组合的订正结果更加靠近观测ERSST的时间序列。

**重构的SST偏差：修正前/后，训练集/验证集/测试集/全时段**
再来看一下订正前后偏差的情况，看偏差也是更为直观的。

In [67]:
Bias_Historical_Train_1D = SST_Historical_Train_1D - SST_ERSST_Train_1D
Bias_Historical_Validate_1D = SST_Historical_Validate_1D - SST_ERSST_Validate_1D
Bias_Historical_Test_1D = SST_Historical_Test_1D - SST_ERSST_Test_1D
Bias_Historical_All_1D = SST_Historical_All_1D - SST_ERSST_All_1D
Bias_Historical_Correct_Train_1D = SST_Historical_Correct_Train_1D - SST_ERSST_Train_1D
Bias_Historical_Correct_Validate_1D = SST_Historical_Correct_Validate_1D - SST_ERSST_Validate_1D
Bias_Historical_Correct_Test_1D = SST_Historical_Correct_Test_1D - SST_ERSST_Test_1D
Bias_Historical_Correct_All_1D = SST_Historical_Correct_All_1D - SST_ERSST_All_1D
### 计算重构的SST偏差: 最小值（Min）/最大值（Max）/平均偏差（Mean Bias）/标准差（Std），这些指标大家可以打印查看
## 最小值
Min_Bias_Historical_Train_1D = np.min(Bias_Historical_Train_1D)
Min_Bias_Historical_Validate_1D = np.min(Bias_Historical_Validate_1D)
Min_Bias_Historical_Test_1D = np.min(Bias_Historical_Test_1D)
Min_Bias_Historical_All_1D = np.min(Bias_Historical_All_1D)
Min_Bias_Historical_Correct_Train_1D = np.min(Bias_Historical_Correct_Train_1D)
Min_Bias_Historical_Correct_Validate_1D = np.min(Bias_Historical_Validate_1D)
Min_Bias_Historical_Correct_Test_1D = np.min(Bias_Historical_Correct_Test_1D)
Min_Bias_Historical_Correct_All_1D = np.min(Bias_Historical_Correct_All_1D)
## 最大值
Max_Bias_Historical_Train_1D = np.max(Bias_Historical_Train_1D)
Max_Bias_Historical_Validate_1D = np.max(Bias_Historical_Validate_1D)
Max_Bias_Historical_Test_1D = np.max(Bias_Historical_Test_1D)
Max_Bias_Historical_All_1D = np.max(Bias_Historical_All_1D)
Max_Bias_Historical_Correct_Train_1D = np.max(Bias_Historical_Correct_Train_1D)
Max_Bias_Historical_Correct_Validate_1D = np.max(Bias_Historical_Validate_1D)
Max_Bias_Historical_Correct_Test_1D = np.max(Bias_Historical_Correct_Test_1D)
Max_Bias_Historical_Correct_All_1D = np.max(Bias_Historical_Correct_All_1D)
# 平均偏差
Mean_Bias_Historical_Train_1D = np.mean(Bias_Historical_Train_1D)
Mean_Bias_Historical_Validate_1D = np.mean(Bias_Historical_Validate_1D)
Mean_Bias_Historical_Test_1D = np.mean(Bias_Historical_Test_1D)
Mean_Bias_Historical_All_1D = np.mean(Bias_Historical_All_1D)
Mean_Bias_Historical_Correct_Train_1D = np.mean(Bias_Historical_Correct_Train_1D)
Mean_Bias_Historical_Correct_Validate_1D = np.mean(Bias_Historical_Validate_1D)
Mean_Bias_Historical_Correct_Test_1D = np.mean(Bias_Historical_Correct_Test_1D)
Mean_Bias_Historical_Correct_All_1D = np.mean(Bias_Historical_Correct_All_1D)
# 标准差
Std_Bias_Historical_Train_1D = np.std(Bias_Historical_Train_1D, ddof = 0)
Std_Bias_Historical_Validate_1D = np.std(Bias_Historical_Validate_1D, ddof = 0)
Std_Bias_Historical_Test_1D = np.std(Bias_Historical_Test_1D, ddof = 0)
Std_Bias_Historical_All_1D = np.std(Bias_Historical_All_1D, ddof = 0)
Std_Bias_Historical_Correct_Train_1D = np.std(Bias_Historical_Correct_Train_1D, ddof = 0)
Std_Bias_Historical_Correct_Validate_1D = np.std(Bias_Historical_Validate_1D, ddof = 0)
Std_Bias_Historical_Correct_Test_1D = np.std(Bias_Historical_Correct_Test_1D, ddof = 0)
Std_Bias_Historical_Correct_All_1D = np.std(Bias_Historical_Correct_All_1D, ddof = 0)

###  画重构的SST偏差
## 预设
fontsizenum = 16 
markersizenum = 6
Num_Year = 2014-1929+1
Num_Sample_Bias = Num_Year*12
X_Time = np.arange(0, Num_Sample_Bias )
Time_Bias = np.array(Time_ERSST[-Num_Sample_Bias::])
xtick_Bias_interval = 12*12
xticklabel_Bias = list(range(0, Num_Sample_Bias, xtick_Bias_interval)); print(xticklabel_Bias)
xticklabels_Bias = Time_Bias[xticklabel_Bias[0::]].astype('str').tolist()
ylim_bias_max = 0.6
ylim_bias_min = -0.6
delta_ylim_bias = ylim_bias_max - ylim_bias_min
yticklabel_Bias = np.linspace(ylim_bias_min, ylim_bias_max, num = 7).tolist()
yticklabel_Bias = [round(i,1) for i in yticklabel_Bias]; print(yticklabel_Bias)
yticklabels_Bias = [str(i) for i in yticklabel_Bias]; print(yticklabels_Bias)
### 画
fig, ax = plt.subplots(1, 1, figsize = (16,9), dpi = 600)
ax.plot(X_Time, Bias_Historical_All_1D, 'b-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'blue', label = 'FIO-ESM v2')
ax.plot(X_Time, Bias_Historical_Correct_All_1D, 'r-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'red', label = 'EEMD-BPNN')
plt.axhline(y = 0 , c = 'k', ls = '--', lw = 3)
ax.set_xlim(0, Num_Sample_Bias)
ax.set_ylim(ylim_bias_min, ylim_bias_max)  
ax.set_xticks(xticklabel_Bias)
ax.set_yticks(yticklabel_Bias)  
ax.set_xticklabels(xticklabels_Bias, fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_Bias, fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize=fontsizenum)
ax.set_ylabel('SST Bias(℃)', fontsize = fontsizenum)
legend_font = { 'weight': 'normal', 'size': fontsizenum}
ax.legend(loc = 'upper left', frameon = False, prop = legend_font, ncol = 1)
ax.grid(linestyle='-.')
## 保存图片
fns = os.path.join(Path_SaveFigures, 'SSTBias_Historical_Correct.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight')
plt.close()


[0, 144, 288, 432, 576, 720, 864, 1008]
[-0.6, -0.4, -0.2, 0.0, 0.2, 0.4, 0.6]
['-0.6', '-0.4', '-0.2', '0.0', '0.2', '0.4', '0.6']


上图是订正前（蓝色）、后（红色）的模式模拟全球平均SST偏差时间序列。我们可以很明显地看到订正后的红色序列更加趋近于偏差等于0的直线。

#### 5.4.2.2. 逐月误差条形统计图
看完整体的时间序列结果，我们还可以看看订正前后的逐月偏差表现。

**重构的SST：修正前/后，训练集/验证集/测试集/全时段**

In [69]:
### 均方根误差
RMSE_SST_Historical_12Month_All = np.zeros(12)
RMSE_SST_Historical_12Month_Correct_All = np.zeros(12)
for month in range(12):
    RMSE_SST_Historical_12Month_All[month] = rmse(SST_Historical_2D[:,month], SST_ERSST_2D[:,month])
    RMSE_SST_Historical_12Month_Correct_All[month] = rmse(SST_Historical_Correct_All_2D[:,month], SST_ERSST_2D[:,month])
print(RMSE_SST_Historical_12Month_All)
print(RMSE_SST_Historical_12Month_Correct_All)
### 相关系数
R_SST_Historical_12Month_All = np.zeros( (12,2) )
R_SST_Historical_12Month_Correct_All = np.zeros( (12,2) )
for month in range(12):
    R_SST_Historical_12Month_All[month,:] = scipy.stats.pearsonr(SST_Historical_2D[:,month], SST_ERSST_2D[:,month])[:]
    R_SST_Historical_12Month_Correct_All[month,:] = scipy.stats.pearsonr(SST_Historical_Correct_All_2D[:,month], SST_ERSST_2D[:,month])[:]
print(R_SST_Historical_12Month_All)
print(R_SST_Historical_12Month_Correct_All)

### 画图：重构SST的逐月表现（均方根误差/相关系数）
## 预设
str_panels = ['a)RMSE', 'b)R']
fontsizenum = 16  
barwidthnum = 0.3
X_Month = np.arange(1,13)
xticklabel_Month = list(range(0,14))
xticklabels_Month = ['','Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec','']
## 均方根误差
ylim_rmse_max = 0.4
ylim_rmse_min = 0
delta_rmse_bias = ylim_rmse_max - ylim_rmse_min
yticklabel_RMSE = np.linspace(ylim_rmse_min, ylim_rmse_max, num = 9).tolist()
yticklabel_RMSE = [round(i,2) for i in yticklabel_RMSE]; print(yticklabel_RMSE)
yticklabels_RMSE = [str(i) for i in yticklabel_RMSE]; print(yticklabels_RMSE)
## 相关系数
ylim_R_max = 1.0
ylim_R_min = 0.7
delta_R_bias = ylim_R_max - ylim_R_min
yticklabel_R = np.linspace(ylim_R_min, ylim_R_max, num = 7).tolist()
yticklabel_R = [round(i,2) for i in yticklabel_R]; print(yticklabel_R)
yticklabels_R = [str(i) for i in yticklabel_R]; print(yticklabels_R)

## 画图
fig, ax = plt.subplots(2, 1, figsize = (16,9), dpi = 600)
# 均方根误差
ax = plt.subplot(2, 1, 1)
bar1 = ax.bar(X_Month - barwidthnum, RMSE_SST_Historical_12Month_All, color = 'blue',  alpha = 1,  width = barwidthnum, edgecolor = 'black', linewidth = 1.5, bottom = 0, align = 'edge') # label = 'FIO-ESM v2'
bar2 = ax.bar(X_Month, RMSE_SST_Historical_12Month_Correct_All, color = 'red',  alpha = 1,  width = barwidthnum, edgecolor = 'black', linewidth = 1.5, bottom = 0, align = 'edge') # label = 'EEMD-BPNN'
for month in range(12):
    Y_Month = np.round( RMSE_SST_Historical_12Month_All[month], 3)
    Y_Month_Correct = np.round( RMSE_SST_Historical_12Month_Correct_All[month], 3)
    Str_Y_Month = str(Y_Month)
    Str_Y_Month_Correct = str(Y_Month_Correct)
    ax.text(X_Month[month] - barwidthnum/2, Y_Month + 0.01, Str_Y_Month, horizontalalignment ='center', fontsize = 10 )
    ax.text(X_Month[month] + barwidthnum/2, Y_Month_Correct + 0.01, Str_Y_Month_Correct, horizontalalignment ='center', fontsize = 10 )
plot1 = plt.axhline(y = RMSE_SST_Historical_All , c = 'blue', ls = '-', lw = 2)
plot2 = plt.axhline(y = RMSE_SST_Historical_Correct_All , c = 'red', ls = '-', lw = 2)
Y_Month = np.round(RMSE_SST_Historical_All, 3); Str_Y_Month = str(Y_Month)
Y_Month_Correct = np.round(RMSE_SST_Historical_Correct_All, 3); Str_Y_Month_Correct = str(Y_Month_Correct)
ax.text(0.3, Y_Month + 0.01, Str_Y_Month, horizontalalignment='center', fontsize = 14)
ax.text(0.3, Y_Month_Correct + 0.01, Str_Y_Month_Correct, horizontalalignment='center', fontsize = 14)
ax.text(0, ylim_rmse_min + delta_rmse_bias*0.92, str_panels[0], fontsize=fontsizenum)   # color = 'black'
ax.set(xlim = (0, 13), ylim = (ylim_rmse_min, ylim_rmse_max))
ax.tick_params(axis = 'both', which = 'both', direction = 'in')
ax.set_xticks(xticklabel_Month)
ax.set_yticks(yticklabel_RMSE)
ax.set_xticklabels(xticklabels_Month, fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_RMSE, fontsize = fontsizenum)
ax.set_xlabel('Month', fontsize=fontsizenum)
ax.set_ylabel('RMSE(℃)', fontsize = fontsizenum)
ax.grid(axis = 'y', linestyle = '--', lw = 1)  # color='grey', alpha=0.5
legend_font = {'weight': 'normal', 'size': fontsizenum}
plt.legend([bar1, bar2, plot1, plot2], ["FIO-ESM v2", "EEMD-BPNN", "FIO-ESM v2(All Time)","EEMD-BPNN(All Time)"],
           bbox_to_anchor = (0.3, 1.0), loc = 'upper center', ncol = 2, frameon = False, prop = legend_font)

# 相关系数
ax = plt.subplot(2, 1, 2)
bar1 = ax.bar(X_Month - barwidthnum, R_SST_Historical_12Month_All[:,0], color = 'blue',  alpha = 1,  width = barwidthnum, edgecolor = 'black', linewidth = 1.5, bottom = 0, align = 'edge') # label = 'FIO-EM v2'
bar2 = ax.bar(X_Month, R_SST_Historical_12Month_Correct_All[:,0], color = 'red',  alpha = 1,  width = barwidthnum, edgecolor = 'black', linewidth = 1.5, bottom = 0, align = 'edge') # label = 'FIO-EM v2'
for month in range(12):
    Y_Month = np.round( R_SST_Historical_12Month_All[month,0], 3)
    Y_Month_Correct = np.round( R_SST_Historical_12Month_Correct_All[month,0], 3)
    Str_Y_Month = str(Y_Month)
    Str_Y_Month_Correct = str(Y_Month_Correct)
    ax.text(X_Month[month] - barwidthnum/2, Y_Month + 0.01, Str_Y_Month, horizontalalignment ='center', fontsize = 10 )
    ax.text(X_Month[month] + barwidthnum/2, Y_Month_Correct + 0.01, Str_Y_Month_Correct, horizontalalignment ='center', fontsize = 10 )
plt.axhline(y = R_SST_Historical_All[0] , c = 'blue', ls = '-', lw = 2)
plt.axhline(y = R_SST_Historical_Correct_All[0] , c = 'red', ls = '-', lw = 2)
Y_Month = np.round(R_SST_Historical_All[0], 3); Str_Y_Month = str(Y_Month)
Y_Month_Correct = np.round(R_SST_Historical_Correct_All[0], 3); Str_Y_Month_Correct = str(Y_Month_Correct)
ax.text(0.3, Y_Month + 0.01, Str_Y_Month, horizontalalignment='center', fontsize = 14)
ax.text(0.3, Y_Month_Correct + 0.01, Str_Y_Month_Correct, horizontalalignment='center', fontsize = 14)
ax.text(0, ylim_R_min + delta_R_bias*0.92, str_panels[1], fontsize=fontsizenum)   # color = 'black'
ax.set(xlim = (0, 13), ylim = (ylim_R_min, ylim_R_max))
ax.tick_params(axis='both', which='both', direction='in')
ax.set_xticks(xticklabel_Month)
ax.set_yticks(yticklabel_R)
ax.set_xticklabels(xticklabels_Month, fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_R, fontsize = fontsizenum)
ax.set_xlabel('Month', fontsize=fontsizenum)
ax.set_ylabel('R',fontsize = fontsizenum)
ax.grid(axis = 'y', linestyle = '--', lw = 1)  # color='grey', alpha=0.5
## 保存图片
fns = os.path.join(Path_SaveFigures, 'Bar_Historical-Correct_12Month.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight')
plt.close()


[0.19498177 0.1950517  0.16868618 0.1516922  0.12982823 0.13920275
 0.2280399  0.31332147 0.31419389 0.23615461 0.17752383 0.16770578]
[0.11809566 0.11227667 0.10865488 0.10622012 0.09938984 0.09789643
 0.09355061 0.09419723 0.09857831 0.10428939 0.11392125 0.1238845 ]
[[7.26889347e-01 2.31192034e-15]
 [7.53750539e-01 5.51990233e-17]
 [7.93474324e-01 8.32068454e-20]
 [8.08324620e-01 5.03126134e-21]
 [8.30400201e-01 4.78380599e-23]
 [8.53433260e-01 1.71393500e-25]
 [8.83863112e-01 1.86870077e-29]
 [8.97805952e-01 1.16565372e-31]
 [8.71509756e-01 1.00291798e-27]
 [8.26196218e-01 1.22063667e-22]
 [7.60342006e-01 2.05033005e-17]
 [7.24486452e-01 3.16033702e-15]]
[[8.33370394e-01 2.42982805e-23]
 [8.53099329e-01 1.87233284e-25]
 [8.64419872e-01 8.22547299e-27]
 [8.75900037e-01 2.55711587e-28]
 [8.91370267e-01 1.32316686e-30]
 [9.09909855e-01 7.54141100e-34]
 [9.21925503e-01 2.37271806e-36]
 [9.17183146e-01 2.55835975e-35]
 [8.99127783e-01 6.93712206e-32]
 [8.70705542e-01 1.28122132e-27]
 [8

上图是订正前后各个月的模式全球平均SST结果与观测的a) 相关系数，b)均方根误差。其中，条形图代表各月的相关系数或均方根误差，蓝色：订正前，红色：订正后；横线代表全时段的相关系数或均方根误差，蓝色：订正前，红色：订正后。

#### 5.4.2.3. 未来预估的时间序列
基于历史订正结果，我们将模型应用到未来预估的订正上。

**重构的SST：修正前/后，SSP126/245/585**

In [70]:
### 预设
fontsizenum = 16  
markersizenum = 4
Num_Year = 2100-2015+1
Num_Sample_SSP = Num_Year*12
X_Time = np.arange(0, Num_Sample_SSP )
Time_SSP = np.array(Time_SSP)
xtick_SSP_interval = 12*12
## SSP126
model_1 = LinearRegression(); model_2 = LinearRegression()
X_Time_2D = X_Time.reshape((-1, 1))
SST_SSP126_2D = SST_SSP126.reshape((-1, 1)); SST_SSP126_Correct_2D = SST_SSP126_Correct_1D.reshape((-1, 1))
model_1.fit(X_Time_2D, SST_SSP126_2D); model_2.fit(X_Time_2D, SST_SSP126_Correct_2D)
Regress_SST_SSP126 = model_1.predict(X_Time_2D); Regress_SST_SSP126_Correct = model_2.predict(X_Time_2D)
xticklabel_SSP = list(range(0, Num_Sample_SSP, xtick_SSP_interval))
print(xticklabel_SSP)
xticklabels_SSP = Time_SSP[xticklabel_SSP[0::]].astype('str').tolist()
ylim_ssp126_max = 22
ylim_ssp126_min = 18
delta_ylim_ssp126 = ylim_ssp126_max - ylim_ssp126_min
yticklabel_SSP126 = np.linspace(ylim_ssp126_min, ylim_ssp126_max, num = 5).tolist()
yticklabel_SSP126 = [round(i,1) for i in yticklabel_SSP126]
print(yticklabel_SSP126)
yticklabels_SSP126 = [str(i) for i in yticklabel_SSP126]
print(yticklabels_SSP126)
## SSP245
model_1 = LinearRegression(); model_2 = LinearRegression()
X_Time_2D = X_Time.reshape((-1, 1))
SST_SSP245_2D = SST_SSP245.reshape((-1, 1)); SST_SSP245_Correct_2D = SST_SSP245_Correct_1D.reshape((-1, 1))
model_1.fit(X_Time_2D, SST_SSP245_2D); model_2.fit(X_Time_2D, SST_SSP245_Correct_2D)
Regress_SST_SSP245 = model_1.predict(X_Time_2D); Regress_SST_SSP245_Correct = model_2.predict(X_Time_2D)
ylim_ssp245_max = 22
ylim_ssp245_min = 18
delta_ylim_ssp245 = ylim_ssp245_max - ylim_ssp245_min
yticklabel_SSP245 = np.linspace(ylim_ssp245_min, ylim_ssp245_max, num = 5).tolist()
yticklabel_SSP245 = [round(i,1) for i in yticklabel_SSP245]
print(yticklabel_SSP245)
yticklabels_SSP245 = [str(i) for i in yticklabel_SSP245]
print(yticklabels_SSP245)
## SSP585
model_1 = LinearRegression(); model_2 = LinearRegression()
X_Time_2D = X_Time.reshape((-1, 1))
SST_SSP585_2D = SST_SSP585.reshape((-1, 1)); SST_SSP585_Correct_2D = SST_SSP585_Correct_1D.reshape((-1, 1))
model_1.fit(X_Time_2D, SST_SSP585_2D); model_2.fit(X_Time_2D, SST_SSP585_Correct_2D)
Regress_SST_SSP585 = model_1.predict(X_Time_2D); Regress_SST_SSP585_Correct = model_2.predict(X_Time_2D)
ylim_ssp585_max = 22
ylim_ssp585_min = 18
delta_ylim_ssp585 = ylim_ssp585_max - ylim_ssp585_min
yticklabel_SSP585 = np.linspace(ylim_ssp585_min, ylim_ssp585_max, num = 5).tolist()
yticklabel_SSP585 = [round(i,1) for i in yticklabel_SSP585]
print(yticklabel_SSP585)
yticklabels_SSP585 = [str(i) for i in yticklabel_SSP585]
print(yticklabels_SSP585)
str_panels = ['a)SSP1-2.6 ', 'b)SSP2-4.5', 'c)SSP5-8.5']

### 画图
fig, ax = plt.subplots(3, 1, figsize = (16,9), dpi = 600)
## SSP126
ax = plt.subplot(3, 1, 1)
plot1, = ax.plot(X_Time, SST_SSP126, 'b-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'blue' )
plot2, = ax.plot(X_Time, SST_SSP126_Correct_1D, 'r-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'red' )
# ax.plot(X_Time, Regress_SST_SSP126.squeeze(), 'b--', linewidth = 2)
# ax.plot(X_Time, Regress_SST_SSP126_Correct.squeeze(), 'r--', linewidth = 2)
ax.text(1, ylim_ssp126_min + delta_ylim_ssp126*0.9, str_panels[0], fontsize = fontsizenum)
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp126_min, ylim_ssp126_max)  
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP126) 
ax.set_xticklabels([])
ax.set_yticklabels(yticklabels_SSP126 , fontsize = fontsizenum)
# ax.set_xlabel('Time', fontfamily='Times New Roman', fontsize=fontsizenum)
ax.set_ylabel('SST(℃)', fontsize = fontsizenum)
ax.grid(linestyle='-.')
legend_font = { 'weight': 'normal', 'size': fontsizenum}
plt.legend([plot1, plot2], ["FIO-ESM v2", "EEMD-BPNN"], loc = 'upper right', ncol = 1, frameon = False, prop = legend_font) # bbox_to_anchor = (0.3, 1.0),

## SSP245
ax = plt.subplot(3, 1, 2)
ax.plot(X_Time, SST_SSP245, 'b-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'blue' )
ax.plot(X_Time, SST_SSP245_Correct_1D, 'r-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'red' )
ax.text(1, ylim_ssp245_min + delta_ylim_ssp245*0.9, str_panels[1], fontsize = fontsizenum)
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp245_min, ylim_ssp245_max) 
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP245)  
ax.set_xticklabels([], fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_SSP245, fontsize = fontsizenum)
ax.set_ylabel('SST(℃)', fontsize = fontsizenum)
ax.grid(linestyle='-.')

## SSP585
ax = plt.subplot(3, 1, 3)
ax.plot(X_Time, SST_SSP585, 'b-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'blue' )
ax.plot(X_Time, SST_SSP585_Correct_1D, 'r-', linewidth = 2, marker = '.', markersize = markersizenum, mfc = 'red' )
ax.text(1, ylim_ssp585_min + delta_ylim_ssp585*0.9, str_panels[2], fontsize = fontsizenum)
ax.set_xlim(0, Num_Sample_SSP)
ax.set_ylim(ylim_ssp585_min, ylim_ssp585_max)  
ax.set_xticks(xticklabel_SSP)
ax.set_yticks(yticklabel_SSP585)  
ax.set_xticklabels(xticklabels_SSP, fontsize = fontsizenum)
ax.set_yticklabels(yticklabels_SSP585,  fontsize = fontsizenum)
ax.set_xlabel('Time', fontsize=fontsizenum)
ax.set_ylabel('SST(℃)', fontsize = fontsizenum)
ax.grid(linestyle='-.')
## 保存图片
fns = os.path.join(Path_SaveFigures, 'SST_SSP_Correct.png')
plt.savefig(fns, dpi = 600, bbox_inches = 'tight')
plt.close()


[0, 144, 288, 432, 576, 720, 864, 1008]
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']
[18.0, 19.0, 20.0, 21.0, 22.0]
['18.0', '19.0', '20.0', '21.0', '22.0']


上图是订正前后三种未来预估情景的全球平均SST时间序列。a）低排放预估情景SSP1-2.6，b）中等排放预估情景SSP2-4.5, c）高排放预估情景SSP5-8.5。蓝色代表订正前，红色代表订正后。

**计算SST增长: 修正前/后，SSP126/245/585**
基于订正前后的模式未来预估全球平均SST，我们计算一下本世纪末20年（2081-2100）相对于最近20年（1995-2014）的增长情况。

In [71]:
# ssp126
SST_ERSST_End20 = SST_ERSST[-20*12::]  # 1995-2014
SST_SSP126_End20 = Regress_SST_SSP126[-20*12::]  # 2081-2100
SST_SSP126_Correct_End20 = Regress_SST_SSP126_Correct[-20*12::]  # 2081-2100
Delta_SST_SSP126 = np.mean(SST_SSP126_End20) - np.mean(SST_ERSST_End20)
Delta_SST_SSP126_Correct = np.mean(SST_SSP126_Correct_End20) - np.mean(SST_ERSST_End20)
print('SSP1-2.6: ','订正前：', Delta_SST_SSP126,'订正后：', Delta_SST_SSP126_Correct)
# ssp245
SST_SSP245_End20 = Regress_SST_SSP245[-20*12::]  # 2081-2100
SST_SSP245_Correct_End20 = Regress_SST_SSP245_Correct[-20*12::]  # 2081-2100
Delta_SST_SSP245 = np.mean(SST_SSP245_End20) - np.mean(SST_ERSST_End20)
Delta_SST_SSP245_Correct = np.mean(SST_SSP245_Correct_End20) - np.mean(SST_ERSST_End20)
print('SSP2-4.5: ','订正前：', Delta_SST_SSP245,'订正后：', Delta_SST_SSP245_Correct)

# ssp585
SST_SSP585_End20 = Regress_SST_SSP585[-20*12::]  # 2081-2100
SST_SSP585_Correct_End20 = Regress_SST_SSP585_Correct[-20*12::]  # 2081-2100
Delta_SST_SSP585 = np.mean(SST_SSP585_End20) - np.mean(SST_ERSST_End20)
Delta_SST_SSP585_Correct = np.mean(SST_SSP585_Correct_End20) - np.mean(SST_ERSST_End20)
print('SSP5-8.5: ','订正前：', Delta_SST_SSP585,'订正后：', Delta_SST_SSP585_Correct)


SSP1-2.6:  订正前： 0.8341953582201391 订正后： 0.7326348564934939
SSP2-4.5:  订正前： 1.5437369682519702 订正后： 1.3381051084507085
SSP5-8.5:  订正前： 2.90624493579967 订正后： 2.5231582284102494


从上述3种未来预估情景订正前后的全球平均SST增长情况来看，订正后均比订正前有更小的增长，说明订正模型从历史偏差中学习到了模式相对观测偏高的规律，也就是我们的模式高估了气候变暖，所以订正模型对未来预估的结果都“往下拉了一下”。

## 小结
这一章我们带领大家了解了用到的BPNN机器学习模型，然后介绍了建模的基本步骤，之后对模式模拟的历史全球平均SST的进行了建模订正，最后对未来预估结果做了订正。
截至到目前，我们的项目教学内容就结束了，希望大家能认真研读，有所收获，为了让大家进一步巩固学习内容，我们还在下一章给大家设置了两个小作业，相信大家已经迫不及待去动手实践了，一起去完成吧。