# 数据结构

基本上，任何气象数据都可以表示为结构为(N,M)的数组。N代表样本数，M代表数据特征。

比如1979-2018的中国夏季年均降水量为一条长度为40的一维序列，可以表示为(40,1)，即有40个样本，每个样本有1个特征。

比如1979-2018的中国500个站点的夏季年均降水量为(40,500)的二维数组。

比如1979-2018的中国格点化的夏季年均降水量为(40,nlat,nlon)的三维数组，可以表示为(40，nlat·nlon)，即有40个样本，每个样本有nlat·nlon个特征。

# 合成分析(Composite analysis)

简单的说就是将具有某类相同特征的样本合并，分析异同。

N = a1+a2+a3+...+b1+b2+b3+...+...

常见用法：

1.在年际变化分析中，对高值年数据和低值年数据合成

2.在年代际变化分析中，对不同年代际数据合成

3.合成某一类相同事件

...

实际上不管是怎样的合成分析，都是对(N,M)中的N进行操作，将N分类，对不同类的子样本集合进行平均。

In [1]:
import numpy as np
import matplotlib.pyplot as plt
a = np.array([15.4,14.6,15.8,14.8,15.0,15.1,15.1,15.0,15.2,15.4,14.8,15.0,15.1,14.7,16.0,15.7,15.4,14.5,15.1,
15.3,15.5,15.1,15.6,15.1,15.1,14.9,15.5,15.3,15.3,15.4,15.7,15.2,15.5,15.5,15.6,15.1,15.1,16.0,16.0,16.8])
a_std = (a - a.mean())/a.std()
year = np.arange(1979,2019,1)

fig = plt.figure(figsize=(12,8))
ax1 = fig.add_axes([0.1, 0.1, 0.8, 0.4])
ax1.plot(year,a_std)
ax1.axhline(1,c='k')
ax1.axhline(-1,c='k')
plt.show()

In [4]:
year_high = year[a_std>1]
year_low = year[a_std<-1]
print('high:',year_high,'\n','low:',year_low)

high: [1981 1993 2016 2017 2018] 
 low: [1980 1982 1989 1992 1996]


## 合成Nino年和Nina年的夏季降水异常

In [8]:
import xarray as xr
import numpy as np
import datetime as dt

nino_year = np.array([1951,1957,1963,1965,1968,1972,1976,1977,1979,1982,1986,1991,1994,1997,2002,2004,2006,2009,2014])
nina_year = np.array([1950,1954,1964,1970,1973,1975,1984,1988,1995,1998,2000,2007,2010,2011])
f = xr.open_dataset('/home/mw/input/moyu1828/precip.mon.mean.nc')
pre_clim = f.precip.loc[f.time.dt.month.isin([6,7,8])]
pre_nino = f.precip.loc[(f.time.dt.month.isin([6,7,8])) & (f.time.dt.year.isin(nino_year))]
pre_nina = f.precip.loc[(f.time.dt.month.isin([6,7,8])) & (f.time.dt.year.isin(nina_year))]
lat = f.lat
lon = f.lon

pre_nino_ano = pre_nino.mean('time')*92 - pre_clim.mean('time')*92# np.arrray(pre_clim).mean((0))
pre_nina_ano = pre_nina.mean('time')*92 - pre_clim.mean('time')*92
print(pre_nino_ano)

  return np.nanmean(a, axis=axis, dtype=dtype)


<xarray.DataArray 'precip' (lat: 180, lon: 360)>
array([[       nan,        nan,        nan, ...,        nan,        nan,
               nan],
       [       nan,        nan,        nan, ...,        nan,        nan,
               nan],
       [       nan,        nan,        nan, ...,        nan,        nan,
               nan],
       ...,
       [-1.2791977, -1.2750931, -1.2371998, ..., -1.3008423, -1.3139915,
        -1.2999039],
       [-1.4980145, -1.4776669, -1.4202824, ..., -1.5531521, -1.5455265,
        -1.514164 ],
       [-1.5938625, -1.5592575, -1.5576057, ..., -1.6613979, -1.650177 ,
        -1.6006699]], dtype=float32)
Coordinates:
  * lat      (lat) float32 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5
  * lon      (lon) float32 0.5 1.5 2.5 3.5 4.5 ... 355.5 356.5 357.5 358.5 359.5


In [10]:
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import cartopy.mpl.ticker as cticker

fig = plt.figure(figsize=(12,8))
ax1 = fig.add_axes([0.1, 0.1, 0.8, 0.4],projection = ccrs.PlateCarree(central_longitude=115))
ax1.set_extent([60,150,0,60], crs=ccrs.PlateCarree())
ax1.add_feature(cfeature.COASTLINE.with_scale('50m')) 
ax1.add_feature(cfeature.LAKES, alpha=0.5)
ax1.set_xticks(np.arange(60,150+30,30), crs=ccrs.PlateCarree())
ax1.set_yticks(np.arange(0,60+30,30), crs=ccrs.PlateCarree())
lon_formatter = cticker.LongitudeFormatter()
lat_formatter = cticker.LatitudeFormatter()
ax1.xaxis.set_major_formatter(lon_formatter)
ax1.yaxis.set_major_formatter(lat_formatter)
ax1.set_title('(a) Pre. anomaly in Nino years',loc='left',fontsize=18)
c1 = ax1.contourf(lon,lat, pre_nino_ano,levels =np.arange(-80,90,10) , 
                     extend = 'both', transform=ccrs.PlateCarree(), cmap=plt.cm.BrBG)

ax2 = fig.add_axes([0.6, 0.1, 0.8, 0.4],projection = ccrs.PlateCarree(central_longitude=115))
ax2.set_extent([60,150,0,60], crs=ccrs.PlateCarree())
ax2.add_feature(cfeature.COASTLINE.with_scale('50m')) 
ax2.add_feature(cfeature.LAKES, alpha=0.5)
ax2.set_xticks(np.arange(60,150+30,30), crs=ccrs.PlateCarree())
ax2.set_yticks(np.arange(0,60+30,30), crs=ccrs.PlateCarree())
ax2.xaxis.set_major_formatter(lon_formatter)
ax2.yaxis.set_major_formatter(lat_formatter)
ax2.set_title('(b) Pre. anomaly in Nina years',loc='left',fontsize=18)
c2 = ax2.contourf(lon,lat, pre_nina_ano, zorder=0,levels =np.arange(-80,90,10) , 
                     extend = 'both', transform=ccrs.PlateCarree(), cmap=plt.cm.BrBG)
position=fig.add_axes([0.58, 0.02,  0.35, 0.025])
fig.colorbar(c1,cax=position,orientation='horizontal',format='%d',)                  

<matplotlib.colorbar.Colorbar at 0x7f17084f91d0>

# 显著性检验

t-test

最常用的t- test中有：

1.单样本位置测试，检验群体的均值是否具有零假设中指定的值。

2.零假设的双样本位置检验，使得两个样本的均值相等。通常都被称为student-t测试，但严格来说，只有在假设两个样本的方差相等时才应使用该名称; 当这个假设被删除时使用的检验形式有时被称为Welch的t检验。

3.零假设的检验，即在同一统计单位上测量的两个响应之间的差异具有零平均值。

4.测量回归线的斜率是否与0 显着不同。(回归分析)

1.t值

 T检验是用t分布理论来推论差异发生的概率，从而比较两个平均数的差异是否显著。

2.P值

P值是用来判定假设检验结果的一个参数。

P值（P value）就是当原假设为真时所得到的样本观察结果或更极端结果出现的概率。如果P值很小，说明原假设情况的发生的概率很小，而如果出现了，根据小概率原理，我们就有理由拒绝原假设，P值越小，我们拒绝原假设的理由越充分。

In [11]:
from scipy.stats.mstats import ttest_ind
help(ttest_ind)#计算两个独立样本均值的t检验

Help on function ttest_ind in module scipy.stats.mstats_basic:

ttest_ind(a, b, axis=0, equal_var=True)
    Calculates the T-test for the means of TWO INDEPENDENT samples of scores.
    
    Parameters
    ----------
    a, b : array_like
        The arrays must have the same shape, except in the dimension
        corresponding to `axis` (the first, by default).
    axis : int or None, optional
        Axis along which to compute test. If None, compute over the whole
        arrays, `a`, and `b`.
    equal_var : bool, optional
        If True, perform a standard independent 2 sample test that assumes equal
        population variances.
        If False, perform Welch's t-test, which does not assume equal population
        variance.
    
        .. versionadded:: 0.17.0
    
    Returns
    -------
    statistic : float or array
        The calculated t-statistic.
    pvalue : float or array
        The two-tailed p-value.
    
    Notes
    -----
    For more details on `ttest_ind`, s

nan_policy='propagate'
          
nan_policy数据含有nan的处理方法：propagate’返回nan, ‘raise’报错, ‘omit’忽略nan

In [19]:
# t_nino,p_nino = ttest_ind(pre_nino,pre_clim,equal_var=False)
# t_nina,p_nina = ttest_ind(pre_nina,pre_clim,equal_var=False)

_,p_nino = ttest_ind(pre_nino,pre_clim,equal_var=False)
_,p_nina = ttest_ind(pre_nina,pre_clim,equal_var=False)

fig = plt.figure(figsize=(12,8))
ax1 = fig.add_axes([0.1, 0.1, 0.8, 0.4],projection = ccrs.PlateCarree(central_longitude=115))
ax1.set_extent([60,150,0,60], crs=ccrs.PlateCarree())
ax1.add_feature(cfeature.COASTLINE.with_scale('50m')) 
ax1.add_feature(cfeature.LAKES, alpha=0.5)
ax1.set_xticks(np.arange(60,150+30,30), crs=ccrs.PlateCarree())
ax1.set_yticks(np.arange(0,60+30,30), crs=ccrs.PlateCarree())
lon_formatter = cticker.LongitudeFormatter()
lat_formatter = cticker.LatitudeFormatter()
ax1.xaxis.set_major_formatter(lon_formatter)
ax1.yaxis.set_major_formatter(lat_formatter)
ax1.set_title('(a) Pre. anomaly in Nino years',loc='left',fontsize=18)
c1 = ax1.contourf(lon,lat, pre_nino_ano,levels =np.arange(-80,90,10), zorder=0,
                     extend = 'both', transform=ccrs.PlateCarree(), cmap=plt.cm.BrBG)
c1p = ax1.contourf(lon,lat, p_nino, levels =[0,0.05,1],hatches=['...', None],zorder=1,colors="none", transform=ccrs.PlateCarree())
 
ax2 = fig.add_axes([0.6, 0.1, 0.8, 0.4],projection = ccrs.PlateCarree(central_longitude=115))
ax2.set_extent([60,150,0,60], crs=ccrs.PlateCarree())
ax2.add_feature(cfeature.COASTLINE.with_scale('50m')) 
ax2.add_feature(cfeature.LAKES, alpha=0.5)
ax2.set_xticks(np.arange(60,150+30,30), crs=ccrs.PlateCarree())
ax2.set_yticks(np.arange(0,60+30,30), crs=ccrs.PlateCarree())
ax2.xaxis.set_major_formatter(lon_formatter)
ax2.yaxis.set_major_formatter(lat_formatter)
ax2.set_title('(b) Pre. anomaly in Nina years',loc='left',fontsize=18)
c2 = ax2.contourf(lon,lat, pre_nina_ano, levels =np.arange(-80,90,10),zorder=0,
                     extend = 'both', transform=ccrs.PlateCarree(), cmap=plt.cm.BrBG)
c2p = ax2.contourf(lon,lat, p_nina, levels =[0,0.05,1],hatches=['...', None], zorder=1,colors="none", transform=ccrs.PlateCarree())
 
position=fig.add_axes([0.58, 0.02,  0.35, 0.025])
fig.colorbar(c1,cax=position,orientation='horizontal',format='%d',)      

<matplotlib.colorbar.Colorbar at 0x7f1708845e90>

http://cmdp.ncc-cma.net/pred/cn_enso.php?product=cn_enso_lanina