# CMIP5: Processing Precip Amounts, 1981-2010

This notebook provides for preliminary processing of precipitation amounts primarily to break up the enormous wet days archive. Want to break this into months and save two different DataFrames:

1. Wet Day Counts
2. Precipitation depths

Then we can proceed to distribution fitting.

Go through a month at a time and create the monthly DataFrame for analysis and then output it for distribution fitting in R.

After outputting then analyze the various DataFrames graphically and output some summary statistics to save from doing this in R.

From Wilks & Wilby (1999)
*"Most stochastic weather generators make the assumption that precipitation amounts
on wet days are independent, and follow the same distribution. Allowing different
probability distributions for precipitation amounts depending on that day’s position in
a wet spell (e.g., the mean rainfall on a wet day following a wet day might be greater
than on a wet day following a dry day) has been considered by Katz (1977), Buishand
(1977; 1978), Chin and Miller (1980) and Wilks (1999a), but allowing this extra
complexity often makes little difference to the result. Similarly, the autocorrelation
between successive nonzero precipitation amounts in daily series is sometimes (statistically) significantly different from zero, but is typically quite small and usually of little
practical importance (Katz, 1977; Buishand, 1977; 1978; Foufoula-Georgiou and
Lettenmaier, 1987). In contrast, accounting for serial correlation of nonzero precipitation
amounts is essential if the precipitation model has an hourly (or smaller) rather than a
daily time step (Katz and Parlange, 1995)."*

In [1]:
from IPython.display import display, HTML
import os
import numpy as np
import pandas as pd
import datetime as dt
import geopandas as gpd
from copy import deepcopy
import re

In [2]:
OUT_DIR = r'\\augustine.space.swri.edu\jdrive\Groundwater\R8937_Stochastic_CC_Recharge\Da' \
          r'ta\JNotes\Processed\CMIP5\CMIP5_1981_WetDays'

Have a DataFrame saved as a pickle file of all of the wet days. Use this as the base for the calculations in this notebook

In [3]:
IN_PICKLE = r'\\augustine.space.swri.edu\jdrive\Groundwater\R8937_Stochastic_CC_Recharge\Da' \
            r'ta\JNotes\Processed\CMIP5\WetDays_1981-2010.pickle'

In [4]:
WetDF = pd.read_pickle( IN_PICKLE )

In [22]:
display( HTML( WetDF.head().to_html() ) )

Unnamed: 0,MGrid_Id,Year,Month,Day,Wet_Count,Total_Depth,Day_1,Day_2,Day_3,Day_4,Day_5,Day_6,Day_7,Day_8,Day_9,Day_10,Day_11,Day_12,Day_13,Day_14,Day_15,Day_16,Day_17,Day_18,Day_19,Day_20,Day_21,Day_22,Day_23,Day_24,Day_25,Day_26,Day_27,Day_28,Day_29,Day_30,Day_31,Day_32,Day_33,Day_34,Day_35,Day_36,Day_37,Day_38,Day_39,Day_40,Day_41,Day_42,Day_43,Day_44,Day_45,Day_46,Day_47,Day_48,Day_49,Day_50,Day_51,Day_52,Day_53,Day_54,Day_55,Day_56,Day_57,Day_58,Day_59,Day_60,Day_61,Day_62,Day_63,Day_64,Day_65,Day_66,Day_67,Day_68,Day_69
0,M100_169,1981,1,2,2,11.38708,10.222975,1.164105,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M100_169,1981,1,7,6,15.535727,0.964378,0.696315,0.358811,10.081599,1.161914,2.272709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M100_169,1981,1,14,4,6.075534,0.607677,4.903019,0.276916,0.287922,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M100_169,1981,1,24,2,3.113439,0.301083,2.812356,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M100_169,1981,1,27,1,0.894761,0.894761,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [12]:
ExModNum = lambda GId: int( (re.match( r'M(.*[0-9])_(.*[0-9])', GId )).group(1) )
ExGridNum = lambda GId: int( (re.match( r'M(.*[0-9])_(.*[0-9])', GId )).group(2) )

## Monthly Data

Now will split this DataFrame up by month, keeping the grid cell id, and only the other needed columns

In [13]:
AllCols = list( WetDF.columns )
AllCols[:6]

['MGrid_Id', 'Year', 'Month', 'Day', 'Wet_Count', 'Total_Depth']

In [14]:
RootCols = deepcopy( AllCols[:6] )
DayCols = deepcopy( AllCols[6:] )

In [15]:
MaxDays = WetDF['Wet_Count'].max()
MaxDays

69

In [16]:
StartInd = 3

Go through each month and split out the wet days by month and output these for further distribution fitting in R.

### Jan

Counts

In [23]:
JanDF = WetDF[RootCols].loc[WetDF['Month'] == 1].copy()

In [24]:
display( HTML( JanDF.head().to_html() ) )

Unnamed: 0,MGrid_Id,Year,Month,Day,Wet_Count,Total_Depth
0,M100_169,1981,1,2,2,11.38708
1,M100_169,1981,1,7,6,15.535727
2,M100_169,1981,1,14,4,6.075534
3,M100_169,1981,1,24,2,3.113439
4,M100_169,1981,1,27,1,0.894761


In [25]:
JanDF['Grid_Id'] = JanDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
JanDF['Model_Id'] = JanDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [26]:
display( HTML( JanDF.head().to_html() ) )

Unnamed: 0,MGrid_Id,Year,Month,Day,Wet_Count,Total_Depth,Grid_Id,Model_Id
0,M100_169,1981,1,2,2,11.38708,169,100
1,M100_169,1981,1,7,6,15.535727,169,100
2,M100_169,1981,1,14,4,6.075534,169,100
3,M100_169,1981,1,24,2,3.113439,169,100
4,M100_169,1981,1,27,1,0.894761,169,100


In [28]:
JanPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jan_WetCnt_CMIP5_1981-2010.pickle" ) )
JanDF.to_pickle( JanPCKF )

In [30]:
JanDF = JanDF.reset_index()
JanFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jan_WetCnt_CMIP5_1981-2010.feather" ) )
JanDF.to_feather( JanFeatherF )

Depths

In [31]:
JanDF = WetDF[WetDF['Month'] == 1].copy()

In [32]:
display( HTML( JanDF.head().to_html() ) )

Unnamed: 0,MGrid_Id,Year,Month,Day,Wet_Count,Total_Depth,Day_1,Day_2,Day_3,Day_4,Day_5,Day_6,Day_7,Day_8,Day_9,Day_10,Day_11,Day_12,Day_13,Day_14,Day_15,Day_16,Day_17,Day_18,Day_19,Day_20,Day_21,Day_22,Day_23,Day_24,Day_25,Day_26,Day_27,Day_28,Day_29,Day_30,Day_31,Day_32,Day_33,Day_34,Day_35,Day_36,Day_37,Day_38,Day_39,Day_40,Day_41,Day_42,Day_43,Day_44,Day_45,Day_46,Day_47,Day_48,Day_49,Day_50,Day_51,Day_52,Day_53,Day_54,Day_55,Day_56,Day_57,Day_58,Day_59,Day_60,Day_61,Day_62,Day_63,Day_64,Day_65,Day_66,Day_67,Day_68,Day_69
0,M100_169,1981,1,2,2,11.38708,10.222975,1.164105,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M100_169,1981,1,7,6,15.535727,0.964378,0.696315,0.358811,10.081599,1.161914,2.272709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M100_169,1981,1,14,4,6.075534,0.607677,4.903019,0.276916,0.287922,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M100_169,1981,1,24,2,3.113439,0.301083,2.812356,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M100_169,1981,1,27,1,0.894761,0.894761,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [33]:
JanMaxCnt = JanDF['Wet_Count'].max()
JanMaxCnt

16

In [34]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]
CurCols

['MGrid_Id', 'Wet_Count', 'Total_Depth']

In [35]:
for jJ in range(JanMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [36]:
JanDF = JanDF[CurCols].copy()

In [37]:
JanDF.reset_index(drop=True, inplace=True)

In [38]:
display( HTML( JanDF.iloc[:10].to_html() ) )

Unnamed: 0,MGrid_Id,Wet_Count,Total_Depth,Day_1,Day_2,Day_3,Day_4,Day_5,Day_6,Day_7,Day_8,Day_9,Day_10,Day_11,Day_12,Day_13,Day_14,Day_15,Day_16
0,M100_169,2,11.38708,10.222975,1.164105,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M100_169,6,15.535727,0.964378,0.696315,0.358811,10.081599,1.161914,2.272709,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M100_169,4,6.075534,0.607677,4.903019,0.276916,0.287922,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M100_169,2,3.113439,0.301083,2.812356,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M100_169,1,0.894761,0.894761,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,M100_169,3,0.899845,0.315332,0.231725,0.352788,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,M100_169,2,1.333776,0.584615,0.749161,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,M100_169,3,17.444168,11.235129,5.968732,0.240306,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,M100_169,3,6.320639,0.850599,5.2001,0.26994,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,M100_169,1,0.405912,0.405912,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [39]:
JanTotDays = JanDF['Wet_Count'].sum()
JanTotDays

3291541

Setup the arrays that will use to store all of the wet day precipitation depths

In [43]:
GridIDs = np.zeros( JanTotDays, dtype=np.int32 )
ModelIDs = np.zeros( JanTotDays, dtype=np.int32 )
PDDepth = np.zeros( JanTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [44]:
NumJanWet = len( JanDF )
NumJanWet

1718160

In [45]:
iCnt = 0
for iI in range(NumJanWet):
    cRow = JanDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [46]:
GridIDs.min(), GridIDs.max()

(1, 210)

In [47]:
PDDepth.min(), PDDepth.max()

(0.2000007, 68.76165)

In [48]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
JanDayDF = pd.DataFrame( data=DataDict )

In [49]:
display( HTML( JanDayDF.describe().to_html() ))

Unnamed: 0,Grid_Id,Model_Id,Precip_mm
count,3291541.0,3291541.0,3291541.0
mean,130.3423,74.34555,2.648782
std,63.68265,56.27557,4.029254
min,1.0,1.0,0.2000007
25%,77.0,28.0,0.4502591
50%,149.0,56.0,1.090413
75%,186.0,119.0,3.025742
max,210.0,196.0,68.76165


Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [50]:
JanPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jan_WetDep_CMIP5_1981-2010.pickle" ) )
JanDayDF.to_pickle( JanPCKF )
JanFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jan_WetDep_CMIP5_1981-2010.feather" ) )
JanDayDF.to_feather( JanFeatherF )

### Feb

Counts

In [51]:
FebDF = WetDF[RootCols].loc[WetDF['Month'] == 2].copy()

In [52]:
FebDF['Grid_Id'] = FebDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
FebDF['Model_Id'] = FebDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [53]:
FebPCKF = os.path.normpath( os.path.join( OUT_DIR, "Feb_WetCnt_CMIP5_1981-2010.pickle" ) )
FebDF.to_pickle( FebPCKF )

In [54]:
FebDF = FebDF.reset_index()
FebFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Feb_WetCnt_CMIP5_1981-2010.feather" ) )
FebDF.to_feather( FebFeatherF )

Depths

In [55]:
FebDF = WetDF[WetDF['Month'] == 2].copy()

In [56]:
FebMaxCnt = FebDF['Wet_Count'].max()
FebMaxCnt

20

In [57]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [58]:
for jJ in range(FebMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [59]:
FebDF = FebDF[CurCols].copy()

In [60]:
FebDF.reset_index(drop=True, inplace=True)

In [61]:
FebTotDays = FebDF['Wet_Count'].sum()
FebTotDays

3435683

Setup the arrays that will use to store all of the wet day precipitation depths

In [62]:
GridIDs = np.zeros( FebTotDays, dtype=np.int32 )
ModelIDs = np.zeros( FebTotDays, dtype=np.int32 )
PDDepth = np.zeros( FebTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [63]:
NumFebWet = len( FebDF )
NumFebWet

1712378

In [64]:
iCnt = 0
for iI in range(NumFebWet):
    cRow = FebDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [65]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
FebDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [66]:
FebPCKF = os.path.normpath( os.path.join( OUT_DIR, "Feb_WetDep_CMIP5_1981-2010.pickle" ) )
FebDayDF.to_pickle( FebPCKF )
FebFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Feb_WetDep_CMIP5_1981-2010.feather" ) )
FebDayDF.to_feather( FebFeatherF )

### Mar

Counts

In [67]:
MarDF = WetDF[RootCols].loc[WetDF['Month'] == 3].copy()

In [68]:
MarDF['Grid_Id'] = MarDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
MarDF['Model_Id'] = MarDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [69]:
MarPCKF = os.path.normpath( os.path.join( OUT_DIR, "Mar_WetCnt_CMIP5_1981-2010.pickle" ) )
MarDF.to_pickle( MarPCKF )

In [70]:
MarDF = MarDF.reset_index()
MarFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Mar_WetCnt_CMIP5_1981-2010.feather" ) )
MarDF.to_feather( MarFeatherF )

Depths

In [71]:
MarDF = WetDF[WetDF['Month'] == 3].copy()

In [72]:
MarMaxCnt = MarDF['Wet_Count'].max()
MarMaxCnt

20

In [73]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [74]:
for jJ in range(MarMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [75]:
MarDF = MarDF[CurCols].copy()

In [76]:
MarDF.reset_index(drop=True, inplace=True)

In [77]:
MarTotDays = MarDF['Wet_Count'].sum()
MarTotDays

3659496

Setup the arrays that will use to store all of the wet day precipitation depths

In [78]:
GridIDs = np.zeros( MarTotDays, dtype=np.int32 )
ModelIDs = np.zeros( MarTotDays, dtype=np.int32 )
PDDepth = np.zeros( MarTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [79]:
NumMarWet = len( MarDF )
NumMarWet

1877248

In [80]:
iCnt = 0
for iI in range(NumMarWet):
    cRow = MarDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [81]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
MarDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [82]:
MarPCKF = os.path.normpath( os.path.join( OUT_DIR, "Mar_WetDep_CMIP5_1981-2010.pickle" ) )
MarDayDF.to_pickle( MarPCKF )
MarFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Mar_WetDep_CMIP5_1981-2010.feather" ) )
MarDayDF.to_feather( MarFeatherF )

### Apr

Counts

In [83]:
AprDF = WetDF[RootCols].loc[WetDF['Month'] == 4].copy()

In [84]:
AprDF['Grid_Id'] = AprDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
AprDF['Model_Id'] = AprDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [85]:
AprPCKF = os.path.normpath( os.path.join( OUT_DIR, "Apr_WetCnt_CMIP5_1981-2010.pickle" ) )
AprDF.to_pickle( AprPCKF )

In [86]:
AprDF = AprDF.reset_index()
AprFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Apr_WetCnt_CMIP5_1981-2010.feather" ) )
AprDF.to_feather( AprFeatherF )

Depths

In [87]:
AprDF = WetDF[WetDF['Month'] == 4].copy()

In [88]:
AprMaxCnt = AprDF['Wet_Count'].max()
AprMaxCnt

37

In [89]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [90]:
for jJ in range(AprMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [91]:
AprDF = AprDF[CurCols].copy()

In [92]:
AprDF.reset_index(drop=True, inplace=True)

In [93]:
AprTotDays = AprDF['Wet_Count'].sum()
AprTotDays

4924715

Setup the arrays that will use to store all of the wet day precipitation depths

In [94]:
GridIDs = np.zeros( AprTotDays, dtype=np.int32 )
ModelIDs = np.zeros( AprTotDays, dtype=np.int32 )
PDDepth = np.zeros( AprTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [95]:
NumAprWet = len( AprDF )
NumAprWet

2081296

In [96]:
iCnt = 0
for iI in range(NumAprWet):
    cRow = AprDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [97]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
AprDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [98]:
AprPCKF = os.path.normpath( os.path.join( OUT_DIR, "Apr_WetDep_CMIP5_1981-2010.pickle" ) )
AprDayDF.to_pickle( AprPCKF )
AprFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Apr_WetDep_CMIP5_1981-2010.feather" ) )
AprDayDF.to_feather( AprFeatherF )

### May

Counts

In [99]:
MayDF = WetDF[RootCols].loc[WetDF['Month'] == 5].copy()

In [100]:
MayDF['Grid_Id'] = MayDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
MayDF['Model_Id'] = MayDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [101]:
MayPCKF = os.path.normpath( os.path.join( OUT_DIR, "May_WetCnt_CMIP5_1981-2010.pickle" ) )
MayDF.to_pickle( MayPCKF )

In [102]:
MayDF = MayDF.reset_index()
MayFeatherF = os.path.normpath( os.path.join( OUT_DIR, "May_WetCnt_CMIP5_1981-2010.feather" ) )
MayDF.to_feather( MayFeatherF )

Depths

In [103]:
MayDF = WetDF[WetDF['Month'] == 5].copy()

In [104]:
MayMaxCnt = MayDF['Wet_Count'].max()
MayMaxCnt

66

In [105]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [106]:
for jJ in range(MayMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [107]:
MayDF = MayDF[CurCols].copy()

In [108]:
MayDF.reset_index(drop=True, inplace=True)

In [109]:
MayTotDays = MayDF['Wet_Count'].sum()
MayTotDays

6745353

Setup the arrays that will use to store all of the wet day precipitation depths

In [110]:
GridIDs = np.zeros( MayTotDays, dtype=np.int32 )
ModelIDs = np.zeros( MayTotDays, dtype=np.int32 )
PDDepth = np.zeros( MayTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [111]:
NumMayWet = len( MayDF )
NumMayWet

2343397

In [112]:
iCnt = 0
for iI in range(NumMayWet):
    cRow = MayDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [113]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
MayDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [114]:
MayPCKF = os.path.normpath( os.path.join( OUT_DIR, "May_WetDep_CMIP5_1981-2010.pickle" ) )
MayDayDF.to_pickle( MayPCKF )
MayFeatherF = os.path.normpath( os.path.join( OUT_DIR, "May_WetDep_CMIP5_1981-2010.feather" ) )
MayDayDF.to_feather( MayFeatherF )

### Jun

Counts

In [115]:
JunDF = WetDF[RootCols].loc[WetDF['Month'] == 6].copy()

In [116]:
JunDF['Grid_Id'] = JunDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
JunDF['Model_Id'] = JunDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [117]:
JunPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jun_WetCnt_CMIP5_1981-2010.pickle" ) )
JunDF.to_pickle( JunPCKF )

In [118]:
JunDF = JunDF.reset_index()
JunFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jun_WetCnt_CMIP5_1981-2010.feather" ) )
JunDF.to_feather( JunFeatherF )

Depths

In [119]:
JunDF = WetDF[WetDF['Month'] == 6].copy()

In [120]:
JunMaxCnt = JunDF['Wet_Count'].max()
JunMaxCnt

69

In [121]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [122]:
for jJ in range(JunMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [123]:
JunDF = JunDF[CurCols].copy()

In [124]:
JunDF.reset_index(drop=True, inplace=True)

In [125]:
JunTotDays = JunDF['Wet_Count'].sum()
JunTotDays

5975417

Setup the arrays that will use to store all of the wet day precipitation depths

In [126]:
GridIDs = np.zeros( JunTotDays, dtype=np.int32 )
ModelIDs = np.zeros( JunTotDays, dtype=np.int32 )
PDDepth = np.zeros( JunTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [127]:
NumJunWet = len( JunDF )
NumJunWet

2061361

In [128]:
iCnt = 0
for iI in range(NumJunWet):
    cRow = JunDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [129]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
JunDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [130]:
JunPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jun_WetDep_CMIP5_1981-2010.pickle" ) )
JunDayDF.to_pickle( JunPCKF )
JunFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jun_WetDep_CMIP5_1981-2010.feather" ) )
JunDayDF.to_feather( JunFeatherF )

### Jul

Counts

In [131]:
JulDF = WetDF[RootCols].loc[WetDF['Month'] == 7].copy()

In [132]:
JulDF['Grid_Id'] = JulDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
JulDF['Model_Id'] = JulDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [133]:
JulPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jul_WetCnt_CMIP5_1981-2010.pickle" ) )
JulDF.to_pickle( JulPCKF )

In [134]:
JulDF = JulDF.reset_index()
JulFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jul_WetCnt_CMIP5_1981-2010.feather" ) )
JulDF.to_feather( JulFeatherF )

Depths

In [135]:
JulDF = WetDF[WetDF['Month'] == 7].copy()

In [136]:
JulMaxCnt = JulDF['Wet_Count'].max()
JulMaxCnt

54

In [137]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [138]:
for jJ in range(JulMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [139]:
JulDF = JulDF[CurCols].copy()

In [140]:
JulDF.reset_index(drop=True, inplace=True)

In [141]:
JulTotDays = JulDF['Wet_Count'].sum()
JulTotDays

5042037

Setup the arrays that will use to store all of the wet day precipitation depths

In [142]:
GridIDs = np.zeros( JulTotDays, dtype=np.int32 )
ModelIDs = np.zeros( JulTotDays, dtype=np.int32 )
PDDepth = np.zeros( JulTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [143]:
NumJulWet = len( JulDF )
NumJulWet

1905592

In [144]:
iCnt = 0
for iI in range(NumJulWet):
    cRow = JulDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [145]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
JulDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [146]:
JulPCKF = os.path.normpath( os.path.join( OUT_DIR, "Jul_WetDep_CMIP5_1981-2010.pickle" ) )
JulDayDF.to_pickle( JulPCKF )
JulFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Jul_WetDep_CMIP5_1981-2010.feather" ) )
JulDayDF.to_feather( JulFeatherF )

### Aug

Counts

In [147]:
AugDF = WetDF[RootCols].loc[WetDF['Month'] == 8].copy()

In [148]:
AugDF['Grid_Id'] = AugDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
AugDF['Model_Id'] = AugDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [149]:
AugPCKF = os.path.normpath( os.path.join( OUT_DIR, "Aug_WetCnt_CMIP5_1981-2010.pickle" ) )
AugDF.to_pickle( AugPCKF )

In [150]:
AugDF = AugDF.reset_index()
AugFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Aug_WetCnt_CMIP5_1981-2010.feather" ) )
AugDF.to_feather( AugFeatherF )

Depths

In [151]:
AugDF = WetDF[WetDF['Month'] == 8].copy()

In [152]:
AugMaxCnt = AugDF['Wet_Count'].max()
AugMaxCnt

49

In [153]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [154]:
for jJ in range(AugMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [155]:
AugDF = AugDF[CurCols].copy()

In [156]:
AugDF.reset_index(drop=True, inplace=True)

In [157]:
AugTotDays = AugDF['Wet_Count'].sum()
AugTotDays

5399138

Setup the arrays that will use to store all of the wet day precipitation depths

In [158]:
GridIDs = np.zeros( AugTotDays, dtype=np.int32 )
ModelIDs = np.zeros( AugTotDays, dtype=np.int32 )
PDDepth = np.zeros( AugTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [159]:
NumAugWet = len( AugDF )
NumAugWet

1972429

In [160]:
iCnt = 0
for iI in range(NumAugWet):
    cRow = AugDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [161]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
AugDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [162]:
AugPCKF = os.path.normpath( os.path.join( OUT_DIR, "Aug_WetDep_CMIP5_1981-2010.pickle" ) )
AugDayDF.to_pickle( AugPCKF )
AugFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Aug_WetDep_CMIP5_1981-2010.feather" ) )
AugDayDF.to_feather( AugFeatherF )

### Sep

Counts

In [163]:
SepDF = WetDF[RootCols].loc[WetDF['Month'] == 9].copy()

In [164]:
SepDF['Grid_Id'] = SepDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
SepDF['Model_Id'] = SepDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [165]:
SepPCKF = os.path.normpath( os.path.join( OUT_DIR, "Sep_WetCnt_CMIP5_1981-2010.pickle" ) )
SepDF.to_pickle( SepPCKF )

In [166]:
SepDF = SepDF.reset_index()
SepFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Sep_WetCnt_CMIP5_1981-2010.feather" ) )
SepDF.to_feather( SepFeatherF )

Depths

In [167]:
SepDF = WetDF[WetDF['Month'] == 9].copy()

In [168]:
SepMaxCnt = SepDF['Wet_Count'].max()
SepMaxCnt

37

In [169]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [170]:
for jJ in range(SepMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [171]:
SepDF = SepDF[CurCols].copy()

In [172]:
SepDF.reset_index(drop=True, inplace=True)

In [173]:
SepTotDays = SepDF['Wet_Count'].sum()
SepTotDays

6045548

Setup the arrays that will use to store all of the wet day precipitation depths

In [174]:
GridIDs = np.zeros( SepTotDays, dtype=np.int32 )
ModelIDs = np.zeros( SepTotDays, dtype=np.int32 )
PDDepth = np.zeros( SepTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [175]:
NumSepWet = len( SepDF )
NumSepWet

2029163

In [176]:
iCnt = 0
for iI in range(NumSepWet):
    cRow = SepDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [177]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
SepDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [178]:
SepPCKF = os.path.normpath( os.path.join( OUT_DIR, "Sep_WetDep_CMIP5_1981-2010.pickle" ) )
SepDayDF.to_pickle( SepPCKF )
SepFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Sep_WetDep_CMIP5_1981-2010.feather" ) )
SepDayDF.to_feather( SepFeatherF )

### Oct

Counts

In [179]:
OctDF = WetDF[RootCols].loc[WetDF['Month'] == 10].copy()

In [180]:
OctDF['Grid_Id'] = OctDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
OctDF['Model_Id'] = OctDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [181]:
OctPCKF = os.path.normpath( os.path.join( OUT_DIR, "Oct_WetCnt_CMIP5_1981-2010.pickle" ) )
OctDF.to_pickle( OctPCKF )

In [182]:
OctDF = OctDF.reset_index()
OctFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Oct_WetCnt_CMIP5_1981-2010.feather" ) )
OctDF.to_feather( OctFeatherF )

Depths

In [183]:
OctDF = WetDF[WetDF['Month'] == 10].copy()

In [184]:
OctMaxCnt = OctDF['Wet_Count'].max()
OctMaxCnt

27

In [185]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [186]:
for jJ in range(OctMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [187]:
OctDF = OctDF[CurCols].copy()

In [188]:
OctDF.reset_index(drop=True, inplace=True)

In [189]:
OctTotDays = OctDF['Wet_Count'].sum()
OctTotDays

4682225

Setup the arrays that will use to store all of the wet day precipitation depths

In [190]:
GridIDs = np.zeros( OctTotDays, dtype=np.int32 )
ModelIDs = np.zeros( OctTotDays, dtype=np.int32 )
PDDepth = np.zeros( OctTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [191]:
NumOctWet = len( OctDF )
NumOctWet

1877449

In [192]:
iCnt = 0
for iI in range(NumOctWet):
    cRow = OctDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [193]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
OctDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [194]:
OctPCKF = os.path.normpath( os.path.join( OUT_DIR, "Oct_WetDep_CMIP5_1981-2010.pickle" ) )
OctDayDF.to_pickle( OctPCKF )
OctFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Oct_WetDep_CMIP5_1981-2010.feather" ) )
OctDayDF.to_feather( OctFeatherF )

### Nov

Counts

In [195]:
NovDF = WetDF[RootCols].loc[WetDF['Month'] == 11].copy()

In [196]:
NovDF['Grid_Id'] = NovDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
NovDF['Model_Id'] = NovDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [197]:
NovPCKF = os.path.normpath( os.path.join( OUT_DIR, "Nov_WetCnt_CMIP5_1981-2010.pickle" ) )
NovDF.to_pickle( NovPCKF )

In [198]:
NovDF = NovDF.reset_index()
NovFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Nov_WetCnt_CMIP5_1981-2010.feather" ) )
NovDF.to_feather( NovFeatherF )

Depths

In [199]:
NovDF = WetDF[WetDF['Month'] == 11].copy()

In [200]:
NovMaxCnt = NovDF['Wet_Count'].max()
NovMaxCnt

18

In [201]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [202]:
for jJ in range(NovMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [203]:
NovDF = NovDF[CurCols].copy()

In [204]:
NovDF.reset_index(drop=True, inplace=True)

In [205]:
NovTotDays = NovDF['Wet_Count'].sum()
NovTotDays

3789844

Setup the arrays that will use to store all of the wet day precipitation depths

In [206]:
GridIDs = np.zeros( NovTotDays, dtype=np.int32 )
ModelIDs = np.zeros( NovTotDays, dtype=np.int32 )
PDDepth = np.zeros( NovTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [207]:
NumNovWet = len( NovDF )
NumNovWet

1815846

In [208]:
iCnt = 0
for iI in range(NumNovWet):
    cRow = NovDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [209]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
NovDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [210]:
NovPCKF = os.path.normpath( os.path.join( OUT_DIR, "Nov_WetDep_CMIP5_1981-2010.pickle" ) )
NovDayDF.to_pickle( NovPCKF )
NovFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Nov_WetDep_CMIP5_1981-2010.feather" ) )
NovDayDF.to_feather( NovFeatherF )

### Dec

Counts

In [211]:
DecDF = WetDF[RootCols].loc[WetDF['Month'] == 12].copy()

In [212]:
DecDF['Grid_Id'] = DecDF.apply( lambda row: ExGridNum(row['MGrid_Id']), axis=1 )
DecDF['Model_Id'] = DecDF.apply( lambda row: ExModNum(row['MGrid_Id']), axis=1 )

In [213]:
DecPCKF = os.path.normpath( os.path.join( OUT_DIR, "Dec_WetCnt_CMIP5_1981-2010.pickle" ) )
DecDF.to_pickle( DecPCKF )

In [214]:
DecDF = DecDF.reset_index()
DecFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Dec_WetCnt_CMIP5_1981-2010.feather" ) )
DecDF.to_feather( DecFeatherF )

Depths

In [215]:
DecDF = WetDF[WetDF['Month'] == 12].copy()

In [216]:
DecMaxCnt = DecDF['Wet_Count'].max()
DecMaxCnt

18

In [217]:
CurCols = [RootCols[0], RootCols[4], RootCols[5] ]

In [218]:
for jJ in range(DecMaxCnt):
    CurCols.append( DayCols[jJ] )
# end for

In [219]:
DecDF = DecDF[CurCols].copy()

In [220]:
DecDF.reset_index(drop=True, inplace=True)

In [221]:
DecTotDays = DecDF['Wet_Count'].sum()
DecTotDays

3392164

Setup the arrays that will use to store all of the wet day precipitation depths

In [222]:
GridIDs = np.zeros( DecTotDays, dtype=np.int32 )
ModelIDs = np.zeros( DecTotDays, dtype=np.int32 )
PDDepth = np.zeros( DecTotDays, dtype=np.float32 )
MGridIDs = [ "%d" % x for x in GridIDs.tolist() ]

In [223]:
NumDecWet = len( DecDF )
NumDecWet

1766793

In [224]:
iCnt = 0
for iI in range(NumDecWet):
    cRow = DecDF.loc[iI]
    cMGridId = cRow.at["MGrid_Id"]
    cNumWet = int( cRow.at["Wet_Count"] )
    cGridId = ExGridNum( cMGridId )
    cModelId = ExModNum( cMGridId )
    for jJ in range(cNumWet):
        cPDep = float( cRow.at[CurCols[StartInd+jJ]] )
        MGridIDs[iCnt] = cMGridId
        GridIDs[iCnt] = cGridId
        ModelIDs[iCnt] = cModelId
        PDDepth[iCnt] = cPDep
        iCnt += 1
    # end of inner for
# end of outer for

In [225]:
DataDict = { "MGrid_Id" : MGridIDs,
             "Grid_Id" : GridIDs,
             "Model_Id" : ModelIDs,
             "Precip_mm" : PDDepth, }
DecDayDF = pd.DataFrame( data=DataDict )

Now want to save the January DataFrame as both a pickle and a feather so that can reuse it with both Python and R.

In [226]:
DecPCKF = os.path.normpath( os.path.join( OUT_DIR, "Dec_WetDep_CMIP5_1981-2010.pickle" ) )
DecDayDF.to_pickle( DecPCKF )
DecFeatherF = os.path.normpath( os.path.join( OUT_DIR, "Dec_WetDep_CMIP5_1981-2010.feather" ) )
DecDayDF.to_feather( DecFeatherF )