# Tutorial 2: Analyzing Statistics and Variogram of the Ice Stream Data

### First load data as before

In [1]:
# load compiled bed elevation measurements
df = pd.read_csv('')

In [None]:
# create a grid of x and y coordinates
x_uniq = np.unique(df.X)
y_uniq = np.unique(df.Y)

xmin = np.min(x_uniq)
xmax = np.max(x_uniq)
ymin = np.min(y_uniq)
ymax = np.max(y_uniq)

cols = len(x_uniq)
rows = len(y_uniq)

resolution = 1000

xx, yy = np.meshgrid(x_uniq, y_uniq)

In [None]:
# load other data
velx, vely, velxerr, velyerr, fig = Topography.load_vel_measures('../../Data/antarctica_ice_velocity_450m_v2.nc', xx, yy)
dhdt, fig = Topography.load_dhdt('../Data/ANT_G1920_GroundedIceHeight_v01.nc',xx,yy,interp_method='linear',begin_year = 2013,end_year=2015,month=7)
smb, fig = Topography.load_smb_racmo('../Data/SMB_RACMO2.3p2_yearly_ANT27_1979_2016.nc', xx, yy, interp_method='spline',time=2014)
bm_mask, bm_source, bm_bed, bm_surface, bm_errbed, fig = Topography.load_bedmachine('../../Data/BedMachineAntarctica-v3.nc', xx, yy)

### Now, it is the time to analyze the compiled radar data.

#### Fit variogram

MCMC.fit_variogram is a wrapper around functions in scikit-learn and skgstats python modules. It conveniently normalize the provided bed elevation data, calculate the semi-variogram, and fit it with four different models (Gaussian, Exponential, Spherical, Matern). The output map could be used to visually decide which variogram model fit the data the best.

In [None]:
# find variograms
df_bed = df.copy()
df_bed = df_bed[df_bed["bed"].isnull() == False]
data = df_bed['bed'].values.reshape(-1,1)
coords = df_bed[['X','Y']].values
roughness_region_mask = (df_bed['bedmachine_mask'].values)==2 # Read BedMachine user guide for the meaning of values == 2 https://nsidc.org/data/nsidc-0756/versions/3

nst_trans, Nbed_radar, varios, fig = MCMC.fit_variogram(data, coords, roughness_region_mask, maxlag=70000, n_lags=70)

__Q1: How does the maxlag and n_lags affect variogram calculated? How you determine their values such that your variogram is accurately represented?__

__Q2: Which variogram model you choose? Why?__

#### Find high velocity region

In addition, since we have the velocity data, it will also be nice to know where the high velocity region is, remembering that the mass conservation technique is best applied in high velocity region

The *highvel_mask* returned try to smoothly enclose a region of high velocity and excluding locations where the ice is not grounded. This region could be used later to constrain the sampling location of the MCMC.

The function works as first find high velocity region, then smooth the boundary of this region. Because this smoothing generally will shrink the boundary, the boundary is later expanded outward for *distance_max* meters.

The degree of smoothness of the boundary could be modified by the optional argument *smooth_mode*, which is default to 10. A higher smooth_mode will give a smoother boundary

The *ocean_mask* will be 1 when the location is ocean (open ocean water without ice, sea ice, or ice shelf), and will be 0 otherwise.

The *grounded_ice_mask* is 1 at location where ice is present and grounded, and will be 0 otherwise.

An example where *ocean_mask == 0 and grounded_ice_mask == 0* will be ice-free terrestrial land

In [None]:
# calculate high velocity region
ocean_mask = (bm_mask == 0) | (bm_mask == 3) # utilize the mask in BedMachine dataset to characterize ice regions
grounded_ice_mask = (bm_mask == 2)
distance_max = 3000
velocity_threshold = 50
highvel_mask = Topography.get_highvel_boundary(velx, vely, velocity_threshold, grounded_ice_mask, ocean_mask, distance_max, xx, yy)

#### Generate initial bed

Let's generate a SGS bed for the entire region

https://gatorglaciology.github.io/gstatsimbook/4_Sequential_Gaussian_Simulation.html 

In [None]:
k = 48
rad = 50000
vario = varios[2]

Pred_grid_xy = np.concatenate((xx.flatten(),yy.flatten()),axis=1)

df_bed['Nbed_radar'] = Nbed_radar.flatten()

sim = gs.Interpolation.okrige_sgs(Pred_grid_xy, df_bed, 'X', 'Y', 'Nbed_radar', k, vario, rad)

xy_grid = np.concatenate((xx.flatten(),yy.flatten(),sim.flatten().reshape(-1,1)),axis=1)
psimdf = pd.DataFrame(data = xy_grid,columns=['X','Y','Z'],index=df.index)

sgs_bed = nst_trans.inverse_transform(np.array(psimdf['Z']).reshape(-1,1)).reshape(rows,cols)
np.savetxt('sgs_bed.txt',sgs_bed)

In [None]:
sgs_bed = np.loadtxt('sgs_bed.txt')

SGS only generate a realization of the bed topography according to the known radar measurements. However, it might ignored something else. The thickness of the ice can be calculated as ice surface elevation minus bed elevation. From BedMachine, we know where the land is ice-free and where is covered by ice. However, SGS might generate bed such that it 'extrude' out of ice in the supposed grounded ice region. Let's fix that quickly

In [None]:
thickness = bm_surface - sgs_bed
sgs_bed = np.where((thickness<=0)&(bm_mask==2), bm_surface-1, sgs_bed)

There are other things we would like to record before begin the MCMC chains

In [None]:
cond_bed = df['bed'].values.reshape(xx.shape)
data_mask = ~np.isnan(cond_bed)