# DEA Coastlines vector statistics <img align="right" src="https://github.com/GeoscienceAustralia/dea-notebooks/raw/develop/Supplementary_data/dea_logo.jpg">

This code conducts vector subpixel coastline extraction for DEA Coastlines:

* Apply morphological extraction algorithms to mask annual median composite rasters to a valid coastal region
* Extract waterline vectors using subpixel waterline extraction ([Bishop-Taylor et al. 2019b](https://doi.org/10.3390/rs11242984))
* Compute rates of coastal change at every 30 m along Australia's non-rocky coastlines using linear regression

This is an interactive version of the code intended for prototyping; to run this analysis at scale, use the [`deacoastlines_statistics.py`](deacoastlines_statistics.py) Python script.

**Compatability:**
```
module use /g/data/v10/public/modules/modulefiles
module load dea/20200713
pip install --user ruptures
pip install --user git+https://github.com/mattijn/topojson/
```
---

### Load packages

First we import all required Python packages, and then start the vector coastline extraction process.

In [None]:
pip install topojson

In [None]:
pip install ruptures

In [20]:
%matplotlib inline
%load_ext line_profiler
%load_ext autoreload
%autoreload 2

import deacoastlines_statistics as deacl_stats

import os
import sys
import geopandas as gpd
from shapely.geometry import box
from rasterio.transform import array_bounds
import pandas as pd
import shutil
import matplotlib.pyplot as plt


The line_profiler extension is already loaded. To reload it, use:
  %reload_ext line_profiler
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Load in data

In [21]:
# Read in contours
study_area = 3
# study_area = 6931
# study_area = 8529
# study_area = 8531
# study_area = 4208
raster_version = 'v0.0.1'
vector_version = 'v0.0.1'
water_index = 'mndwi'
index_threshold = -0.05
baseline_year = '2019'


## Load DEA CoastLines rasters

In [22]:
yearly_ds, gapfill_ds = deacl_stats.load_rasters(raster_version, 
                                                 study_area, 
                                                 water_index)
print(yearly_ds)

# Create output vector folder
output_dir = f'output_data/{study_area}_{raster_version}/vectors'
os.makedirs(f'{output_dir}/shapefiles', exist_ok=True)

<xarray.Dataset>
Dimensions:  (x: 1811, y: 1150, year: 8)
Coordinates:
  * year     (year) int64 2013 2014 2015 2016 2017 2018 2019 2020
  * y        (y) float64 1.638e+06 1.638e+06 1.638e+06 ... 1.603e+06 1.603e+06
  * x        (x) float64 2.26e+05 2.260e+05 2.261e+05 ... 2.803e+05 2.803e+05
Data variables:
    mndwi    (year, y, x) float32 0.0047475346 -0.03017528 ... -0.4887228
    tide_m   (year, y, x) float32 0.02191715 0.021915795 ... -0.16330433
    count    (year, y, x) int16 7 8 9 7 6 6 7 8 9 ... 18 19 19 19 19 19 19 19 19
    stdev    (year, y, x) float32 0.6094196 0.5997575 ... 0.0533742 0.0732927
Attributes:
    transform:      | 30.00, 0.00, 226005.00|\n| 0.00,-30.00, 1637925.00|\n| ...
    crs:            +init=epsg:32628
    res:            (30.0, 30.0)
    is_tiled:       1
    nodatavals:     (nan,)
    scales:         (1.0,)
    offsets:        (0.0,)
    AREA_OR_POINT:  Area


## Load vector data

In [23]:
# Get bounding box to load data for
bbox = gpd.GeoSeries(box(*array_bounds(height=yearly_ds.sizes['y'],
                                       width=yearly_ds.sizes['x'],
                                       transform=yearly_ds.transform)),
                     crs=yearly_ds.crs)

# # Rocky shore mask
# smartline_gdf = (gpd.read_file('input_data/Smartline.gdb', 
#                                bbox=bbox).to_crs(yearly_ds.crs))

# Tide points
tide_points_gdf = (gpd.read_file('input_data/tide_points_coastal_wgs84.geojson', 
                            bbox=bbox).to_crs(yearly_ds.crs))

# Study area polygon
comp_gdf = (gpd.read_file('input_data/coastal_grid_wgs84.geojson', 
                          bbox=bbox)
            .set_index('id')
            .to_crs(str(yearly_ds.crs)))

# Mask to study area
study_area_poly = comp_gdf.loc[study_area]

# # Load climate indices
# climate_df = pd.read_csv('input_data/soi.long.data', 
#                          header=None, 
#                          delimiter='  ', 
#                          skiprows=1, 
#                          index_col=0, 
#                          skipfooter=9,
#                          engine='python').mean(axis=1).to_frame('soi')


  for feature in features_lst:


## Extract shoreline contours

### Extract ocean-masked contours![persistent_nodata.tif](attachment:persistent_nodata.tif)

In [24]:
# # Generate waterbody mask
# waterbody_mask = deacl_stats.waterbody_masking(
#     input_data='input_data/SurfaceHydrologyPolygonsRegional.gdb',
#     modification_data='input_data/estuary_mask_modifications.geojson',
#     bbox=bbox,
#     yearly_ds=yearly_ds)

In [25]:
# Mask dataset to focus on coastal zone only
masked_ds = deacl_stats.contours_preprocess(
    yearly_ds,
    gapfill_ds,
    water_index,
    index_threshold,
#     waterbody_mask,
    tide_points_gdf,
    output_path=f'output_data/{study_area}_{raster_version}') 

In [26]:
from datacube.utils.cog import write_cog

write_cog(masked_ds.sel(year=2019),
              fname=f'2019_masked.tif',
              overwrite=True)

PosixPath('2019_masked.tif')

In [27]:
# Extract contours
contours_gdf = (deacl_stats.subpixel_contours(da=masked_ds,
                                              z_values=index_threshold,
                                              min_vertices=10,
                                              dim='year',
                                              output_path=f'temp2.geojson')
                .set_index('year'))

Operating in single z-value, multiple arrays mode
Writing contours to temp2.geojson


## Compute statistics
### Create stats points on baseline contour

In [28]:
# Extract statistics modelling points along baseline contour
points_gdf = deacl_stats.points_on_line(contours_gdf, 
                                        baseline_year, 
                                        distance=30)

# # Clip to remove rocky shoreline points
# points_gdf = deacl_stats.rocky_shores_clip(points_gdf, 
#                                            smartline_gdf, 
#                                            buffer=50)


### Measure annual coastline movements

In [29]:
if points_gdf is not None:  

    # Calculate annual movements and residual tide heights for every 
    # contour compared to the baseline year
    points_gdf = deacl_stats.annual_movements(points_gdf,
                                              contours_gdf,
                                              yearly_ds,                                     
                                              baseline_year,
                                              water_index)

Dropping 60


### Calculate regressions

In [30]:
if points_gdf is not None:

    # Apply regression function to each row in dataset
    points_gdf = deacl_stats.calculate_regressions(points_gdf,
                                                   contours_gdf,
#                                                    climate_df,
                                                  )
    
# Add in retreat/growth helper columns (used for web services)
points_gdf['retreat'] = points_gdf.rate_time < 0 
points_gdf['growth'] = points_gdf.rate_time > 0

# Add Shoreline Change Envelope (SCE), Net Shoreline Movement 
# (NSM) and Max/Min years
# stats_list = ['sce', 'nsm', 'max_year', 'min_year', 'breaks']
# points_gdf = points_gdf.apply(
#     lambda x: deacl_stats.all_time_stats(x), axis=1)

Comparing annual movements with time


In [None]:
points_gdf

In [None]:
# points_gdf.to_crs('EPSG:4326').to_file(f'test_stats.geojson', driver='GeoJSON')

## Export files

### Export stats files

In [31]:
if points_gdf is not None:
    
    # Set up scheme to optimise file size
    schema_dict = {key: 'float:8.2' for key in points_gdf.columns
                   if key != 'geometry'}
    schema_dict.update({'sig_time': 'float:8.3',
                        'outl_time': 'str:80',
#                         'sig_soi': 'float:8.3',
#                         'outl_soi': 'str:80',
                        'retreat': 'bool', 
                        'growth': 'bool',
#                         'max_year': 'int:4',
#                         'min_year': 'int:4',
#                         'breaks': 'str:80',
                       })
    col_schema = schema_dict.items()

    # Clip stats to study area extent, remove rocky shores
    stats_path = f'{output_dir}/stats_{study_area}_{vector_version}_{water_index}_{index_threshold:.2f}'
    points_gdf = points_gdf[points_gdf.intersects(study_area_poly['geometry'])]

    # Export to GeoJSON
    points_gdf.to_crs('EPSG:4326').to_file(f'{stats_path}.geojson', 
                                           driver='GeoJSON')

    # Export as ESRI shapefiles
    stats_path = stats_path.replace('vectors', 'vectors/shapefiles')
    points_gdf.to_file(f'{stats_path}.shp',
                       schema={'properties': col_schema,
                               'geometry': 'Point'})

### Export contours

In [None]:
# Assign certainty to contours based on underlying masks
# contours_gdf = deacl_stats.contour_certainty(
#     contours_gdf, 
#     output_path=f'output_data/{study_area}_{raster_version}')

# Clip annual shoreline contours to study area extent
contour_path = f'{output_dir}/contours_{study_area}_{vector_version}_' \
               f'{water_index}_{index_threshold:.2f}'
contours_gdf['geometry'] = contours_gdf.intersection(study_area_poly['geometry'])
contours_gdf.reset_index().to_crs('EPSG:4326').to_file(f'{contour_path}.geojson', 
                                                       driver='GeoJSON')

# Export stats and contours as ESRI shapefiles
contour_path = contour_path.replace('vectors', 'vectors/shapefiles')
contours_gdf.reset_index().to_file(f'{contour_path}.shp')

## Generate continental summary layer

In [None]:
from deacoastlines_summary import main

In [None]:
main(['out', 'v1.1.1', 'v1.1.1', '0.00', True, True, False])  # 2500 default summary

### Identify missed tiles

In [None]:
# all_tiles = '2083 660 2423 3436 3437 3641 3539 3538 2374 2169 3089 2320 3640 2373 2372 2271 2270 2784 1969 2377 2379 2830 1813 676 780 678 677 2891 2890 2077 2789 2380 2278 2790 2889 1192 880 670 669 2282 984 567 559 773 774 3193 1098 1090 988 1088 1091 563 461 462 1982 1981 1880 1879 1779 1778 882 1092 1097 460 3297 3298 1089 987 565 568 778 881 985 776 675 3502 3400 661 1979 3500 1472 1575 2385 3501 3294 560 459 458 3399 3296 1301 1200 1199 2381 1405 1303 862 761 760 777 569 463 2082 1980 674 468 570 571 3091 3397 2183 2080 2582 3192 2179 2079 2078 3233 2788 4150 2626 2482 3131 3029 3132 2990 2785 467 566 464 2280 466 465 2688 1404 1302 2013 2218 2217 2115 2116 1711 2787 2687 2827 2886 2786 3031 3030 2929 2928 2319 2829 2828 2726 2523 2015 1913 1912 2421 1609 1608 1811 2625 1507 1506 4819 2588 2587 2383 2384 1298 1196 564 2181 1299 1197 1710 1297 1198 1096 764 662 865 2487 664 663 2382 562 561 1677 1676 1474 1371 1295 1296 1093 986 964 966 965 1066 1271 1270 1169 1168 5237 4107 3798 3800 4209 4208 3805 4307 4104 3799 4413 4311 4514 4003 4002 3901 3900 4207 5337 5432 3807 3705 3603 3602 5233 4308 4616 3905 4309 5134 5235 3804 3702 3701 3803 3802 5543 4511 4409 4512 5441 5339 5023 4921 5440 5437 4614 5335 5433 5436 5435 5334 5333 5331 5332 5127 4923 5330 4717 5328 5226 4310 4615 3806 5131 5029 5329 5024 5031 5133 5132 5230 5128 5232 4820 4718 4006 3904 3902 4922 4005 5231 5129 8627 8626 8625 8533 8531 8530 8529 8527 8434 8433 8432 8431 8430 8429 8428 8427 8426 8423 8422 8421 8419 8418 8335 8334 8333 8332 8331 8330 8329 8234 8233 8232 8231 8228 8227 8216 8215 8214 8211 8210 8133 8132 8131 8113 8111 8108 8107 8032 8030 8008 8007 8006 7929 7928 7906 7905 7904 7900 7832 7831 7828 7800 7799 7797 7731 7729 7727 7696 7632 7631 7630 7629 7628 7593 7592 7531 7530 7529 7490 7489 7385 7384 7285 7284 7283 7282 7276 7275 7178 7177 7176 7175 7174 7172 7061 6969 6968 6965 6958 6957 6867 6866 6863 6862 6861 6860 6857 6856 6761 6760 6759 6758 6757 6756 6754 6753 6657 6656 6557 6554 6553 6551 6550 6461 6460 6459 6458 6457 6361 6360 6357 6356 6355 6354 6260 6258 6160 6057 5956 5849 5749 5647 5646 5545 6728 6414 6923 4769 4768 4458 4560 5077 5076 5481 5792 5690 5793 5691 5478 5479 5480 4048 3946 3945 5689 5587 5586 5584 5482 5585 5483 6724 7029 6927 6932 5485 5384 5180 5179 5379 5282 5375 5381 5278 4047 4150 4355 3844 3843 4970 5693 4251 4149 3742 3741 4871 4457 4456 4354 4252 4763 5795 5376 7030 5378 5275 7135 6929 7132 5796 4971 4869 5174 5073 4766 4664 4767 4665 6828 4663 4870 7124 5794 5692 5176 4867 4765 4764 5074 4973 4972 4660 4558 7123 6309 7023 7022 7225 6717 7429 6714 7033 6925 6411 7327 6103 6612 7134 7133 7031 6818 6512 6410 6920 6921 6614 6819 6001 6000 5899 5898 6308 6307 6206 6205 6725 6620 6519 6514 6412 7229 7230 6823 6722 6518 6621 6830 6515 7026 7028 7027 6931 6721 6622 7231 6625 6413 6727 7233 7131 6623 6726 6624 6829 6517 6516 6618 6619 1473 763 2988 2481 2081 3398 8528 8532 8031 7281 4661 5279 5280 6613 6716 5277 6453 6555 6549 6755 6864 7798 8213 7930 7931 7830 7730 7024 7025 7127 7130 6930 7932 6824 6926 6715 5281 5383 5484 5380 5177 5377 5276 5175 4868 3335 2727 2728 2422 2524 2014 1812 1193 673 1372 1574 2182 2180 2486 2888 4004 4412 4513 5130 6820 7032 8112 5852 2584 5030 5228 4106 3903 3704 2989 5338 4411 4105 1067 2484 1195 2485 2586 2685 2686 2281 762 5126 5125 5227 1194 7901 8109 7129 671 672 864 8212 7903 5851 5544 5236 6651 6358 7902 7071 3090 2683 863 7232 779 2483 2583 2585 2684 2887 3295 3703 4206 4410 4559 4662 5075 5178 5229 5382 5850 5853 5952 5953 5954 5955 6158 6159 6259 6261 6353 6359 6454 6455 6456 6511 6552 6556 6652 6653 6654 6858 6865 6922 7070 7128 7173 7279 7280 7386 7387 7388 7695 8004 8005 8110 8316 8317 8318 8319 8420 8524 8525 8526'
# all_tiles = set(all_tiles.split(' '))


# Original tiles
import glob
vector_version = 'v1.1.3'
all_paths = glob.glob(f'output_data/*/vectors/contours_*_{vector_version}_mndwi_0.00.geojson')
all_tiles = set([os.path.basename(tile).split('_')[1] for tile in all_paths])

vector_version = 'v1.1.1'
processed_paths = glob.glob(f'output_data/*/vectors/contours_*_{vector_version}_mndwi_0.00.geojson')
processed_tiles = set([os.path.basename(tile).split('_')[1] for tile in processed_paths])
missing_tiles = all_tiles - processed_tiles
print(f'{len(all_tiles)} total tiles\n{len(processed_tiles)} processed tiles\n{len(missing_tiles)} missing tiles')

In [None]:
' '.join(sorted(missing_tiles))

## Add Geoserver fields

### Rate of change points
`wms_abs_t`
`wms_abs_s`

`wms_conf_t`
`wms_conf_s`

In [None]:
# points_gdf = gpd.read_file('releases/DEACoastlines_v1.0.0/Shapefile/DEACoastlines_ratesofchange_v1.0.0.shp')
points_gdf = gpd.read_file('DEACoastlines_ratesofchange_v1.1.0.shp')

In [None]:
points_gdf['abs_time'] = points_gdf.rate_time.abs()
points_gdf['conf_time'] = points_gdf['se_time'] * 1.96
points_gdf['conf_soi'] = points_gdf['se_soi'] * 1.96
points_gdf.head()

In [None]:
# Set up scheme to optimise file size
schema_dict = {key: 'float:8.2' for key in points_gdf.columns
               if key != 'geometry'}
schema_dict.update({'sig_time': 'float:8.3',
                    'outl_time': 'str:80',
                    'sig_soi': 'float:8.3',
                    'outl_soi': 'str:80',
                    'retreat': 'bool', 
                    'growth': 'bool',
                    'max_year': 'int:4',
                    'min_year': 'int:4',
                    'breaks': 'str:80',
                    'sig_time': 'float:8.3'})
col_schema = schema_dict.items()

# Export as ESRI shapefiles
points_gdf.to_file('releases/DEACoastlines_v1.0.0-beta/DEACoastlines_v1.0.0_geoserver/DEACoastLines_statistics_v1.0.0.shp',
                   schema={'properties': col_schema, 'geometry': 'Point'})

### Annual coastlines

In [None]:
# contours_gdf = gpd.read_file('releases/DEACoastlines_v1.0.0/Shapefile/DEACoastlines_annualcoastlines_v1.0.0.shp')
contours_gdf = gpd.read_file('DEACoastlines_annualcoastlines_v1.1.0.shp')

In [None]:
contours_gdf.certainty.unique()

In [None]:
contours_gdf['wms_good'] = contours_gdf.certainty == 'good'
contours_gdf['wms_tidal'] = contours_gdf.certainty == 'tidal issues'
contours_gdf['wms_nodata'] = contours_gdf.certainty == 'insufficient data'
contours_gdf['wms_aero'] = contours_gdf.certainty == 'aerosol issues'
contours_gdf.head()

In [None]:
# Export as ESRI shapefiles
contours_gdf.to_file('releases/DEACoastlines_v1.0.0-beta/DEACoastlines_v1.0.0_geoserver/DEACoastLines_coastlines_v1.0.0.shp')

## Annual snapshot code

In [None]:
points_gdf = gpd.read_file('DEACoastlines_ratesofchange_v1.1.0.shp')
points_gdf = points_gdf.to_crs('EPSG:4326')

In [None]:
import numpy as np
def remove_outliers(x):
    
    if x.outl_time is not None:
        columns = [f'{"dist_"}{i}' for i in x.outl_time.split(" ") if i]
        x[columns] = np.nan

    return x


points_gdf_nooutliers = points_gdf.apply(remove_outliers, axis=1)

# import dask.dataframe as dd  

# data_dd = dd.from_pandas(points_gdf, npartitions=8) 
# points_gdf_nooutliers = (data_dd.map_partitions(lambda df: df.apply(remove_outliers, axis=1), meta=np.float).compute(scheduler='processes'))




In [None]:
# filter_rows = (points_gdf_nooutliers.geometry.x > 150) & (points_gdf_nooutliers.geometry.y < -29)
subset_gdf = points_gdf_nooutliers.loc[:, points_gdf_nooutliers.columns.str.contains('dist_')]
subset_gdf.head()

In [None]:
diff_gdf = subset_gdf.diff(periods=1, axis=1)
diff_gdf

In [None]:
diff_long = diff_gdf.melt(var_name='year', value_name='distance').dropna()
diff_long['year'] = diff_long.year.str.strip('dist_').astype(int)
diff_long.head()

In [None]:
# diff_long.groupby'year').distance.apply(pd.cut, bins=2)
test = diff_long.groupby(['year', pd.cut(x=diff_long['distance'], 
                                         bins=[-np.inf, -10, -5, 0, 5, 10, np.inf], 
                                         labels=['< -10 m', '-10 to -5 m', '-5 to 0 m', '0 - 5 m', '5 - 10 m', '10 m >'])]).count()
test = test.groupby('year').transform(lambda x: (x / x.sum()) * 100)

In [None]:
test2 = test.rename({'distance': 'proportion'}, axis=1).reset_index('distance').pivot(columns='distance', values='proportion')

In [None]:
fig = plt.figure(figsize=(12, 6))
test2.plot.area(cmap='RdBu', linestyle='None', ax=plt.gca()).legend(loc='upper center', bbox_to_anchor=(0.5, 1.1), ncol=6, frameon=False);
plt.plot(test2.iloc[:,0:3].sum(axis=1).index, 
         test2.iloc[:,0:3].sum(axis=1).values, color='grey', linewidth=0.5)
plt.axhline(50, c='black', linestyle='dashed')
plt.xlim([1989, 2020])
plt.ylim([0, 100])
import matplotlib.ticker as mtick
plt.gca().yaxis.set_major_formatter(mtick.PercentFormatter())

In [None]:
test2.loc[2020].plot.pie(cmap='RdBu')

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(12, 4))
year = 2020
diffs = (test2.loc[year] - test2.loc[year - 1])
diffs.plot.bar(width=0.9, color=plt.cm.RdBu(np.linspace(0, 1, 6)))

rates = diffs.index.values
symbols = [u'\N{BLACK DOWN-POINTING TRIANGLE}', u'\N{box drawings heavy horizontal}', u'\N{BLACK UP-POINTING TRIANGLE}']
triangles = pd.cut(diffs, [-100, -0.5, 0.5, 100], labels=symbols)
percents = [f'{abs(diff):.1f}%' for diff in diffs]
labels = [f'{r}\n\n{t} {p}' for r, t, p in zip(rates, triangles, percents)]

ax.set_frame_on(False)
ax.yaxis.set_visible(False)
ax.xaxis.set_tick_params(length=0)
ax.set(xlabel="")

ax.xaxis.set_ticks_position('top')
ax.set_xticklabels(labels, rotation = 0, fontsize=15);

In [None]:
diffs = (test2.loc[year] - test2.loc[year - 1])
diffs

In [None]:
year = 2019
test2.loc[year]

In [None]:
year = 2018

symbols = [u' \N{BLACK DOWN-POINTING TRIANGLE} ', u' \N{BLACK UP-POINTING TRIANGLE} ']
absolute = test2.loc[year]
diffs = (test2.loc[year] - test2.loc[year - 1])
triangles = pd.cut(diffs, [-100, 0, 100], labels=symbols)

# print(f'Erosion (> 10m):  {absolute.iloc[0]:.1f}% ( {triangles.iloc[0]} {diffs.iloc[0]:.1f}%)\n Growth (> 10m):  {absolute.iloc[-1]:.1f}% ( {triangles.iloc[-1]} {diffs.iloc[-1]:.1f}%)')
# print(f'{absolute.iloc[-1]:.1f}% ( {triangles.iloc[-1]} {diffs.iloc[-1]:.1f}%) grew')

print(f"In {year}, {absolute.iloc[0]:.1f}% of Australia's coastlines eroded by more than 10 metres ({triangles.iloc[0]} {diffs.iloc[0]:.1f}% since {year - 1}), compared to "
      f"{absolute.iloc[-1]:.1f}% that grew ({triangles.iloc[-1]}{diffs.iloc[-1]:.1f}% since {year - 1})")

In [None]:
years = range(1988,2021)
thresh = 20

retreat_rates = []
growth_rates = []

for year in years:

    
    retreat_perc = (diff_gdf[f"dist_{year}"] <= -thresh).mean().item()
    growth_perc = (diff_gdf[f"dist_{year}"] >= thresh).mean().item()
    retreat_rates.append(retreat_perc)
    growth_rates.append(growth_perc)
    
#     print(f'{year} growth greater than {str(thresh)} m / year: {retreat_perc:.2%}')
#     print(f'    retreat greater than {str(thresh)} m / year: {growth_perc:.2%}')

In [None]:
test = pd.DataFrame(data={'growth': growth_rates, 'retreat': retreat_rates}, index=years)  #.drop([1991, 1992, 1993, 1994])
# test.plot(figsize=(15, 5))

In [None]:
fig, ax = plt.subplots()
ax.stackplot(test.index, [test.growth, test.retreat], baseline='sym')
plt.axhline(0, c='black', linestyle='dashed')

In [None]:
test.plot.area(figsize=(10, 5))
plt.axhline(0.5, c='black', linestyle='dashed')
# plt.xlim([1989,2020])
plt.ylim([0, 1.0])

In [None]:
test.retreat.mean()

In [None]:
test.growth.mean()

In [None]:
soi = pd.read_csv('input_data/climate_indices.csv', index_col='year')
soi

In [None]:
soi_test = test.join(soi.shift(0))
soi_test['diff'] = soi_test.retreat - soi_test.growth

soi_test.plot.scatter(x='SOI', y='diff')

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** For assistance with any of the Python code or Jupyter Notebooks in this repository, please post a [Github issue](https://github.com/GeoscienceAustralia/DEACoastLines/issues/new). For questions or more information about this product, sign up to the [Open Data Cube Slack](https://join.slack.com/t/opendatacube/shared_invite/zt-d6hu7l35-CGDhSxiSmTwacKNuXWFUkg) and post on the [`#dea-coastlines`](https://app.slack.com/client/T0L4V0TFT/C018X6J9HLY/details/) channel.

**Last modified:** July 2021