# Build Tiled Library

This task includes the following main steps:

* Tile library
* Calculate FSP and segment downstream distance
* Assign stream order to segments

## Tile Library

With segment-based library as input, we spatially divide it into tiles based on the tile size provided. Note that tile size is the number of cells to avoid partial cells within a tile and can be used for both PCS and GCS. 

This process also copies the FSP and segment info CSV files to the tiled library and creates a metadata file (TileCellSizeSpatialReference.json) which stores library tile and cell sizes and spatial reference in a JSON file.

**Note that this is the most time consuming process and may take hours for large libraries.**

In [1]:
import sys
import time

# Set tools/scripts folder (YOU NEED TO CHANGE THIS)
fldplnToolFolder = r'E:/CUAHSI_SI/training/source' #r'C:\Users\lixi\OneDrive - The University of Kansas\FLDPLN\tools_os'

# add the tools folder to sys.path to access the fldpln module
sys.path.append(fldplnToolFolder) 
# fldpln modules
from fldpln_library import *
from fldpln import *

In [7]:
# Segment-based library
segLibFolder = r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m'

# tiled library folder
tiledLibFolder = r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library' 

# define tile size (number of cells) and format
cellSize = 10
tileSize = 7500 # number of cells
tileFileFormat = 'snappy' # 'snappy' or 'mat'

# libraries to be tiled. 
# Note that the tiled libraries will have the same name as the segment-based libraries except they are located under the segFolder!
libNames = ['lib'] # libs for Verdigris River

# tile libraries
for libName in libNames: 
    print(f'Tile library: {libName} ...')
    TileLibrary(os.path.join(segLibFolder,libName), cellSize, os.path.join(tiledLibFolder,libName),tileSize,tileFileFormat) 

Tile library: lib ...
Calculate library extent ...


  segExts = pd.concat([segExts,segExt])


Library external border extent (minX, maxX, minY, maxY) : (783230.0, 816370.0, 4099130.0, 4158820.0)
Total number of FSP-FPP relations: 20685625
Number of (possible) tiles: 1
Tile extents:
 [(783230.0, 858230.0, 4099130.0, 4174130.0)] 

Build tiles (tiling FSP-FPP relations) ...
Processing tile:  1
Tile extent (minX, maxX, minY, maxY) : (783230.0, 858230.0, 4099130.0, 4174130.0)
Number of segments interseting with the tile:  29
Total number of FSP-FPP relations in the tile: 20685625
Saving FSP-FPP relations in a file...


  fspDf = tdf.groupby(['FspId'], as_index=False).agg(MinDtf = ('Dtf', min),MaxDtf = ('Dtf', max))
  fspDf = tdf.groupby(['FspId'], as_index=False).agg(MinDtf = ('Dtf', min),MaxDtf = ('Dtf', max))
  fspIdxDf = pd.concat([fspIdxDf, fspDf], ignore_index=True)


Number of unique FSPs in the tile: 12703
Tile FSP extent (fspMinX,fspMaxX,fspMinY,fspMaxY):  (788650.0, 809610.0, 4099380.0, 4136930.0)
Tile FPP extent (fppMinX,fppMaxX,fppMinY,fppMaxY):  (783230.0, 816370.0, 4099130.0, 4158820.0)
Save fsp-tile index as a CSV file ...
Save tile index as a CSV file ...


  tileIdxDf = pd.concat([tileIdxDf, tileIdx],ignore_index=True)


## Calculate FSP and Segment Downstream Distance

Here we calculate downstream distance for the both FSPs and segments for mapping. It involves the following main tasks:
* Clean up segments 
    * It removes segments from the segment table if they are not in the FSP table. 
    * If a removed segment is the downstream segment of another segment in the segment table, the upstream segment ID is set to 0 (i.e., watershed outlet). 
    * Those removed segments are usually close to or in waterbodies. By removing those segments, a library may have several separate watersheds/outlets! For example, neosho has 3 separate watersheds (segment 13, 104, 186 as the outlet segments). 
* Calculate FSP and segment downstream distance (i.e., distance from outlet) using in interpolating FSP depth of water from gauges

At the end, this step updates the FSP and segment info CSV files with additional columns. 

In [8]:
from fldpln_library import *
from fldpln import *

In [9]:
def CalculateFspSegmentDownstreamDistance_test(libFolder,libName):
# Cleanup segments (some segments don't exist in FSPs) and save library FSP and segment information as two csv files 
# (fsp_info.csv & segment_info.csv). It also reads in the SpatialReference.prj and save it in CellSizeSpatialReference.json
# It also calculates FSP and segment downstream distance (i.e., distance to library outlet(s)) which involves:
# 1. Calculate FSP's within-segment downstream distance
# 2. Calculate segment length which is more accurate than "CellCount" * cell size
# 3. Calculate segment's dowstreeam distance (to watershed outlet) for speeding up 
# 4. Calculate FSP's downstream distance
# Note that FSPs and segments are based on raster cell centers. Segment and its downstream segment has a gap (1 cell or sqrt(2) cell).
# This function add the gap when calculating downstream distance to the outlet!
   
    #
    # read in fsp (flood source pixel) and segment network info Excel files
    #
    # fspInfoColumnNames = ['FspX','FspY','SegId','FilledElev'], columns 'DsDist' will be calculated by this function
    # segInfoColumnNames = ['SegId','CellCount','DsSegId', 'StFac','EdFac'], columns 'Length','DsDist' will be added by this function
    fspInfoFile = os.path.join(libFolder, libName, fspInfoFileName)
    segInfoFile = os.path.join(libFolder, libName, segInfoFileName)
 

    # read in FSP ID and coordinates
    # need to set float_precision='round_trip' to prevent rounding while reading the text file! float_precision='high' DOESN'T work.
    # For Verdigris 10-m library, FSP ID of 22246, its FspX of -1003.7918248322967 in fsp_info.csv was read into memory as -1003.7918248322968 without using float_precision='round_trip'
    fspDf = pd.read_csv(fspInfoFile,float_precision='round_trip',index_col=False)
    segDf = pd.read_csv(segInfoFile,float_precision='round_trip',index_col=False)

    #
    # Clean up the segment table.
    # 1. Remove the segment if it's not in the FSP table
    # 2. If the missing segment is the downstream segment of another segment, set it as 0. 
    # Those missing segments are usually because of they are close to or in waterbodies. 
    # By removing those segment, a library may have several seperate watersheds/outpets!
    #
    # get the segment IDs
    segIds = segDf['SegId'].to_list()
    for sid in segIds:
        fsps = fspDf[fspDf['SegId']==sid]
        if len(fsps)==0:
            # segment not found in the FSP table. delete the row
            segDf = segDf.loc[segDf['SegId']!=sid]
            # set downstream segment ID to 0
            segDf.loc[segDf['DsSegId']==sid,'DsSegId'] = 0

    #
    # Calculate FSP within-segment distance, segment length, and segment dowstream distance, and FSP downstream distance
    #
    # add field for FSP within-segment distance
    fspDf['DsDist'] = 0.0
    # add field for segment length
    segDf['Length'] = 0.0

    # Calculate FSP within-segment DOWNSTREAM distance and segment length
    for segIdx, row in segDf.iterrows():
        segID = row['SegId']
        # print(segID)

        # select FSP on the segment
        fsps = fspDf[fspDf['SegId']==segID][['FspX','FspY']]
        
        # calculate fsp downstream within segment length
        segDist = 0.0
        if len(fsps)==0:
            # this should not happen as we have already clean up the segment table!
            print(f"Segment {segID} is missing in {fspInfoFileName}!")
        else:  
            first=True
            for idx, row in fsps[::-1].iterrows():
                # Note the idx in fsps is the index in fspDf!!!
                # calculate distance
                if first:
                    fspx1, fspy1 = row['FspX'], row['FspY']
                    dist=0.0
                    first=False
                else:
                    fspx2, fspy2 = row['FspX'], row['FspY']
                    dist=math.sqrt((fspx1-fspx2)**2+(fspy1-fspy2)**2)
                    fspx1, fspy1 = fspx2, fspy2
                segDist += dist
                fspDf.at[idx,'DsDist']=segDist

        # update segment distance in segDf
        segDf.at[segIdx,'Length'] = segDist

    # show the DFs
    # print(fspDf[:1136])
    # print(segDf)

    #
    # Calculate segment downstream length
    #
    # only for the segments that exist in the FSP table
    # But this not necessary as segments don't exist in FSP table already removed
    # Also this line will remove the segment which has just one FSP!
    # segDf = segDf[segDf['Length']>0]

    # add field for segment downstream distance for speeding up calculating FSP downstream distance
    segDf['DsDist'] = 0.0
    for segIdx, row in segDf.iterrows():
        segID = row['SegId']
        dsSegID = row['DsSegId']

        dsDist = 0.0
        while dsSegID != 0:
            print(dsSegID)
            # get downstream segment length and ID
            tempDf = segDf[segDf['SegId']==dsSegID][['Length','DsSegId']]
            length, segID_ds = tempDf.iat[0,0], tempDf.iat[0,1]
            dsDist += length

            # There is a GAP between two segments as they are consisted of FSP cell centers
            # Calculate the GAP and add it to segment downstream dist
            # last fsp in upstream segment
            lastFspXy = fspDf[fspDf['SegId']==segID][['FspX','FspY']].tail(1)    
            fspx1, fspy1 = lastFspXy.iat[0,0],lastFspXy.iat[0,1]
            # first FSP in downstream segment
            firstFspXy = fspDf[fspDf['SegId']==dsSegID][['FspX','FspY']].head(1)
            fspx2, fspy2 = firstFspXy.iat[0,0], firstFspXy.iat[0,1]
            dist=math.sqrt((fspx1-fspx2)**2+(fspy1-fspy2)**2)
            dsDist += dist

            # move to ownstream segment
            segID = dsSegID
            dsSegID = segID_ds

        segDf.at[segIdx,'DsDist'] = dsDist
    # print(segDf)

    # Calculate FSP downstream distance
    for idx, row in fspDf.iterrows():
        segID = row['SegId']
        inSegDist = row['DsDist']

        # get segment downstream distance
        tempDf = segDf[segDf['SegId']==segID][['DsDist']]
        segDsDist = tempDf.iat[0,0]

        # reset FSP downstream distance
        fspDf.at[idx,'DsDist'] = inSegDist + segDsDist
    # print(fspDf)

    # save the updated info files
    fspDf.to_csv(fspInfoFile,index=False,mode='w+')
    segDf.to_csv(segInfoFile,index=False,mode='w+')

    return fspDf, segDf

In [10]:
# tiled library folder
tiledLibFolder = r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library' 

# libraries to be tiled. 
# Note that the tiled libraries will have the same name as the segment-based libraries except they are located under the tiledLibFolder!
libNames = ['lib'] # libs with different segments and fldmx in Wildcat Creek

# Create FSP and segment info files and library meta file (i.e., cell size and spatial reference)
for libName in libNames:
    print(f'Calculate FSP and segment downstream distance: {libName} ...')
    CalculateFspSegmentDownstreamDistance_test(tiledLibFolder,libName)

Calculate FSP and segment downstream distance: lib ...
177.0
178
179
180
181
182
183
184
185
186
187
191
195
196
63.0
64
179
180
181
182
183
184
185
186
187
191
195
196
64.0
179
180
181
182
183
184
185
186
187
191
195
196
179.0
180
181
182
183
184
185
186
187
191
195
196
81.0
82
83
84
85
191
195
196
82.0
83
84
85
191
195
196
83.0
84
85
191
195
196
84.0
85
191
195
196
85.0
191
195
196
191.0
195
196
100.0
101
195
196
101.0
195
196
195.0
196
176.0
177
178
179
180
181
182
183
184
185
186
187
191
195
196
177.0
178
179
180
181
182
183
184
185
186
187
191
195
196
178.0
179
180
181
182
183
184
185
186
187
191
195
196
179.0
180
181
182
183
184
185
186
187
191
195
196
180.0
181
182
183
184
185
186
187
191
195
196
181.0
182
183
184
185
186
187
191
195
196
182.0
183
184
185
186
187
191
195
196
183.0
184
185
186
187
191
195
196
184.0
185
186
187
191
195
196
185.0
186
187
191
195
196
186.0
187
191
195
196
187.0
191
195
196
191.0
195
196
195.0
196
196.0


## Assign Stream Order to FSPs and Segments Manually
Stream orders are used while interpolating the depth of flow (DOF) at FSPs where low order streams are handled before high order streams. Currently this is performed in GIS manually by first create the segment shapefiles from the FSP and segment info CSV files. 

### Generate Segment Shapefile
Create a shapefile for manually assigning stream orders to segments in a tiled library. Note that the shapefile gets its CRS from the library metadata file and has all the attributes in the FSP and segment info CSV file.

In [11]:
# tiled library folder to generate segment shapefile
tiledLibFolder= r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library\lib'

# FSP and segment info CSV files
segInfoFile = os.path.join(tiledLibFolder,segInfoFileName)
fspInfoFile = os.path.join(tiledLibFolder,fspInfoFileName)

# Read lib CRS from metadata file
metaDataFile = os.path.join(tiledLibFolder,metaDataFileName)
with open(metaDataFile,'r') as jf:
    md = json.load(jf)
srText = md['SpatialReference']
libCrs = CRS.from_wkt(srText)

# segment shapefile folder
# shpFolder = r'D:\xingong\Wildcat\stream_order'
shpFolder = tiledLibFolder
shpName = "segments.shp"
outShpFile =  os.path.join(shpFolder,shpName)

# generate segment shapefile 
print(f'Generate segment shapefile for libraries ...')
# GenerateSegmentShapefiles(tiledLibFolder,libNames,shpFolder,shpName)
GenerateSegmentShapefilesFromFspSegmentInfoFiles(segInfoFile, fspInfoFile, libCrs, outShpFile)

Generate segment shapefile for libraries ...


### Get Stream Order to FSPs and Segments from Segment Shapefile

This step gets stream orders from the segment shapefile and add them to the FSP and segment info files. 

It also creates a new text file, stream_order_info.csv, which stores the connectivity among stream orders with columns: [‘StrOrd’, ‘DsStrOrd’, ‘JunctionFspX’, ‘JunctionFspY’]. This information is used in DOF interpolation.

In [12]:
# tiled library folder to add stream order
tiledLib= r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library\lib'

# FSP and segment info CSV files
segInfoFile = os.path.join(tiledLib,segInfoFileName)
fspInfoFile = os.path.join(tiledLib,fspInfoFileName)

# segment shapefile which has the stream order
shpFolder = r'E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library\lib'
shpName = "segments.shp"
shpOrdColName = 'Str_Ord'
shpFile =  os.path.join(shpFolder,shpName)

print(f'Get stream order and generate stream order network info for : {tiledLib} ...')
GetStreamOrdersForFspsSegments(tiledLib,shpFile,shpOrdColName)

Get stream order and generate stream order network info for : E:\CUAHSI_SI\VerdigrisRv\projects\verdigris_10m\tiled_snz_library\lib ...


  strOrdDf = pd.concat([strOrdDf, temp],ignore_index=True)


(       FspId    FspX     FspY  SegId  FilledElev        DsDist  StrOrd
 0      25464  788655  4130815     61  225.053833  70876.790001     5.0
 1      25465  788665  4130815     61  225.053833  70866.790001     5.0
 2      25466  788675  4130815     61  225.053833  70856.790001     5.0
 3      25467  788685  4130825     61  225.053833  70842.647866     5.0
 4      25468  788695  4130825     61  225.053833  70832.647866     5.0
 ...      ...     ...      ...    ...         ...           ...     ...
 12698  78752  803635  4099425    196  205.413483     52.426407     1.0
 12699  78753  803645  4099415    196  205.407089     38.284271     1.0
 12700  78754  803645  4099405    196  205.401901     28.284271     1.0
 12701  78755  803655  4099395    196  205.395462     14.142136     1.0
 12702  78756  803665  4099385    196  205.388733      0.000000     1.0
 
 [12703 rows x 7 columns],
     SegId  CellCount  DsSegId     StFac     EdFac       Length        DsDist  \
 0      61        716     