# SWOT Shapefile Data Conversion to CSV

### Notebook showcasing how to merge/concatenate multiple shapefiles into a single file.
- Utilizing the merged shapefile and converting it to a csv file.
- Option to query the new dataset based on users choice; either 'reach_id' or water surface elevation ('wse'), etc.
- Using the queried variable to export it as a csv or shapefile.

### Merging two seperate shapefiles into one

In [1]:
import geopandas as gpd
 
# Read shapefiles
SWOT_1 = gpd.read_file('C:\\Users\\tarpinia\\Desktop\\SWORD_shp_v2\\shp\\NA\\na_sword_reaches_hb75_v2.shp')
SWOT_2 = gpd.read_file('C:\\Users\\tarpinia\\Desktop\\SWORD_shp_v2\\shp\\NA\\na_sword_reaches_hb74_v2.shp')
 
# Bring the shapefiles into a common cordinate system
SWOT_1 = SWOT_1.to_crs('EPSG:4326')
SWOT_2 = SWOT_2.to_crs('EPSG:4326')
 
# Merge/Combine multiple shapefiles into one
SWOT_Merge = gpd.pd.concat([SWOT_1, SWOT_2])
 
#Export merged geodataframe into shapefile
SWOT_Merge.to_file('C:\\Users\\tarpinia\\Desktop\\SWOT_Merge.shp')

### Merging multiple shapefiles from within a folder

In [2]:
import glob
from pathlib import Path
import pandas as pd

# Direct folder path of shapefiles
folder = Path("C:\\Users\\tarpinia\\Desktop\\SWOT_River_Reaches")

# State filename extension to look for within folder, in this case .shp which is the shapefile
shapefiles = folder.glob("*.shp")

# Merge/Combine multiple shapefiles in folder into one
gdf = pd.concat([
    gpd.read_file(shp)
    for shp in shapefiles
]).pipe(gpd.GeoDataFrame)

# Export merged geodataframe into shapefile
gdf.to_file(folder / 'C:\\Users\\tarpinia\\Desktop\\SWOTReaches.shp')

### Converting to CSV

Converting merged geodataframe into a csv file. 

In [3]:
gdf.to_csv('C:\\Users\\tarpinia\\Desktop\\csvmerge.csv')

### Querying a Shapefile

If you want to search for a specific reach id or a specific length of river reach that is possible through a spatial query using Geopandas. 

Utilizing comparison operators (>, <, ==, >=, <=).

You can zoom into a particular river reach by specifying by it’s reach_id or looking for duplicate overlapping river reaches.

In [4]:
reach = gdf.query("reach_id == '74292500301'")
reach

Unnamed: 0,reach_id,time,time_tai,time_str,p_lat,p_lon,river_name,wse,wse_u,wse_r_u,...,p_width,p_wid_var,p_n_nodes,p_dist_out,p_length,p_maf,p_dam_id,p_n_ch_max,p_n_ch_mod,geometry
2,74292500301,-1000000000000.0,-1000000000000.0,no_data,40.063235,-98.551296,no_data,-1000000000000.0,-1000000000000.0,-1000000000000.0,...,54.0,387.837794,47,3200409.359,9496.587434,-1000000000000.0,0,2,1,"LINESTRING (-98.50490 40.06789, -98.50525 40.0..."
308,74292500301,-1000000000000.0,-1000000000000.0,no_data,40.063235,-98.551296,no_data,-1000000000000.0,-1000000000000.0,-1000000000000.0,...,54.0,387.837794,47,3200409.359,9496.587434,-1000000000000.0,0,2,1,"LINESTRING (-98.50490 40.06789, -98.50525 40.0..."
262,74292500301,-1000000000000.0,-1000000000000.0,no_data,40.063235,-98.551296,no_data,-1000000000000.0,-1000000000000.0,-1000000000000.0,...,54.0,387.837794,47,3200409.359,9496.587434,-1000000000000.0,0,2,1,"LINESTRING (-98.50490 40.06789, -98.50525 40.0..."
51,74292500301,-1000000000000.0,-1000000000000.0,no_data,40.063235,-98.551296,no_data,-1000000000000.0,-1000000000000.0,-1000000000000.0,...,54.0,387.837794,47,3200409.359,9496.587434,-1000000000000.0,0,2,1,"LINESTRING (-98.50490 40.06789, -98.50525 40.0..."


In [5]:
WSE = gdf.query('wse > 75')
WSE

Unnamed: 0,reach_id,time,time_tai,time_str,p_lat,p_lon,river_name,wse,wse_u,wse_r_u,...,p_width,p_wid_var,p_n_nodes,p_dist_out,p_length,p_maf,p_dam_id,p_n_ch_max,p_n_ch_mod,geometry
116,71386000311,713221200.0,713221200.0,2022-08-07T21:0013Z,48.48355,-82.85651,no_data,111.30161,-1000000000000.0,30.96656,...,45.0,285.15307,75,464346.34,15080.667224,-1000000000000.0,0,1,1,"LINESTRING (-82.87880 48.52825, -82.87919 48.5..."
263,77158000011,713275000.0,713275000.0,2022-08-08T11:5628Z,25.297171,-108.473158,no_data,123.71461,-1000000000000.0,0.0,...,69.5,1719.195048,49,9731.61,9731.609922,-1000000000000.0,0,2,1,"LINESTRING (-108.49317 25.28405, -108.49287 25..."
119,73282800021,713441800.0,713441800.0,2022-08-10T10:1658Z,33.634414,-87.209808,no_data,88.18387,-1000000000000.0,4.2635,...,211.5,3285.033201,57,687962.665,11346.636403,-1000000000000.0,0,2,1,"LINESTRING (-87.23478 33.62552, -87.23452 33.6..."
630,74267700121,713441900.0,713441900.0,2022-08-10T10:1834Z,38.778477,-84.10726,no_data,134.81383,-1000000000000.0,2.6857,...,669.0,2311.101872,57,2560861.191,11466.933285,-1000000000000.0,0,2,1,"LINESTRING (-84.17021 38.79320, -84.16986 38.7..."
34,73290000041,714511800.0,714511800.0,2022-08-22T19:3017Z,30.597928,-88.626436,no_data,118.64166,-1000000000000.0,15.47494,...,105.0,754.311517,52,67960.8,10424.745294,-1000000000000.0,0,2,1,"LINESTRING (-88.60566 30.58840, -88.60597 30.5..."
242,74253000021,714511800.0,714511700.0,2022-08-22T19:2912Z,34.018836,-90.967538,no_data,91.37639,-1000000000000.0,4.93354,...,968.0,67506.844891,50,1108109.937,9988.011659,-1000000000000.0,0,4,1,"LINESTRING (-91.01678 33.99997, -91.01645 34.0..."
658,74291500071,714511700.0,714511600.0,2022-08-22T19:2746Z,38.843434,-92.441821,no_data,76.10944,-1000000000000.0,9.76231,...,408.0,4018.894985,59,2280021.224,11890.852274,-1000000000000.0,0,2,1,"LINESTRING (-92.39134 38.81822, -92.39168 38.8..."


### Converting to CSV

Converting querried variable into a csv file.

In [6]:
reach.to_csv('C:\\Users\\tarpinia\\Desktop\\reach.csv')

In [7]:
WSE.to_csv('C:\\Users\\tarpinia\\Desktop\\WSE.csv')