# Geoprocessing

---

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Geoprocessing" data-toc-modified-id="Geoprocessing-1">Geoprocessing</a></span><ul class="toc-item"><li><span><a href="#Code-from-Notebook-1,-needed-for-in-memory-objects-for-Geoprocessing" data-toc-modified-id="Code-from-Notebook-1,-needed-for-in-memory-objects-for-Geoprocessing-1.1">Code from Notebook 1, needed for in-memory objects for Geoprocessing</a></span></li></ul></li><li><span><a href="#Geoprocessing" data-toc-modified-id="Geoprocessing-2">Geoprocessing</a></span><ul class="toc-item"><li><span><a href="#Point-in-Polygon-Counts-via-Spatial-Join" data-toc-modified-id="Point-in-Polygon-Counts-via-Spatial-Join-2.1">Point-in Polygon Counts via Spatial Join</a></span><ul class="toc-item"><li><span><a href="#Join-Features---Esri-Description" data-toc-modified-id="Join-Features---Esri-Description-2.1.1">Join Features - Esri Description</a></span></li></ul></li></ul></li></ul></div>

In [6]:
import os
import sys
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

In [28]:
import glob
import geopandas as gpd

In [44]:
from tools.tools import read_json, get_current_time
from capstone.etl.viirs_join_basins import viirs_join_basins, compile_basin_data
from capstone.etl.census_parse import parse_census
from capstone.etl.census_retrieval import census_retrieval
from capstone.etl.generate_basins import generate_us_basins
from capstone.etl.eia_retrieval import eia_retrieval
from capstone.etl.eia_parse import eia_parse_county, eia_parse_data

In [39]:
config = read_json('../config.json')

current_date = get_current_time('yyyymmdd')

wd = f"{config['workspace_directory']}/data"

## Code from Notebook 1, needed for in-memory objects for Geoprocessing

In [40]:
census_shp = census_retrieval(f"{wd}/input/census")
census = gpd.read_file(census_shp)
census.columns = [c.lower() for c in census.columns]

eia_xls = eia_retrieval(f"{wd}/input/eia")
eia_cnty = eia_parse_county(eia_xls)
eia_data = eia_parse_data(eia_xls)  # parse the target variable(s) data

census_gdf = parse_census(census_shp)
basins_list, all_basins = generate_us_basins(
    census_gdf,
    eia_cnty,
    f"{wd}/input/basins",
)  # this code creates individual files for basin geographies as well as an all_basins geography file/object.

 parse eia data
    for Anadarko Region
    for Appalachia Region
    for Bakken Region
    for Eagle Ford Region
    for Haynesville Region
    for Niobrara Region
    for Permian Region
generating us basins
    permian region
    appalachia region
    haynesville region
    eagle ford region
    anadarko region
    niobrara region
    bakken region


# Geoprocessing

In [41]:
# get lists of all the retrieved viirs data for both 2.1c and 3.0 viirs

viirs_2_1c_files = glob.glob(f"{wd}/input/viirs21c/*.csv")  # get viirs
viirs_2_1c_files.sort()  # sort so dates are consecutive for tracking

print(f'Total 2.1c files: {len(viirs_2_1c_files)}')

viirs_3_0_files = glob.glob(f"{wd}/input/viirs30/*.csv")  # get viirs files
viirs_3_0_files.sort()  # sort so dates are consecutive for tracking

print(f'Total 3.0 files: {len(viirs_3_0_files)}')

Total 2.1c files: 2095
Total 3.0 files: 824


## Point-in Polygon Counts via Spatial Join

While Join Features tool was not used (rather GeoPandas S-Join for Spatial Join), this illustration better shows how a given geography 2d or 3d polygon, is intersected with points, we can count those features inside. 

### Join Features - Esri Description 

> Joins attributes from one layer to another based on spatial, temporal, or attribute relationships, or a combination of those relationships. [https://pro.arcgis.com/en/pro-app/tool-reference/geoanalytics-desktop/join-features.htm](https://pro.arcgis.com/en/pro-app/tool-reference/geoanalytics-desktop/join-features.htm)

[![join](https://pro.arcgis.com/en/pro-app/tool-reference/geoanalytics-desktop/GUID-EB8FA998-105A-4D93-93E3-5FAA1057137D-web.png)](https://pro.arcgis.com/en/pro-app/tool-reference/geoanalytics-desktop/GUID-EB8FA998-105A-4D93-93E3-5FAA1057137D-web.png)


Geopandas code inside `tools.geoprocessing.py` which is used inside `viirs_join_basins(...)`  in this project repository:
```python
import geopandas as gpd


def point_in_polygon(point_gdf, poly_gdf):
    return gpd.sjoin(
        point_gdf,
        poly_gdf,
        how="inner",
        op='intersects',  # warning CRS of frames do not match
    )
```

In [42]:
viirs_join_basins( 
    wd,
    all_basins,
    viirs_2_1c_files,
    '21c',
)   # spatially join viirs 2.1c to basins geometries

viirs_join_basins(
    wd,
    all_basins,
    viirs_3_0_files,
    '30',
)  # spatially join viirs 3.0 to basins geometries

selecting viirs for basins 21c
    20120301
    20120302
    20120303
    20120304
    20120305
    20120306
    20120307
    20120308
    20120309
    20120310
    20120311
    20120312
    20120313
    20120314
    20120315
    20120316
    20120317
    20120318
    20120319
    20120320
    20120321
    20120322
    20120323
    20120324
    20120326
    20120327
    20120328
    20120329
    20120330
    20120331
    20120401
    20120402
    20120403
    20120404
    20120405
    20120406
    20120407
    20120408
    20120409
    20120410
    20120411
    20120412
    20120413
    20120414
    20120415
    20120416
    20120417
    20120418
    20120419
    20120420
    20120421
    20120422
    20120423
    20120424
    20120425
    20120426
    20120427
    20120428
    20120429
    20120430
    20120501
    20120502
    20120503
    20120504
    20120505
    20120506
    20120507
    20120508
    20120509
    20120510
    20120511
    20120512
    20120513
    20120514
    201

    20160314
    20160315
    20160316
    20160317
    20160318
    20160319
    20160320
    20160321
    20160322
    20160323
    20160324
    20160325
    20160326
    20160327
    20160328
    20160329
    20160330
    20160331
    20160401
    20160402
    20160403
    20160404
    20160405
    20160406
    20160407
    20160408
    20160409
    20160410
    20160411
    20160412
    20160413
    20160414
    20160415
    20160416
    20160417
    20160418
    20160419
    20160420
    20160421
    20160422
    20160423
    20160424
    20160425
    20160426
    20160427
    20160428
    20160429
    20160430
    20160501
    20160502
    20160503
    20160504
    20160505
    20160506
    20160507
    20160508
    20160509
    20160510
    20160511
    20160512
    20160513
    20160514
    20160515
    20160516
    20160517
    20160518
    20160519
    20160520
    20160521
    20160522
    20160523
    20160524
    20160525
    20160526
    20160527
    20160528
    20160529

    20181205
    20181206
    20181207
    20181208
    20181209
    20181210
    20181211
    20181212
    20181213
    20181214
    20181215
    20181216
    20181217
    20181218
    20181219
    20181220
    20181221
    20181222
    20181223
    20181224
    20181225
    20181226
    20181227
    20181228
    20181229
    20181230
    20181231
    20190101
    20190102
    20190103
    20190104
    20190105
    20190106
    20190107
    20190108
    20190109
    20190110
    20190111
    20190112
    20190113
    20190114
    20190115
    20190116
    20190117
    20190118
    20190119
    20190120
    20190121
    20190122
    20190123
    20190124
    20190125
    20190126
    20190127
    20190128
    20190129
    20190130
    20190131
    20190201
    20190202
    20190203
    20190204
    20190205
    20190206
    20190207
    20190208
    20190209
    20190210
    20190211
    20190212
    20190213
    20190214
    20190215
    20190216
    20190217
    20190218
    20190219

In [45]:
basins_int_viirs_21c = compile_basin_data(wd, '21c')
basins_int_viirs_30  = compile_basin_data(wd, '30')

    20120301
    20120302
    20120303
    20120304
    20120305
    20120306
    20120307
    20120308
    20120309
    20120310
    20120311
    20120312
    20120313
    20120314
    20120315
    20120316
    20120317
    20120318
    20120319
    20120320
    20120321
    20120322
    20120323
    20120324
    20120326
    20120327
    20120328
    20120329
    20120330
    20120331
    20120401
    20120402
    20120403
    20120404
    20120405
    20120406
    20120407
    20120408
    20120409
    20120410
    20120411
    20120412
    20120413
    20120414
    20120415
    20120416
    20120417
    20120418
    20120419
    20120420
    20120421
    20120422
    20120423
    20120424
    20120425
    20120426
    20120427
    20120428
    20120429
    20120430
    20120501
    20120502
    20120503
    20120504
    20120505
    20120506
    20120507
    20120508
    20120509
    20120510
    20120511
    20120512
    20120513
    20120514
    20120515
    20120516
    20120517

    20131125
    20131126
    20131127
    20131128
    20131129
    20131130
    20131201
    20131202
    20131203
    20131204
    20131205
    20131206
    20131207
    20131208
    20131209
    20131210
    20131211
    20131212
    20131213
    20131214
    20131215
    20131216
    20131217
    20131218
    20131219
    20131220
    20131221
    20131222
    20131223
    20131224
    20131225
    20131226
    20131227
    20131228
    20131229
    20131230
    20131231
    20140101
    20140102
    20140103
    20140104
    20140105
    20140106
    20140107
    20140108
    20140109
    20140110
    20140111
    20140112
    20140113
    20140114
    20140115
    20140116
    20140117
    20140118
    20140119
    20140120
    20140121
    20140122
    20140123
    20140124
    20140125
    20140126
    20140127
    20140128
    20140129
    20140130
    20140131
    20140201
    20140202
    20140203
    20140204
    20140205
    20140206
    20140207
    20140208
    20140209

    20150818
    20150819
    20150820
    20150821
    20150822
    20150823
    20150824
    20150825
    20150826
    20150827
    20150828
    20150829
    20150830
    20150831
    20150901
    20150902
    20150903
    20150904
    20150905
    20150906
    20150907
    20150908
    20150909
    20150910
    20150911
    20150912
    20150913
    20150914
    20150915
    20150916
    20150917
    20150918
    20150919
    20150920
    20150921
    20150922
    20150923
    20150924
    20150925
    20150926
    20150927
    20150928
    20150929
    20150930
    20151001
    20151002
    20151003
    20151004
    20151005
    20151006
    20151007
    20151008
    20151009
    20151010
    20151011
    20151012
    20151013
    20151014
    20151015
    20151016
    20151017
    20151018
    20151019
    20151020
    20151021
    20151022
    20151023
    20151024
    20151025
    20151026
    20151027
    20151028
    20151029
    20151030
    20151031
    20151101
    20151102

    20170510
    20170511
    20170512
    20170518
    20170519
    20170520
    20170521
    20170522
    20170523
    20170524
    20170525
    20170526
    20170527
    20170528
    20170529
    20170530
    20170531
    20170601
    20170602
    20170603
    20170604
    20170605
    20170606
    20170607
    20170608
    20170609
    20170610
    20170611
    20170612
    20170613
    20170614
    20170615
    20170616
    20170617
    20170618
    20170619
    20170620
    20170621
    20170622
    20170623
    20170624
    20170625
    20170626
    20170627
    20170628
    20170629
    20170630
    20170701
    20170702
    20170703
    20170704
    20170705
    20170706
    20170707
    20170708
    20170709
    20170710
    20170711
    20170712
    20170713
    20170714
    20170715
    20170716
    20170717
    20170718
    20170719
    20170720
    20170721
    20170722
    20170723
    20170724
    20170725
    20170726
    20170727
    20170728
    20170729
    20170730

    20190206
    20190207
    20190208
    20190209
    20190210
    20190211
    20190212
    20190213
    20190214
    20190215
    20190216
    20190217
    20190218
    20190219
    20190220
    20190221
    20190222
    20190223
    20190224
    20190225
    20190226
    20190227
    20190228
    20190301
    20190302
    20190303
    20190304
    20190305
    20190306
    20190307
    20190308
    20190309
    20190310
    20190311
    20190312
    20190313
    20190314
    20190315
    20190316
    20190317
    20190318
    20190319
    20190320
    20190321
    20190322
    20190323
    20190324
    20190325
    20190326
    20190327
    20190328
    20190329
    20190330
    20190331
    20190401
    20190402
    20190403
    20190404
    20190405
    20190406
    20190407
    20190408
    20190409
    20190410
    20190411
    20190412
    20190413
    20190414
    20190415
    20190416
    20190417
    20190418
    20190419
    20190420
    20190421
    20190422
    20190423

In [46]:
print(basins_int_viirs_21c.shape)
print(basins_int_viirs_30.shape)

(1009001, 129)
(523075, 46)
