# ODK Aggregate processing pipeline.

Copyright 2021 Robert McGregor

This notebook controls the processing of ODK Aggregate Result csv files and produces the following outputs:
- observation sheet
- ras sheet
- downloads and renames photographs.

### cd to the directory from the Anaconda/Miniconda terminal.

cd E:\DENR\code\rangeland_monitoring\rmb_aggregate_processing

Depending on your system you may need to add "/d"
i.e. cd /d E:\DENR\code\rangeland_monitoring\rmb_aggregate_processing

## Setup

### Required packages:
The ODK Aggregate processing pipeline requires the following package:

- numpy
- geopandas
- datetime
- os
- argparse
- sys
- shutill
- warnings
- glob
- xlsxwriter

If you are using this notebook locally, you may need to install these packages using conda or pip. Uncomment one of the cells below by removing the hashtag('#') and run the cell by pressing **Shift+Enter**.

In [1]:
! conda list

# packages in environment at C:\ProgramData\Miniconda3\envs\rmb_zonal:
#
# Name                    Version                   Build  Channel
affine                    2.3.0                      py_0  
argon2-cffi               20.1.0           py37h4ab8f01_2    conda-forge
async_generator           1.10                       py_0    conda-forge
attrs                     20.2.0                     py_0  
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
blas                      1.0                         mkl  
bleach                    3.2.1              pyh9f0ad1d_0    conda-forge
bokeh                     2.2.3                    py37_0  
bzip2                     1.0.8                he774522_0  
ca-certificates           2021.5.30            h5b45459_0    conda-forge
certifi                   2021.5.30        py37h03978a9_0    conda-forge
cffi                      1.14.0           py

## Lets check the core packages versions in your conda environment.

In [1]:
import pandas; print(f"pandas: {pandas.__version__}")
import numpy; print(f"numpy: {numpy.__version__}")
import geopandas; print(f"geopandas: {geopandas.__version__}")
import argparse; print(f"argpass: {argparse.__version__}")
import xlsxwriter; print(f"xlsxwriter: {xlsxwriter.__version__}")
import selenium; print(f"selenium: {selenium.__version__}")

pandas: 1.1.1
numpy: 1.19.1
geopandas: 0.8.1
argpass: 1.1
xlsxwriter: 1.3.7
selenium: 3.141.0


In [None]:
# If you do not have the required packages uncomment out the required line if you are on you local computer you will 
# need use external wifi (i.e. hotspot your phone)

#! conda install -c conda-forge numpy geopandas argparse xlsxwriter selenium
#! conda install -c conda-forge numpy
#! conda install -c conda-forge geopandas
#! conda install -c conda-forge argparse
#! conda install -c conda-forge xlsxwriter
#! conda install -c conda-forge selenium

## Command arguments

- '-d', '--directory_odk', help = 'The directory containing ODK csv files.', default set to '\raw_odk'
- '-x', '--export_dir', help = 'Directory path for outputs.', default set to 'Z:\Scratch\Zonal_Stats_Pipeline\rmb_aggregate_processing\outputs'            
- '-c', '--chrome_driver', help = 'File path for the chrome extension driver.', default set to "assets/chrome_driver/chrome_driver_v89_0_4389_23/chromedriver.exe"
- '-r', '--remote_desktop', help = 'Working on the remote_desktop? - Enter remote_auto, remote, local or offline.', default set to 'remote' - see next cell for full explanation.
- '-v', '--assets_veg_list_dir', help="Directory containing veg lists', default set to 'assets/veg_list'.
- '-s', '--assets_shapefiles_dir', help='Directory containing shapefiles', default set to 'assets/shapefiles'
- '-t', '--time_sleep', help='Time between odk aggregate actions -if lagging increase integer', default set to 20 - only required if ruuning remote_desktop as remote_auto
- '-ht', '--html_dir', help='Directory containing html transect files. remote_desktop local or offline mode  requires the manual download of transect html tables', default='html_transect'
- '-ver', '--version', help='ODK version being processed (e.g. v1, v2 etc.)', default='v1'
- '-p', '--property_enquire', help="Enter the name of a single property you wish to process. (eg. PROPERTY NAME)", default is set to None (None - will process all sites).

### Remote desktop command argument options

 - remote_auto = PGB-BAS14 server - will automate entire process - networking issues are currently causing this to crash - not recomended.
 - remote = PGB-BAS14 server - will automate the process however, you will need to have downloads result csv files from ODK Aggregate - recommended.
 - local = external computer conected to NTG internet requires odk csv files and will download photos - recommended.
 - offline = external computer not conected to NTG internet - requires odk csv files and will not download photos - if required.

#### Known issues

- There are currently major issues with NTG Networking which are causing ODK to crash.
- Due to the previously mentioned issue, saving transect data as an html/complete file over the network can also crash ODK Aggregate. As, such, all T4's have been approved for adm-accounts, allowing access to the remote server. Until such time that the network improved this script should only be used with remote_server set to 'remote', from within the remote server.


### Check that you have connected to the project database

In [2]:
import geopandas as gpd
import os

path_parent = os.path.dirname(os.getcwd())
previous_visists_shapefile =  path_parent +  '\\assets\\shapefiles\\NT_StarTransect_20200713.shp'
gdf = gpd.read_file(previous_visists_shapefile)
gdf.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

#### Do your results look like this?

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

If yes, you are ready to run some code.

If not, open this notebook from within the conda environment zonal, and try again.

Still having issues? Contact Rob or Grant on Teams, sharing your screen to trouble shoot.

## Run the RMB observatrion / ras sheet pipeline.

#### Change your working directory to code

In [3]:
import os
path_parent = os.path.dirname(os.getcwd())
code_dir =  path_parent +  '\\code'
os.chdir(code_dir)
print(os.getcwd())

E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\code


In [None]:
#cd E:\DENR\code\rangeland_monitoring\rmb_aggregate_processing\code

In [None]:
# press shift enter to see the command arguments
#run step1_1_initiate_odk_processing_pipeline.py --help

In [16]:
%run step1_1_initiate_odk_processing_pipeline.py -h

usage: step1_1_initiate_odk_processing_pipeline.py [-h] [-d DIRECTORY_ODK]
                                                   [-x EXPORT_DIR]
                                                   [-c CHROME_DRIVER]
                                                   [-r REMOTE_DESKTOP]
                                                   [-v ASSETS_VEG_LIST_DIR]
                                                   [-s ASSETS_SHAPEFILES_DIR]
                                                   [-t TIME_SLEEP]
                                                   [-ht HTML_DIR]
                                                   [-ver VERSION]
                                                   [-p PROPERTY_ENQUIRE]
                                                   [-pd PASTORAL_DISTRICTS_DIRECTORY]

Process raw RMB odk outputs -> csv, shapefiles observational sheets, and Ras
sheets.

optional arguments:
  -h, --help            show this help message and exit
  -d DIRECTORY_ODK, --directory_odk DIRECT

In [None]:
%run step1_1_initiate_odk_processing_pipeline.py -r remote_auto -p CLARAVALE  -t 10

Processing property:  Dorisvale
veg_list.xlsx located.
NT_Pastoral_Estate.shp located.
NT_StarTransect_20200713.shp located.
['RMB_Star_Transect_v2', 'RMB_Integrated_v2', 'RMB_Basal_Sweep_v2', 'RMB_Woody_Thickening_v2', 'RMB_Rapid_Assessment_RAS_v2']
Located and removed:  ['C:Users\\admluxts\\Downloads\\RMB_Star_Transect_v2_results.csv'] from C:Users\admluxts\Downloads
file_output:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Basal_Sweep_v2_results.csv
--------------------------------------------------
file_output:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Basal_Sweep_v2_results.csv
RMB_Basal_Sweep_v2_results.csv have been moved to  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk
file_output:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Integrated_v2_results.csv
--------------------------------------------------
file_output:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_proces



- located:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Integrated_v2_results.csv
- in directory:  raw_odk
--------------------------------------------------
remote_auto results_csv:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Integrated_v2_results.csv
--------------------------------------------------
integrated lat_lon:  ['wgs84', 'now_device', -14.5303463, 131.3019333, 4.0]
integrated:  | DVL08A |
integrated lat_lon:  ['wgs84', 'now_device', -14.5741048, 131.2951037, 4.0]
integrated:  | DVL09A |
integrated lat_lon:  ['wgs84', 'now_device', -14.5260185, 131.3506292, 6.0]
integrated:  | DVL10A |
integrated lat_lon:  ['wgs84', 'now_device', -14.3504651, 131.4440833, 6.0]
integrated:  | DVL11A |




- located:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Rapid_Assessment_RAS_v2_results.csv
- in directory:  raw_odk
--------------------------------------------------
remote_auto results_csv:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Rapid_Assessment_RAS_v2_results.csv
--------------------------------------------------




- located:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Star_Transect_v2_results.csv
- in directory:  raw_odk
--------------------------------------------------
remote_auto results_csv:  E:\DEPWS\code\rangeland_monitoring\rmb_aggregate_processing\raw_odk\RMB_Star_Transect_v2_results.csv
--------------------------------------------------
step2_1_star_transect_processing_workflow.py INITIATED.
                            START                            END  \
0   2022-04-04T14:25:07.714+09:30  2022-04-12T08:27:40.137+09:30   
1   2022-04-05T08:37:32.444+09:30  2022-04-11T11:02:55.628+09:30   
2   2022-04-05T11:56:30.919+09:30  2022-04-11T10:59:17.199+09:30   
3   2022-04-05T15:17:31.966+09:30  2022-04-11T10:57:53.807+09:30   
4   2022-04-06T07:59:11.068+09:30  2022-04-11T10:56:30.092+09:30   
5   2022-04-07T10:50:45.787+09:30  2022-04-11T10:53:14.719+09:30   
6   2022-04-07T13:55:10.792+09:30  2022-04-07T14:39:57.995+09:30   
7   2022-04-07T15:27:38.807+09:30

clean_list after metadata:  ['DVL08A', '09/05/2022', '09/05/2022 03:49:53 PM', 'Crawford, Cameron', 'Luxton, Sarah', 'Darwin', 'Dorisvale', nan, 'Dorisvale', 'new', 'Dorisvale_DVL08A', 'wgs84', 'now_device', -14.530874, 131.3018723, 4.8, 'North', -14.5302255, 131.3018775, 4.961, 'North south', 28.0, 0.0, 0.0, 27.0, 37.0, 0.0, 2.0, 5.0, 1.0, 0.0, 0.0, 0.0, 12.0, 0.0, 24.0, 22.0, 42.0, 0.0, 0.0, 0.0, 100.0, 'Southeast northwest', 10.0, 0.0, 0.0, 54.0, 25.0, 0.0, 7.0, 4.0, 0.0, 0.0, 0.0, 0.0, 25.0, 0.0, 2.0, 18.0, 55.0, 0.0, 2.0, 0.0, 98.0, 'Northeast southwest', 7.0, 0.0, 0.0, 55.0, 20.0, 0.0, 12.0, 5.0, 0.0, 0.0, 1.0, 0.0, 9.0, 0.0, 4.0, 13.0, 74.0, 1.0, 1.0, 0.0, 98.0, 'https://pgb-bas14.nt.gov.au:8443/ODKAggregate/view/formMultipleValue?formId=RMB_STAR_TRANSECT_v2%5B%40version%3Dnull+and+%40uiVersion%3Dnull%5D%2FRMB_Star_Transect_v2%5B%40key%3Duuid%3Add7ff0a0-b9da-4f23-bd67-83deb89ba537%5D%2FREPEAT_points_1', 'https://pgb-bas14.nt.gov.au:8443/ODKAggregate/view/formMultipleValue?formId

clean_list after metadata:  ['DVL09A', '10/05/2022', '10/05/2022 08:37:16 AM', 'Hughes, Harrison', 'Crawford, Cameron', 'Darwin', 'Dorisvale', nan, 'Dorisvale', 'new', 'Dorisvale_DVL09A', 'wgs84', 'now_device', -14.5744223, 131.2950493, 5.764, 'North', -14.5740254, 131.2950275, 4.854, 'North south', 21.0, 0.0, 0.0, 0.0, 25.0, 0.0, 38.0, 14.0, 1.0, 0.0, 0.0, 1.0, 4.0, 0.0, 1.0, 7.0, 88.0, 2.0, 0.0, 0.0, 98.0, 'Southeast northwest', 24.0, 0.0, 0.0, 0.0, 13.0, 0.0, 21.0, 31.0, 10.0, 0.0, 0.0, 1.0, 5.0, 0.0, 4.0, 3.0, 88.0, 0.0, 0.0, 0.0, 100.0, 'Northeast southwest', 32.0, 0.0, 0.0, 0.0, 12.0, 0.0, 24.0, 18.0, 12.0, 0.0, 0.0, 2.0, 2.0, 0.0, 4.0, 6.0, 88.0, 0.0, 0.0, 0.0, 100.0, 'https://pgb-bas14.nt.gov.au:8443/ODKAggregate/view/formMultipleValue?formId=RMB_STAR_TRANSECT_v2%5B%40version%3Dnull+and+%40uiVersion%3Dnull%5D%2FRMB_Star_Transect_v2%5B%40key%3Duuid%3Ab1b02a5c-df31-4647-8f78-f215290783cb%5D%2FREPEAT_points_1', 'https://pgb-bas14.nt.gov.au:8443/ODKAggregate/view/formMultipleValue?





!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
located_list:  [True, True, True, True]
script 7 initiated
Filling observational sheets
file:  Z:\Scratch\Zonal_Stats_Pipeline\rmb_aggregate_processing\outputs\luxts_20220517_14_41\odk_complete\clean_basal.csv
file:  Z:\Scratch\Zonal_Stats_Pipeline\rmb_aggregate_processing\outputs\luxts_20220517_14_41\odk_complete\clean_integrated.csv
file:  Z:\Scratch\Zonal_Stats_Pipeline\rmb_aggregate_processing\outputs\luxts_20220517_14_41\odk_complete\clean_ras.csv
file:  Z:\Scratch\Zonal_Stats_Pipeline\rmb_aggregate_processing\outputs\luxts_20220517_14_41\odk_complete\clean_star_transect.csv
csv_list:  ['Z:\\Scratch\\Zonal_Stats_Pipeline\\rmb_aggregate_processing\\outputs\\luxts_20220517_14_41\\odk_complete\\clean_basal.csv', 'Z:\\Scratch\\Zonal_Stats_Pipeline\\rmb_aggregate_processing\\outputs\\luxts_20220517_14_41\\odk_complete\\clean_integrated.csv', 'Z:\\Scratch\\Zonal_Stats_Pipeline\\rmb_aggregate_processing\\outputs\\luxts_20220517_14_41\\od