# NVCL Geopackager

An interactive tool to:
1. Create a [geopackage](https://www.geopackage.org/) from a simple csv file and including some basic metadata information.
2. Inspect the resulting [geopackage](https://www.geopackage.org/) file for geospatial and metadata correctness and,
3. Upload the [geopackage](https://www.geopackage.org/) as a custom layer in [GeoServer](http://geoserver.org/) using the GeoServer REST API.

## Steps to use

### Create Geopackage

The first stage is to create a geopackage file using the contents of a user provided CSV file.

1. Upload a csv file (including column headers) containing the following information:
    1. **Latitude** Column
    2. **Longitude** Column
    3. Any additional data columns that may be of interest to the end user such as the sample ID, collar ID, domain data
2. Fill in the mandatory details:
    1. **Layer Name** - A unique and descriptive name for the layer
    2. **Collection URI** - The web location where the data collection can be downloaded.
    3. **Collection Keywords** - Choose the approriate keywords that match the data that can be found in the collection
    4. **Latitude Column Name** - Select the column name in the csv file where the latitude data can be found
    5. **Longitude Column Name** - Select the column name in the csv file where the longitude data can be found
    6. Select the names of any additional data columns to include (this should not include the selected latitude or longitude columns)
3. Press "Continue" button to move to the "Inspect Geopackage" stage

### Inspect Geopackage

This stage is intended to allow the user to inspect the contents of the new geopackage file created in the previous stage.

1. Inspect the generated table to see the values of the columns added in the previous stage.
2. Inspect the generated geospatial preview to ensure that the point locations proivided in the uploaded CSV file are in the correct locations.  You can also "hover" over the various points to inspect the metadata with each point.
3. (Optional) Download the [geopackage](https://www.geopackage.org/) file generated and load it in an external program to view its contents and perform further validation.
4. At any stage you can press the "previous" button to return to the previous stage and make any corrections necessary.
3. Press "Continue" button to move to the "Upload Geopackage" stage

### Upload Geopackage

This stage aims to upload the newly created geopackage file into an existing instance of GeoServer and make the geopackage available as a new searchable layer.

1. Upload a Geopackage file:
2. Fill in the mandatory details:
    1. **Layer Name** - A unique and descriptive name for the layer (note: the value for this should already be copied over from stage 1)
    2. **Layer Owner Email** - This should be the email address of the "owner" of the new layer. 
    3. **Layer Description** - A brief description of the new layer being created.  This will be visible to users of the new layer and searchable in GeoServer.
    4. **Data Owner** - Provide details of the data owner ie. the owner of the original collection.
    5. **GeoServer Host** - The base path to an existing GeoServer instance where the new layer will reside .
    6. **GeoServer Workspace** - The GeoServer workspace that the layer should be created under.
    7. **GeoServer User** - The name of the user that should be used to connect to the GeoServer instance.
    8. **GeoServer Password** - The password to be used to connect to GeoServer.
    9. **Collection Keywords** - Additional keywords to make the layer more searchable (note: these values should be copied over from stage 1)
    
    


In [None]:
import io
import re
import param
import numpy as np
import panel as pn
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import geoviews as gv
import geoviews.feature as gf
from cartopy import crs
from urllib.parse import quote
import requests

import create_geopackage
import update_geoserver


pn.extension()
gv.extension('bokeh')

keyword_options = ['', 'age', 'stratigraphy', 'lithology', 'photo', 'spectral', 'thin section', 'SEM','Tornado','Maia','Laser-ICP-MS','TIMA','XRD','pXRF','geochemistry','isotopes','gamma ray','mag sus','density', 'AEM', 'hydrogeochemistry']

class Create_Layer(param.Parameterized):

    input_df = None
    geo_df = None
    lat_column_name = None
    long_column_name = None
    metadata_columns = None

    file_input = param.FileSelector(precedence=1)

    layer_name = param.String(default='my_new_layer', doc='Type the name of the new layer', precedence=-2)

    collection_uri = param.String(default='https://', doc='Type the link to the data collection', precedence=-3)

    collection_keywords = param.ListSelector(default=[], objects=keyword_options, doc='Click to select standard keywords for this collection', precedence=-3)

    lat_column_name = param.Selector(doc='Select latitude column', constant=True, precedence=-4)
    long_column_name = param.Selector(doc='Select longitude column', constant=True, precedence=-4)
    additional_columns = param.Selector(doc='Select additional columns', constant=True, precedence=-4)
    
    continue_button=pn.widgets.Button(name='Continue',button_type='primary', disabled=True, width=150, align=('end','end'))

    check1 = pn.indicators.BooleanStatus(value=False, color="success", width=50, height=50)
    check2 = pn.indicators.BooleanStatus(value=False, color="success", width=50, height=50)
    check3 = pn.indicators.BooleanStatus(value=False, color="success", width=50, height=50)

    ready = param.Boolean(False, precedence=-4)
    
    check_indicators = pn.Row(check1, check2, check3, align=('end','end'))

    alerts = pn.Row(check_indicators, pn.pane.Alert("Select CSV File to start", alert_type="info"))

    def __init__(self,**params):
        super().__init__(**params)
        
        self.continue_button.on_click(self.on_click_continue)
        return

    def on_click_continue(self,event):
        self.create_package()
        self.ready = True
        return

#     def get_metadata_column_defaults(self):
#         all_columns = list(self.input_df.columns)
#         selected_lat = self.lat_column_name
#         selected_long = self.long_column_name
#         if selected_lat is not None and selected_long is not None:
#             return list(filter(lambda a: a not in [selected_lat,selected_long] , all_columns))
#         else:
#             return None

    def guess_lat_and_long(self):
        all_columns = list(self.input_df.columns)
        print(all_columns)
        likely_lat = next((x for x in all_columns if x.startswith('LATITUDE')), None)
        if not likely_lat:
            likely_lat = next((x for x in all_columns if x.upper().startswith('LAT')), None)

        likely_long = next((x for x in all_columns if x.startswith('LONGITUDE')), None)
        if not likely_long:
            likely_long = next((x for x in all_columns if x.upper().startswith('LON')), None)
        return (likely_lat, likely_long)


    def validate_inputs(self) -> list:
        messages = []
        in_df = self.input_df
        selected_lat = self.lat_column_name
        selected_long = self.long_column_name
        layer_name = self.layer_name
        
#         additional_columns = self.additional_columns
        uri = self.collection_uri
    
        safe_layer_name = re.sub('[^a-zA-Z\d:]', '_', layer_name.lower())
        safe_layer_name = re.sub('\_+', '_', safe_layer_name)
        safe_layer_name = re.sub('^\_|\_$', '', safe_layer_name)

        self.layer_name = safe_layer_name

        if not ((uri.startswith("https://")) or (uri.startswith("http://"))):
            messages.append("Invalid collection URI")

        if in_df is not None:
            if selected_lat is None:
                messages.append("Please select latitude column")
            else:
                if pd.api.types.is_numeric_dtype(in_df[selected_lat]):
                    if pd.Series(in_df[selected_lat] < -90).sum() > 0:
                        messages.append("Values in latitude column below -90")
                    if pd.Series(in_df[selected_lat] > 90).sum() > 0:
                        messages.append("Values in latitude column above -90")
                else:
                    messages.append("Latitude column is not numeric")

            if selected_long is None:
                messages.append("Please select longitude column")
            else:
                if pd.api.types.is_numeric_dtype(in_df[selected_long]):
                    if pd.Series(in_df[selected_long] < -360).sum() > 0:
                        messages.append("Values in longitude column below -360")
                    if pd.Series(in_df[selected_long] > 360).sum() > 0:
                        messages.append("Values in longitude column above 360")
                else:
                    messages.append("Longitude column is not numeric")

        return messages

#     @param.output('geodataframe')
#     @param.depends('check2', watch=True)
    def create_package(self):
        print("create_package")
        if self.check1 and self.check2:
            selected_lat = self.lat_column_name
            selected_long = self.long_column_name

            keep_columns = []
            if self.additional_columns is not None:
                keep_columns = self.additional_columns
            
            # make any columns that are of mixed type strings
            df = self.input_df    
            mixed_columns = df.loc[:, df.applymap(type).nunique().gt(1)].columns
            df[mixed_columns] = df[mixed_columns].astype(str)
        
            if keep_columns is not None:
                df = self.input_df[keep_columns]
            else:
                df = self.input_df['']

            if selected_lat is not None and selected_long is not None:
                geometry = [Point(xy) for xy in zip(self.input_df.loc[:, selected_long], self.input_df.loc[:, selected_lat])]

            if self.collection_uri:
                df = create_geopackage.add_metadata_link(df, self.collection_uri)
            if self.collection_keywords:
                df = create_geopackage.add_metadata_keywords(df, self.collection_keywords)

            self.geo_df = gpd.GeoDataFrame(df, crs=4326, geometry=geometry)

            self.check3.value = True
            self.continue_button.disabled=False
            return self.geo_df
        else:
            self.check3 = False

        return gpd.GeoDataFrame(None)

    @param.output(geo_df=param.DataFrame)
#     @param.depends('geo_df', 'layer_name', 'collection_keywords', watch=True)
    def output(self):
        print(f"create output: layer_name: {self.layer_name}  keywords: {self.collection_keywords}")
        if self.geo_df is not None:
            geodataframe = self.geo_df
#             self.layer_name.param.trigger('value')  
#             self.layer_name = self.layer_name
        else:
            geodataframe = gpd.GeoDataFrame()
                  
        return geodataframe

    @param.depends('file_input')
    def table(self):
        widgets = None

        if self.file_input:

            df = pd.read_csv(io.BytesIO(self.file_input), low_memory=False)

            self.input_df = df
            widgets = pn.widgets.Tabulator(df, pagination='remote', page_size=15)

            self.param.additional_columns.constant = False
            self.param.long_column_name.constant = False
            self.param.lat_column_name.constant = False

            self.param.lat_column_name.objects = list(df.columns)
            self.param.long_column_name.objects = list(df.columns)
            self.param.additional_columns.objects = list(df.columns)
            self.check1.value = True
            self.param.additional_columns = list(df.columns)
            self.lat_column_name, self.long_column_name = self.guess_lat_and_long()
            self.additional_columns = list(df.columns)

            self.param.layer_name.precedence = 2
            self.param.collection_uri.precedence = 3
            self.param.collection_keywords.precedence = 3
            
            self.param.lat_column_name.precedence = 4
            self.param.long_column_name.precedence = 4
            self.param.additional_columns.precedence = 4

        else:
            self.check3.value = False
            self.check2.value = False
            self.check1.value = False

        return widgets



#     @param.depends('geo_df')
#     def geoview(self):

#         widgets = None

#         if self.geo_df is not None:
#             no_geo_df = self.geo_df.drop(columns="geometry")
#             geo_table = pn.widgets.Tabulator(no_geo_df, pagination='remote', page_size=10)

#             tiles = gv.tile_sources.Wikipedia
#             geoview = gv.Points(self.geo_df).opts(tools=['hover'], width=800, height=800)

#             geo_map = pn.pane.HoloViews(tiles * geoview, backend='bokeh')

#             widgets = pn.Tabs(geo_table, geo_map)

#         return widgets



    @param.depends('collection_uri', 'collection_keywords', 'lat_column_name', 'long_column_name', 'additional_columns', 'layer_name', watch=True)
    def validate_self(self):
        if self.input_df is not None:

            validation = self.validate_inputs()

            if len(validation) == 0:
                self.alerts[1] = None
                self.check2.value = True
                self.continue_button.disabled = False
                self.create_package()

            else:
                self.alerts[1] = pn.pane.Alert("Validation Failed<br>" + "<br>".join(validation), alert_type="warning")
                self.check3.value = False
                self.check2.value = False
                self.continue_button.disabled = True

#             self.additional_columns = self.get_metadata_column_defaults()

        return

    def panel(self):
        widget_config ={'file_input': {'widget_type': pn.widgets.FileInput, 'accept': '.csv'},
            'collection_keywords': {'widget_type': pn.widgets.MultiChoice},
            'collection_uri': {'name': "Collection URI"},
            'layer_description': {'widget_type': pn.widgets.TextAreaInput},
            'lat_column_name': {'name': "Select Latitude Column"},
            'long_column_name': {'name': "Select Longitude Column"},
            'additional_columns': {'widget_type': pn.widgets.MultiChoice},
            }

        configured_input = pn.Param(self.param, widgets=widget_config)

        input_row = pn.Row(configured_input, self.table)
        return pn.Column(self.continue_button, self.alerts,input_row, width=1200)


# layer = Create_Layer(name='Create new layer')

# layer.panel()

In [None]:
class View_Layer(param.Parameterized):
    
    geo_df_file = param.FileSelector(precedence=-1)

    geo_df = param.DataFrame(default=gpd.GeoDataFrame(), precedence=2)

    file_name = param.String(default='', doc='Type the name of the new layer', precedence=-2)
    
    layer_name = param.String(default='', doc='Type the name of the new layer', precedence=-2)
    
    collection_keywords = param.ListSelector(default=[], objects=keyword_options, doc='Click to select standard keywords for this collection', precedence=1)
    
    continue_button=pn.widgets.Button(name='Continue',button_type='primary', disabled=True, width=150, align=('end','end'))

    ready = param.Boolean(False, precedence=4)
    
    alerts = pn.Row(pn.pane.Alert("", alert_type="light"))
    

    def __init__(self,**params):
        super().__init__(**params)
        self.continue_button.on_click(self.on_click_continue)
        self.geopackage_download = pn.widgets.FileDownload(filename=f"{self.file_name}.gpkg", callback=self.get_geopackage, button_type='success')
        return

    def on_click_continue(self,event):
        self.ready = True
        return

    @param.output(geo_df_file=param.FileSelector)
    def output(self):
        if self.geo_df is not None:
            self.geo_df_file = self.get_geopackage()
            return self.geo_df_file 
        else:
            return None

    def get_geopackage(self):
        print("stage 2 get_geopackage")
        if self.geo_df is not None and len(self.geo_df) > 0:
            output = io.BytesIO()
            self.geo_df.to_file(output, layer=self.file_name, driver="GPKG")
#             print(f"writer={output}")
#             writer.save() # Important!
            output.seek(0) # Important!
            return output
        else:
            return None

    def geoview(self):
        widgets = None
        
        if self.geo_df is not None:
            tiles = gv.tile_sources.Wikipedia
            geoview = gv.Points(self.geo_df).opts(tools=['hover'], width=800, height=800)

            widgets = pn.pane.HoloViews(tiles * geoview, backend='bokeh')

        return widgets
    
    @param.depends('layer_name', watch=True)
    def file_name_update(self):
        print(f"stage 2: file_name_update: {self.file_name}")
        if self.geopackage_download:
            self.file_name = self.layer_name
            self.geopackage_download.filename = f"{self.layer_name}.gpkg"
            self.geopackage_download.label =  f"Download {self.layer_name}.gpkg"
        
        return
        

    @param.depends('geo_df', watch=True)
    def table(self):
        print(f"stage 2: table_update: {self.file_name}")
        if 'geometry' in self.geo_df.columns:
            no_geo_df = self.geo_df.drop(columns="geometry")
        else:
            no_geo_df = self.geo_df

        self.continue_button.disabled = False

        return pn.widgets.Tabulator(no_geo_df, pagination='remote', page_size=25)

    def panel(self):
        widget_config ={
            'geo_df_file': {'widget_type': pn.widgets.FileInput},
            }

        configured_inputs = pn.Param(self.param, widgets=widget_config)

        return pn.Column(self.continue_button, self.alerts, self.geopackage_download, pn.Tabs(("Geospatial Preview", self.geoview), ("Table Preview", self.table)), width=1200)


class Upload_Layer(param.Parameterized):
    
    geo_df_file = param.FileSelector(precedence=-1)
    
    geopackage = param.FileSelector(precedence=1)

    layer_name = param.String(default='testing1', doc='Type the name of the new layer', precedence=2)
    
    layer_owner_email = param.String(default='', doc='Type the email address if the owner of this layer', precedence=2)

    layer_description = param.String(default='testing upload from notebook', doc='Provide a description of the new layer', precedence=2)
    
    data_owner = param.String(default='', doc='Type the owner of the data collection that this layer references', precedence=2)

    geoserver_workspace = param.String(default='activity4_testing', doc='Type the geoserver workspace where the new layer should reside', precedence=3)

    geoserver_host = param.String(default='https://example.com', doc='Type the geoserver host name', precedence=3)

    geoserver_user = param.String(default='admin', doc='Type the geoserver user who can upload this data', precedence=3)

    geoserver_password = param.String(default='', doc='Type the password for the geoserver user', precedence=3)
    
    collection_keywords = param.ListSelector(default=[], objects=keyword_options, doc='Click to select standard keywords for this collection', precedence=3)
    
    upload_button=pn.widgets.Button(name='Upload to Geoserver',button_type='primary', disabled=True, width=200, align=('end','end'))
    
    alerts = pn.Column(pn.pane.Alert("", alert_type="light"))
    
    ready = param.Boolean(False, precedence=-4)

    def __init__(self,**params):
        super().__init__(**params)

        self.upload_button.on_click(self.on_click_upload)
        return

    def validate_inputs(self) -> list:
        print(f'stage3 validate inputs : {self.geopackage}')
        messages = []

        if self.geopackage is not None:
            if self.geoserver_host is None or len(self.geoserver_host) < 1:
                messages.append("Please provide a valid geoserver host URL")
                if not ((self.geoserver_host.startswith("https://")) or (self.geoserver_host.startswith("http://"))):
                    messages.append("Invalid Geoserver URI")

            if self.layer_name is None or len(self.layer_name) < 1:
                messages.append("Please provide a name for the new layer")
                
            if self.layer_owner_email is None or len(self.layer_owner_email) < 1:
                messages.append("Please provide an email address for the layer owner")

            if self.layer_description is None or len(self.layer_description) < 1:
                messages.append("Please provide a description for the new layer")
                
            if self.data_owner is None or len(self.data_owner) < 1:
                messages.append("Please provide details of the data owner")

            if self.geoserver_workspace is None or len(self.geoserver_workspace) < 1:
                messages.append("Please provide a geoserver workspace for new layer")

            if self.geoserver_user is None or len(self.geoserver_user) < 1:
                messages.append("Please provide a geoserver user to use for the upload")

            if self.geoserver_password is None or len(self.geoserver_password) < 1:
                messages.append("Please provide a password for the geoserver user")

        return messages
    
#     @param.depends('collection_keywords', watch=True)
#     def keywords_update(self):
#         print(f"stage 3:  keywords update: {self.collection_keywords}")

#         self.keywords = self.collection_keywords
       
#         return

    @param.depends('geopackage', 'layer_name', 'layer_owner_email','layer_description','data_owner' ,'geoserver_host', 'geoserver_user', 'geoserver_password', 'geoserver_workspace', watch=True)
    def validate_self(self):
        print(f"stage 3 validate layer_name: {self.layer_name}")
        if self.geopackage is not None:

            validation = self.validate_inputs()

            if len(validation) == 0:
                self.alerts[0] = None 
                self.upload_button.disabled = False
            else:
                self.alerts[0] = pn.pane.Alert("Validation Failed<br>" + "<br>".join(validation), alert_type="warning")
                self.upload_button.disabled = True
        return

    def on_click_upload(self,event):
        self.geoserver_upload()
        return


    def geoserver_upload(self):
        self.param.layer_name.constant = True
        self.param.layer_owner_email.constant = True
        self.param.layer_description.constant = True
        self.param.data_owner.constant = True
        self.param.geoserver_workspace.constant = True
        self.param.geoserver_host.constant = True
        self.param.geoserver_user.constant = True
        self.param.geoserver_password.constant = True
        self.param.collection_keywords.constant = True
        self.upload_button.disabled = True
        
        try:
            r = update_geoserver.update_datastore_and_layer_from_file(self.geoserver_host, self.geoserver_user, self.geoserver_password, self.geoserver_workspace, self.layer_name, self.layer_name, self.geopackage, self.layer_description, self.collection_keywords)
        except requests.exceptions.HTTPError as err:
            self.alerts[0] = pn.pane.Alert(f"Geoserver Upload Failed: {err}", alert_type="danger")
            self.param.layer_name.constant = False
            self.param.layer_owner_email.constant = False
            self.param.layer_description.constant = False
            self.param.data_owner.constant = False
            self.param.geoserver_workspace.constant = False
            self.param.geoserver_host.constant = False
            self.param.geoserver_user.constant = False
            self.param.geoserver_password.constant = False
            self.param.collection_keywords.constant = False
            self.upload_button.disabled = False
            self.ready = False
            print(f"upload response = {err}")
            return
        
        self.alerts[0] = pn.pane.Alert("Geoserver Upload Successful!", alert_type="success")
        self.ready = True
            
        return
    
    @param.depends('ready')
    def layer_preview(self):
        widgets = None
        if self.ready:

            preview_url = f"{self.geoserver_host}/geoserver/{self.geoserver_workspace}/wms?service=WMS&version=1.1.0&request=GetMap&layers={self.geoserver_workspace}%3A{self.layer_name}&bbox=113.81%2C-43.33%2C153.33%2C-11.03499999999999&width=768&height=627&srs=EPSG%3A4326&styles=&format=text%2Fhtml%3B%20subtype%3Dopenlayers"
            print(preview_url)
            widgets = pn.pane.Markdown(f"[GeoServer Preview]({preview_url})")
            

            
            pn.widgets.StaticText(name='Static Text', value='A string')
            
        return widgets


    def panel(self):
        widget_config ={'geopackage': {'widget_type': pn.widgets.FileInput, 'accept': '.gpkg'},
            'geo_df_file': {'widget_type': pn.widgets.FileInput},
            'collection_keywords': {'widget_type': pn.widgets.MultiChoice},
            'layer_description': {'widget_type': pn.widgets.TextAreaInput},
            'geoserver_password': {'widget_type': pn.widgets.PasswordInput}
            }

        configured_inputs = pn.Param(self.param, widgets=widget_config)

        return pn.Column(self.upload_button, self.alerts,pn.Row(configured_inputs,self.layer_preview), width=1200)
    
# upload = Upload_Layer(name='Upload layer')

# upload.panel()

In [None]:
pipeline = pn.pipeline.Pipeline(debug=True, inherit_params=True)

create_layer = Create_Layer()
view_layer = View_Layer(geo_df=create_layer.output())
upload_layer = Upload_Layer(geopackage=view_layer.output())

pipeline.add_stage('Create Layer', create_layer, ready_parameter='ready', auto_advance=True)

# pipeline.add_stage('Inspect Layer', View_Layer())
pipeline.add_stage('Inspect Layer', view_layer, ready_parameter='ready', auto_advance=True)

pipeline.add_stage('Upload Layer', upload_layer, ready_parameter='ready', auto_advance=True)

pipeline