# Download the Sample Data
Using the Python code below, we can download a land cover raster file for Johnson County, Iowa and a shapefile for the city boundary of Iowa City. We will use these two files to test the zonal statistics tool that we will create in the following steps.

In [None]:
import zipfile, urllib.request, shutil

url1 = 'https://iowageodata.s3.amazonaws.com/imageryBaseMapsEarthCover/earthCover/Land_cover_2009_1m/HRLC_2009_52.zip'
url2 = 'https://github.com/ui-libraries/Zonal_Statistics_Tool_JupyterNotebook/raw/main/IowaCity_Shapefile.zip'
file_name1 = 'HRLC_2009_52.zip'
file_name2 = 'IowaCity_Shapefile.zip'

with urllib.request.urlopen(url1) as response, open(file_name1, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)
    with zipfile.ZipFile(file_name1) as zf:
        zf.extractall()

with urllib.request.urlopen(url2) as response, open(file_name2, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)
    with zipfile.ZipFile(file_name2) as zf:
        zf.extractall()

# Import Modules
Before running any scripts, we need to import the following Python modules, which will allow us to calculate zonal statistics for our chosen raster land cover file and selected land cover class range. If you are attempting to replicate this in your own Anaconda environment, make sure to install gdal, rasterio, rasterstats, halo, and ipywidgets using the ```conda install -c conda-forge``` command in your Mac terminal or Windows shell before you attempt to import and use these modules. The gdal, rasterio, and rasterstats modules are necessary for our zonal stats analysis. The halo and ipywidgets modules are required for adding spinners to lengthy processes.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

# Build the Tkinter-Based GUI Framework
In the next section of Python code, we will add to our code above by building out the scaffolding that returns a GUI form to handle user inputs dynamically. The first four inputs will ask for the shapefile of the user's analysis area, the raster file containing the land cover data, the lowest value of the raster classes being analyzed, and the highest value of the raster classes being analyzed. Then, an "OK" button will initiate the analytical portion of the script, using the provided inputs. For now, we are going to skip over the analytical coding to provide the bookend of the Tkinter GUI, which defines the placement and size of the input boxes and button. Run the following code and notice how the input form that appears on your screen. Don't worry about filling anything out at this time, just click "OK" and note that it returns the message, "Processing happens here!"

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            print('Processing happens here!')
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Define User Inputs
With the following code, we will build upon the previous code and define each of the user-provided inputs. This time, run the code and choose the City.shp file you downloaded for the input shapefile and HRLC_2009_52.img file you downloaded for the input raster file. Then, input 3 for the lowest value in the raster classification range and 6 for the highest value. If you check the metadata that comes with the img file you unzipped, you will see that classes 3 to 6 contain all of the tree canopy classes. Click "OK" and the code should print the paths to your input data, as well as the raster class values that you provided.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            ##  Define input variables
            input_zone_polygon = var_1.get()
            input_value_raster = var_2.get()
            low_class_str = var_3.get()
            high_class_str = var_4.get()

            ## Convert numeric strings to integers
            low_class_int = int(low_class_str)
            high_class_int = int(high_class_str)

            ## Ensure all entries are filled
            if input_zone_polygon == "":
                print("Error: Select the shapefile for your area of interest")
            if input_value_raster == "":
                print("Error: Select the raster for zonal analysis")
            if low_class_int == "":
                print("Error: Input the lowest value in the raster classification range you want to analyze")
            if high_class_int == "":
                print("Error: Input the highest value in the raster classification range you want to analyze")
            
            ## Check user inputs
            print("The input shapefile is: " + input_zone_polygon + ". The input raster is: " + input_value_raster + ". The lowest raster class value is: " + low_class_str + ". The highest raster class value is: " + high_class_str + ".")
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Check Raster File Extension
Now that we have confirmed that our GUI works with user-provided inputs, let's add some code to confirm whether our raster file conforms to the two file formats that this code accepts, img or tif. Run this script and complete the input form as before. This time, when you click "OK", you should see a message informing you that the input raster is in .img format.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            ##  Define input variables
            input_zone_polygon = var_1.get()
            input_value_raster = var_2.get()
            low_class_str = var_3.get()
            high_class_str = var_4.get()

            ## Convert numeric strings to integers
            low_class_int = int(low_class_str)
            high_class_int = int(high_class_str)

            ## Ensure all entries are filled
            if input_zone_polygon == "":
                print("Error: Select the shapefile for your area of interest")
            if input_value_raster == "":
                print("Error: Select the raster for zonal analysis")
            if low_class_int == "":
                print("Error: Input the lowest value in the raster classification range you want to analyze")
            if high_class_int == "":
                print("Error: Input the highest value in the raster classification range you want to analyze")
            
            print('Retrieving the directory path for the input raster.')

            ## Get the directory path of the input raster
            ras_dir_path = os.path.dirname(input_value_raster)

            print('Retrieving the file extension of the input raster.')

            ## Get the file extension of the raster file
            file_ext = os.path.splitext(input_value_raster)[1]

            print('Checking if the file extension of the input raster is tif or img.')

            if file_ext == '.tif':
                drive = 'GTiff'
            elif file_ext == '.img':
                drive = 'HFA'
            else:
                print('Error: The input raster file needs to be in .tif or .img format.')
                
            ## Print the raster file extension
            print('The input raster file extension is: ' + file_ext)
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Reclassify the Raster to a Binary Classification
If all went well in the previous step, we are ready to reclassify the input raster from its many raster classes to a simple binary classification, where 1 is equivalent to the tree canopy classes (3 - 6) that we chose, and 0 is equivalent to all other classes we wish to exclude from analysis. Notice that we are using a module called halo to implement a spinner and processing information alongside particularly lengthy processes. This is to indicate to the user that things are proceeding normally despite the wait. Grab a sandwich, it will be a while! After a bit of an interlude, you should see the message, "Done reclassifying the raster" and a new output file called "raster2.img" in your user folder. However, we need to reproject this raster to match the projection of the shapefile in order for our zonal stats analysis to be successful.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            ##  Define input variables
            input_zone_polygon = var_1.get()
            input_value_raster = var_2.get()
            low_class_str = var_3.get()
            high_class_str = var_4.get()

            ## Convert numeric strings to integers
            low_class_int = int(low_class_str)
            high_class_int = int(high_class_str)

            ## Ensure all entries are filled
            if input_zone_polygon == "":
                print("Error: Select the shapefile for your area of interest")
            if input_value_raster == "":
                print("Error: Select the raster for zonal analysis")
            if low_class_int == "":
                print("Error: Input the lowest value in the raster classification range you want to analyze")
            if high_class_int == "":
                print("Error: Input the highest value in the raster classification range you want to analyze")
            
            print('Retrieving the directory path for the input raster.')

            ## Get the directory path of the input raster
            ras_dir_path = os.path.dirname(input_value_raster)

            print('Retrieving the file extension of the input raster.')

            ## Get the file extension of the raster file
            file_ext = os.path.splitext(input_value_raster)[1]

            print('Checking if the file extension of the input raster is tif or img.')

            if file_ext == '.tif':
                drive = 'GTiff'
            elif file_ext == '.img':
                drive = 'HFA'
            else:
                print('Error: The input raster file needs to be in .tif or .img format.')
                
            print('Reclassifying the raster to a binary 0,1 classification...')
            
            #Define the gdal driver with the drive variable from the conditional test
            driver = gdal.GetDriverByName(drive)

            file = gdal.Open(input_value_raster)
            band = file.GetRasterBand(1)
            lista = band.ReadAsArray()

            # reclassification
            with Halo(text='Reclassifying outlying classes to 0', spinner='dots'):
                lista[np.where( lista < low_class_int )] = 0
            with Halo(text='Reclassifying selected classes to 1', spinner='dots'):
                lista[np.where((low_class_int - 1 < lista) & (lista < high_class_int + 1)) ] = 1
            with Halo(text='Reclassifying remaining outlying classes to 0', spinner='dots'):
                lista[np.where( lista > high_class_int )] = 0

            # create new file
            file2 = driver.Create( ras_dir_path + '/raster2' + file_ext, file.RasterXSize , file.RasterYSize , 1)
            file2.GetRasterBand(1).WriteArray(lista)

            # spatial ref system
            proj = file.GetProjection()
            georef = file.GetGeoTransform()
            file2.SetProjection(proj)
            file2.SetGeoTransform(georef)
            file2 = None

            print('Done reclassifying the raster.')
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Reproject the Raster to Match the Shapefile
Now, let's rerun the process with some additional lines of code that will create a new raster file in the same projection as our input shapefile. Notice that, in our additional code to retrieve the EPSG of the shapefile and to reproject the raster, we are using the "with Halo" command to tie a spinner to these actions as a visual cue to the user that something is in fact happening during these long processes. When this code finishes running, you should see a new raster file called "raster2_reproject.img" in your user folder.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            ##  Define input variables
            input_zone_polygon = var_1.get()
            input_value_raster = var_2.get()
            low_class_str = var_3.get()
            high_class_str = var_4.get()

            ## Convert numeric strings to integers
            low_class_int = int(low_class_str)
            high_class_int = int(high_class_str)

            ## Ensure all entries are filled
            if input_zone_polygon == "":
                print("Error: Select the shapefile for your area of interest")
            if input_value_raster == "":
                print("Error: Select the raster for zonal analysis")
            if low_class_int == "":
                print("Error: Input the lowest value in the raster classification range you want to analyze")
            if high_class_int == "":
                print("Error: Input the highest value in the raster classification range you want to analyze")
            
            print('Retrieving the directory path for the input raster.')

            ## Get the directory path of the input raster
            ras_dir_path = os.path.dirname(input_value_raster)

            print('Retrieving the file extension of the input raster.')

            ## Get the file extension of the raster file
            file_ext = os.path.splitext(input_value_raster)[1]

            print('Checking if the file extension of the input raster is tif or img.')

            if file_ext == '.tif':
                drive = 'GTiff'
            elif file_ext == '.img':
                drive = 'HFA'
            else:
                print('Error: The input raster file needs to be in .tif or .img format.')
                
            print('Reclassifying the raster to a binary 0,1 classification...')
            
            #Define the gdal driver with the drive variable from the conditional test
            driver = gdal.GetDriverByName(drive)

            file = gdal.Open(input_value_raster)
            band = file.GetRasterBand(1)
            lista = band.ReadAsArray()

            # reclassification
            with Halo(text='Reclassifying outlying classes to 0', spinner='dots'):
                lista[np.where( lista < low_class_int )] = 0
            with Halo(text='Reclassifying selected classes to 1', spinner='dots'):
                lista[np.where((low_class_int - 1 < lista) & (lista < high_class_int + 1)) ] = 1
            with Halo(text='Reclassifying remaining outlying classes to 0', spinner='dots'):
                lista[np.where( lista > high_class_int )] = 0

            # create new file
            file2 = driver.Create( ras_dir_path + '/raster2' + file_ext, file.RasterXSize , file.RasterYSize , 1)
            file2.GetRasterBand(1).WriteArray(lista)

            # spatial ref system
            proj = file.GetProjection()
            georef = file.GetGeoTransform()
            file2.SetProjection(proj)
            file2.SetGeoTransform(georef)
            file2 = None

            print('Done reclassifying the raster. Reprojecting the raster to the shapefile projection...')
            
            # Get the EPSG code of the input shapefile
            with Halo(text='Identifying the EPSG code of the input SHP', spinner='dots'):
                shp_driver = ogr.GetDriverByName('ESRI Shapefile')
                dataset = shp_driver.Open(input_zone_polygon)
                layer = dataset.GetLayer()
                spatialRef = layer.GetSpatialRef()
                shp_epsg = spatialRef.GetAttrValue("GEOGCS|AUTHORITY", 1)

            print('Shapefile projection is EPSG:' + shp_epsg + '.')
            
            # Reproject the raster
            with Halo(text='Reprojecting the raster to match SHP', spinner='dots'):
                input_raster = gdal.Open(ras_dir_path + '/raster2' + file_ext)
                output_raster = ras_dir_path + '/raster2_reproject' + file_ext

                warp = gdal.Warp(output_raster,input_raster,dstSRS='EPSG:'+str(shp_epsg))
                warp = None

            print('Done reprojecting.')
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Compute Zonal Statistics
Now that we have matched the raster projection to the input shapefile projection, we can calculate zonal statistics for our selected raster classes within the area covered by the shapefile. In this case, we are calculating the percent of Iowa City covered by tree canopy. This is done by adding a few lines of code supported by the rasterstats module that we imported. What follows is the complete Python code for our zonal statistics tool. When it finishes running, you should see a message informing you that the selected raster classes (3, 4, 5, and 6) constitute 29.26% of the area of interest (Iowa City). You can also tell that the analyzed raster represents a binary classification scheme, in that the minimum value is 0 and the maximum value is 1. Count refers to the total number of raster pixels in the shapefile-defined AOI, while sum refers to the total number of pixels within the selected raster classes. If you divide sum by count, you will get the percent coverage.

In [None]:
import os
import tkinter as tk
from tkinter.filedialog import askdirectory
from tkinter.filedialog import askopenfilename
import tkinter.messagebox
from osgeo import gdal
from osgeo import ogr, osr
import numpy as np
import rasterio as rio
from rasterstats import zonal_stats
from halo import HaloNotebook as Halo

print("Modules imported!\n-----")
print("Initializing code...")

##  Set up class frame
class Application(tk.Frame):
    def __init__(self, master=None):
        tk.Frame.__init__(self, master)
        self.grid()

        ##  Commands

        ##  G-1. Return Shapefile Path
        def BrowseFile_1():
            var_1.set(askopenfilename(title="Select the shapefile for your area of interest"))

        ##  G-2. Return Raster File Path
        def BrowseFile_2():
            var_2.set(askopenfilename(title="Select the raster for zonal analysis"))

        ## G-3. Return Lowest Value of Desired Raster Class
        def BrowseLabel_3():
            var_3.set(askinteger(title="Input the lowest value in the raster classification range you want to analyze"))

        ## G-4. Return Highest Value of Desired Raster Class
        def BrowseLabel_4():
            var_4.set(askinteger(title="Input the highest value in the raster classification range you want to analyze"))

        ##  G-5. Run the Script
        def OK():       
            
            ##  Define input variables
            input_zone_polygon = var_1.get()
            input_value_raster = var_2.get()
            low_class_str = var_3.get()
            high_class_str = var_4.get()

            ## Convert numeric strings to integers
            low_class_int = int(low_class_str)
            high_class_int = int(high_class_str)

            ## Ensure all entries are filled
            if input_zone_polygon == "":
                print("Error: Select the shapefile for your area of interest")
            if input_value_raster == "":
                print("Error: Select the raster for zonal analysis")
            if low_class_int == "":
                print("Error: Input the lowest value in the raster classification range you want to analyze")
            if high_class_int == "":
                print("Error: Input the highest value in the raster classification range you want to analyze")
            
            print('Retrieving the directory path for the input raster.')

            ## Get the directory path of the input raster
            ras_dir_path = os.path.dirname(input_value_raster)

            print('Retrieving the file extension of the input raster.')

            ## Get the file extension of the raster file
            file_ext = os.path.splitext(input_value_raster)[1]

            print('Checking if the file extension of the input raster is tif or img.')

            if file_ext == '.tif':
                drive = 'GTiff'
            elif file_ext == '.img':
                drive = 'HFA'
            else:
                print('Error: The input raster file needs to be in .tif or .img format.')
                
            print('Reclassifying the raster to a binary 0,1 classification...')
            
            #Define the gdal driver with the drive variable from the conditional test
            driver = gdal.GetDriverByName(drive)

            file = gdal.Open(input_value_raster)
            band = file.GetRasterBand(1)
            lista = band.ReadAsArray()

            # reclassification
            with Halo(text='Reclassifying outlying classes to 0', spinner='dots'):
                lista[np.where( lista < low_class_int )] = 0
            with Halo(text='Reclassifying selected classes to 1', spinner='dots'):
                lista[np.where((low_class_int - 1 < lista) & (lista < high_class_int + 1)) ] = 1
            with Halo(text='Reclassifying remaining outlying classes to 0', spinner='dots'):
                lista[np.where( lista > high_class_int )] = 0

            # create new file
            file2 = driver.Create( ras_dir_path + '/raster2' + file_ext, file.RasterXSize , file.RasterYSize , 1)
            file2.GetRasterBand(1).WriteArray(lista)

            # spatial ref system
            proj = file.GetProjection()
            georef = file.GetGeoTransform()
            file2.SetProjection(proj)
            file2.SetGeoTransform(georef)
            file2 = None

            print('Done reclassifying the raster. Reprojecting the raster to the shapefile projection...')
            
            # Get the EPSG code of the input shapefile
            with Halo(text='Identifying the EPSG code of the input SHP', spinner='dots'):
                shp_driver = ogr.GetDriverByName('ESRI Shapefile')
                dataset = shp_driver.Open(input_zone_polygon)
                layer = dataset.GetLayer()
                spatialRef = layer.GetSpatialRef()
                shp_epsg = spatialRef.GetAttrValue("GEOGCS|AUTHORITY", 1)

            print('Shapefile projection is EPSG:' + shp_epsg + '.')
            
            # Reproject the raster
            with Halo(text='Reprojecting the raster to match SHP', spinner='dots'):
                input_raster = gdal.Open(ras_dir_path + '/raster2' + file_ext)
                output_raster = ras_dir_path + '/raster2_reproject' + file_ext

                warp = gdal.Warp(output_raster,input_raster,dstSRS='EPSG:'+str(shp_epsg))
                warp = None

            print('Done reprojecting. Processing zonal statistics...')
            
            zs = zonal_stats(input_zone_polygon,output_raster,stats=['min', 'max', 'mean', 'count', 'sum'])

            print(zs)

            ## Hide the tkinter root box
            root = tk.Tk()
            root.withdraw()

            ## Define each zonal stat
            min = [x['min'] for x in zs]
            max = [x['max'] for x in zs]
            mean = [x['mean'] for x in zs]
            count = [x['count'] for x in zs]
            sum = [x['sum'] for x in zs]

            ## Build the messagebox content
            lines = ["AOI covered by selected raster classes: " + str(round(mean[0]*100,2))+"%", "minimum: " + str(min[0]), "maximum: " + str(max[0]), "count: " + str(count[0]), "sum: " + str(sum[0])]

            ## Display the messagebox content in separate lines
            tk.messagebox.showinfo("Zonal Statistics Summary", "\n".join(lines))

            print('All done processing!')
            
        self.quit()


        ##  GUI Widgets
        ##  G-1. Shapefile Selection
        ##  1-Label
        label_1 = tk.Label(self, text="Select the shapefile for your area of interest")
        label_1.grid(row=0, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  1-Entry Box
        var_1 = tk.StringVar()
        entry_1 = tk.Entry(self, textvariable=var_1)
        entry_1.grid(row=0, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  1-Button
        button_1 = tk.Button(self, text="Browse", command=BrowseFile_1)
        button_1.grid(row=0, column=3, padx=5, pady=5, sticky="E")

        ##  G-2. Raster Selection
        ##  2-Label
        label_2 = tk.Label(self, text="Select the raster for zonal analysis")
        label_2.grid(row=1, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  2-Entry Box
        var_2 = tk.StringVar()
        entry_2 = tk.Entry(self, textvariable=var_2)
        entry_2.grid(row=1, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  2-Button
        button_2 = tk.Button(self, text="Browse", command=BrowseFile_2)
        button_2.grid(row=1, column=3, padx=5, pady=5, sticky="E")

        ##  G-3. Low Class Selection
        ##  3-Label
        label_3 = tk.Label(self, text="Input the lowest value in the raster classification range you want to analyze")
        label_3.grid(row=2, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  3-Entry Box
        var_3 = tk.StringVar()
        entry_3 = tk.Entry(self, textvariable=var_3)
        entry_3.grid(row=2, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-4. High Class Selection
        ##  4-Label
        label_4 = tk.Label(self, text="Input the highest value in the raster classification range you want to analyze")
        label_4.grid(row=3, column=0, columnspan=2, padx=5, pady=5, sticky="W")

        ##  4-Entry Box
        var_4 = tk.StringVar()
        entry_4 = tk.Entry(self, textvariable=var_4)
        entry_4.grid(row=3, column=2, padx=5, pady=5, ipadx=100, sticky="W")

        ##  G-5. OK
        ##  5-Button
        button_5 = tk.Button(self, text="OK", command=OK)
        button_5.grid(row=4, column=3, columnspan=2, padx=5, pady=5, ipadx=20, sticky="s")

app = Application()
app.master.title("Zonal Stats Tool")
app.mainloop()

# Conclusion
This concludes the tutorial for creating a stand-alone GUI-driven Python tool for computing zonal statistics on a raster land cover file. Feel free to copy and modify this code to serve your own interests!