# GeoPython Workshop
Welcome to the GeoPython Workshop!

The GeoPython workshop aims to introduce Python programming as a means of manipulating geospatial data. The workshop will include exercises and problems where you will analyze and manipulate geospatial data. This workshop is for beginners to the Python programming language and geospatial data.


## Requirements
 - Laptop
 - Python 2 or 3
 - Jupyter Notebook
 - Fiona
 - Shapely
 - Gdal
 - Rasterio
 - Folium

## Python Backgrounder
*Python is a widely used high-level, general-purpose, interpreted, dynamic programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C++ or Java.* - [Wikipedia](https://en.wikipedia.org/wiki/Python_%28programming_language%29)

Python is authored by Guido Van Rossum a Dutch programmer. He is known as the Benevolent Dictator for Life. He was employed at Google for 7 years and currently works at Dropbox. 

Python is currently being used in a wide variety of applications and domains. 
 - Data Science and Engineering
 - Software Engineering
 - System Management and Orchestration
 - Robotics
 - Journalism
 - Geospatial World
 - Disaster Risk Management and Mitigation
 - And many more

![Python in the Real World](images/python_world1.jpg)

## Why Python?
There are many reasons why many programmers and scientists are shifting to Python.
 - Readability
 - Development Time
 - Flexibility
 - Large Library Support
 - Community
 - Compensation
 

## Python 101
The power of Python is its ability for quick developments and readable syntax.

### Printing a text or result
In any programming language, printing your results is the bread and butter of your development and debugging. You can determine if your program is working properly just by a simple command. In printing a text, execute the following commands:

In [21]:
print("FOSS4G waters!")

FOSS4G waters!


In [22]:
print("""
This is a multiline text.
Python is awesome!        
""")


This is a multiline text.
Python is awesome!        



### Variable Declaration and Simple Maths
Python like any programming language can work with variables and use these variables for your computations. Unlike the other languages like c or java, Python can dynamically determine the variable type.

In [23]:
a = 25 # this is an integer
b = 5.0 # this is a float
c = "Hello FOSS!" # this is a string
d = a * b
print(a * b) # this is also the same as print(d)


125.0


In [24]:
print(c) # printing a variable with a value of a string

Hello FOSS!


### Python Lists
We have a special data type in python what we call lists. Lists are your in memory "databases" which you can store information and access them later in your program.

In [30]:
e = [1,2,3,4,10,23] # this is a list of integers.
f = ["this", "is", "list",  "of", "strings"]
g = ["a list", "of", 2.6] # this is a list of mixed types.
h = "This is also a list"
i = e[-2]

e[2] = 100 # set the element of a list using its index.

print(e) # displaying the whole list.
print(i) # display the 2nd to the last of a given list.
# e.append(7000) # this adds an element to the last index of the list e
# print(e)

[1, 2, 100, 4, 10, 23]
10


### List Operations
Python has some built-in functions for handling lists like max, min, len, sorted.

In [36]:
print(len(e)) # displays the number of elements

6


### Python Loops
Like any programming language, Python can also implement loops. We have the following loops - for, while, and do-while. Take note that loops in Python are very slow due to its dynamically typed feature. For large or long mathematical loops it is recommended that you use some other Python libraries like Numpy or SciPy. 

In [26]:
for x in f:
    print(x)

this
is
list
of
strings


In [27]:
for x in xrange(10): # xrange is deprecated in python 3. just use range instead of xrange
    print(x)

0
1
2
3
4
5
6
7
8
9


### Controlling the Loops: If-ElIf-Else and Conditional Statements
Python like any programming language, allows you to have conditional statements where you will apply certain criteria that you want to achieve. Let's say you want to determine the numbers greater than 3 in the following list.  

In [28]:
k = [2, 5, 3.1, 0, -2, "test"]
for j in k:
    if(j > 2.5):
        print(j)
    #elif(j == 0):
    #    break
    else:
        print("bazzingga!")

bazzingga!
5
3.1
bazzingga!
bazzingga!
test


As you can see, string data types are treated greater than numerical data types like integers and floats.

## Python for Geospatial Applications
Currently the Python programming language is being used as a scripting language to automate different tasks for different GIS platforms like QGIS, Grass GIS and ArcGIS. Python is also used to create Web-based Geographic Information Systems (WebGIS) which is now becoming more popular since most people are now hooked up into their phones and computers. So why use Python instead of using a desktop GIS system? We want to use Python as a means of manipulating geospatial data because we want to have a workflow independent from any GIS software/vendor. 


### Handling Geospatial Data
We have two main kinds of data the raster and vector. Vector datasets are represented as points, lines, and polygons. Most vector data are useful for storing data that has discrete boundaries such land parcels, roads, buildings, points of interest and many others. Raster datasets are generally represented by rectangular grid of pixels. Raster data is focused on modelling continuous phenomena and images of the earth. Generally raster data are your satellite images, digital elevation models, dsm, and others.

![Raster vs Vector](images/raster_vector.gif)

### Vector x Python
Python has several libraries in manipulating and analyzing vector data. Famous libraries include OGR, shapely and fiona. Some of the vector file extensions or formats are as follows:
 - shp (ESRI Shapefile)
 - kmz and kml (Keyhole Markup Language)
 - csv (Comma Separated Values)
 - geojson (geospatial javascript object notation)
 - dxf
 - xyz

### Situation 1: Common Data into Geospatial Data (CSV to SHP)
Most geospatial data are from spreadsheets, text files or even pdfs - so we have to convert them to make sense out of these data. 

In this example we will be turning comma separated values (CSV) into geospatial data. Suppose we have a CSV file containing attributes like name, description, latitude, longitude etc, we want to turn this csv into a shapefile so we can load this into QGIS. The code tutorial came from Tom Macwright which can be viewed [here.](http://www.macwright.org/2012/10/31/gis-with-python-shapely-fiona.html)

In [11]:
import csv # let's use an external library to read csv files
with open('data/top10sites.csv', 'rb') as csvfile:
    reader = csv.DictReader(csvfile) # this converts each row into a python dictionary - it is almost the same as lists but instead of numerical indeces you have keys to access data.
    for row in reader:
        print(row)
        #print(row["Place"]) # you can access data using a certain key

{'Max Value': '1327512.36221', 'Latitude': '19.392', 'Place': 'Katanapan Point-Cagayan Valley', 'id': '1', 'Longitude': '121.156'}
{'Max Value': '1262860.15324', 'Latitude': '19.4', 'Place': 'Katanapan Point-Cagayan Valley', 'id': '2', 'Longitude': '121.156'}
{'Max Value': '1253266.53637', 'Latitude': '19.352', 'Place': 'Cabudadan-Cagayan Valley', 'id': '3', 'Longitude': '121.444'}
{'Max Value': '1244616.1615', 'Latitude': '18.936', 'Place': 'Cape Bojeador-Ilocos', 'id': '4', 'Longitude': '119.996'}
{'Max Value': '1223701.93723', 'Latitude': '18.936', 'Place': 'Cape Bojeador-Ilocos', 'id': '5', 'Longitude': '119.988'}
{'Max Value': '1197975.16676', 'Latitude': '19.408', 'Place': 'Katanapan Point-Cagayan Valley', 'id': '6', 'Longitude': '121.148'}
{'Max Value': '1186722.44419', 'Latitude': '19.344', 'Place': 'Cabudadan-Cagayan Valley', 'id': '7', 'Longitude': '121.444'}
{'Max Value': '1179809.84815', 'Latitude': '19.4', 'Place': 'Katanapan Point-Cagayan Valley', 'id': '8', 'Longitude': 

In [1]:
# Now that we can read a csv. Let us try to convert the latitude and longitude into actual point geometries
import csv
from shapely.geometry import Point

with open('data/top10sites.csv', 'rb') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        point = Point(float(row['Longitude']), float(row['Latitude']))
        print(point)

POINT (121.156 19.392)
POINT (121.156 19.4)
POINT (121.444 19.352)
POINT (119.996 18.936)
POINT (119.988 18.936)
POINT (121.148 19.408)
POINT (121.444 19.344)
POINT (121.148 19.4)
POINT (121.18 19.384)
POINT (119.964 18.944)


In [3]:
# Once we have finished converting each latitude and longitude into actual geometries we can now convert them into actual shapefiles.
import csv
from shapely.geometry import Point, mapping
from fiona import collection

# We are now defining the structure of the shapefile - remember that we can create either a point, line, polygon for the shapefile.
schema = { 'geometry': 'Point', 'properties': { 'place': 'str' } }
with collection(
    "output/top10sites.shp", "w", "ESRI Shapefile", schema) as output:
    with open('data/top10sites.csv', 'rb') as f:
        reader = csv.DictReader(f)
        for row in reader:
            point = Point(float(row['Longitude']), float(row['Latitude']))
            # we are now writing the shapefile
            output.write({
                'properties': {
                    'place': row['Place']
                },
                'geometry': mapping(point)
            })

### Situation 2: Reading SHP Files
Most GIS vector data are written with Esri's shapefiles(SHP). Shapefiles is a group of files which contain the projection, attributes and geometry of the object they represent. In this scenario, we will try to read the shapefile that we have produced from the previous situation. We both of the libraries that we have used in the previous example. 

In [3]:
from fiona import collection
from shapely.geometry import Point, shape
with collection("output/top10sites.shp", "r") as input:
    for point in input:
        print shape(point['geometry'])

POINT (121.156 19.392)
POINT (121.156 19.4)
POINT (121.444 19.352)
POINT (119.996 18.936)
POINT (119.988 18.936)
POINT (121.148 19.408)
POINT (121.444 19.344)
POINT (121.148 19.4)
POINT (121.18 19.384)
POINT (119.964 18.944)


You will notice that the output of the script is the same as the result in situation 1 when converting the latitude and longitude into point geometries.

### Challenge 1: (Center of the View)
Every shapefile can be encapsulated by a bounding box. Your job is to determine the center of the shapefile using its bounding box. We will need this in the next section. 

**Hint** How do we determine the center of a square?

In [4]:
lng = []
lat = []
with collection("output/top10sites.shp", "r") as input:
    for point in input:
        lng.append(point['geometry']['coordinates'][0])
        lat.append(point['geometry']['coordinates'][1])

centerlng = (max(lng) + min(lng)) * 0.5
centerlat = (max(lat) + min(lat)) * 0.5

center = [centerlat, centerlng]
print(center)

[19.172, 120.70400000000001]


### Situation 3: Visualizing Vector Data
There are many visualization libraries in Python like Matplotlib, Bokeh, Seaborn, etc. In this example, we will use folium which helps us in visualizing geospatial data. Folium uses leaflet js as its mapping platform so it simplifies our visualization workflow.

**Creating a Basemap**

In [19]:
import folium
map_osm = folium.Map(location=center, zoom_start=5) # adding a basemap
map_osm


**Add the vector on the map**

In [18]:
from fiona import collection
from shapely.geometry import Point, shape
with collection("output/top10sites.shp", "r") as input:
    for point in input:
        folium.Marker([point['geometry']['coordinates'][1], point['geometry']['coordinates'][0]], popup=point['properties']['place']).add_to(map_osm)
map_osm