# Assignment 1: Advanced Geo Data Processing

## 1. Object-orientied programming 

In [7]:
import numpy as np
import ogr

1.1 Write a class called "Polygon" which represents a polygon geometry. It should have a function called "envelope()" which calculates the bounding box of the polygon. 

In [135]:
test_coordinates = [[1,1],[1,2],[2,2],[1,2],[1,1]]

In [136]:
class Polygon():
    
    def __init__(self, coordinates):
        self.coordinates = np.array(coordinates)
        
    def envelope(self):
        xcoords = self.coordinates[:,0]
        ycoords = self.coordinates[:,1]

        minx = xcoords.min()
        maxx = xcoords.max()
        miny = ycoords.min()
        maxy = ycoords.max()
        
        return [minx, miny, maxx, maxy]

In [137]:
class Polygon_list():
    
    geom_type = "Polygon"
    
    def __init__(self, coordinates):
        self.coordinates = coordinates
        
    def envelope(self):
        
        xcoords = [x for x, y in self.coordinates]
        ycoords = [y for x, y in self.coordinates]
        
        minx = min(xcoords)
        maxx = max(xcoords)
        miny = min(ycoords)
        maxy = max(ycoords)
        
        return [minx, miny, maxx, maxy]

... using your python class

In [174]:
%%timeit
poly_py = Polygon(test_coordinates)
poly_py_env = poly_py.envelope()

12.2 µs ± 406 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [140]:
print("minX: %d, minY: %d, maxX: %d, maxY: %d" %(poly_py_env[0],poly_py_env[1],poly_py_env[2],poly_py_env[3]))

minX: 1, minY: 1, maxX: 2, maxY: 2


... using only python data types 

In [175]:
%%timeit
poly_py_list = Polygon_list(test_coordinates)
poly_py_env = poly_py_list.envelope()

2.15 µs ± 57 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [146]:
poly_py_env = poly_py_list.envelope()
print("minX: {0}, minY: {1}, maxX: {2}, maxY: {3}".format(*poly_py_env))

minX: 1, minY: 1, maxX: 2, maxY: 2


### Use OGR to do the same 

... using ogr

In [102]:
# Create ring
ring = ogr.Geometry(ogr.wkbLinearRing)
for c in test_coordinates:
    ring.AddPoint(c[0], c[1])
# Create polygon
poly_ogr = ogr.Geometry(ogr.wkbPolygon)
poly_ogr.AddGeometry(ring)

0

In [112]:
poly_ogr_env = poly_ogr.GetEnvelope()
print("minX: {0}, minY: {2}, maxX: {1}, maxY: {3}".format(*poly_ogr_env))

minX: 1.0, minY: 1.0, maxX: 2.0, maxY: 2.0


In [173]:
%%timeit
# Create ring
ring = ogr.Geometry(ogr.wkbLinearRing)
for c in test_coordinates:
    ring.AddPoint(c[0], c[1])
# Create polygon
poly_ogr = ogr.Geometry(ogr.wkbPolygon)
poly_ogr.AddGeometry(ring)
poly_ogr_env = poly_ogr.GetEnvelope()

15.4 µs ± 486 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


### Use shapely to do the same

In [150]:
import shapely

In [176]:
%%timeit
poly_shapely = shapely.geometry.Polygon(test_coordinates)
poly_shapely_env = poly_shapely.envelope

17.5 µs ± 90.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [177]:
%%timeit
poly_shapely = shapely.geometry.Polygon(test_coordinates)
poly_shapely_env = poly_shapely.bounds

44.3 µs ± 698 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [149]:
print("minX: {}, minY: {}, maxX:{}, maxY: {}".format(*poly_shapely.bounds))

minX: 1.0, minY: 1.0, maxX:2.0, maxY: 2.0


### __Discuss in groups:__ 
Why is the ogr implementation faster than the pure Python implementations? 

__Hint:__ Watch the first lecture video of "Getting Started with Python I" in the Geoscripting Course (starting at 14:45 min)

Result:
* Interpreted languages are slower than compiled languages
* ogr relies on swig which is mainly in C, while shapely is mostly in Python
* but ogr, is less Pythonic
* Keep the overhead of creating python or C classes in mind. This might be important when you optimize code!

### Bonus

* add an __iter__ and __next__ class: use generator to yield coordinates
* create a point class and a geometry class from which point and polygon inherit from 
* write abstract base class
* Write a decorator function for the print coordinates function:
* add a class attribute "Polygon"
* add a __str__ mehdos

## 2.2 More "Pythonic" modules for vector and raster data processing

Why do we need more Pyhtonic code? For prototyping you should be able to write code fast. Only later on you optimize if necessary. Don't step into the the trap of premature optimization.

s this really necessary? You will often read something like this:

![gdal_vs_rasterio](img/gdal_vs_rasterio.png)

[Source: GitHub Issue - rasterio vs python gdal #11](https://github.com/inbo/niche_vlaanderen/issues/11)

https://rasterio.readthedocs.io/en/latest/intro.html
    What is geospatial data abstraction? 

### Vector data: Fiona + shapely + geopandas vs OGR

### Raster data: rasterio vs GDAL

### References

#### Object Oriented Programming 
[Python Tutorial](https://docs.python.org/3/tutorial/classes.html)
[Abstract Classes](https://docs.python.org/3/library/abc.html#module-abc)

https://www.toptal.com/python/computational-geometry-in-python-from-theory-to-implementation
    
#### Rasterio 
[Switching from GDAL to Rasterio](https://rasterio.readthedocs.io/en/latest/topics/switch.html)