## Section 6 - Arctic Regions Geospatial Wrangling


In [67]:
import os
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd

In [4]:
fp = os.path.join("data", "arctic_communities.geojson")
df = gpd.read_file(fp)

In [5]:
df.head()

Unnamed: 0,admin,country,n_communities,geometry
0,United States of America,US,115,"MULTIPOLYGON (((-132.74687 56.52568, -132.7576..."
1,United Kingdom,GB,96,"MULTIPOLYGON (((-2.66768 51.62300, -2.74214 51..."
2,Sweden,SE,133,"MULTIPOLYGON (((19.07646 57.83594, 18.99375 57..."
3,Russia,RU,774,"MULTIPOLYGON (((145.88154 43.45952, 145.89561 ..."
4,Norway,NO,48,"MULTIPOLYGON (((20.62217 69.03687, 20.49199 69..."


## Brainstorm

#### a)
- Separate Alaska and continental US
- Remove continental US
- Reproject
- Plot

#### b)
- How do we separate US from Alaska without downloading external data


## 2. Check geometry types

In [14]:
# Check geometry types
df.geom_type

admin
United States of America    MultiPolygon
United Kingdom              MultiPolygon
Sweden                      MultiPolygon
Russia                      MultiPolygon
Norway                      MultiPolygon
Lithuania                   MultiPolygon
Latvia                           Polygon
Iceland                          Polygon
Finland                     MultiPolygon
Estonia                     MultiPolygon
Greenland                   MultiPolygon
Faroe Islands               MultiPolygon
Denmark                     MultiPolygon
Canada                      MultiPolygon
Belarus                          Polygon
dtype: object

The different countries have different geometry types. We suspect countries with islands are MultiPolygon

In [13]:
# Reset index
df = df.set_index("admin")

Create an `if-else` statemtn that 

prints “Multiple feature types:” followed by the unique geometry types (no repetition) in the geodataframe if not all the features are polygons, and

prints “All features are:” followed by the unique geometry type if all the features in the geodataframe have the same geometry type.

In [23]:
if df.geom_type.unique().size > 1:
    print(f"Multiple feature types {df.geom_type.unique()}")
else:
    print(f"All features are: {df.geom_type.unique()}")



Multiple feature types ['MultiPolygon' 'Polygon']


In [26]:
def check_polygons(df):
    if df.geom_type.unique().size > 1:
        print(f"Multiple feature types {df.geom_type.unique()}")
    else:
        print(f"All features are: {df.geom_type.unique()}")



In [27]:
check_polygons(df)

Multiple feature types ['MultiPolygon' 'Polygon']


## 3) Explode the polygons (for real)

In [31]:
# Explode the world
df = df.explode(index_parts = False).reset_index()

In [32]:
# Check if it worked
check_polygons(df)

All features are: ['Polygon']


## 4) Compute minimum y-coordinate for polygons

At this point, every row in your df should be a single polygon.

Select the first row of df using iloc. What kind of Python object is this?

Select the geometry of the first row of df. What kind of Python object is this?

Use the bounds attribute for shapely Polygons to select the southern-most bound of the first polygon in df.

Create a function min_y that receives a single row of a geodataframe as its parameter and returns the minimum y-coordinate of its bounding box.

Use the min_y function and the apply method for data frames to create a new column miny in df which has the minimum y coordinate.

In [41]:
# Check data type of the first row of the df
type(df.iloc[0])

pandas.core.series.Series

In [54]:
# Check dtype of geometry column
type(df.geometry.iloc[13])

shapely.geometry.polygon.Polygon

In [57]:
# Return bounds for first geometry
df.geometry.iloc[0].bounds[1]

56.511035156249996


Min X.           Min Y.               Max X.        Max Y.
(-132.948046875, 56.511035156249996, -132.56796875, 56.794775390625)

In [110]:
# Create function to determine the minimum Y value (southernmost) for a given geometry
def min_y(n):
  return(df.geometry.iloc[n].bounds[1])
    

In [112]:

min_y(3)


60.312646484374994