<img src="https://i.imgur.com/6U6q5jQ.png"/>

In [None]:
%reset
# starting fresh

# Geometries

The geodataframe (GDF) is a dataframe (DF) where every row represents an geometry  (point, line, polygon). Python requires you to install the **GEOPANDAS** library to work with these structures (check if you have it using _pip show_).

In the repository for this class you will see a folder named **maps**, with files  I have previously downloaded from this [website](https://www.efrainmaps.es/english-version/free-downloads/world/). There are three maps: *countries*, *cities*, and *rivers* of the world.

Visit the [repository](https://github.com/PythonVersusR/DataStructures_spatial/tree/main) and you may see something like this:

<img src="https://github.com/PythonVersusR/DataStructures_spatial/blob/main/pics/repo_Git.jpg?raw=true">

When you go inside the _maps_ folder you will see this:

<img title="a title" alt="Alt text" src="https://github.com/PythonVersusR/DataStructures_spatial/blob/main/pics/repo_Git_mapFolder.jpg?raw=true">

You see:

1.  A folder with files.
2.  Some *.json* files.
3.  Some *.zip* files. These files are zipped or compressed version of the files in the folder (not the folder with files).

Now, take a look a **World_Countries** folder:

<img src="https://github.com/PythonVersusR/DataStructures_spatial/blob/main/pics/repo_Git_mapFolder_shapes.jpg?raw=true">

There, you see that this **one map** requires **several files**. That is the nature of the shapefile.

Let me get the _raw_ link to each map from GitHub:

In [None]:
linkGit_shape="https://github.com/PythonVersusR/DataStructures_spatial/raw/main/maps/World_Countries/World_Countries.shp"
linkGit_json="https://github.com/PythonVersusR/DataStructures_spatial/raw/main/maps/World_Countries.json"
linkGit_zip="https://github.com/PythonVersusR/DataStructures_spatial/raw/main/maps/World_Countries.zip"

Let's read these files with the help of **geopandas**:

In [None]:
import geopandas as gpd

countriesShape=gpd.read_file(linkGit_shape)
countriesJson=gpd.read_file(linkGit_json)
countriesZip=gpd.read_file(linkGit_zip)

Let's see what we have:

In [None]:
type(countriesShape),type(countriesJson),type(countriesZip)

Some more info:

In [None]:
countriesShape.info(),countriesJson.info(),countriesZip.info()

Notice all the files have a column _"geometry"_.

Let me work with the **json** files for the other maps we have:

In [None]:
citiesLinkGit="https://github.com/PythonVersusR/DataStructures_spatial/raw/main/maps/World_Cities.json"
riversLinkGit="https://github.com/PythonVersusR/DataStructures_spatial/raw/main/maps/World_Hydrography.json"

citiesJson=gpd.read_file(citiesLinkGit)
riversJson=gpd.read_file(riversLinkGit)

We have three different maps:

In [None]:
countriesJson.info(),citiesJson.info(),riversJson.info()

Let's look for more details:

In [None]:
countriesJson.head()

In [None]:
citiesJson.head()

In [None]:
riversJson.head()

Now you see each file stores different geometries:

In [None]:
riversJson.geom_type.value_counts()

In [None]:
citiesJson.geom_type.value_counts()

In [None]:
countriesJson.geom_type.value_counts()

Let's see the maps:

In [None]:
countriesJson.plot() #thickness of lines

In [None]:
riversJson.plot()

In [None]:
citiesJson.plot()

## Map Projection

The projection (CRS) is a very important property of the maps. They affect several aspects:

* shape
* area
* scale
* direction

If you plan on doing some computations with several maps, you should verify that all have the same projection (**CRS**):

In [None]:
countriesJson.crs==citiesJson.crs==riversJson.crs

In [None]:
countriesJson.crs.is_projected,countriesJson.crs,countriesJson.crs.axis_info

In [None]:
citiesJson.crs.is_projected,citiesJson.crs,citiesJson.crs.axis_info

In [None]:
riversJson.crs.is_projected, riversJson.crs,riversJson.crs.axis_info

Our three maps are not projected. Then some math may not work. Let's work next with one country.

## Subsetting

We want to keep the geometries of one contry. We can subset our maps by *filtering*:

In [None]:
# filtering 
brazil=countriesJson[countriesJson.COUNTRY=='Brazil']

But you can also subset by *clipping*, as sometimes other data frames may not have the same fields for filtering:

In [None]:
# clipping
citiesBrazil = gpd.clip(gdf=citiesJson,
                          mask=brazil)
riversBrazil = gpd.clip(gdf=riversJson,
                               mask=brazil)

Can we compute the centroid of Brazil?

In [None]:
# this works with warning
brazil.centroid

We should follow the advice and set the right projection.

## Reprojecting

A projected CRS will have units in meters or feet (or similar). For a more accurate option it is better to look for the ones explicitly prepared for a particular locations of the world. You can request a crs per country [here](https://epsg.io/?q=brazil+kind%3APROJCRS):

In [None]:
# recommended for Brazil (meters)
brazil_5641=brazil.to_crs(5641)
brazil_5641.crs.axis_info

In [None]:
# this works with no warning

brazil_5641.centroid

Let's reproject the others:

In [None]:
citiesBrazil_5641=citiesBrazil.to_crs(5641)
riversBrazil_5641=riversBrazil.to_crs(5641)


Finally, we can plot what we have:

In [None]:
# plotting:

base5641=brazil_5641.plot(facecolor="whitesmoke", edgecolor='black', linewidth=0.4,figsize=(5,5))
brazil_5641.centroid.plot(color='red',markersize=100,ax=base5641)
citiesBrazil_5641.plot(marker='+', color='green', markersize=15,ax=base5641)
riversBrazil_5641.plot(edgecolor='blue', linewidth=0.5,ax=base5641)