**Coordinate reference systems** (CRS) are essential metadata for any geospatial dataset. Without a CRS, geometries would simply be a collection of coordinates in an arbitrary space. Only the CRS allows GIS software, including the Python packages we use in this course, to relate these coordinates to a place on Earth, or other roughly spherical objects or planets.

Often confused with CRS, map projections, also known as **projected coordinate systems**, are mathematical models that enable the transfer of coordinates on the surface of our three-dimensional Earth to coordinates on a planar surface, such as a flat, two-dimensional map. Unlike projected coordinate systems, geographic coordinate systems use only latitude and longitude, which are the degrees along the horizontal and vertical great circles of a sphere approximating the Earth, as the x and y coordinates in a planar map. Finally, there are both projected and geographic coordinate systems that employ more complicated ellipsoids than a simple sphere to better approximate the 'potato-shaped' reality of our planet. The complete CRS information required to accurately relate geospatial information to a place on Earth includes both (projected/geographic) coordinate system and ellipsoid.

Spatial datasets frequently have different CRSs since different coordinate systems are optimized for specific regions and purposes. No coordinate system can be entirely accurate around the world, and the transformation from three- to two-dimensional coordinates cannot be accurate in angles, distances, and areas at the same time.

Therefore, it is a typical GIS task to transform, or reproject, a dataset from one reference system to another, such as to make two layers interoperable. Comparing two datasets with different CRSs would inevitably yield incorrect results; for instance, determining points contained within a polygon would not work if the points have geographic coordinates (in degrees), and the polygon is in the national Finnish reference system (in meters).

Choosing an appropriate projection for your map is not always easy. It depends on what you want to represent in your map and what your data's spatial scale, resolution, and extent are. In reality, there is no single "perfect projection"; each has advantages and disadvantages, and you should choose a projection that best fits each map. 

![Map Projections](https://imgs.xkcd.com/comics/map_projections.png)
Source: https://xkcd.com/

Once you have figured out which map projection to use, handling coordinate reference systems, fortunately, is fairly easy in Geopandas. The library `pyproj` provides additional information about a CRS, and can assist with more tricky tasks, such as guessing the unknown CRS of a data set.

In this section we will learn how to retrieve the coordinate reference system information of a data set, and how to re-project the data into another CRS.

Let's start with a file we imported before (regions of the EU)

In [None]:
import geopandas 

# Download, unpack, and read the NUTS regions dataset from the Eurostat website
url = "https://gisco-services.ec.europa.eu/distribution/v2/nuts/shp/NUTS_RG_60M_2021_3035.shp.zip"
gdf = geopandas.read_file(url)
gdf.head()

In [None]:
gdf.plot()

In [None]:
gdf.crs

The object that is being displayed is a `pyproj.CRS` object, which represents a coordinate reference system.

In the geospatial world, the EPSG code (European Petroleum Survey Group) is a standard way of identifying coordinate reference systems. Each EPSG code corresponds to an entry in the EPSG Geodetic Parameter Dataset, which is a collection of coordinate reference systems and coordinate transformations ranging from global to national, regional, and local scope.

The EPSG code of the given `GeoDataFrame` is `3035`, which corresponds to a projected coordinate system that uses the GRS 1980 reference ellipsoid. It is the map projection officially recommended by the European Commission.

You can find information about reference systems and lists of commonly known CRS from many online resources, for example:

www.spatialreference.org 

www.proj4.org

www.mapref.org

In [None]:
gdf.geometry.head()


Transforming data from one reference system to another is a very simple task in geopandas. In fact, all you have to to is use the to_crs() method of a GeoDataFrame, supplying a new CRS in a wide range of possible formats. The easiest is to use an EPSG code:'

In [None]:
gdf_epsg_4326 = gdf.to_crs("EPSG:4326")

In [None]:
gdf_epsg_4326.geometry.head()

In [None]:
gdf_epsg_4326.crs

It's notable that we have transformed a projected system into a geographic one now - hint!, the values of the coordinates are in longtitude and latitude now) - the legendary WGS84 system!

Let's plot the 2 maps side by side:

In [None]:

import matplotlib.pyplot

# Prepare sub plots that are next to each other
figure, (axis1, axis2) = matplotlib.pyplot.subplots(nrows=1, ncols=2)

# Plot the original (WGS84, EPSG:4326) data set
gdf.plot(ax=axis1)
axis1.set_title(" ETRS89-extended / LAEA Europe")
axis1.set_aspect(1)

# Plot the reprojected (EPSG:3035) data set
gdf_epsg_4326.plot(ax=axis2)
axis2.set_title("WGS84")
axis2.set_aspect(1)

matplotlib.pyplot.tight_layout()

Indeed, the maps look quite different, and the re-projected data set distorts the European countries less, especially in the Northern part of the continent.

Let’s still save the reprojected data set in a file so we can use it later. Note that, even though modern file formats save the CRS reliably, it is a good idea to use a descriptive file name that includes the reference system information.

Geospatial data can be accompanied by different types of information describing the coordinate reference system (CRS) used to represent the data. Common formats for CRS information include PROJ strings, EPSG codes, Well-Known-Text (WKT), and JSON. When working with spatial data from multiple sources, you may need to convert the CRS information from one format to another. The pyproj library, which geopandas uses to handle reference systems, can parse and convert CRS information in various formats. Knowing how to convert between formats can be useful when working with spatial data.

In [None]:
import pyproj

crs = pyproj.CRS(gdf.crs)

print(f"CRS as a proj4 string: {crs.to_proj4()}\n")

print(f"CRS in WKT format: {crs.to_wkt()}\n")

print(f"EPSG code of the CRS: {crs.to_epsg()}\n")


Not every possible coordinate reference system has an EPSG code assigned.
That’s why *pyproj*, by default, tries to find the best-matching EPSG
definition. If it does not find any, `to_epsg()` returns `None`.




### Use pyproj to find detailed information about a CRS

A `pyproj.CRS` object can also be initialised manually, for instance, using an
EPSG code or a Proj4-string. It can then provide detailed information on the
parameters of the reference system, as well as suggested areas of use. We can,
for example, create a `CRS` object for the `EPSG:3035` map projection we used

Use pyproj to find detailed information about a CRS
A `pyproj.CRS` object can also be initialised manually, for instance, using an EPSG code or a Proj4-string. It can then provide detailed information on the parameters of the reference system, as well as suggested areas of use. We can, for example, create a CRS object for the EPSG:3035 map projection we used above

In [None]:
crs = pyproj.CRS("EPSG:4326")
crs

In [None]:
crs.name

In [None]:
crs.area_of_use.bounds

## Sources

This lesson is inspired by the [Programming in Python lessons](http://swcarpentry.github.io/python-novice-inflammation/) from the [Software Carpentry organization](http://software-carpentry.org) and has adapted or reused material from University of Helsinki Automating GIS processis course (https://autogis-site.readthedocs.io/en/latest/course-info/license.html) under a Creative Commons Attribution-ShareAlike 4.0 International licence (https://creativecommons.org/licenses/by-sa/4.0/deed.en).