---
title: 'Coordinate Refrence Systems'
author: 'Hyunsoo Kim'
date: '2022-05-24'
categories: [Python, Pandas, Geopandas]
image: geopandas.png
jupyter: python3
page-layout: full
---

> Coordinate Reference Systems

**This notebook is an exercise in the [Geospatial Analysis](https://www.kaggle.com/learn/geospatial-analysis) course.  You can reference the tutorial at [this link](https://www.kaggle.com/alexisbcook/coordinate-reference-systems).**

---


In [2]:
import pandas as pd
import geopandas as gpd

from shapely.geometry import LineString

from learntools.core import binder
binder.bind(globals())
from learntools.geospatial.ex2 import *

# Exercises

### 1) Load the data.

Run the next code cell (without changes) to load the GPS data into a pandas DataFrame `birds_df`.  

다음 코드 셀(변경 없이)을 실행하여 GPS 데이터를 pandas DataFrame `birds_df`에 로드합니다.

In [3]:
# Load the data and print the first 5 rows
birds_df = pd.read_csv("../input/geospatial-learn-course-data/purple_martin.csv", parse_dates=['timestamp'])
print("There are {} different birds in the dataset.".format(birds_df["tag-local-identifier"].nunique()))
birds_df.head()

There are 11 birds in the dataset, where each bird is identified by a unique value in the "tag-local-identifier" column.  Each bird has several measurements, collected at different times of the year.

Use the next code cell to create a GeoDataFrame `birds`.  
- `birds` should have all of the columns from `birds_df`, along with a "geometry" column that contains Point objects with (longitude, latitude) locations.  
-`birds`에는 `birds_df`의 모든 열과 함께 (경도, 위도) 위치가 있는 Point 개체가 포함된 "geometry" 열이 있어야 합니다.
- Set the CRS of `birds` to `{'init': 'epsg:4326'}`.

In [4]:
# Your code here: Create the GeoDataFrame
birds = gpd.GeoDataFrame(birds_df, geometry=gpd.points_from_xy(birds_df["location-long"], birds_df["location-lat"]))

# Your code here: Set the CRS to {'init': 'epsg:4326'}
birds.crs = {'init' :'epsg:4326'}

# Check your answer
q_1.check()

In [5]:
# Lines below will give you a hint or solution code
#q_1.hint()
#q_1.solution()

### 2) Plot the data.

Next, we load in the `'naturalearth_lowres'` dataset from GeoPandas, and set `americas` to a GeoDataFrame containing the boundaries of all countries in the Americas (both North and South America).  Run the next code cell without changes.

다음으로 GeoPandas에서 `naturalearth_lowres` 데이터 세트를 로드하고 `americas`를 미주(북미와 남미 모두)의 모든 국가 경계를 포함하는 GeoDataFrame으로 설정합니다. 변경 없이 다음 코드 셀을 실행합니다.

In [6]:
# Load a GeoDataFrame with country boundaries in North/South America, print the first 5 rows
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
americas = world.loc[world['continent'].isin(['North America', 'South America'])]
americas.head()

Use the next code cell to create a single plot that shows both: (1) the country boundaries in the `americas` GeoDataFrame, and (2) all of the points in the `birds_gdf` GeoDataFrame.  

다음 코드 셀을 사용하여 (1) `americas` GeoDataFrame의 국가 경계와 (2) `birds_gdf` GeoDataFrame의 모든 점을 모두 표시하는 단일 플롯을 만듭니다.

Don't worry about any special styling here; just create a preliminary plot, as a quick sanity check that all of the data was loaded properly.  In particular, you don't have to worry about color-coding the points to differentiate between birds, and you don't have to differentiate starting points from ending points.  We'll do that in the next part of the exercise.

In [16]:
# Your code here
ax = americas.plot(figsize=(8,8), color='red', linestyle=':', edgecolor='black')
americas.plot(markersize=1, ax=ax)
# Uncomment to see a hint
#q_2.hint()

In [17]:
# Get credit for your work after you have created a map
q_2.check()

# Uncomment to see our solution (your code may look different!)
#q_2.solution()

### 3) Where does each bird start and end its journey? (Part 1)

Now, we're ready to look more closely at each bird's path.  Run the next code cell to create two GeoDataFrames:
- `path_gdf` contains LineString objects that show the path of each bird.  It uses the `LineString()` method to create a LineString object from a list of Point objects.
- `path_gdf`에는 각 새의 경로를 표시하는 LineString 개체가 포함되어 있습니다. `LineString()` 메서드를 사용하여 Point 개체 목록에서 LineString 개체를 만듭니다.
- `start_gdf` contains the starting points for each bird.
- `start_gdf`는 각 새의 시작 지점을 포함합니다.

In [18]:
# GeoDataFrame showing path for each bird
path_df = birds.groupby("tag-local-identifier")['geometry'].apply(list).apply(lambda x: LineString(x)).reset_index()
path_gdf = gpd.GeoDataFrame(path_df, geometry=path_df.geometry)
path_gdf.crs = {'init' :'epsg:4326'}

# GeoDataFrame showing starting point for each bird
start_df = birds.groupby("tag-local-identifier")['geometry'].apply(list).apply(lambda x: x[0]).reset_index()
start_gdf = gpd.GeoDataFrame(start_df, geometry=start_df.geometry)
start_gdf.crs = {'init' :'epsg:4326'}

# Show first five rows of GeoDataFrame
start_gdf.head()

Use the next code cell to create a GeoDataFrame `end_gdf` containing the final location of each bird.  
- The format should be identical to that of `start_gdf`, with two columns ("tag-local-identifier" and "geometry"), where the "geometry" column contains Point objects.
- 형식은 두 개의 열("tag-local-identifier" 및 "geometry")이 있는 `start_gdf'의 형식과 동일해야 합니다. 여기서 "geometry" 열은 Point 개체를 포함합니다.
- Set the CRS of `end_gdf` to `{'init': 'epsg:4326'}`.
- `end_gdf`의 CRS를 `{'init': 'epsg:4326'}`으로 설정합니다.

In [24]:
# Your code here
end_df = birds.groupby("tag-local-identifier")['geometry'].apply(list).apply(lambda x: x[-1]).reset_index()
end_gdf = gpd.GeoDataFrame(end_df, geometry=end_df.geometry)
end_gdf.crs = {'init' :'epsg:4326'}

# Check your answer
q_3.check()

In [25]:
# Lines below will give you a hint or solution code
#q_3.hint()
#q_3.solution()

### 4) Where does each bird start and end its journey? (Part 2)

Use the GeoDataFrames from the question above (`path_gdf`, `start_gdf`, and `end_gdf`) to visualize the paths of all birds on a single map.  You may also want to use the `americas` GeoDataFrame.

위 질문의 GeoDataFrames(`path_gdf`, `start_gdf`, `end_gdf`)를 사용하여 단일 지도에서 모든 새의 경로를 시각화하세요. `americas` GeoDataFrame을 사용할 수도 있습니다.

In [27]:
# Your code here
ax = americas.plot(figsize=(10,10), color='none', edgecolor='gainsboro', zorder=3)

# Add wild lands, campsites, and foot trails to the base map
start_gdf.plot(color='lightgreen', ax=ax)
path_gdf.plot(color='maroon', markersize=2, ax=ax)
end_gdf.plot(color='black', markersize=1, ax=ax) 

# Uncomment to see a hint
#q_4.hint()

In [28]:
# Get credit for your work after you have created a map
q_4.check()

# Uncomment to see our solution (your code may look different!)
#q_4.solution()

### 5) Where are the protected areas in South America? (Part 1)

It looks like all of the birds end up somewhere in South America.  But are they going to protected areas?

In the next code cell, you'll create a GeoDataFrame `protected_areas` containing the locations of all of the protected areas in South America.  The corresponding shapefile is located at filepath `protected_filepath`.

다음 코드 셀에서는 남미의 모든 보호 지역 위치를 포함하는 GeoDataFrame `protected_areas`를 생성합니다. 해당 shapefile은 파일 경로 `protected_filepath`에 있습니다.

In [30]:
# Path of the shapefile to load
protected_filepath = "../input/geospatial-learn-course-data/SAPA_Aug2019-shapefile/SAPA_Aug2019-shapefile/SAPA_Aug2019-shapefile-polygons.shp"

# Your code here
protected_areas = gpd.read_file(protected_filepath)

# Check your answer
q_5.check()

In [31]:
# Lines below will give you a hint or solution code
#q_5.hint()
#q_5.solution()

### 6) Where are the protected areas in South America? (Part 2)

Create a plot that uses the `protected_areas` GeoDataFrame to show the locations of the protected areas in South America.  (_You'll notice that some protected areas are on land, while others are in marine waters._)

'protected_areas' GeoDataFrame을 사용하여 남아메리카의 보호 지역 위치를 표시하는 플롯을 만듭니다. (_일부 보호 구역은 육지에 있고 다른 보호 구역은 바다에 있음을 알 수 있습니다._)

In [38]:
# Country boundaries in South America
south_america = americas.loc[americas['continent']=='South America']

# Your code here: plot protected areas in South America
ax = south_america.plot(figsize=(8,8), color='whitesmoke', edgecolor='black')
protected_areas.plot(markersize=1, ax=ax,alpha=0.4)

# Uncomment to see a hint
#q_6.hint()

In [36]:
# Get credit for your work after you have created a map
q_6.check()

# Uncomment to see our solution (your code may look different!)
#q_6.solution()

### 7) What percentage of South America is protected?

You're interested in determining what percentage of South America is protected, so that you know how much of South America is suitable for the birds.  

As a first step, you calculate the total area of all protected lands in South America (not including marine area).  To do this, you use the "REP_AREA" and "REP_M_AREA" columns, which contain the total area and total marine area, respectively, in square kilometers.

Run the code cell below without changes.

In [39]:
P_Area = sum(protected_areas['REP_AREA']-protected_areas['REP_M_AREA'])
print("South America has {} square kilometers of protected areas.".format(P_Area))

Then, to finish the calculation, you'll use the `south_america` GeoDataFrame.  

In [40]:
south_america.head()

Calculate the total area of South America by following these steps:
- Calculate the area of each country using the `area` attribute of each polygon (with EPSG 3035 as the CRS), and add up the results.  The calculated area will be in units of square meters.
- 각 폴리곤(CRS로 EPSG 3035 사용)의 'area' 속성을 사용하여 각 국가의 면적을 계산하고 결과를 합산합니다. 계산된 면적은 평방 미터 단위입니다.
- Convert your answer to have units of square kilometeters.
- 평방 킬로미터 단위가 되도록 답을 변환하십시오.

In [51]:
# Your code here: Calculate the total area of South America (in square kilometers)
totalArea = sum(south_america.geometry.to_crs(epsg=3035).area)/10**6
totalArea 
# Check your answer
q_7.check()

In [52]:
# Lines below will give you a hint or solution code
#q_7.hint()
#q_7.solution()

Run the code cell below to calculate the percentage of South America that is protected.

In [None]:
# What percentage of South America is protected?
percentage_protected = P_Area/totalArea
print('Approximately {}% of South America is protected.'.format(round(percentage_protected*100, 2)))

### 8) Where are the birds in South America?

So, are the birds in protected areas?  

그렇다면 새들은 보호 구역에 있습니까?

Create a plot that shows for all birds, all of the locations where they were discovered in South America.  Also plot the locations of all protected areas in South America.

모든 새, 남미에서 발견된 모든 위치를 보여주는 플롯을 만듭니다. 또한 남아메리카의 모든 보호 지역의 위치를 ​​표시합니다.

To exclude protected areas that are purely marine areas (with no land component), you can use the "MARINE" column (and plot only the rows in `protected_areas[protected_areas['MARINE']!='2']`, instead of every row in the `protected_areas` GeoDataFrame).

순수한 해양 지역(토지 구성요소 없음)인 보호 지역을 제외하려면 "MARINE" 열을 사용할 수 있습니다(그리고 `protected_areas[protected_areas['MARINE']!='2']` `protected_areas` GeoDataFrame의 모든 행).

In [56]:
# Your code here
ax = south_america.plot(figsize=(8,8), color='whitesmoke', edgecolor='black')
protected_areas[protected_areas['MARINE']!='2'].plot(ax=ax, alpha=0.4, zorder=1)
birds[birds.geometry.y < 0].plot(ax=ax, color='red', alpha=0.6, markersize=10, zorder=2)

# Uncomment to see a hint
#q_8.hint()

In [57]:
# Get credit for your work after you have created a map
q_8.check()

# Uncomment to see our solution (your code may look different!)
#q_8.solution()

# Keep going

Create stunning **[interactive maps](https://www.kaggle.com/alexisbcook/interactive-maps)** with your geospatial data.

---




*Have questions or comments? Visit the [course discussion forum](https://www.kaggle.com/learn/geospatial-analysis/discussion) to chat with other learners.*