<div style="text-align: center; background-color: #f9f9f9; border-left: 6px solid #4CAF50; padding: 10px; border-radius: 5px;">
    <h1 style="color: #333; font-size: 24px;">Universidad Nacional de Colombia</h1>
    <h2 style="color: #555; font-size: 20px;">Facultad de Ciencias Agrarias, Sede Bogotá</h2>
    <h3 style="color: #777; font-size: 18px;">Workshop: Geographic data exploration and first approaches</h3>
    <h3 style="color: #777; font-size: 18px;">Programación SIG II-2024</h3>
</div>


In this workshop, we will take a hands-on approach to working with geospatial data, focusing on practical skills and foundational concepts. Using **GeoPandas**, a powerful Python library for geographic data, you will learn how to handle, analyze, and visualize spatial data efficiently.
Additionally, this workshop will guide you on how to install libraries that may not be included in `conda`, ensuring you can work with any Python package required for geospatial analysis.

The workshop is structured around exercises and concepts derived from [Chapter 1: Spatial Data](https://py.geocompx.org/01-spatial-data) of the Geocomputation with Python resource. You will work with a shapefile containing information about municipalities in Colombia, but the techniques you’ll learn can be applied to any geospatial dataset. Use this very notebook to go through the [Geographic data in Python section](https://py.geocompx.org/) from  Dorman et al., and answer the questions at the end of this notebook. 

---

## Learning Objectives

By the end of this workshop you are expected to:

* Differentiate between a `GeoSeries` and a `GeoDataFrame`, and understand their roles in geospatial analysis.
* Import spatial datasets into Python using `GeoPandas`.
* Explore and interpret the attributes and geometries within a `GeoDataFrame`.
* Access and modify geometry data, such as centroids and boundaries.
Perform attribute-based filtering of spatial datasets.
* Visualize Spatial Data  using `.plot()` and `.explore()` methods.
* Customize maps 
* Reproject spatial data to ensure consistency and accuracy in analysis.
* Replicate the analysis using your own datasets, adapting the learned techniques to different contexts.


# 1. Installing new packages

Lets try importing the required libraries

In [None]:
import pandas as pd
import shapely
import geopandas as gpd
import matplotlib.pyplot as plt

**Why This Error Happens??**

The error ```ModuleNotFoundError``` occurs when Python cannot find the module you are trying to import. This usually happens because the package is not installed in your Python environment. Every package needs to be installed before it can be used in your code
In order to install a missing package, you can use one of the following methods

## Using the terminal
You can open the terminal and:
* ```pip install shapely``` for pip
or 
* ```conda install shapely -c conda-forge``` for conda

## Using the notebook cells
You can install the package directly from a code cell using:
* ```!pip install shapely``` with pip
* ```!conda install shapely -c conda-forge -y``` with conda


## When should you use conda?
* You usually use conda for Scientific and Data Science Packages: 
conda is excellent for packages with complex system-level dependencies, such as geospatial libraries (e.g., shapely, gdal), scientific libraries (e.g., numpy, scipy), and machine learning libraries. These packages often require non-Python libraries or binaries that conda installs automatically.
* You also use conda to manage Python Environments that use specific versions of Python and libraries.
* Conda-forge is a community-maintained repository that provides pre-compiled packages for scientific computing, making installation simpler and faster e.g. ```conda install geopandas -c conda-forge```
* conda can manage dependencies beyond Python, including R, C++, and system libraries. This makes it ideal for multidisciplinary projects.
  
## When should you use pip?
```pip``` (Pip Installs Packages), is the default package manager for Python, used to install, manage, and uninstall Python libraries and dependencies.
* Use pip for pure Python packages that are easy to install and widely available on PyPI (Python Package Index), such as pandas, matplotlib, requests
*  If you're not using a Conda environment and rely on the system-wide Python installation, pip is the best option.
* Some Python libraries are not available in Conda repositories, so you need pip to install them.

<div style="background-color: #f0f8ff; border-left: 6px solid #4682b4; padding: 10px; border-radius: 5px;">
    <h2 style="color: #333;">Now it's your turn!</h2>
    <p style="color: #555;">Install the <code>geopandas</code> package to start working with spatial data!! </p>
    <ol style="color: #555;">
 
</div>




# 2. Exploring geographic data

In the following cell you will create an **instance** of an object from the **GeoDataFrame class**, which is part of the geopandas library. This object does have attributes and methods that you can use.
You will find the shapefile of Colombian municipalities at [**LINK**](https://drive.google.com/drive/folders/1LugBAW1CLn7m3at2kVkLlA76HERBwGVP?usp=drive_link). 

In [None]:
path_shp_municipios=r"/Users/macbook/Library/CloudStorage/OneDrive-UniversidadNacionaldeColombia/UN_2024_2/2024_2_Programacion_SIG/Talleres/Fundamenteos_Geoespacial"
df_municipios = gpd.read_file(path_shp_municipios)

<div style="background-color: #f9f9f9; border-left: 6px solid #FF5722; padding: 10px; border-radius: 5px;">
    <h2 style="color: #333;">🔍 Data Exploration: Identifying Spatial Columns</h2>
    <p style="color: #555;">As part of your spatial data analysis, explore the GeoDataFrame <code>de_municipios</code> and answer the following questions:</p>
    <ol style="color: #555;">
        <li><b>What are the names of all the columns in the <code>de_municipios</code> GeoDataFrame?</b></li>
        <li><b>Which of these columns represent spatial data? Explain why.</b></li>
    </ol>
    <p style="color: #555;">Use methods like <code>.columns</code>, <code>.geometry</code>, and <code>.info()</code> to help you answer these questions.</p>
</div>


In [None]:
print(type(shp_municipios))
print(df_municipios)

<div style="background-color: #f9f9f9; border-left: 6px solid #FF5722; padding: 10px; border-radius: 5px;">
    <h2 style="color: #333;">🔍 Data Visualisation: Identifying Spatial Columns</h2>
    <p style="color: #555;">When working with GeoDataFrames in GeoPandas, .plot() and .explore() are two methods for visualizing spatial data.  </p>
    <ol style="color: #555;">
        <b>Try both methods and identify when should you use one or another</b>
    </ol>

</div>



In [None]:
shp_municipios.plot()

# 3. Follow the material 
available at [**This LINK**](https://py.geocompx.org/01-spatial-data)  (Only unit 1, and run the analyses over your own data)

<div style="background-color: #f9f9f9; border-left: 6px solid #4CAF50; padding: 10px; border-radius: 5px;">
    <h2 style="color: #333;">🧠 Finally...Some Questions oto think about</h2>
    <p style="color: #555;">Reflect on the following questions as you work through the material and exercises. These questions are designed to deepen your understanding.</p>
    <ol style="color: #555;">
        <li><b>What are the key differences between a GeoSeries and a GeoDataFrame? When would you use one over the other?</b></li>
        <li><b>How does the geometry column in a GeoDataFrame differ from other columns? Why is it treated uniquely?</b></li>
        <li><b>What are the advantages of using a GeoDataFrame compared to a traditional pandas DataFrame for handling spatial data?</b></li>
        <li><b>Why is it important to know the Coordinate Reference System (CRS) of your spatial data? What could go wrong if the CRS is not properly defined?</b></li>
        <li><b>How do vector and raster data differ in their structure and typical use cases? Can you think of a scenario where one would be more suitable than the other?</b></li>
        <li><b>Why is it necessary to reproject spatial data into a consistent CRS when combining datasets? What challenges might arise if this step is skipped?</b></li>
        <li><b>Centroids are a simplified representation of geometries. In what scenarios might using centroids instead of full geometries lead to misleading results?</b></li>
        <li><b>What are some real-world applications of spatial data filtering (e.g., selecting features based on attributes)? How might this be useful for your course project?</b></li>
        <li><b>If you needed to share spatial data with collaborators who use different software, what file formats would you consider using? Why?</b></li>
    </ol>
    <p style="color: #555;">Discuss these questions with your peers or reflect on them individually. I might feel tempted to include some in the exam.</p>
</div>
