
# DemographyToolkit - Unit Test and Demonstration

This notebook demonstrates the usage of the `DemographyToolkit` with basic examples, 
as part of validating that the toolkit works properly. It includes explanations about Unit Tests, 
the setup of the environment, loading data, calculating population in polygons, and creating new areas.

---



## 🧪 What is a Unit Test?

A Unit Test is a small, automated piece of code that checks that a specific function or feature 
in your project works as expected. In our case, we are writing simple tests to verify that the 
`DemographyToolkit` works properly: loading datasets, calculating population, creating new areas, etc.

If any test fails, it means that something might be wrong in the code or in the expected results.



## ⚙️ Environment Setup

Before using the `DemographyToolkit`, make sure you import the necessary modules and initialize a project.


In [None]:
from hera import Project
from measurements.GIS.vector.demography import DemographyToolkit
import geopandas as gpd
from shapely.geometry import Polygon


## 📁 Create a New Project

We create a test project to work inside it.


In [None]:
project = Project("UNIT_TEST_DEMOGRAPHY")
toolkit = DemographyToolkit(projectName="UNIT_TEST_DEMOGRAPHY")


## 📦 Load a Small Dataset

For testing, we load a small sample shapefile with basic population information.


In [None]:

# Load a basic dataset (already exists)
import os

# Resolve data path relative to HERA_DATA_PATH environment variable
data_path = os.path.join(
    os.environ["HERA_DATA_PATH"],
    "measurements", "GIS", "vector", "population_lamas.shp"
)

# Load the sample dataset
toolkit.loadData("lamas_population", data_path, overwrite=True)



## 🧮 Calculate Population in a Small Polygon

Let's define a simple polygon and calculate the estimated population inside it.


In [None]:

# Define a simple polygon in WGS84 (EPSG:4326)
polygon = Polygon([
    (35.1, 33.85),
    (35.15, 33.85),
    (35.15, 33.90),
    (35.1, 33.90),
    (35.1, 33.85)
])

gdf = gpd.GeoDataFrame(index=[0], crs="EPSG:4326", geometry=[polygon])

# Calculate population using the toolkit
result = toolkit.analysis.calculatePopulationInPolygon(
    shapelyPolygon=gdf.geometry.iloc[0],
    dataSourceOrData="lamas_population"
)

result



## ✅ Summary

In this notebook we demonstrated:

- Creating a `Project`
- Loading a demographic dataset
- Calculating population inside a polygon
- Creating a new area based on demographic data

All examples are kept simple to ensure that the Unit Test will pass without unexpected errors.


# 📊 DemographyToolkit - Unit Test Documentation

This notebook explains the purpose and coverage of the unit tests written for the `DemographyToolkit` class in Hera.
Each test is described in detail along with the logic of the toolkit methods it verifies.


## 🧰 Overview: `DemographyToolkit`

The `DemographyToolkit` is responsible for managing demographic data, including loading population shapefiles, analyzing intersections with custom polygons, and saving derived areas.

### Core Methods

- `loadData(...)`: Loads a shapefile or GeoJSON into the database as a population source.
- `createNewArea(...)`: Generates a new area (polygon) and summarizes the intersecting population from a data source.
- `calculatePopulationInPolygon(...)`: Calculates fractional population within a specified polygon.
- `setDefaultDirectory(...)`: Sets and optionally creates a default directory to save shapefiles.


## ✅ Test: `test_calculatePopulationInPolygon_basic`

**Purpose:**
Validates that the function returns a non-empty result for a polygon that intersects existing data.

**What it does:**
- Constructs a buffered rectangle around the first polygon.
- Calls `calculatePopulationInPolygon` to get population fractions.
- Asserts:
  - The result is not empty.
  - It contains expected columns: `geometry` and `areaFraction`.

**Why it matters:**
Ensures the intersection logic and population weight computation work for overlapping geometries.


## 🧪 Test: `test_calculatePopulationInPolygon_partial_intersection`

**Purpose:**
Checks correctness when a polygon intersects **multiple existing** features only partially.

**What it does:**
- Unions two adjacent polygons and creates a buffer around their centroid.
- Ensures that multiple overlaps are detected and partial contributions are calculated correctly.

**Assertions:**
- Result is non-empty.
- At least one intersecting region is returned.


## 🚫 Test: `test_calculatePopulationInPolygon_outside_bounds`

**Purpose:**
Verifies that the function behaves correctly when the polygon is **outside the data bounds**.

**What it does:**
- Builds a polygon far away from all population data.
- Ensures the returned GeoDataFrame is empty.

**Why this test is important:**
Prevents false positives or errors on spatial mismatches.


## 🛑 Test: `test_calculatePopulationInPolygon_invalid_datasource`

**Purpose:**
Checks error handling when the data source name does not exist.

**What it does:**
- Passes an invalid source name.
- Confirms that the code raises a `ValueError`.

**Why it's tested:**
Verifies input validation and prevents silent failures.


## 🧮 Test: `test_createNewArea_simple`

**Purpose:**
Tests the creation of a new area and verifies population aggregation.

**How it works:**
- Creates a rectangle covering the full extent of the dataset.
- Calls `createNewArea` with `TOOLKIT_SAVEMODE_NOSAVE`.
- Verifies that:
  - The result is a `nonDBMetadataFrame`.
  - The geometry and population data are present.
  - The total population matches the expected sum.

**Key aspect tested:**
Correct aggregation and output structure when creating regions dynamically.


## 📁 Test: `test_setDefaultDirectory_creates_and_sets_path`

**Purpose:**
Validates the behavior of setting and creating the default save directory.

**What it checks:**
- A temporary folder is created.
- The internal attribute `_FilesDirectory` is updated.
- The directory exists on the filesystem.

**Why important:**
Ensures safe and correct saving of shapefiles when working across systems.
