# Calculate crop statistics 

## Background

Crop type maps provide information on the distribution of different crops and can be used to generate crop area statistics, contributing to the understanding of agricultural production.

## Description

This notebook demonstrates how to calculate crop area statistics using the crop type maps and input administration boundaries vector file.
Results are inspected and saved into tables.


## Getting started
To run this analysis, run all the cells in the notebook, starting with the "Load packages" cell.

### Load packages

In [2]:
import os
import pickle
import json

import datacube
import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import rioxarray
import xarray as xr
from deafrica_tools.spatial import xr_rasterize



## Load admin boundaries and crop map

We will load the data using coordinate reference system `EPSG:6933` for area calculation. The reference system uses units in meter.

In [40]:
output_crs = "EPSG:6933"

Geometries for regions where crop areas are to be calculated are loaded in a GeoPandas Dataframe. For calculating crop areas within the entire country, one geometry for the country should be provided. For calculating crop areas within provinces, a list of geometries for the provinces should be provided.

> For testing, we caculate crop areas for level 3 admin regions within a district where a crop type map has been generated.

In [75]:
# select the district name for which a crop map is available
district_name = "Nhamatanda" 
#district_name = "Lichinga"

mozambique_admins = gpd.read_file("Data/Mozambique_admin_gadm_level3.geojson")
area_of_interest_gdf = mozambique_admins[mozambique_admins["NAME_2"]==district_name]
area_of_interest_gdf.reset_index(inplace=True)

In [76]:
area_of_interest_gdf.explore(
    tiles = "https://mt1.google.com/vt/lyrs=y&x={x}&y={y}&z={z}", 
    attr ='Imagery @2022 Landsat/Copernicus, Map data @2022 Google',
    popup=True,
    cmap='viridis',
    style_kwds=dict(color= 'red', fillOpacity= 0, weight= 3),
    )

### Load the corresponding crop type map

In [77]:
crop_type_path= f"Results/Map/{district_name}_merged_croptype_prediction.tif"
da_crop_type=rioxarray.open_rasterio(crop_type_path).squeeze()

RasterioIOError: Results/Map/Nhamatanda_merged_croptype_prediction.tif: No such file or directory

### Load the crop class labels dictionary

In [78]:
# Dictionary with class labels from previous step
labels_path = "Results/Model/class_labels.json"

# Read the class label dictionary
with open(labels_path, "r") as json_file:
    labels_dict = json.load(json_file)

## Calculate areas per polygon

In [79]:
gdf_new=area_of_interest_gdf.copy()
for index, district in area_of_interest_gdf.iterrows():

    print(f"Processing polygon {index}")
    area_of_interest_gdf.loc[index,'ID']=index
    # Rasterize polygon
    district_mask = xr_rasterize(
        gdf=area_of_interest_gdf.iloc[[index]],
        da=da_crop_type,
        transform=da_crop_type.geobox.transform,
        crs=output_crs,
    )
    for class_name, class_value in labels_dict.items():
        # convert sq m to hectare
        crop_type_area=np.sum(district_mask==class_value)/10000.0
        attr_name=class_name+"_area_hectare"
        gdf_new.loc[index, attr_name] = crop_type_area
        print('area in hectare for {}: {}'.format(class_name,crop_type_area))

Processing polygon 0


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  super().__setitem__(key, value)


NameError: name 'da_crop_type' is not defined

## Plot areas histogram

In [None]:
gdf_new.plot(x='ID',y=[labels_dict.values()])

## Export to files

In [58]:
# Set results path
output_folder = "Results/Crop_stats"
if not os.path.exists(output_folder):
    os.makedirs(output_folder)

In [59]:
gdf_new.to_file(os.path.join(output_folder, f"{district_name}_crop_areas.geojson"))

In [60]:
# convert to CSV by dropping geometry
df = pd.DataFrame(gdf_new.drop(columns='geometry'))

df.to_csv(os.path.join(output_folder, f"{district_name}_crop_areas.csv"))