# Data calibration

<hr style="height:4px; background-color:black; border:none;">

<br>

This section explains the calibration procedure of the model’s parameters. It describes the processing steps, data sources, and relevant references.

## Geographic Extent
Brazilian Amazon Biome as defined in IBGE (2019), with a total area of 421,274,200 hectares.

## Model Variations


### 1043 sites
Regular grid of 1,043 sites with ∼ 67.5km x 67.5km resolution (67.5km-sites) covering the Brazilian Amazon Biome. The aggregation process consisted of three steps:
1. We rasterized the Amazon Biome using the MapBiomas grid (Souza et al. 2020), with pixels of ∼ 30m x 30m resolution (30m-pixels), as the baseline (to guarantee a perfect spatial overlap between the sites and the MapBiomas data).

2. We aggregated the Amazon grid by a factor of 2,250. Considering the top-left corner as the origin,
approximately every combination of 2,250 x 2,250 30m-pixels becomes one of the 67.5km-sites (grid
of 37x51 67.5km-sites).

3. From the 1,887 67.5km-sites, we selected the 1,043 sites with at least 3% of their area inside the
biome.

### 78 sites
Regular grid of 78 sites with ∼ 270km x 270km resolution (270km-sites) covering the Brazilian Amazon Biome. The aggregation process consisted of two steps:
1. We further aggregated the grid of 37x51 67.5km-sites by a factor of 4. Considering the topleft corner as the origin, approximately every combination of 4x4 67.5km-sites becomes one of
the 270km-sites. As 37/4 and 51/4 do not return an integer, the extent of the resulting grid is
somewhat larger (grid of 10x13 270km-sites).
2. From the 130 270km-sites, we selected the 78 sites with at least 3% of their area inside the biome



## Parameters and Initial Conditions
### $Z^i_0$ and $\bar{z^i}$
1. We aggregated the MapBiomas land use/cover 30m-pixels (Souza et al. 2020) to the 67.5km-sites
calculating the area (in hectares) of the 67.5km-sites with forest cover and agricultural use in 2017.
2. $Z^i_0$ is defined as the area with agricultural use for each site $i$ in 2017, while $\bar{z^i}$ is the total land available in 2017 (sum of the area with agricultural use or forest cover).
3. Source: MapBiomas Land Use/Cover Collection 5 (Souza et al. 2020)

### $\gamma^i$
1. We extracted a random sample of 1.2 million 30m-pixels from the MapBiomas grid, with land
cover/use information from 1985-2019.
2. We identified 893,753 pixels that could be considered primary forests in 2018 (with no deforestation, at least since 1985).
3. We added aboveground biomass density data for 2017 from ESA Biomass (Santoro
and Cartus 2021). Biomass data also comes in a grid format with ∼ 100m resolution, so we
spatially matched it to our sample and calculated the average CO2 density (Mg/ha) on the
primary forest pixels within each site for each year.
4. We conducted bayesian estimation using log of average CO2 density as dependent variable and predicted $\gamma^i$ as expoential of fitted values. The specification is presented below,
where CO2e ha is the average CO2 density we calculated, rainfall and temperature are the average annual precipitation (mm) and temperature (degrees Celsius),
respectively, for the period of 1970-2000, and longitude and latitude are the geographical coordinates of the
municipality centroids. We do the estimation at the level of 1043 sites and $\nu^i$ is the coarse 78-site each observation belongs to.
5. Source: ESA Biomass (v3) (Santoro and Cartus 2021) and MapBiomas Land Use/Cover Collection 5
(Souza et al. 2020)

```{math}
\begin{align*}
log(CO2e ha) = \beta^\gamma_{0} 
+ \beta^\gamma_{1} \log(\text{historical_precip}) 
+ \beta^\gamma_{2} \log(\text{historical_temp}) \\
\quad + \beta^\gamma_{3} \text{latitude} 
+ \beta^\gamma_{4} \text{longitude} 
+ \beta^\gamma_{5} (\text{latitude} \times \text{longitude}) + \nu^i + u_i
\end{align*}
```

### $\alpha$
1. $\alpha$ is the carbon depreciation parameter calculated as $1-0.01^{1/100}=0.045$
2. We used an approximation, based on Heinrich et al. (2021), of 100 years for the time of convergence
of the carbon accumulation process. The convergence time depends only on α and the convergence
threshold (set to 0.99).
3. Assuming the 100 years-period and a convergence threshold of 99%, we set $\alpha=1-(1-0.99)^{1/100}=0.045$
4. Source: Heinrich et al. (2021)

### $X_0^i$
1. $X^i_0$ is the stock of CO2e stored in site's $i$ forest in 2017: Calculated as $X^i_0 = \gamma^i*(\bar{z}^i - Z_0^i)$, assuming
    that all forest at the initial point is primary
2. Source: ESA Biomass (v3) (Santoro and Cartus 2021) and MapBiomas Land Use/Cover Collection 5
(Souza et al. 2020)

### $\kappa$
1. $\kappa$ is annual emission factor from agricultural use (CO2e/ha).
2. We combined the agricultural net annual emission data at the state level from SEEG (Sistema de
Estimativas de Emissões e Remoções de Gases de Efeito Estufa) with the agricultural area from
MapBiomas at the municipal level from 1990 to 2019.
3. Finally, $\kappa$ is the sum of the net agricultural emission divided by the sum of the agricultural area
averaged across States from 1990-2019 and weighted by the fraction inside the Amazon Biome.
Since we only have coarse data at the State level, we set a single value for all sites.
4.  Source: SEEG/Observatorio do Clima (De Azevedo et al. 2018); State division aggregated from 2015
Municipality Division of IBGE (IBGE 2015); Mapbiomas Collection 5 Muni Level (Souza et al. 2020);
Biomes Division 2019 IBGE (IBGE 2019)


### $\theta^i$
1. We used data at the municipal level from the 2017 and 2006 Agricultural Census on the value of
cattle sold for slaughter per hectare of pasture.
2. To build a smoother representation of technology and fill in missing values, we used the fitted
values of the bayesian estimation to get predicted slaughter value per hectare at municipal level(See eq below),
where slaughter value per hectare is the value of cattle sold per pasture area in 2017 (USD/ha),
rainfall and temperature are the average annual precipitation (mm) and temperature (degrees Celsius),
respectively, for the period of 1970-2000, and longitude and latitude are the geographical coordinates of the
municipality centroids. $\nu^m$ is the Amazonian water basins the municipality belongs to.
3. To convert from the municipal level to the site level, we calculated a weighted mean of the predicted
value of this regression, with weights based on the share of the municipal area inside the site and the
pasture area of the municipality
4. To convert to a productivity parameter, we divide the average predicted value of cattle sold for slaughter
per hectare of pasture area for each site in 2017 or 2006 (expressed in constant 01/2017 USD) by the
observed $\bar{P_t}^a$ (expressed in constant 01/2017 USD) with $t=2017$ or $t=2006$
5. Source: 2017 Agricultural Census (IBGE 2017); 2006 Agricultural Census (IBGE 2006); Cattle prices
from SEAB-PR (SEAB-PR 2021); Historical climate data - WorldClim (Fick and Hijmans 2017)

```{math}
\begin{align*}
log(Slaughter value)= \beta^\theta_{0} + \beta^\theta_{1} (\text{historical_precip}) 
+ \beta^\theta_{2} (\text{historical_temp}) + \beta^\theta_{3} (\text{historical_temp}^2) \\
\quad + \beta^\theta_{4} (\text{lat}) + \beta^\theta_{5} (\text{lat}^2) 
+ \beta^\theta_{6} \log(\text{cattleSlaughter_farmGatePrice}) 
+ \beta^\theta_{7} (\text{distance}) + \nu^m + u_m
\end{align*}
```

### $\zeta$
1. $\zeta$s are the adjustment cost parameters. $\zeta_1$ is for deforestation and $\zeta_2$ for reforestation.
2. To calibrate $\zeta_1$, we compute the average marginal cost of
deforestation implied by our model using data from MapBiomas on annual historical deforestation between 2008 – 2017 (Souza et al., 2020) and match this to the difference in prices for
forested and clear land (Araújo, Costa and Sant’Anna, 2024). To calibrate $\zeta_2$, we compute the
average marginal cost of natural reforestation using data from MapBiomas on annual historical
secondary vegetation age (Souza et al., 2020) and match this to natural reforestation costs in
Benini and Adeodato (2017).

## Code
The calibration code is written in R and is available <a href = "https://github.com/patohdzs/amazon-carbon-prices/tree/fix_bayesian_model/rsrc">here</a>. Please refer to the `.Rprofile` file for the required package dependencies.

The code is composed of four parts:
1. `cleaning/_masterfile.R` downloads and loads the raw data, then performs cleaning and filtering to retain only relevant variables.
2. `processing/_masterfile.R` aggregates and merges datasets to align their scales, transforming the cleaned data into a format suitable for calibration.
3. `calibration/_masterfile.R` uses the processed data to calibrate the model parameters introduced in this section.
4. `_masterfile_all.R` provides a convenient one-click solution for external users by automatically executing all steps—cleaning, processing, without requiring manual intervention.

## Data Sources 📚

A summary of the raw data repositories used in this project:

- **[ESA Biomass](https://climate.esa.int/en/projects/biomass/data/)** — Spatially explicit estimates of above-ground biomass (AGB) globally  
- **[FGV (Fundação Getulio Vargas)](https://portal.fgv.br/en/research)** — Brazil’s leading economic research institute  
- **[Global Forest Watch](https://www.globalforestwatch.org/)** — Interactive platform for real-time forest monitoring  
- **[IBGE](https://www.ibge.gov.br/en/)** — Brazilian Institute of Geography and Statistics  
- **[IPEA](https://www.ipea.gov.br/forumbrics/en/)** — Institute for Applied Economic Research  
- **[MapBiomas](https://brasil.mapbiomas.org/en/)** — Annual land use/cover maps for Brazil  
- **[SEAB‑PR (Paraná Agriculture Secretariat)](https://www.agricultura.pr.gov.br/)** — State-level data on agricultural prices  
- **[SEEG](https://seeg.eco.br/english/seeg-data/)** — Brazilian greenhouse gas emissions and removals  
- **[World Bank Data](https://data.worldbank.org/)** — Comprehensive international development and environmental datasets


## References

1. Araujo, Rafael, Francisco Costa, and Marcelo Sant’Anna. 2022. “Efficient Forestation in the Brazilian
Amazon: Evidence from a Dynamic Model.” SocArXiv.
2. De Azevedo, Tasso Rezende, Ciniro Costa Junior, Amintas Brandão Junior, Marcelo dos Santos Cremer,
Marina Piatto, David Shiling Tsai, Paulo Barreto, et al. 2018. “SEEG Initiative Estimates of Brazilian
Greenhouse Gas Emissions from 1970 to 2015.” Scientific Data 5 (1): 1–43.
3. Fick, Stephen E, and Robert J Hijmans. 2017. “WorldClim 2: New 1-Km Spatial Resolution Climate
Surfaces for Global Land Areas.” International Journal of Climatology 37 (12): 4302–15.
4. Heinrich, Viola HA, Ricardo Dalagnol, Henrique LG Cassol, Thais M Rosan, Catherine Torres de Almeida,
Celso HL Silva Junior, Wesley A Campanharo, et al. 2021. “Large Carbon Sink Potential of Secondary
Forests in the Brazilian Amazon to Mitigate Climate Change.” Nature Communications 12 (1): 1–11.
5. IBGE. 2006. “Censo Agropecuário: Tabelas 930, 1421.” Instituto Brasileiro de Geografia e Estatística
(IBGE), Ministério da Economia. Available at: https://sidra.ibge.gov.br/geratabela?format=us.csv&
name=agCensus2006_agUseArea.csv&terr=NC&rank=-&query=t/1421/n6/all/v/184/p/all/c222/
113466,113467,113468,113469,113470,113471,113472/c12603/0/c12625/0/c472/0/d/v184%202/l/,p%
2Bc12603%2Bv%2Bc222,t%2Bc12625%2Bc472, https://sidra.ibge.gov.br/geratabela?format=us.csv&
name=agCensus2006_cattleSlaughter.csv&terr=NC&rank=-&query=t/930/n6/all/v/2067,2068/p/all/
c12645/114291,114292,114293/c218/0/c7940/0/c3244/0/c12517/0/c12625/0/d/v2068%200/l/,c3244%
2Bp%2Bc12517%2Bc12625%2Bc218%2Bc7940%2Bv,t%2Bc12645. Acessed on: May 5, 2023.
6. IBGE. 2015. “Malhas Muncipais: Shapefile, 2015.” Instituto Brasileiro de Geografia e Estatística
(IBGE), Ministério da Economia. Archived at: https://web.archive.org/web/20200916142056/ftp:
//geoftp.ibge.gov.br/organizacao_do_territorio/malhas_territoriais/malhas_municipais/municipio_
2015/Brasil/BR/br_municipios.zip. Archived on: September 16, 2020.
7. IBGE. 2017. “Censo Agropecuário: Tabelas 6882, 6911.” Instituto Brasileiro de Geografia e Estatística (IBGE), Ministério da Economia. Available at: https://sidra.ibge.gov.br/geratabela?format=
us.csv&name=agCensus2017_cattleSold.csv&terr=NC&rank=-&query=t/6911/n6/all/v/9523,9529,
9743,9749/p/all/c829/46302/c12625/41140/c220/110085/d/v9743%203,v9749%203/l/,p%2Bc829%
2Bc12625%2Bv,t%2Bc220, https://sidra.ibge.gov.br/geratabela?format=us.csv&name=agCensus2017_
agUseArea.csv&terr=NC&rank=-&query=t/6882/n6/all/v/184/p/all/c829/46302/c222/40677,40678,
113469,113470,113471,113472/c12771/45951/c830/46427/c220/110085/d/v184%203/l/,p%2Bc829%
2Bc12771%2Bv%2Bc222,t%2Bc830%2Bc220. Acessed on: May 5, 2023.
8. IBGE. 2019. “Biomas Do Brasil: Shapefile, 2019.” Instituto Brasileiro de Geografia e Estatística (IBGE),
Ministério da Economia. Archived at: https://web.archive.org/web/20200916173523/ftp://geoftp.ibge.
gov.br/informacoes_ambientais/estudos_ambientais/biomas/vetores/Biomas_250mil.zip. Archived on:
September 16, 2020.
9. Santoro, Maurizio, and Oliver Cartus. 2021. “ESA Biomass Climate Change Initiative (Biomass_cci):
Global Datasets of Forest Above-Ground Biomass for the Years 2010, 2017 and 2018, V3.” NERC EDS
Centre for Environmental Data Analysis. https://doi.org/10.5285/5F331C418E9F4935B8EB1B836F8A91B8.
10. SEAB-PR. 2021. “Preço Médio - Recebido Pelo Agricultor: Boi Gordo, Arroz (Em Casca), Cana-de-Açúcar,
Milho, Mandioca, 1990-2021.” Secretaria da Agricultura e do Abastecimento do Estado do Paraná,
Departamento de Economia Rural [publisher], Instituto de Pesquisa Econômica Aplicada, Ministério da
Economia [distributor]. Available at: http://www.ipeadata.gov.br. Acessed on: February 22, 2021.
11. Souza, Carlos M., Julia Z. Shimbo, Marcos R. Rosa, Leandro L. Parente, Ane A. Alencar, Bernardo F.
T. Rudorff, Heinrich Hasenack, et al. 2020. “Reconstructing Three Decades of Land Use and Land
Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine.” Remote Sensing 12 (17).
https://doi.org/10.3390/rs12172735.


<br>
<hr style="height:4px; background-color:black; border:none;">