# Some important notions before we start

PnET-Succession is a complex ecophysiological model (PnET) that has been further enhanced to work within the framework of LANDIS-II (as a succession extension). This makes it quite complex.

As I worked to create this document and write a complete guide to calibrate PnET-Succession, many questions arised. Many of these are important enough that I recommand that you engage with them **before continuing this guide**, as they will most likely influence your decisions moving forward.

Many of the questions and ideas mentionned here might seem overwelming at first; but you'll see that we will still end up with some concrete recommendations. In addition, you will get a complete practical example for all of the steps of the calibration in the following notebooks, that you will be able to copy or tweak for your own uses.

Still, I highly recommand you consider everything written here before you start to get a good grip of the limitations and justifications behind this calibration method.

## Why do we need to calibrate PnET-Succession ? What does calibration mean ?

```{image} ./Images/eric-prouzet-n_1SH37KbdQ-unsplash.jpg
:alt: An old termometer used to image the idea of calibration. 
:width: 300px
:align: left

Image from Eric Prouzet on Unsplash.
```

:::{note} The short answer
Calibrating means adjusting the model's parameters to ensure that its outputs are correct in context where we know what they should be.

PnET-Succession is a model where most parameters can be derived directly from empirical values, but it still requires calibration for some of them as biological processes are quite complex. We also need calibration because you want to be extra-sure that the model behaves correctly and for the right reasons since you will be using it to make predictions. 
:::

In ecology, calibrating a model involves adjusting the model's parameters to ensure that its outputs align closely with real-world observations.

Not all models need to be calibrated, of course. In science, you can find physical models (e.g. [some climate models](https://www.ouranos.ca/en/understanding-climate-concepts/climate-simulation-without-model-calibration)) that do not require calibration because their equations replicate the law of physics that drives the processes they are modelling quite perfectly. For these models, we can simply insert empirical values or constants that will drive the processes inside the model, and let things be.

That's not the case for a model such as LANDIS-II and PnET-Succession because the processes driving tree growth are much more complex to model that abiotic processes. It's currently impossible to model the fine physical processes that explain how a tree grows and die; there are too many components interacting together. This makes the elaboration of equations that explain the behaviour of the modelled processes very difficult, because everything becomes extremely contextual. I'm not an expert on these subject; but I find this text about the science of ecology by [Sutherland (2013)](http://doi.wiley.com/10.1111/1365-2745.12025) to be quite a good summary of this issue :

> One explanation of barriers to progress in ecology maintains that it is a science of middle numbers (Allen & Hoekstra 1992).
In small-number systems like the solar system, the relationships between the components, and the state of the system, can often be adequately described by a simple set of equations.
In contrast, in large-number systems such as chemical interactions in fluids, the behaviour of the system can usually be adequately described using statistical averages because of the large number of components and the simple nature of their interactions.
Ecological systems unfortunately belong to the study of middle numbers: they are too complex to describe individually, yet their components are too few and their interactions too complex to be described by statistical dynamics. 

Therefore, the best we can do in forest ecology is making models that contain as much of components of the organisms we are simulating (or their most important components). Then, we make sure that the models behave "like they should" for contexts where we are quite certain of how they should behave. In return, we will hope that this will make the models behave "like they should" in contexts where we are less certain (i.e. when we are making predictions). This, in essence, is calibration.

Still, you are in luck ! Because PnET-Succession is a model that was well thought-out enough to reproduce the dynamic of tree growth without too much calibration. This means that many of its parameters can be taken directly from empirical values without further change (as long as these empirical values are available). As [Eric Gustafson writes in his calibration tips](https://github.com/LANDIS-II-Foundation/Extension-PnET-Succession/blob/master/deploy/docs/LANDIS-II%20PnET-Succession%20v5.1%20User%20Guide.pdf) :

> One of the compelling features of PnET-Succession is that its parameters are mostly empirically estimable values, and it autonomously produces very realistic growth responses under a wide variety of abiotic conditions when its parameters are set correctly. The developers of the PnET model claim (perhaps too optimistically) that PnET does not need calibration because it is completely based on empirically observed inputs (parameters) and relationships (see Aber et al. 1995).

However, Eric Gustafson further writes that according to him, some parameters still need some calibrating. These parameters are those related to the photosynthetic capacity, and the ones determining the response of the species to changes in available water and in temperature :

> [...] because the model is designed to use empirically known parameters to mechanistically simulate growth and competition based on first principles of physiology, **the primary purpose of calibration of PnET-Succession is to get photosynthetic capacity (via Foliar Nitrogen) and amount of foliage correct**. A secondary purpose is **to calibrate parameters (species and ecoregion) that determine hydrology and response to water and temperature*. This means that of the dozens of parameters used by PnET-Succession, very few need calibration, and most applications can use empirically derived (or default) values for most parameters

In any case, you will want to do some calibration, even if it's just to be certain that your model produces realistic values. This is because LANDIS-II and PnET-Succession are used for predictive purposes (i.e. predicting how forests will evolve in a given context). As such, you want PnET-Succession to be able to reasonably mimic the way that trees grow and die inside forests so that your LANDIS-II simulation produces outputs that can give you insights about reality. This cannot happen if PnET-Succession does not replicate the way that trees of different species grow and die correctly - and there are a couple of traps that one might fall into that can make that happen.

## The traps of calibrating PnET-Succession

:::{note} The short version
PnET-Succession is so complex that a different set of parameters can lead to the same output. This means that the model can, in some situations, give you the right output for the wrong reasons.

That is why we'll try to calibrate PnET-Succession in a way that will ensure as much coherence in our parameter values as we can rather than risquing having the right outputs from incorrect values.
:::

In his calibration tips, Eric Gustafson highlights several "traps" in calibrating a complex model like PnET-Succession. All of them seem to be linked to the problem of *equifinity*, meaning that in a given context, the model can produce the same results with different parameters (understand : the same results for different reasons). 

For example : PnET-Succession might produce a relistic growth curve for a given tree species you just calibrated for the context of your study area with a set of temperature and precipitations, but it might be because you overestimated its growth with the current temperatures but understimated its sensitivity to drought. As such, while your result is correct in this situation where both of your errors compensate each other, your modelled tree species will most likely react in very un-realistic ways with different conditions of temperature or soil moisture - for example, if climate change impacts your area. This means that PnET-Succession can do the right thing for the wrong reasons in one context, and then do very wrong things because of the same wrong reasons in another context.

This means that doing a "brute-force" calibration is not necesseraly a good idea here. Brute-force means testing a lot of combination of parameter values through random permutation until we find a combination that generate a growth curve that seems realistic. But equifinity implies that there might be several such combinations, and that many might produce the right response for the wrong reasons.

To avoid these traps, Gustafon's calibration tips recommand a step-by-step approach where you calibrate models in a way that ensures that the *logic* behind the values you are using is sound. For example, tree species with different shade tolerance should have a half-saturation parameter (which is the parameter influencing shade tolerance) that represent this shade tolerance correctly, especially when compared to each other. When a tweak in a parameter is needed, Gustafson recommands doing it in a way that respects the logic of previous tweaks to other parameters - but also the difference in the life strategies of the tree species you are dealing with.

## What are we calibrating here ? Entire tree species as a whole ? Variants ? Or phenotypes ?

```{image} ./Images/saira-ahmed-760VRBl1-Gg-unsplash.jpg
:alt: Two trees in the fog. 
:height: 300px
:align: left

Image from Eric Prouzet on Unsplash.
```

:::{note} The short answer
PnET-Succession simulates tree species where all individual of the same species are the same/have the same parameters; but in real life, there are only individual trees that are different from each other, even if they are of the same species. **You need to consider this and address it explicitely in your modelling assumption**.

You have the possibility to model "variants" or "phenotypes" using pseudo-species in PnET-Succession to add some variability between trees of the same species. You will have to decide werether you need to do this based on your study area, research question and on the data available to you. But keep in mind that PnET-Succession cannot make these variants/phenotypes reproduce with each other, as they will be considered as separate species inside the model.

In any case, you should always remember and explicitly indicate that you are simulating an abstraction of many individual trees at once; and that even if you choose to model an entire tree species without simulating variants or phenotypes, you will in the end only represent trees from the area where you empirical data come from, and not necesseraly the entire species as a whole (since you'd need worlwide data for that).
:::

This is a crucial question. While PnET-Succession contains parameters that are not species-specific, most of the parameters we'll be interested in during the calibration are related to the tree species you will be simulating directly.

But what are we really creating or replicate here in PnET-Succession ? A tree species ? A variant inside this species ? Or even a given phenotype ? The key problem around these questions is that in nature, the dynamic of a given individual tree is the result of both its genotype/DNA and of its growth history (which might result in different phenotype for a given DNA); in addition, the genotype of trees of the same species might be quite different between two individuals of the same species. What that means is that two individuals of the same tree species might grow differently. Thus, a model trying to represent the growth of these two trees should ideally have a different set of parameters to represent them, so as to properly model their difference in growth.

But here we arrive at another limit of modelling in forest ecology; while PnET-Succession might be a very good model that can produce robust and realistic growth behaviours with a lot of empirical and non-calibrated parameters, it's still a model that represents entire species. PnET-Succession will, for example, simulate "Abies balsamea", the balsam fir; but on the ground, in reality, the trees of this species are "a balsam fir", not "the balsam fir". Trees are not abstraction of a species as a whole; they are individual with inter-individual and intra-specific differences. But PnET-Succession does work by simulating "species" that are an abstraction.

So, how do we define this abstraction ? How to calibrate a "tree species" while account for the fact that there are individual differences between tree of the same species, which we cannot represent in PnET-Succession ? Are we supposed to find parameters that represent "the mean" or "the average" of the species in terms of its growth, of the temperatures that it prefers, of its drought tolerance ? And how do we get empirical data that represents a "species average", since all empirical data ultimatly come from real individual trees that are not necesseraly average ? All of these questions are important as they will influence the emprical data you will need for the calibration, as well as the eventual *pseudo-species* you will define (see below).

After thinking long and hard about it, and discussing it with Eric Gustafson and Brian Sturtevant, it looks like the answer to these questions is contextual to your research question. First of all, while PnET-Succession does simulate species rather than individuals, nothing is preventing you from implementing *pseudo-species* in the model. For example, instead of just having a *Abies balsamea*, you can have a *Eastern Abies balsamea* and a *Western Abies balseamea*  to represent two variants or to phenotypes that are cohabitating in your landscapee. Unfortunatly, PnET-Succession will not be able to deal with the reproduction between these two variants (they will be considered as two completely distinct species); but it's a possibility.

But ultimately, the choice will depend on three things : your study area, your research question and the availability of data you have.

- Are there any variants or phenotypes of a given species that are well-known in your study area ? If yes, you might want to calibrate them separately **if you have empirical data or knowledge that can allow you to isolate their behaviour**. In particular, you will need empirical growth data (see first step of calibration) and information about the temperature preference and drought tolerance of each phenotype/variant.
    - If you don't have the data to calibrate each phenotype/variant, then you can still try to simulate your study area with what you have, and use a species generalisation. Simply be careful to be completly explicit about this limit in your publications, including in how you will interpret your results.
- Does your research question involves the behaviour of variants or phenotypes for a species ? For example, do you want to try to add "southern variants" in a northern landscape to help forests against climate change ? If so, you'll have to define variants.
    - If you're lacking empirical data about these variants, you might calibrate them "by hand", meaning by tweaking the values of some parameters by yourself, not based on any empirical data. For example, you might change the PsnTOpt parameter to change the optimal temperature for photosynthesis; you might increase it by one or two degrees for a southern variant.
 
Even if you end up simulating only one set of parameters (no variants or phenotypes) for a given species in your study area, be aware that you will most likely not be simulating an entire species. Depending on the origin and nature of the empirical data that you will use, you will most likely create an abstract entity that represent an average of the trees of the same species in the area where your empirical data comes from. For example : if you get empirical data from the province of Quebec in Canada to calibrate the species *Abies balsamea*, keep in mind that what you will have done is represent an abstraction of the *Abies balsamea* trees in Quebec, and not necesseraly of *Abies balsamea* as a whole. If you took data from different canadian provinces, you might get a different set of parameters. The resulting differences might be minimal; but it's important to be explicit as to what we calibrate, especially because future researchers might use your work in a different study area.

So once you have answered these questions (wherever you will simulates different variants or phenotypes or not), you will need to choose the origin of your empirical data. Let's talk about that.

## What should be the spatial extent of the empirical data (tree growth, parameter values etc.) that I will use for my calibration ?

:::{note} The short answer

Take in as much data as you can in the species distribution range of your tree species. Then, compare the data from inside or near your study area (if there is any) with the rest of the data you have. Use this comparison to decide where you want to stop in spatial extent of the data you will take. The idea is to take enough data to capture the behaviour of your tree species and its difference to other tree species; but to stop as soon as you can to limit the amount of different genotypes, phenotypes or variants that are present in your data. 

In situations where you lack data, do with what you have unless what you have seems really strange (in particular where comparing data between the different species you want to calibrate). Try to prioritize coherence; for example, if an empirical value of half-saturation for photosynthesis you got for a species clearly does not align with what you know about its shade tolerance (or with the half-saturation values you have with your other species), you might be better off using an arbitrary value that will be more coherent with the logic of PnET-Succession (which is : the half-saturation between your different species should represent their shade tolerance relative to each other). 

**In any case, always be explicit and transparent about your choices. Document them as much as you can. They will help you make sense of the outputs of the model, and nuance your conclusions about your results.**
:::

This is again a crucial question. For the calibration, you will need empirical data about the tree growth of each of your species (step 1), about their reactions to changes in temperature and soil water content (step 2 and 3) and also many other parameters that can be derived from empirical values. But you will most certainly find empirical data from inside and outside your study area. What should you choose ? This is especially crucial for the empirical growth data you will require for step 1.

To make a clear example of the problem, let's imagine the following situation : we're making LANDIS-II simulations with PnET-Succession for a landscape in the province of Quebec, Canada. We want to calibrate the parameters of *Abies balsamea* for this. So we download empirical growth data for this for step 1 of the calibration. But this gives us a choice : we can use empirical data from only Quebec; we can add data from the neighbouring province of Ontario; or we can even take data for all of Canada.

Thing is, using data for only Quebec will limit us because we will have less data points. This will increase our uncertainity about the characteristics of the growth of our species that we're trying to capture for the calibration (especially for rarer tree species !). For example, during step 1 of the calibration, we need a good estimation of the "growth peak" of the species. But are we certain that the Quebec data will properly show the growth peak ? Or maybe it will underestimate it ?

But using data from surrounding provinces, or even for the whole of Canada, will lead us to approximate the behaviour of this species from an increasing number of individual trees - coming from different biotic and abiotic conditions where they might deal with competition from different species, etc. This means that we are trying to mix more trees, phenotypes or variants (or just difference in external growth conditions that influenced the growth of the trees) into one abstract entity. And while it's difficult to have a definitive idea of how this will influence the parameters you will have, it's easy to understand that this might create issues. To get back to our example with the growth peak, let's say that by taking data for all of Canada, you're seeing a growth peak that is higher than the one you see if you only look t data from Quebec. Is this higher growth peak pertinent to you ? Maybe it represents the performance of a variant or local genotype that is not in Quebec; or maybe it represents the effect of growth conditions (for example, better temperature or precipitations or soils) that are not present in Quebec. The question remain; should you use this data ? 

So in the end, we have a clear trade-off : the larger the area from which you will take empirical data, the more data you will have to capture the variability in the species's growth patterns or caracteristics - which is great where local data is scarce, or not very well sampled ! But this will come with the problem of potentially misrepresenting the growth of your species in your local context, since you will be using data from very different contexts. As such, you need to have enough data to properly capture your tree species behaviour; but not too much so that you can represent its *local* behaviour. The best approach is most likely to keep the spatial extent of your empirical data as small as can be, while ensuring you captured enough variability.

:::{attention} But why do we need local empirical data to calibrate ? Can't PnET-Succession deal with any provenance of data, as long as we match it with the right climate or soil data ?

A part of you might ask if this problem of needing "local" estimate of our tree species's performance is a real issue or not. That because as PnET-Succession is such a good model, as long as we match the empirical data with the righ context associated with it (e.g. the soils, precipitations, light, competition and everything else), **wherever this empirical data comes from**, then maybe our calibration will produce the right parameters that will produce realistic outputs ! That basically, the important thing is not to have empirical data from your study area, but rather to match the empirical data with the right context.

A consequence of this idea is that if two researchers try to calibrate the same tree species using empirical data from two far away places (e.g. at the extreme north or south of the species distribution), then as long as they do the calibration right (by pairing this empirical data to the right temperatures, soils and precipitations), they will get the same parameters ! Another consequence of this idea is that basically, there could be a single set of PnET-Succession parameters for each unique tree species on earth. That all that PnET-Succession needs to model the growth of a tree species after that is the local context (i.e. climate, soil, but also competition from other trees). 

I'm of the opinion that this is no true, and for one good reason : the parameters of PnET-Succession are used to represent several characteristics of our tree species (e.g. foliage nitrogen, drought tolerance, etc.) that are well-known to vary between individuals of a same species. If you take two trees of the same species, but one from the northern edge or the species' distribution and the other from the southern edge, they are bound to be quite different because they grew in different conditions, expressed different genes, and thus had a distinct phenotype. That's not even mentionning local adaptation in the genotype of a given species, which can also be quite substancial.

That is why, if two researchers calibrate the same tree species but with data from the northern and southern edges of its distribution, then they're most likely bound to have different parameters which will represent the variation of the tree's genotype and phenotype along its range. **This is the reason why I recommand to limit the spatial extent of your empirical data as much as possible**, so that you avoid mixing data from trees of the same species but that are yet different to generate the parameters of a single tree species in PnET-Succession). In fact, I'd say that you should expect to find relatively unique parameters for your study area.

Of course, not all of your parameters will be unique; most likely because you will lack data for some of your parameters, and you will need to use what's available. And maybe you'll find parameters really close to previously published study. But you should make sure through this calibration.
:::

So, to what spatial extent should you limit yourself ? I still do not have a clear answer to this question. I believe that this is contextual. Here is what I would recommand :
- If possible, get as much empirical data as you can using national or even international datasets (see further sections for what kind of empirical data you'll be looking for precisely).
- Then, select data from your study area for the species you want to calibrate, and compare (for each species) what you see locally versus what you see overall. Are there big differences ? For example, are your local values much smaller or bigger than the overall average ? If so, try to identify the reasons that might explain these differences.
    - If the reason is that you have only a relatively small amount of data locally (e.g. less than 50 data points to describe the growth of your tree species), then you have a good reason to take data from an extended spatial range. Try to extend until you reach a satisfactory amount (e.g. more than 100), or until you see a clear pattern emerging in your data. Be on the lookout for bimodal distributions or other patterns that might indicate that the data you're adding might be representing a population of trees with relatively different caracteristics.
    - If the reason is that the distribution or average that you see between the local data and the more global data is different, then you might be looking at differences in genotypes, phenotypes or variants that are present in your data. If that's the case, I recommand again that you take the least amount of data as possible that focuses on your study area or its surrounding. If you still see multi-modal distributions or strange patterns in your data, you might be looking at variants or phenotypes inside your study area. You will then be able to decide if you want to simulate them individually (see previous sections).

:::{hint} Being explicit and transparent about your choices of data, per [Reese et al., 2024](https://doi.org/10.1139/cjfr-2024-0085).


**Whatever data and spatial extent for your data you choose, you choose be completly transparent about this choice in the methodology section of your publications.**.

[Reese et al., 2024](https://doi.org/10.1139/cjfr-2024-0085) provides a very good step-by-step for communicating the assumptions and hypothesis surrounding your calibration, with some direct examples for LANDIS-II studies in the supplementary material (e.g. see Supp. Mat. A.).

To take again the example from the beginning of this section : let say I choose to take empirical growth data from both Ontario and Quebec (even though my study area is in Quebec and not in Ontario) for a given species, because it's a relatively rare species in Quebec, and there was more data in Ontario showing a more complete growth pattern than just the data of Quebec.

I could write in my publication :

> For *Abies balsamea*, we chose to include empirical growth data from both Ontario and Quebec for our first calibration step. This is because the data in Quebec was sparse, and adding the data from Ontario showed a clearer growth pattern that served as a better target for our calibration step. Ontario is near Quebec, and the two provinces have similar climates in some locations. However, we do ackowledge that differences might exists in trees of *Abies balsamea* between the two provinces, and in the contexts (climate, soil) that shaped them. Still, **we make the assumption that these differences are small enough to warrant using data from Ontario to calibrate our species for our study area in Quebec**, and that they will not lead to important effects on the outcomes of our simulation.

That way, the choice I made (using data from both Ontario and Quebec), the reason of my choice (not enough data in Quebec), and the assumption behind my choice (data from Ontario will not affect results too much) are clearly explained. This will help the interpretation of my results, and it will also help future modellers that might want to use the parameters resulting from my calibration. 
:::

Now that you've acquired your empirical data, there is only one question left : how to find the climate data to match it.

## What should be the spatial extent of my climate data during the calibration ?

Calibration means making simulations to tweak the parameters of the model until you get the expected output in a situation where you know what the output should be. But that's the thing : what climate data should you give to these simulations ? (let's call these "climate data stream", which are a time series of all climate variable necessery for PnET-Succession for one location, or one cell of the landscape). Indeed, you probably gathered empirical data from a large area (or even a reasonably-size landscape) for the calibration; but that means that a lot of different local climate data streams will be associated with your overall empirical data. Which one should you choose ?

To give a more practical example, let's imagine we have empirical data to calibrate *Abies balsamea* :

- Foliar nitrogen comes from measures taken in Minnessota
- Tree growth data comes from sample plots in all of Quebec and Ontario

Right here, you already have a lot of different climate involved, that are associated to each of the unique entries in those datasets. Which one should you choose ? Northern USA ? Quebec ? Ontario ? And where in Quebec and Ontario exactly ?

You could, of course, do a lot of simulations with a different climate data stream every time (e.g. one simulation with climate from the USA, one from Quebec, one from Ontario; maybe several more with different climate streams for each region). You could also calibrate using many cells in your simulation where each cell have a different climate. But I don't think it would solve the problem, since different pieces of your empirical data might come from different places (e.g. you took Foliar nitrogen values from the only place it was available, and your tree growth data for someplace else entirely). You will therefore always have a mismatch between the climate data of any of your simulation and the one that is associated to the different parts of the empirical data you have. Plus, this would take a lot of time to do, and to interpret. And your time is limited.

The ideal configuration would be that you would have an enormous amount of empirical data and measures for all of your tree species for each of the cells of your study area. Then, you'd make a calibration spanning every type of possible climate data variation in your study area, matching the right origin/cell of empirical data spatially with the right climate data stream. And thus, you'd end up with a calibration where you have to tweak parameters in order to satisfy hundred of thousands (if not millions) of calibration targets, one for each cell of your landscape. While technologies have massively improved the amount of data available, we're not there yet; and I'm not sure our human brain can deal with such a complex calibration.

So here, we're trying to have one or several calibration targets, which means only a couple of climate data streams at the most. That gets us back to the question : what climate data stream to choose, if our empirical data comes from many different places ?

In the end, the climate data stream you will choose will depend on the calibration step you are in. I will give some propositions here. In any case, I recommand that you duly note and recognize the assumption that the climate data stream you will choose (e.g. from this place in Ontario, or this other place in Quebec, etc.) is adapted to the calibration step you are dealing with. It will remain an assumption, because we can hardly prove it for sure. Remember to document this assumption and communicate it properly in your publications (see previous section).

In calibration step 1, you will need an "ideal climate" in order to match the best possible growth peak and initial growth seen in your empirical growth data. Therefore, the idea here is to get a climate data stream that is a pretty good match to the conditions that created these upper-bound growth performance for your tree species. To do this, here are some recommendations to create your climate data stream :

- [In his calibration tips, Eric Gustafson](./ReferencesAndData/Documentation/Gustafson2024PnETUserGuide.pdf) recommands using a "constant" climate data stream, meaning it's the same climate every year (although the climate changes every month). This avoids the counfounding effect of "extreme" events, and thus allows you to better interpret what you're doing during the calibration phase.
-  You could try to identify what is the precise place in space that is associated with your empirical growth data, and in particular where does the data that show the growth peak come from. Once you have narrowed down the place in space in question, you can gather climate data stream from this place and average them both spatially and temporally, to a get a constant climate data stream (see previous point) where each value is the average of the different climate grid cells of climate station data you have taken from the place.
    - If the empirical growth data you're using for step 1 is not associated to a particular place in space but rather to a large spatial extent (e.g. you are using growth curve generated from a model like the Forest Vegetation Simulator; see next pages of this guide), then I would recommand that you get a climate data stream from a place in the large spatial extent where the tree species in question thrives the most.
 
For calibration step 2, the climate stream is less important - since you'll be interested in looking at how your tree species compete for light with each other. I recommand you use the climate data stream - or one of them if you have several - that you used for step 1.

For calibration step 3, we're interested in the effect of adding or restricting water in the soils. This can be done without changing the climate data stream from step 1 and step 2, and simply by changing the soils (which will alter the water retention of the soil). Therefore, I recommand again that you use the same climate data stream as for step 1.

For calibration step 4, we're interested in the effect of temperature. In this case, you'll have to devellop several climate data stream where temperature changes. Ideally, the precipitations should remain the same - so that you can isolate the effect of temperature and drought/water restriction. You can use the same location of the climate data stream from step 1, but then take predictions under several RCP or SSP climate change scenarios with different temperatures - all the while keeping the precipitation similar to your historical data. You can also manually edit your climate data stream to increase or decrease your temperatures while keeping precipitations similars.

I expect that by following these advices, you will have something that is ultimatly imperfect, but still sufficient to obtain climate data that are pertinent to the goals of the calibration.