# WM 12 Month Evaluation Notebook

_Adarsh Pyarelal_

As always, we begin with imports, and print out the commit hash for a rendered
version of the notebook.

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina')
from delphi.paths import data_dir
from delphi.AnalysisGraph import AnalysisGraph
from delphi.visualization import visualize
import delphi.jupyter_tools as jt
jt.print_commit_hash_message()

# Forecasting

Q1: How much rainfall is expected in Northern Bahr el Ghazal and Unity in the lean season?

A: (_From Cheryl's note_) The lean season represents the time before harvest, when food from the
previous harvest is scarce. There may be crops in the field and ample
rainfall, but food is scarce. The lean season may vary from year to year based 
on how much food was harvested the previous year, the timing of the planting
season, the growth season length, and other factors. 

We can approximate the lean season rainfall as the rainfall that occurs between planting and harvest. In general, maize and sorghum are planted around the same time, but maize is  harvested earlier. We can therefore use the maize growing season rainfall as the approximation of lean season rainfall. 

In [None]:
jt.get_expected_value("rainfall","Northern Bahr el Ghazal")

In [None]:
jt.get_expected_value("rainfall","Unity")

Q2: What are the expected crop yields for maize and sorghum during the summer of 2017 in Northern Bahr el Ghazal and Unity States?

In [None]:
jt.get_expected_value("production","Northern Bahr el Ghazal", crop="maize")

In [None]:
jt.get_expected_value("production","Unity", crop="maize")

In [None]:
jt.get_expected_value("production","Northern Bahr el Ghazal", crop="sorghum")

In [None]:
jt.get_expected_value("production","Unity", crop="sorghum")

# Conditional Forecasting

Q: What would be the effect on crop yields for maize and sorghum in Northern Bahr el Ghazal State and Unity State if rainfall is 60% less than the mean rainfall for the lean season in previous years (alarm-level drought scenario)?

To perform conditional forecasting, we need a causal analysis graph. In the cell below, we will construct a small one directly from text, in order to highlight the salient features of our model building process.

In [None]:
G = AnalysisGraph.from_text("""
An small increase in precipitation causes a large increase in crop yield. An increase in crop yield causes an increase in food availability.
""")
visualize(G)

Next, we assemble a stochastic transition model for the dynamic Bayes net constructed from this CAG, purely using gradable adjectives. 

In [None]:
G.res=1000
G.assemble_transition_model_from_gradable_adjectives()
G.sample_from_prior()

We then label some nodes for convenience:

In [None]:
n0 = "UN/events/weather/precipitation"
n1 = "UN/events/human/agriculture/food_production"

Then, we specify, the relative amount by which precipitation changes (-0.6 = 60%):

In [None]:
delta = -0.6

We then run an experiment to see how a 60% decrease in precipitation will affect food production (all quantities start at 1.0).

In [None]:
jt.run_experiment(G, n0, delta, n1)

We can also examine the downstream effect, on food availability:

In [None]:
jt.run_experiment(G, n0, delta,  "UN/entities/food_availability")

Now, we can sharpen our predictions by learning from data provided by DSSAT. We perform a simple linear fit to get a sharper distribution for $\beta_{precipitation,food\_production}$ in the transition matrix.

In [None]:
G.infer_transition_matrix_coefficient_from_data(
    n0, n1, state = "Northern Bahr el Ghazal", crop = "maize"
)
jt.run_experiment(G, n0, delta,  n1)

In [None]:
jt.run_experiment(G, n0, delta,  "UN/entities/food_availability")

We can see that the distributions have become sharper, both for food_production, but also for the downstream quantity food_availability (the standard deviation has reduced).

The next step is to connect these distributions of abstract quantities to indicators. For `food_production`, if we specify a custom indicator with a mean derived from the DSSAT data table, we should be able to roughly follow DSSAT trends if we want to quickly measure the effects of changing precipitation on crop yield and nodes downstream of it.

If we want more precision, we can use the exact distribution provided by a DSSAT run to more accurately infer the distribution of $\beta_{precipitation,food\_production}$ (as opposed to the simple linear fit to historical data).