In [1]:
import json
import pandas as pd
from IPython.display import display, Markdown
import openai

# Load model results
with open('../model_what_if/xgb_scenarios.json') as f:
    xgb = json.load(f)

with open('../model_what_if/lstm_scenarios.json') as f:
    lstm = json.load(f)

# Construct scenario comparison
comparison_scenarios = pd.DataFrame({
    "Indicator Change": [
        "GDP ↑ by 10%",
        "Social Support ↑ by +0.1",
        "Freedom ↑ by +0.1"
    ],
    "Predicted Happiness Score (XGBoost)": [
        xgb["GDP+10%"],
        xgb["SocialSupport+0.1"],
        xgb["Freedom+0.1"]
    ],
    "Predicted Happiness Score (LSTM)": [
        lstm["GDP+10%"],
        lstm["SocialSupport+0.1"],
        lstm["Freedom+0.1"]
    ]
})

# Markdown Display
display(Markdown("##  Scenario-Based Comparison between XGBoost and LSTM"))
display(comparison_scenarios)



##  Scenario-Based Comparison between XGBoost and LSTM

Unnamed: 0,Indicator Change,Predicted Happiness Score (XGBoost),Predicted Happiness Score (LSTM)
0,GDP ↑ by 10%,5.351331,5.706116
1,Social Support ↑ by +0.1,5.461033,5.754849
2,Freedom ↑ by +0.1,5.612052,5.775245


In [None]:
import openai
from openai import OpenAI

# Replace this with your actual OpenAI key
client = OpenAI(api_key="your-api-key") 

def get_genai_opinion(xgb, lstm):
    prompt = f"""
You are an expert in socio-economic modeling.

We have the following happiness predictions for Ukraine under three separate interventions:

| Indicator Change          | XGBoost | LSTM |
|--------------------------|---------|------|
| GDP ↑ by 10%             | {xgb['GDP+10%']:.2f}     | {lstm['GDP+10%']:.2f}  |
| Social Support ↑ by +0.1 | {xgb['SocialSupport+0.1']:.2f}     | {lstm['SocialSupport+0.1']:.2f}  |
| Freedom ↑ by +0.1        | {xgb['Freedom+0.1']:.2f}     | {lstm['Freedom+0.1']:.2f}  |

Compare the sensitivity of the models. Which factor has the most impact? Which model behaves more realistically and would be more reliable for policy planning?
"""

    response = client.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": "You are a senior social data scientist."},
            {"role": "user", "content": prompt}
        ]
    )

    return response.choices[0].message.content
genai_comment = get_genai_opinion(xgb, lstm)

display(Markdown("## GenAI’s Analysis"))
display(Markdown(genai_comment))


## GenAI’s Analysis

The sensitivities of the models can be compared by assessing the changes in the predicted happiness under different interventions for both XGBoost and LSTM models.

From the given data, all three interventions improve the happiness prediction in both models.

## Sensitivity comparison:

For the XGBoost model, the increases are +0.15 for the GDP increase, +0.11 for the social support increase, and +0.15 for the freedom increase.

For the LSTM model, the increases are +0.36 for the GDP increase, +0.04 for the social support increase, and +0.03 for the freedom increase.

## Most impactful factor:

For the XGBoost model, the GDP increase and the freedom increase both have the most impact.

For the LSTM model, the GDP increase has the most impact.

## Realism and Reliability:

Determining which model behaves more realistically depends on the specific context and assumptions made in the models. Generally, it would be expected that both increases in GDP and increases in social support and freedom would all positively contribute to happiness to varying degrees. 

The LSTM model shows a stronger dependency on the GDP increase, which might be too optimistic an outlook, although it depends on the actual socio-economic context. If other factors such as social support and freedom are equally important in real-world considerations, the XGBoost model that attributes similar importance to all three factors might be a better option. 

Given that policy planning has to often balance between different sorts of interventions, a model that doesn't overemphasize one factor over others would generally be more useful. Therefore, based on the available information, the XGBoost model might be more reliable for policy planning. But it would be very useful to also have confidence intervals or some measure of uncertainty for these predictions, to sophisticatedly guide policy decisions based on machine learning models.