# Introduction

---Integrated Analysis: Lost Wallets, PISA Scores, and Social Capital Measures---

In the paper [What Do Cross-country Surveys Tell Us about Social Capital?](https://davetannenbaum.github.io/documents/TannenbaumCohnZundMarechal2025.pdf), Tannenbaum et al. use the [Wallet Return Dataset](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YKBODN) as a direct measure of civic honesty to investigate two types of indirect social capital measures. First, they provide an analysis of lost wallet reporting rates and their correlation to survey measures of social capital, showing the quantitative extent to which survey measures contain legitimate information about social capital. Second, they show that lost wallet reporting rates may be used as effective predictors of "Economic and Institutional Performance", confirming social capital's economic explanatory value.

I became curious of how educational assessment data would relate to these findings. The [Programme for International Student Assessment (PISA)](https://www.oecd.org/en/about/programmes/pisa.html) contains data on national educational program effectiveness, measured on 15-year-olds and is a standard dataset for comparing education outcomes between countries. Surprisingly, PISA scores were very strongly correlated with lost wallet reporting rates and consequently resulted in the following concerning the two aims of the Tannenbaum paper:
1) PISA scores correlated with survey measures of social capital in largely the same manner as lost wallet reporting rates.
2) Lost wallet reporting rate proved to be arguably a better predictor of PISA scores than of any of the other measures of "Economic and Institutional Performance".

Throwing caution to the wind and jumping to conclusions in haste: this shows that increasing the performance of education systems necessarily requires improving civic honesty. Since it is unlikely that neither civic honesty nor educational success cause the other, it stands to reason that educational success is ultimately a function of social capital which is easy to suspect as the causative factor.

### Merging PISA Data with Wallet Data
We calculate wallet reporting rates (proportion of '100' responses) per country and merge this with the PISA data.

In [5]:
import pandas as pd
import plotly.express as px

import statsmodels.api as sm

cat_cols = [
    "country",
    "response",
    "male",
    "above40",
    "computer",
    "coworkers",
    "other_bystanders",
    "institution",
    "cond",
    "security_cam",
    "security_guard",
    "local_recipient",
    "no_english",
    "understood_situation",
]

# Import Tannenbaum data
df = pd.read_csv(
    "../data/tannenbaum_data.csv",
    dtype={col: "category" for col in cat_cols},
)

# Import PISA data
pisa = pd.read_csv("../data/pisa_data.csv").rename(columns={"mean_score": "pisa_score"})
df = df.merge(pisa, how="left", on="country")

wallet_pisa = (
    df.astype({"response": int})
    .groupby(["country", "subject"])[["response", "pisa_score"]]
    .mean()
    # .reset_index()
)

wallet_pisa.corr()

Unnamed: 0,response,pisa_score
response,1.0,0.796786
pisa_score,0.796786,1.0


In [6]:
# Scatter plot
fig_scatter_pisa_wallet = px.scatter(
    wallet_pisa.reset_index(), 
    x="response", 
    y="pisa_score", 
    hover_data=["country"],
    title="PISA 2022 Reading Score vs. Wallet Return Rate by Country",
    trendline="ols"
)
fig_scatter_pisa_wallet.update_xaxes(showline=True, linecolor='black')
fig_scatter_pisa_wallet.update_yaxes(showline=True, linecolor='black')
fig_scatter_pisa_wallet.show()

Since PISA scores and lost wallet return rates are so closely correlated, it isn't surprising that we'll see that they relate similarly with other variables.

## PISA Missing Countries

The PISA data used in this report is from 2022 and was copied directly from the pdf found here:
[OECD PISA 2022 Results Vol I](https://www.oecd.org/en/publications/pisa-2022-results-volume-i_53f23881-en.html), pp. 52-57. It should be noted that the PISA 2022 data is missing data for some important countries that are included in the Wallet Return Dataset (China, Russia, India, Ghana, Kenya, South Africa). The 2018 PISA results do include measures for China and Russia, however, China proves to be an extreme outlier with the very high education scores and a very low lost wallet reporting rate. Tannenbaum's paper also noted China as a special case. East Asian countries are generally underrepresented in the wallet dataset and the three other East Asian countries included (Malaysia, Thailand, Indonesia) are very different from China, both culturally, economically, and governmentally.