# Inequality and Polarization: An Agent-Based Modeling Approach Using Eurostat Data

by:
- Kevin Heinrich, 11902941
- Matthias Hemmer, 11804194
- Marina Hofer, 12337819
- Christina Sophie Knes, 11902902

## Project Description

This project explores the relationship between regional income inequality and social polarization in Europe using agent-based modeling. Drawing on real-world data from Eurostat, including the Gini coefficient, at-risk-of-poverty rates and disposable income across NUTS-2 regions, we simulate how economic disparities might influence public opinion dynamics and social fragmentation.

We implement a modified Deffuant model, where agents interact and adjust their opinions based on income differences and local economic context. The model integrates heterogeneity in income, frustration levels, and opinion tolerance, creating a dynamic system that allows us to observe under which conditions polarization emerges.

Our main objectives are:
- To simulate opinion formation in a socioeconomically stratified population.
- To explore how varying levels of inequality and redistribution affect polarization.
- To identify regional vulnerability to polarization based on Eurostat indicators.

Through these simulations, we aim to provide insights into how economic structures may contribute to societal tensions, and how policy interventions might influence collective outcomes.

## Environment Setup

For the following project, **Python Version 3.13.5** was used.


**`TODO: Rewrite import description when everything is ready for submission`**

The following libraries are required for this project:

### Deep Learning
- **`torch`** – PyTorch, an open-source deep learning library for tensor computation and neural networks.

### Natural Language Processing (NLP)
- **`nltk`** – Toolkit for working with human language data, supporting text processing and analysis.
- **`spacy`** – Fast and production-ready NLP library with pre-trained models and support for pipelines.
- **`textstat`** – A package for computing text readability scores and complexity metrics.

### Standard Library Modules
- **`re`** – Provides regular expression matching operations.
- **`os`** – Interfaces for interacting with the operating system.
- **`math`** – Mathematical functions such as logarithms, square roots, and constants.
- **`unicodedata`** – Utilities for Unicode character properties and normalization.


## PIP installations

To install the required libraries, run the following:

In [1]:
! pip install eurostat
! pip install pandas
! pip install numpy

Collecting eurostat
  Downloading eurostat-1.1.1-py3-none-any.whl.metadata (26 kB)
Collecting pandas (from eurostat)
  Using cached pandas-2.3.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (91 kB)
Collecting requests (from eurostat)
  Using cached requests-2.32.4-py3-none-any.whl.metadata (4.9 kB)
Collecting numpy>=1.26.0 (from pandas->eurostat)
  Using cached numpy-2.3.1-cp313-cp313-manylinux_2_28_x86_64.whl.metadata (62 kB)
Collecting pytz>=2020.1 (from pandas->eurostat)
  Using cached pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas->eurostat)
  Using cached tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting charset_normalizer<4,>=2 (from requests->eurostat)
  Using cached charset_normalizer-3.4.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (35 kB)
Collecting idna<4,>=2.5 (from requests->eurostat)
  Using cached idna-3.10-py3-none-any.whl.metadata (10 kB)
Collecting urllib3<3,>=1.21.

## Include modules

In [None]:
import eurostat

-----------------------------------------------

## 1. Read Eurostat data

Firstly, we download the necessary datasets via the `eurostat` API.

In [None]:
df_orig_gini_coefficient = eurostat.get_data_df("ilc_di12")
df_orig_risk_of_poverty = eurostat.get_data_df("ilc_li02")
df_orig_regional_differences = eurostat.get_data_df("ilc_mded01")

## 2. Preprocess data

### 2.1 Preprocess "Gini Coefficient" Dataset

In [4]:
df_processed_gini_coefficient = df_orig_gini_coefficient.copy(deep=True)

### 2.2 Preprocess "Risk of Poverty" Dataset

In [7]:
df_processed_risk_of_poverty = df_orig_risk_of_poverty.copy(deep=True)

### 2.3 Preprocess "Regional Differences" Dataset

In [8]:
df_processed_regional_differences = df_orig_regional_differences.copy(deep=True)