<a href="https://www.kaggle.com/code/tanyaleena/why-your-vote-matters-politics-global-economics?scriptVersionId=255329700" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

**1. Introduction**

The notebook was developed to highlight why YOUR vote, why EVERY vote matters. Through the use of an interactive, time-lapse map, we leverage data from the World Bank Group DataBank to begin an exploratory analysis powered by an interactive Shiny web application.

The core question is: how do political events and leadership correlate with economic indicators like Gross Domestic Product (GDP) and poverty levels? To answer this, we will:
* Assemble a unique dataset by cleaning and merging multiple sources on GDP, poverty rates, and US political history.
* Perform an exploratory data analysis (EDA) to uncover key trends and relationships within the data.
* Recreate the core visualization: a data-rich world map for a specific year, which is the centerpiece of the final interactive app.

This notebook showcases the complete data science workflow—from raw data to a polished, insightful visualization. Let's dive in! 🌎

![Why Your Vote Matters](https://github.com/user-attachments/assets/b60fd193-56e2-45e2-96bd-c04c332506ca)

**2. Setup: Loading Libraries and Data**

First, we need to load the R packages and raw data files required for this analysis. For this notebook, you will need to upload povclean.csv, gdpclean.csv, partiesclean.csv, and custom.geo.json to your Kaggle environment.

In [None]:
# --- Install All Relevant Packages ---

install.packages(c("tidyverse", "sf", "RColorBrewer", "scales", "ggplot2", "dplyr", "tidyr", "stringr",
                  "leaflet","htmltools", "glue", "lubridate", "jsonlite", "shiny", "shinydashboard", "readr", "shinythemes"))

# --- Data Manipulation and Plotting ---
library(tidyverse) # Includes dplyr for data wrangling and ggplot2 for plotting
library(sf)        # The standard for working with spatial vector data
library(RColorBrewer)# For creating nice color palettes
library(scales)    # For formatting numbers in plots
library(ggplot2)
library(dplyr)
library(tidyr)
library(stringr)

# --- Interactive Maps (for the final visualization) ---
library(leaflet)
library(htmltools)
library(glue)
library(lubridate)
library(jsonlite)

# --- Shiny App ---
library(shiny)
library(shinydashbaord)
library(readr)
library(shinythemes)

# Set a consistent theme for our static plots
theme_set(theme_minimal(base_size = 14))

**3. Data Cleaning and Merging: Building the Master Dataset**

The foundation of any great analysis is clean, well-structured data. Our final dataset, df.rds, was created by cleaning and merging three separate files. Here is the exact process.

**3.1. Processing GDP Data**

The raw GDP data is in a "wide" format, with years as columns. We need to pivot it to a "long" format where each row represents a single country in a single year.

In [None]:
# Read the raw GDP data

gdp <- read_csv("gdpclean.csv")

# Pivot from wide to long format

gdp_long <- pivot_longer(
  gdp,
  cols = `1961`:`2024`,
  names_to = "Year",
  values_to = "gross_domestic_product"
)

# Clean the dataset: remove rows with no GDP value and convert Year to numeric

gdp_clean <- gdp_long %>%
  filter(!is.na(gross_domestic_product)) %>%
  mutate(Year = as.numeric(Year))

glimpse(gdp_clean)

**3.2. Merging GDP with Poverty Data**

Next, we'll join our clean GDP data with the poverty dataset.

In [None]:
# Read the poverty data

pov <- read_csv("povclean.csv")

# Join the two datasets by Country and Year

gdp_pov <- full_join(
  gdp_clean,
  pov,
  by = c("Country", "Year")
)

glimpse(gdp_pov)

**3.3. Integrating Political Data and Finalizing**

Finally, we'll join the economic data with the US political parties data and perform the last cleaning steps to create our analysis-ready data frame.

In [None]:
# Read the political parties data

parties <- read_csv("partiesclean.csv")

# Join with the combined economic data
# Note: This join is by "Year" only, which assumes the political data applies globally
# for the purpose of this specific analysis.

final_join <- inner_join(
  gdp_pov,
  parties,
  by = c("Year")
)

# Perform final cleaning steps

df <- final_join %>%
  drop_na(Code) %>% # Remove rows where country code is missing
  mutate(
    Year = as.integer(Year),
      
    # Clean the poverty percentage column

    percent_on_4.20 = str_remove(as.character(percent_on_4.20), "%"),
    percent_on_4.20 = as.numeric(percent_on_4.20)
  )

# Glimpse the final, analysis-ready data

glimpse(df)

**4. Exploratory Data Analysis (EDA)**

With our clean dataset, we can now explore some trends.

**Global GDP Over Time**

Let's see how the world's economy has grown by plotting the total global GDP for each year.

In [None]:
# Calculate total GDP per year

total_gdp_by_year <- df %>%
  group_by(Year) %>%
  summarise(total_gdp = sum(gross_domestic_product, na.rm = TRUE))

# Plot the trend

ggplot(total_gdp_by_year, aes(x = Year, y = total_gdp)) +
  geom_area(fill = "#2ECC71", alpha = 0.5) +
  geom_line(color = "#2ECC71", size = 1.2) +
  labs(
    title = "Total Global GDP has Skyrocketed Since the 1960s",
    subtitle = "Sum of GDP across all countries in the dataset",
    x = "Year",
    y = "Total Global GDP (in trillions USD)"
  ) +
  scale_y_continuous(labels = scales::label_dollar(scale = 1e-6, suffix = "T"))

**Finding:** The plot clearly shows an exponential increase in total global GDP, with particularly rapid growth in the 21st century.

**5. Geospatial Visualization: The Map**
   
Now for the main event! We will use the leaflet package to recreate the core visualization from the Shiny app. We'll select data from a single year (e.g., 2018) to create a static version of our interactive map.

In [None]:
# --- 1. Prepare the Data for a Specific Year ---

year_to_map <- 2018
map_data_for_year <- df %>% filter(Year == year_to_map)

# --- 2. Load and Merge with Geospatial Data ---

world_map_geo <- st_read("custom.geo.json")
final_map_data <- merge(world_map_geo, map_data_for_year, by.x = "adm0_a3", by.y = "Code", all.x = TRUE)

# --- 3. Create the Color Palette and Labels ---

pal <- colorNumeric(
  palette = brewer.pal(11, "RdYlGn"),
  domain = final_map_data$gross_domestic_product
)
final_map_data$poverty_opacity <- ifelse(is.na(final_map_data$percent_on_4.20), 0.5,
                                       0.3 + (final_map_data$percent_on_4.20 / 100) * 0.7)

labels <- lapply(1:nrow(final_map_data), function(i) {
  label_parts <- c(paste0("<strong>", final_map_data$name[i], "</strong>"))
  if (!is.na(final_map_data$gross_domestic_product[i])) {
    label_parts <- c(label_parts, paste0("GDP: ", format(final_map_data$gross_domestic_product[i], big.mark = ",")))
  }
  if (!is.na(final_map_data$percent_on_4.20[i])) {
    label_parts <- c(label_parts, paste0("% of population living on under $4.20: ", final_map_data$percent_on_4.20[i], "%"))
  }
  HTML(paste(label_parts, collapse = "<br>"))
})

# --- 4. Render the Leaflet Map ---

leaflet(final_map_data) %>%
  addProviderTiles("CartoDB.DarkMatter", options = providerTileOptions(noWrap = TRUE)) %>%
  setView(lng = 0, lat = 30, zoom = 2) %>%
  addPolygons(
    fillColor = ~pal(gross_domestic_product),
    weight = 1, opacity = 1, color = "white", dashArray = "3",
    fillOpacity = ~poverty_opacity,
    highlightOptions = highlightOptions(weight = 3, color = "#F39C12", fillOpacity = 0.9, bringToFront = TRUE),
    label = labels,
    labelOptions = labelOptions(style = list("font-weight" = "normal", padding = "3px 8px"), textsize = "15px", direction = "auto")
  ) %>%
  addLegend(pal = pal, values = ~gross_domestic_product, opacity = 0.8, title = "GDP (2018)", position = "bottomright")

**6. Conclusion**

This notebook has detailed the journey from raw, disparate data files to a clean, merged dataset ready for complex visualization. Through EDA and geospatial mapping, we've created a powerful snapshot that connects economic prosperity, poverty, and political context on a global scale.

This analysis forms the backbone of the fully interactive Shiny web application. To explore the data across all years and see the events for specific countries, please visit the links below.

View the live Shiny App: [http://0ll75a-tanya-d0costa.shinyapps.io/project]

See the full project code on GitHub: [https://github.com/tanyadcosta-alt/worldpulse.git]

Thank you for following along!