Exploring Stack Overflow Trends with Pandas & Matplotlib
This project analyzes the popularity of programming languages over time using data from Stack Overflow.
Each post on Stack Overflow contains a tag indicating the programming language being discussed.
By counting how many posts mention each language, we can visualize trends in popularity and observe how interest in different languages changes over the years.
The goal is to practice data manipulation, time-series analysis, and visualization with Pandas and Matplotlib β key tools for any data analyst or ML engineer.
- Which programming languages are the most popular over time?
- How has the popularity of each language evolved?
- Which languages are rising in interest and which are declining?
- Cleaning and exploring Stack Overflow tag data.
- Using groupby() and pivot() to reshape data.
- Converting strings to datetime with
pd.to_datetime(). - Plotting time-series trends using Matplotlib.
- Customizing line charts (titles, colors, limits, legends).
- Applying rolling averages
.rolling().mean()to smooth fluctuations.
The dataset comes from Stack Overflow posts tagged with programming languages.
It includes timestamps for each post and tags identifying the language name.
Expected file structure:
data/
βββ QueryResults.csv
Example of loading data:
import pandas as pd
df = pd.read_csv("data/QueryResults.csv")- Inspect and clean the dataset (
head(),info(),isna()). - Convert timestamps to datetime objects.
- Group and count posts per language per year.
- Pivot the table to create columns for each programming language.
- Plot time-series data to show how popularity evolves.
- Apply smoothing and styling to the charts.
Option A β Google Colab (recommended):
- Click the Colab badge at the top.
- Ensure the CSV path is correct (
data/QueryResults.csv). - Run all cells.
Option B β Local:
python -m venv .venv && . .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Open the notebook in VS Code or Jupyterprogramming_languages/
βββ data/
β βββ QueryResults.csv
βββ programming_languages.ipynb
βββ README.md
βββ requirements.txt
Last updated: 2025-10-08