Skip to content

SBosquezEspinoza/Programming-Languages-Popularity-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Programming Languages Popularity Analysis

Exploring Stack Overflow Trends with Pandas & Matplotlib

Open In Colab


πŸ“˜ Overview

This project analyzes the popularity of programming languages over time using data from Stack Overflow.
Each post on Stack Overflow contains a tag indicating the programming language being discussed.
By counting how many posts mention each language, we can visualize trends in popularity and observe how interest in different languages changes over the years.

The goal is to practice data manipulation, time-series analysis, and visualization with Pandas and Matplotlib β€” key tools for any data analyst or ML engineer.


🧠 Guiding Questions

  • Which programming languages are the most popular over time?
  • How has the popularity of each language evolved?
  • Which languages are rising in interest and which are declining?

🧩 What You’ll Learn

  • Cleaning and exploring Stack Overflow tag data.
  • Using groupby() and pivot() to reshape data.
  • Converting strings to datetime with pd.to_datetime().
  • Plotting time-series trends using Matplotlib.
  • Customizing line charts (titles, colors, limits, legends).
  • Applying rolling averages .rolling().mean() to smooth fluctuations.

πŸ“‚ Dataset

The dataset comes from Stack Overflow posts tagged with programming languages.
It includes timestamps for each post and tags identifying the language name.

Expected file structure:

data/
└── QueryResults.csv

Example of loading data:

import pandas as pd
df = pd.read_csv("data/QueryResults.csv")

βš™οΈ Methods

  1. Inspect and clean the dataset (head(), info(), isna()).
  2. Convert timestamps to datetime objects.
  3. Group and count posts per language per year.
  4. Pivot the table to create columns for each programming language.
  5. Plot time-series data to show how popularity evolves.
  6. Apply smoothing and styling to the charts.

πŸ“Š Results (on the .ipynb file)


🧾 How to Run

Option A β€” Google Colab (recommended):

  1. Click the Colab badge at the top.
  2. Ensure the CSV path is correct (data/QueryResults.csv).
  3. Run all cells.

Option B β€” Local:

python -m venv .venv && . .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install -r requirements.txt
# Open the notebook in VS Code or Jupyter

πŸ“ Repository Structure

programming_languages/
β”œβ”€β”€ data/
β”‚   └── QueryResults.csv
β”œβ”€β”€ programming_languages.ipynb
β”œβ”€β”€ README.md
└── requirements.txt

Last updated: 2025-10-08

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published