Skip to content

bluebread/leetcode-company-wise-rating

Repository files navigation

LeetCode Company-wise Rating Dataset

This dataset combines LeetCode problem ratings with company-specific interview questions to provide comprehensive insights into the difficulty and frequency of problems asked by different companies.

Data Sources

This dataset is created by merging two primary data sources:

1. LeetCode Problem Rating

  • Source: zerotrac/leetcode_problem_rating
  • Location: leetcode_problem_rating/ratings.txt
  • Description: Contains difficulty ratings for 2,293 LeetCode problems
  • Key Columns:
    • ID: Problem ID
    • Title: Problem title
    • Rating: Numerical difficulty rating (1084-3774)
    • Title Slug: URL-friendly problem identifier
    • Contest Slug: Associated contest information
    • Problem Index: Contest problem index

2. LeetCode Company-wise Interview Questions

  • Source: snehasishroy/leetcode-companywise-interview-questions
  • Location: leetcode-companywise-interview-questions/
  • Description: Collection of LeetCode problems organized by 470 companies
  • Organization: Each company has 5 CSV files based on time periods:
    • 1. Thirty Days.csv - Problems from the last 30 days
    • 2. Three Months.csv - Problems from the last 3 months
    • 3. Six Months.csv - Problems from the last 6 months
    • 4. More Than Six Months.csv - Older problems
    • 5. All.csv - All problems for that company
  • Key Columns:
    • ID: Problem ID
    • Difficulty: Easy/Medium/Hard classification
    • Title: Problem title
    • Frequency: How often the problem appears
    • Acceptance Rate: Problem acceptance percentage
    • Link: LeetCode problem URL
    • Topics: Problem topic tags

Dataset Creation Process

The merged dataset is created using the main.py script, which:

  1. Loads the problem ratings from leetcode_problem_rating/ratings.txt
  2. Iterates through all 2,350 company-specific CSV files
  3. Merges rating data with company problem data based on problem ID
  4. Outputs simplified CSV files containing only:
    • ID: Problem identifier
    • Title: Problem name
    • Rating: Difficulty rating

Output Structure

output/
  {Company Name}/
    README.md
    all.csv
    more-than-six-months.csv
    six-months.csv
    three-months.csv
    thirty-days.csv

Each company's README.md contains:

  • Company name as title
  • Statistical summary (mean, median, min, max, etc.) for each time period

Usage

To regenerate the dataset:

python main.py

To load individual datasets:

import pandas as pd

# Load a specific company's data
df = pd.read_csv('output/Google/all.csv')

# View rating statistics
print(df['Rating'].describe())

Credits

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages