Skip to content

πŸš€ LeetCode Pandas Solutions 🐼 Master Pandas via LeetCode problems. Solutions with concept breakdowns (DataFrames, grouping, vectorization). Covers data manipulation, aggregation, interview strategies. Jupyter notebooks & exercises. Ideal for data science prep. #Pandas #LeetCode #DataScience

Notifications You must be signed in to change notification settings

iamAntimPal/LeetCode-Introduction-to-Pandas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LeetCode - Introduction to Pandas

πŸ“Œ Problem Statement

This repository serves as an introduction to Pandas, which is a powerful data manipulation and analysis library for Python. It provides data structures like DataFrame and Series, designed to handle and manipulate large datasets efficiently. Pandas is widely used in data science and machine learning for tasks like data cleaning, exploration, and visualization.

This repository contains solutions to various LeetCode problems, where we use Pandas to manipulate and solve problems more effectively, especially those dealing with large datasets and structured data.


πŸ“Š Table of Contents

  1. Introduction
  2. Installation
  3. Pandas Basics
  4. Pandas CheetSheet
  5. LeetCode Problems
  6. Useful Links

πŸ“š Introduction

What is Pandas?

Pandas is an open-source data manipulation and analysis library for Python. It offers data structures like Series (for one-dimensional data) and DataFrame (for two-dimensional data) that make it easy to manipulate, analyze, and visualize data in various formats (CSV, Excel, SQL databases, JSON, etc.).

Key Features:

  • Data Structures: Pandas primarily offers two data structures:

    • Series: A one-dimensional labeled array, similar to a list or array.
    • DataFrame: A two-dimensional labeled data structure, like a table or spreadsheet, where each column can be a different type.
  • Data Alignment: Automatically aligns data based on labels or indexes during operations.

  • Handling Missing Data: Pandas provides methods to handle missing data in datasets.

  • Data Aggregation: Allows you to group data and perform operations like sum, mean, count, etc.

  • Merging and Joining: Easily combine different datasets using joins or merges.

  • Filtering and Sorting: Provides rich functionality for filtering and sorting data based on multiple conditions.

Why Use Pandas?

  • Efficiency: Pandas is optimized for performance and can handle large datasets much faster than traditional Python data structures.
  • Easy to Use: With just a few lines of code, you can manipulate and analyze data easily.
  • Integration: It integrates well with other libraries such as NumPy, Matplotlib, and Scikit-learn, making it ideal for data science and machine learning workflows.

πŸ›  Installation

To get started with Pandas, you need to install the library using the following command:

pip install pandas

Additionally, you might also need other dependencies, such as numpy, which Pandas relies on for numerical computations:

pip install numpy

πŸ§‘β€πŸ’» Pandas Basics

Here are some fundamental Pandas operations you'll use to solve problems in this repository.

1. Importing Pandas

import pandas as pd

2. Creating a DataFrame

A DataFrame is the core data structure in Pandas, and it can be created from various data sources like lists, dictionaries, or external files.

# Creating a DataFrame from a dictionary
data = {'Name': ['John', 'Anna', 'Peter', 'Linda'],
        'Age': [28, 24, 35, 32]}
df = pd.DataFrame(data)

3. Data Inspection

  • View the first few rows:

    print(df.head())
  • Get DataFrame info:

    print(df.info())
  • Descriptive statistics:

    print(df.describe())

4. Selecting Data

  • Select a single column:

    df['Name']
  • Select multiple columns:

    df[['Name', 'Age']]
  • Row selection by index:

    df.iloc[0]  # Selects the first row

5. Data Cleaning

  • Handling missing values:
    df.isna().sum()  # Check for missing values
    df.dropna()  # Drop rows with missing values
    df.fillna(value=0)  # Replace NaN with a specific value

6. Grouping and Aggregation

  • Group by a column:
    df.groupby('Age').mean()  # Group by 'Age' and calculate the mean

7. Merging DataFrames

  • Merge two DataFrames:
    df1.merge(df2, on='column_name')

🧩 Pandas CheetSheet

cheet sheet

---

🧩 LeetCode Problems

This repository includes solutions to various LeetCode problems using Pandas. Below are some examples:

Problem 1: Problem Title

  • Description: [Brief description of the problem]
  • Solution:
    # Pandas solution code

Problem 2: Problem Title

  • Description: [Brief description of the problem]
  • Solution:
    # Pandas solution code

Feel free to explore the other problems and their solutions in the solutions/ folder.


πŸ”— Useful Links

About

πŸš€ LeetCode Pandas Solutions 🐼 Master Pandas via LeetCode problems. Solutions with concept breakdowns (DataFrames, grouping, vectorization). Covers data manipulation, aggregation, interview strategies. Jupyter notebooks & exercises. Ideal for data science prep. #Pandas #LeetCode #DataScience

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages