Skip to content

A collection of notebooks showcasing various data cleaning and dataset creation projects using Pandas.

Notifications You must be signed in to change notification settings

benkaan001/pandas_and_beyond

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Cleaning Projects with Pandas

This repository contains a collection of Jupyter notebooks showcasing various data cleaning and dataset creation projects using Pandas and various Python libraries built for web scraping.

Table of Contents

Data Analysis Projects

Dataset Creation Projects

File Organization

pandas_and_beyond/
├── analyze_data/
│   ├── __init__.py
│   ├── 00_project.ipynb
│   ├── 01_project.py
│   ├── ...
├── data/
│   ├── __init__.py
│   ├── external/
│   │   ├── external_data.csv
│   │   └── ...
│   └── generated/
│       ├── raw/
│       │   ├── raw_data.csv
│       │   └── ...
│       ├── cleaned/
│       │   ├── cleaned_data.csv
│       │   └── ...
│       └── ...
├── generate_data/
│   ├── __init__.py
│   ├── 00_create_dataset.ipynb
│   ├── 01_web_scrape.ipynb
│   ├── ...
│   └── using_csv/
│       ├── __init__.py
│       ├── 00_read_csv.ipynb
│       ├── ...
│       └── ...
├── helper/
│   ├── __init__.py
│   ├── helper_function.ipynb
│   ├── helper_module.py
│   └── ...
└── tests/
    ├── __init__.py
    ├── test_.py
    └── ...

Requirements

  • Python 3.6 or higher
  • Pandas 1.0 or higher
  • Jupyter Notebook

Getting Started

To get started with this repository, you will need to clone or download it to your local machine. Once you have done so, you can navigate to analyze data directory and open the corresponding Jupyter notebook.

Contributing

This repository is a part of my continuous learning journey, which has been inspired by the valuable contributions made by various members of the Kaggle community. If you have a data cleaning project that you have implemented using Pandas and would like to contribute to this repository, please create a new branch and submit a pull request. Your contributions are highly appreciated and will help other learners who are looking to enhance their data cleaning skills using Pandas.

About

A collection of notebooks showcasing various data cleaning and dataset creation projects using Pandas.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published