Welcome to the Exploratory Data Analysis (EDA) repository! In this project, we use the powerful Pandas Profiling library to explore datasets in a Jupyter Notebook, providing valuable insights and visualization to better understand the data.
Exploratory Data Analysis (EDA) is a crucial step in the data science process, as it helps identify patterns, trends, and outliers in the data. In this repository, we demonstrate how to perform EDA using the Pandas Profiling library within a Jupyter Notebook. The Pandas Profiling library automates much of the EDA process, generating comprehensive reports with insightful visualizations to facilitate data understanding.
To run the EDA Jupyter Notebook, you will need the following software and libraries:
- Python 3.x
- Jupyter Notebook
- Pandas
- Pandas Profiling
To get started with the EDA Jupyter Notebook, follow these steps:
- Clone the repository using
git clone https://github.com/Briankim254/Explaratory-data-analysis.git
- Navigate to the project directory using
cd Explaratory-data-analysis
- Install the required libraries by running
pip install -r requirements.txt
- Launch Jupyter Notebook by running
jupyter notebook
- Open the
EDA_with_Pandas_Profiling.ipynb
notebook in Jupyter Notebook - Load your dataset by replacing the sample dataset in the
read_csv()
function with your own CSV file - Run all cells in the notebook to generate the Pandas Profiling report
- Explore the interactive report to better understand your dataset, identifying patterns, trends, and outliers
We welcome contributions to improve and expand the EDA repository. To contribute, please follow these steps:
- Fork the repository and create a new branch for your changes
- Make your changes or additions to the project
- Create a pull request and wait for a review from a team member
Please ensure that your code adheres to best practices for code quality and documentation.
The Exploratory Data Analysis (EDA) with Pandas Profiling repository is licensed under the MIT License. This allows for open collaboration and sharing of the project while ensuring that contributors retain ownership of their work.