# Exploratory Data Analysis (EDA) for Adobe Customer Segmentation Data

## Introduction
In this notebook, we will perform a detailed Exploratory Data Analysis (EDA) on Adobe customer data to uncover insights into user segments and behavior. Our objective is to gain an in-depth understanding of customer usage patterns, data quality, and relationships between features, setting the foundation for data-driven marketing recommendations.

This EDA will proceed through the following steps:

1. **Data Overview and Initial Exploration**  
   We’ll begin by loading and examining the data structure, data types, and missing values to get a high-level overview.

2. **Data Quality Checks and Cleaning**  
   Data consistency is critical for reliable insights, so we'll address any duplicates, missing values, and inconsistencies, especially in categorical and numerical data.

3. **Univariate Analysis**  
   To understand each feature individually, we’ll analyze numerical and categorical feature distributions. This includes checking for outliers and skewness, which may affect downstream analysis.

4. **Bivariate Analysis**  
   This step focuses on pairwise relationships, helping us uncover significant associations between variables. We’ll explore correlations between numerical features, as well as relationships between categorical and numerical features.

5. **Multivariate Analysis**  
   Here, we’ll study relationships among multiple features simultaneously. Clustering techniques may help us identify patterns across Adobe user segments.

6. **Segment Analysis**  
   We’ll explore user segments, analyzing feature distributions within each segment to gain insights into segment-specific characteristics, which will support tailored marketing efforts.

7. **Feature Engineering Insights**  
   Based on data patterns, we will consider feature transformations, interactions, and dimensionality reduction techniques that could improve modeling.

8. **Insights and Summary**  
   Finally, we will summarize key findings, highlighting the most informative features. This overview will support targeted recommendations for marketing strategies.

Each step will be explained in detail and implemented efficiently, providing a structured approach to preparing Adobe customer data for predictive modeling and insights.


### 1. Data Overview and Initial Exploration

In this first step, we aim to gain a high-level understanding of the dataset structure and contents. This initial exploration provides crucial context for the rest of our analysis, setting us up to identify any immediate issues or notable patterns.

Our goals in this section are as follows:

- **Load and Preview the Data**: We will examine the first and last few rows to understand the general structure of the data, including feature names and a sample of the values.
- **Check Data Shape**: By looking at the dimensions of the dataset (number of rows and columns), we get an early indication of its size and complexity.
- **Review Data Types and Non-null Counts**: Knowing the data types for each column allows us to understand which features are numerical, categorical, or potentially time-based. Additionally, the non-null counts per column give us an early sense of missing values.
- **Summarize Numerical and Categorical Columns**: Descriptive statistics for numerical features (e.g., mean, standard deviation, min, and max) help us understand the range and central tendencies. For categorical features, we review category distributions to understand the spread of values within each category.

This overview provides an essential foundation for the EDA by helping us get acquainted with the data’s structure and potential issues early on.


In [1]:
# importing necessary libraries. 
import pandas as pd 
import numpy as np 

In [None]:
file_path = 