# Adoption Data Exploration (Learner Notebook)

Welcome!  
This notebook is for you to **practice and demonstrate** your skills in **data cleaning and exploratory data analysis (EDA)**.

You will:  
1. Load the data  
2. Clean and prepare the data  
3. Perform descriptive statistics  
4. Explore adoption patterns  
5. Summarize insights  

**Note:**  
- You will be guided step by step.  
- Write your own code — the notebook will not do the work for you.  


## 1. Setup
Import the libraries you’ll need.

Tasks:
- Import `pandas`, `numpy`, `matplotlib.pyplot`, `seaborn`
- Configure plots (e.g., `sns.set_style("whitegrid")`)

In [None]:
# Your code here


## 2. Load the Data
Load the dataset(s) you’ve been provided.

Tasks:
- Load the data into one or more pandas DataFrames  
- Display the first 5 rows to inspect the data


In [None]:
# Your code here


## 3. Inspect the Data
Before analysis, you need to understand the structure.

Tasks:
- Check `.shape` and `.info()`  
- List the column names  
- Check missing values (`.isna().sum()`)  
- Identify duplicate rows (`.duplicated().sum()`)


In [None]:
# Your code here


## 4. Data Cleaning
Now it’s time to clean your data.

Tasks:
- Handle missing values (decide: drop, fill, or leave them)  
- Remove duplicates if necessary  
- Convert columns to correct data types (numeric, categorical, dates, etc.)  
- If there are categorical columns (like bands or industries), validate categories  
- (Optional) Detect outliers in numeric columns


In [None]:
# Your code here


## 5. Descriptive Statistics
Explore your numeric variables.

Tasks:
- Use `.describe()` to summarize numeric columns  
- Find min, max, mean, median of key columns  
- Comment on what you observe


In [None]:
# Your code here


## 6. Categorical Variables
Look at distributions of categories.

Tasks:
- Count unique values (`.value_counts()`) for categorical columns  
- Create bar plots to visualize the frequency of categories


In [None]:
# Your code here


## 7. Adoption Patterns
Explore adoption in the data.

Tasks:
- Calculate the **overall adoption rate**  
- Compare adoption across one or more categorical variables (e.g., groupby a category)  
- Create bar plots to show adoption rates


In [None]:
# Your code here


## 8. Behavioural Comparisons
Compare behaviours between adopters and non-adopters.

Tasks:
- Compare averages (e.g., balances or transaction counts) for adopters vs non-adopters  
- Use plots (boxplot, violin plot, histogram) to visualize differences  
- Write down your observations


In [None]:
# Your code here


## 9. Correlation Exploration
Check relationships between variables.

Tasks:
- Create a DataFrame with only numeric columns  
- Calculate correlations using `.corr()`  
- Plot a heatmap with `seaborn`  
- Identify and comment on interesting correlations


In [None]:
# Your code here


## 10. Insights & Reflection
Now summarize what you’ve found.

Tasks:
- Write **3–5 bullet points** with your insights from the analysis  
- Reflect on:
  - Which variables are strongly related to adoption?  
  - What data cleaning steps had the biggest impact?  
  - What would you investigate further with more time?


**(Double Click to write your answer here)**


# End of Notebook

You have:  
- Loaded and cleaned the data  
- Explored the dataset with descriptive statistics and EDA  
- Visualized adoption patterns  
- Summarized key insights  

This demonstrates your ability to apply **data cleaning and analysis** skills independently.
