# 1. Data Validation and Cleaning
## Data Validation and Cleaning Steps
### 1. Identify and Correct Inconsistent Sales Methods
- **Objective**: Ensure consistency within the sales_method coumn to facilitate accurate grouping and analysis.
- **Process**: Examined unique values in sales_method and found inconsistencies due to different naming (e.g. "Em + Call", "Email + call")
- **Action Taken**: Stadardized the sales_method values by stripping whitespace, converting all entries to lowercase and convert the inconsistency of unique value
- **Result**: The sales_method column now has consistent value, enabling accurate analysis across different sales approaches.

### 2. Handle Missing Values in Revenue
- **Objective**: Address any missing values in the revenue column, as they are critical for calculating metrics.
- **Process**: Identified rows with missing values in revenue.
- **Action Taken**: Replaced missing values directly, ensuring that all revenue data is complete for accurate analysis..
- **Result**: Missing values in the revenue column have been filled, minimizing data loss and preserving as much information as possible.

### 3. Remove Duplicate Entries
- **Objective**: Eliminate redundant records to ensure accurate results.
- **Process**: Checked for and identifies duplicate rows in the dataset.
- **Action Taken**: Removed all duplicated entries, keeping only unique records.
- **Result**: Duplicated entries have been eliminated, ensuring each record is unique and reducing any risk of biased results.

### 4. Validate Data Types
- **Objective**: Confirm that each column has an appropriate data type to support accurate analysis.
- **Process**: Reviewed data types for each column. Found that revenue column were initially recognized as non-numeric.
- **Action Taken**: Converted revenue to numeric data types to support calculations and analysis.
- **Result**: Data types have been validated and corrected, ensuring compatibility with analytical operations.

### 5. Verifying the Data Cleaning Results
- **Objective**: Ensure all cleaning steps have been correctly applied.
- **Process**: Rechecked for missing values, duplicates, data type consistency, and standardization in sales_method after all cleaning steps.
- **Result**: the dataset has passed all validation checks and is fully prepared for analysis

## Summary
### The data validation and cleaning process has ensured that the dataset is now complete, accurate, and consistent. Key issues, including inconsistent sales methods, missing values, duplicates,and data type mismatches, have been effectively addressed. This validated dataset provides a strong foundation for in-depth analysis and reliable insights.


# 2. Exploratory Data Analysis
## Single Variable Graphics
### 1. Distributin of Revenue
- Graphics: Histogram of the revenue variable to show its distribution
- Insight: This helps identify the overall spread of revenue values, the frequency of low versus high revenue transactions, and any potential outliers.

![download (23)](download%20(23).png)

### 2. Count of Customers by Sales Method
- Graphic: Bar chart showing the count of customers for each sales_method.
- Insight: Provides an understanding of which sales methods reach the most customers, which can indicate the method's popularity or effectiveness in engaging customers.

![download (20)](download%20(20).png)


## Multi-variable Graphics
### 1. Weekly Revenue by Sales Method.
- Graphics: Line plot showing the average weekly revenue across different sales_method categories.
- Insight: This visual allows us to observe any patterns or variantions in revenue performance over time across different sales methods, helping to identify consistent or high-performing methods.

![download (22)](download%20(22).png)


## Findings
- Revenue Distribution: Revenue is heavily concentrated at lower values, with some high-revenue outliers. This may suggest most transactions yield moderate revenue, with fewer high-value transactions.
- Customer Reach by Sales Method: Email reaches the most customers, followed by Call and Email + Call. This indicates Email is widely used, possibly due to ease of scaling compared to other methods.
- Weekly Revenue Trend: Email + Call consistently generates higher weekly revenue, though it’s used less frequently than Email alone. This suggests a higher effectiveness in driving revenue but may require more resources.

# Definition of a Metric for Business Monitoring
## Metric: Revenue per Customer
- Reasoning: This metric allows the business to understand the average revenue generated by each customer per sales method. By tracking revenue per customer, the business can monitor the effectiveness of each sales method and identify opportunities for improvement or scaling.

## Calculation of Initial Metric
### To calculate the initial values, divide total revenue by the number of unique customers per sales method:
`revenue_per_customer = data.groupby('sales_method')['revenue'].sum() / data.groupby('sales_method')['customer_id'].nunique()`
`print("Revenue per Customer by Sales Method:\n", revenue_per_customer)`

## Monitoring Approach for Revenue per Customer
- Weekly Tracking: Track revenue per customer on a weekly basis for each sales method. This allows the business to detect trends, such as a drop in revenue per customer for any specific method, and adjust strategies accordingly.
- Comparison Across Sales Methods: Regularly compare revenue per customer across different methods to assess which methods are performing well. If two methods yield similar revenue per customer but have differing resource requirements, the business can prioritize the more efficient option.
- Setting Benchmarks: Use initial values of revenue per customer as benchmarks. By monitoring changes over time, the business can set goals for incremental improvements.

# Final Summary
## Summary of Findings:
- Revenue per Customer: The Email + Call method produces the highest revenue per customer, making it ideal for targeting high-revenue clients, though it may require more resources.
- Customer Reach: The Email method is highly scalable, reaching the largest number of customers at a lower cost, suitable for broad engagement.
- Weekly Trends: Email + Call shows consistent high weekly revenue, suggesting it drives greater value despite its intensive nature, whereas Email can maintain customer engagement effectively with fewer resources.

## Recommendations:
- Strategic Focus: Prioritize the Email + Call method for high-value clients while using Email for broader campaigns, optimizing cost-effectiveness.
- Metric Monitoring: Implement "Revenue per Customer" as a weekly monitored metric to assess performance, helping the business adjust its approach dynamically.
- Resource Allocation: Allocate resources to high-return sales methods and adjust for method effectiveness over time.