# Homework Assignment 8: Advanced Data Wrangling - Capstone Project

**Student Name:** [Enter Your Full Name Here]

**Student ID:** [Enter Your Student ID]

**Date Submitted:** [Enter Today's Date]

**Due Date:** [Insert Due Date Here]

---

## 🎓 Capstone Project Overview

This is your **final capstone assignment** that integrates everything you've learned in this R data wrangling course. You'll work with real business data to perform a comprehensive analysis that demonstrates mastery of:

- Data import and validation
- Data transformation with dplyr (select, filter, arrange, mutate, summarize, group_by)
- String manipulation with stringr
- Date/time operations with lubridate
- Complex conditional logic with case_when()
- Data quality checks and validation
- Business intelligence reporting

## 🎯 Business Scenario

You are a data analyst for a growing e-commerce company. The executive team needs a comprehensive analysis of sales performance, customer behavior, and operational efficiency. Your analysis will directly inform strategic decisions about:
- Resource allocation across regions
- Customer retention strategies
- Product portfolio optimization
- Operational improvements

## 📊 Your Deliverables

1. **Data Quality Report**: Validate and clean the data
2. **Customer Segmentation**: Identify high-value customers and at-risk accounts
3. **Product Performance Analysis**: Evaluate product categories and features
4. **Regional Performance**: Compare performance across geographic regions
5. **Temporal Trends**: Analyze patterns over time
6. **Executive Dashboard**: Create a comprehensive business intelligence summary
7. **Strategic Recommendations**: Provide data-driven business recommendations

## 📁 Dataset

You will work with `company_sales_data.csv` which contains:
- Sales transactions with revenue, cost, and profit data
- Customer information and transaction dates
- Product details and categories
- Regional information
- Sales representative data

## ⏱️ Time Estimate

Plan for 3-4 hours to complete this comprehensive analysis.

## 📝 Instructions

- Complete all tasks in order
- Write your code in the designated TODO sections
- Use the pipe operator (`%>%`) to chain operations
- Add comments explaining your business logic
- Run all cells to verify your code works
- Answer all reflection questions
- Create professional, business-ready outputs

---

## Part 1: Data Import and Initial Validation

**Business Context:** Before any analysis, professional data analysts must ensure data quality and understand the dataset structure.

**Your Tasks:**
1. Load all required packages
2. Import the sales data
3. Perform initial data exploration
4. Validate data quality
5. Document any issues found

In [None]:
# Task 1.1: Load Required Packages
# TODO: Load tidyverse (includes dplyr, stringr, ggplot2)


# TODO: Load lubridate for date operations


cat("✅ Packages loaded successfully!\n")

In [None]:
# Task 1.2: Import Sales Data
# TODO: Import company_sales_data.csv from the data/ directory
# Store it in a variable called 'sales_data'
sales_data <- 

cat("✅ Data imported successfully!\n")
cat("Total rows:", nrow(sales_data), "\n")
cat("Total columns:", ncol(sales_data), "\n")

In [None]:
# Task 1.3: Initial Data Exploration

cat("=== DATA STRUCTURE ===\n")
# TODO: Display the structure using str()


cat("\n=== FIRST 10 ROWS ===\n")
# TODO: Display first 10 rows


cat("\n=== SUMMARY STATISTICS ===\n")
# TODO: Display summary statistics


cat("\n=== COLUMN NAMES ===\n")
# TODO: Display all column names



In [None]:
# Task 1.4: Data Quality Validation

cat("=== DATA QUALITY REPORT ===\n\n")

# TODO: Create a data quality summary with:
#   - missing_values: count of NA values per column
#   - negative_revenue: count of Revenue < 0
#   - negative_cost: count of Cost < 0
#   - zero_units: count of Units_Sold <= 0
#   - duplicate_rows: count of duplicate rows

data_quality_summary <- list(
  # Your code here:
  
)

# Display quality report
cat("Missing Values:\n")
print(data_quality_summary$missing_values)

cat("\nData Validation:\n")
cat("Negative Revenue:", data_quality_summary$negative_revenue, "\n")
cat("Negative Cost:", data_quality_summary$negative_cost, "\n")
cat("Zero/Negative Units:", data_quality_summary$zero_units, "\n")
cat("Duplicate Rows:", data_quality_summary$duplicate_rows, "\n")

if (sum(data_quality_summary$negative_revenue, 
        data_quality_summary$negative_cost,
        data_quality_summary$zero_units) == 0) {
  cat("\n✅ All validation checks passed!\n")
} else {
  cat("\n⚠️  Data quality issues detected!\n")
}

## Part 2: Data Transformation and Feature Engineering

**Business Context:** Raw data needs calculated fields and categorizations to generate business insights.

**Your Tasks:**
1. Calculate financial metrics (Profit, Profit_Margin, ROI)
2. Create performance categories
3. Clean and standardize text fields
4. Parse and extract date components
5. Create customer value scores

In [None]:
# Task 2.1: Calculate Financial Metrics
# TODO: Create a new dataframe 'sales_enhanced' with these new columns:
#   - profit: Revenue - Cost
#   - profit_margin: (profit / Revenue) * 100
#   - roi: (profit / Cost) * 100
#   - revenue_per_unit: Revenue / Units_Sold
#   - cost_per_unit: Cost / Units_Sold

sales_enhanced <- sales_data %>%
  mutate(
    # Your code here:
    
  )

# Display sample of calculated metrics
cat("Financial Metrics Sample:\n")
sales_enhanced %>%
  select(Revenue, Cost, profit, profit_margin, roi) %>%
  head(10) %>%
  print()

In [None]:
# Task 2.2: Create Performance Categories
# TODO: Add these new categorical columns using case_when():
#   - performance_tier: "High" if profit_margin > 40, "Medium" if > 25, else "Low"
#   - revenue_size: "Large" if Revenue > 25000, "Medium" if > 15000, else "Small"
#   - deal_type: "Bulk" if Units_Sold > 40, "Standard" if > 15, else "Small"
#   - high_value_flag: "Yes" if Revenue > 20000, else "No"

sales_enhanced <- sales_enhanced %>%
  mutate(
    # Your code here:
    
  )

# Display category distribution
cat("Performance Tier Distribution:\n")
table(sales_enhanced$performance_tier) %>% print()

cat("\nRevenue Size Distribution:\n")
table(sales_enhanced$revenue_size) %>% print()

In [None]:
# Task 2.3: Clean and Standardize Text Fields
# TODO: Create cleaned versions of text columns:
#   - product_category_clean: Trim whitespace and convert to Title Case
#   - region_clean: Trim whitespace and convert to Title Case
#   - sales_rep_clean: Trim whitespace and convert to Title Case
# Hint: Use str_trim() and str_to_title()

sales_enhanced <- sales_enhanced %>%
  mutate(
    # Your code here:
    
  )

# Show unique values before and after
cat("Original Product Categories:\n")
print(unique(sales_data$Product_Category))

cat("\nCleaned Product Categories:\n")
print(unique(sales_enhanced$product_category_clean))

In [None]:
# Task 2.4: Parse Dates and Extract Components
# TODO: Add these date-related columns:
#   - date_parsed: Parse the Date column (check format first!)
#   - sale_year: Extract year
#   - sale_month: Extract month number
#   - sale_month_name: Extract month name (label=TRUE, abbr=FALSE)
#   - sale_quarter: Extract quarter
#   - sale_weekday: Extract weekday name (label=TRUE, abbr=FALSE)
#   - is_weekend: TRUE if Saturday or Sunday
# Hint: Use ymd(), mdy(), or dmy() depending on date format

sales_enhanced <- sales_enhanced %>%
  mutate(
    # Your code here:
    
  )

# Display date components
cat("Date Components Sample:\n")
sales_enhanced %>%
  select(Date, date_parsed, sale_month_name, sale_weekday, is_weekend) %>%
  head(10) %>%
  print()

In [None]:
# Task 2.5: Create Customer Value Score
# TODO: Create a 'customer_value_score' column using case_when():
#   - "Platinum": high_value_flag = "Yes" AND performance_tier = "High"
#   - "Gold": high_value_flag = "Yes" OR performance_tier = "High"
#   - "Silver": revenue_size = "Medium" OR performance_tier = "Medium"
#   - "Bronze": All others

sales_enhanced <- sales_enhanced %>%
  mutate(
    # Your code here:
    
  )

# Display value score distribution
cat("Customer Value Score Distribution:\n")
table(sales_enhanced$customer_value_score) %>% print()

cat("\nValue Score by Performance Tier:\n")
table(sales_enhanced$customer_value_score, sales_enhanced$performance_tier) %>% print()

## Part 3: Comprehensive Business Analysis

**Business Context:** Executives need insights across multiple dimensions to make strategic decisions.

**Your Tasks:**
1. Analyze performance by region
2. Evaluate product categories
3. Assess sales representative performance
4. Analyze temporal patterns
5. Identify top performers and opportunities

In [None]:
# Task 3.1: Regional Performance Analysis
# TODO: Create 'regional_performance' by grouping by region_clean and calculating:
#   - total_revenue: sum of Revenue
#   - total_profit: sum of profit
#   - avg_profit_margin: mean of profit_margin
#   - transaction_count: count using n()
#   - total_units: sum of Units_Sold
#   - avg_deal_size: mean of Revenue
# TODO: Add revenue_share: (total_revenue / sum(total_revenue)) * 100
# TODO: Arrange by total_revenue descending

regional_performance <- sales_enhanced %>%
  # Your code here:
  

cat("=== REGIONAL PERFORMANCE ANALYSIS ===\n")
print(regional_performance)

# Identify top region
top_region <- regional_performance$region_clean[1]
cat("\n🏆 Top Performing Region:", as.character(top_region), "\n")

In [None]:
# Task 3.2: Product Category Analysis
# TODO: Create 'category_performance' by grouping by product_category_clean
# Calculate the same metrics as regional analysis
# Add revenue_share and arrange by total_revenue descending

category_performance <- sales_enhanced %>%
  # Your code here:
  

cat("=== PRODUCT CATEGORY PERFORMANCE ===\n")
print(category_performance)

# Identify top category
top_category <- category_performance$product_category_clean[1]
cat("\n🏆 Top Performing Category:", as.character(top_category), "\n")

In [None]:
# Task 3.3: Sales Representative Performance
# TODO: Create 'sales_rep_performance' by grouping by sales_rep_clean
# Calculate: total_revenue, total_profit, avg_profit_margin, transaction_count
# Arrange by total_revenue descending and show top 10

sales_rep_performance <- sales_enhanced %>%
  # Your code here:
  

cat("=== TOP 10 SALES REPRESENTATIVES ===\n")
print(head(sales_rep_performance, 10))

In [None]:
# Task 3.4: Monthly Trend Analysis
# TODO: Create 'monthly_trends' by grouping by sale_year and sale_month_name
# Calculate: total_revenue, transaction_count, avg_profit_margin
# Arrange by sale_year and sale_month

monthly_trends <- sales_enhanced %>%
  # Your code here:
  

cat("=== MONTHLY SALES TRENDS ===\n")
print(monthly_trends)

In [None]:
# Task 3.5: Weekday Pattern Analysis
# TODO: Create 'weekday_patterns' by grouping by sale_weekday
# Calculate: transaction_count, total_revenue, avg_revenue
# Arrange by transaction_count descending

weekday_patterns <- sales_enhanced %>%
  # Your code here:
  

cat("=== TRANSACTION PATTERNS BY WEEKDAY ===\n")
print(weekday_patterns)

# Calculate weekend vs weekday performance
weekend_summary <- sales_enhanced %>%
  group_by(is_weekend) %>%
  summarize(
    transaction_count = n(),
    total_revenue = sum(Revenue),
    .groups = 'drop'
  )

cat("\n=== WEEKEND VS WEEKDAY ===\n")
print(weekend_summary)

In [None]:
# Task 3.6: Multi-Dimensional Analysis
# TODO: Create 'region_category_performance' by grouping by region_clean AND product_category_clean
# Calculate: total_revenue, transaction_count, avg_profit_margin
# Arrange by total_revenue descending and show top 15

region_category_performance <- sales_enhanced %>%
  # Your code here:
  

cat("=== TOP 15 REGION-CATEGORY COMBINATIONS ===\n")
print(head(region_category_performance, 15))

## Part 4: Executive Dashboard and KPIs

**Business Context:** Executives need a high-level summary of business performance with key metrics and insights.

**Your Tasks:**
1. Calculate overall business KPIs
2. Identify top performers across all dimensions
3. Highlight areas of concern
4. Create a professional executive summary

In [None]:
# Task 4.1: Calculate Overall Business KPIs
# TODO: Create 'business_kpis' with these metrics:
#   - total_revenue: sum of Revenue
#   - total_profit: sum of profit
#   - overall_profit_margin: mean of profit_margin
#   - total_transactions: count using n()
#   - total_units_sold: sum of Units_Sold
#   - avg_transaction_value: mean of Revenue
#   - high_value_transaction_pct: percentage where high_value_flag = "Yes"
#   - platinum_customer_pct: percentage where customer_value_score = "Platinum"

business_kpis <- sales_enhanced %>%
  summarize(
    # Your code here:
    
  )

cat("\n", rep("=", 70), "\n")
cat("           EXECUTIVE BUSINESS DASHBOARD\n")
cat(rep("=", 70), "\n\n")

cat("📊 KEY PERFORMANCE INDICATORS\n")
cat(rep("─", 40), "\n")
cat("Total Revenue: $", format(business_kpis$total_revenue, big.mark=","), "\n")
cat("Total Profit: $", format(business_kpis$total_profit, big.mark=","), "\n")
cat("Overall Profit Margin:", round(business_kpis$overall_profit_margin, 1), "%\n")
cat("Total Transactions:", business_kpis$total_transactions, "\n")
cat("Total Units Sold:", format(business_kpis$total_units_sold, big.mark=","), "\n")
cat("Avg Transaction Value: $", format(round(business_kpis$avg_transaction_value, 2), big.mark=","), "\n")
cat("High-Value Transactions:", round(business_kpis$high_value_transaction_pct, 1), "%\n")
cat("Platinum Customers:", round(business_kpis$platinum_customer_pct, 1), "%\n")

In [None]:
# Task 4.2: Identify Top Performers

cat("\n🏆 TOP PERFORMERS\n")
cat(rep("─", 40), "\n")

# TODO: Display top region (from regional_performance)
cat("Top Region:", as.character(regional_performance$region_clean[1]), "\n")
cat("  Revenue: $", format(regional_performance$total_revenue[1], big.mark=","), "\n")

# TODO: Display top product category (from category_performance)
cat("\nTop Product Category:", as.character(category_performance$product_category_clean[1]), "\n")
cat("  Revenue: $", format(category_performance$total_revenue[1], big.mark=","), "\n")

# TODO: Display top sales rep (from sales_rep_performance)
cat("\nTop Sales Representative:", as.character(sales_rep_performance$sales_rep_clean[1]), "\n")
cat("  Revenue: $", format(sales_rep_performance$total_revenue[1], big.mark=","), "\n")

# TODO: Display busiest weekday (from weekday_patterns)
cat("\nBusiest Weekday:", as.character(weekday_patterns$sale_weekday[1]), "\n")
cat("  Transactions:", weekday_patterns$transaction_count[1], "\n")

In [None]:
# Task 4.3: Performance Distribution Analysis

cat("\n📈 PERFORMANCE DISTRIBUTION\n")
cat(rep("─", 40), "\n")

# TODO: Calculate and display distribution of performance_tier
performance_dist <- sales_enhanced %>%
  group_by(performance_tier) %>%
  summarize(
    count = n(),
    total_revenue = sum(Revenue),
    percentage = (n() / nrow(sales_enhanced)) * 100,
    .groups = 'drop'
  ) %>%
  arrange(desc(total_revenue))

print(performance_dist)

# TODO: Calculate and display distribution of customer_value_score
cat("\n💎 CUSTOMER VALUE DISTRIBUTION\n")
cat(rep("─", 40), "\n")

value_dist <- sales_enhanced %>%
  group_by(customer_value_score) %>%
  summarize(
    count = n(),
    total_revenue = sum(Revenue),
    percentage = (n() / nrow(sales_enhanced)) * 100,
    .groups = 'drop'
  ) %>%
  arrange(desc(total_revenue))

print(value_dist)

## Part 5: Strategic Insights and Recommendations

**Business Context:** Data analysis must translate into actionable business recommendations.

**Your Tasks:**
1. Identify key business opportunities
2. Highlight areas of concern
3. Provide data-driven recommendations
4. Prioritize action items

In [None]:
# Task 5.1: Identify Growth Opportunities

cat("\n💡 STRATEGIC OPPORTUNITIES\n")
cat(rep("=", 70), "\n\n")

# TODO: Find underperforming regions with potential
# (regions with low revenue but high profit margins)
underperforming_regions <- regional_performance %>%
  filter(total_revenue < median(total_revenue) & avg_profit_margin > median(avg_profit_margin)) %>%
  arrange(desc(avg_profit_margin))

cat("1. UNDERPERFORMING REGIONS WITH HIGH MARGINS:\n")
if(nrow(underperforming_regions) > 0) {
  print(underperforming_regions)
  cat("   → Opportunity: Increase marketing investment in these regions\n\n")
} else {
  cat("   No underperforming regions with high margins identified\n\n")
}

# TODO: Find product categories with growth potential
cat("2. PRODUCT CATEGORY OPPORTUNITIES:\n")
cat("   Top 3 categories by profit margin:\n")
category_performance %>%
  arrange(desc(avg_profit_margin)) %>%
  head(3) %>%
  select(product_category_clean, avg_profit_margin, total_revenue) %>%
  print()
cat("   → Opportunity: Expand inventory in high-margin categories\n\n")

# TODO: Analyze weekend vs weekday performance
cat("3. TEMPORAL OPPORTUNITIES:\n")
weekend_pct <- (sum(sales_enhanced$is_weekend) / nrow(sales_enhanced)) * 100
cat("   Weekend transactions:", round(weekend_pct, 1), "%\n")
if(weekend_pct < 25) {
  cat("   → Opportunity: Increase weekend promotions and staffing\n\n")
} else {
  cat("   → Weekend performance is strong\n\n")
}

In [None]:
# Task 5.2: Identify Areas of Concern

cat("\n⚠️  AREAS OF CONCERN\n")
cat(rep("=", 70), "\n\n")

# TODO: Find low-performing regions
low_performing_regions <- regional_performance %>%
  filter(avg_profit_margin < 30) %>%
  arrange(avg_profit_margin)

cat("1. LOW PROFIT MARGIN REGIONS:\n")
if(nrow(low_performing_regions) > 0) {
  print(low_performing_regions)
  cat("   → Action: Review pricing and cost structure in these regions\n\n")
} else {
  cat("   All regions performing well\n\n")
}

# TODO: Identify small deal concentration
small_deal_pct <- (sum(sales_enhanced$deal_type == "Small") / nrow(sales_enhanced)) * 100
cat("2. DEAL SIZE DISTRIBUTION:\n")
cat("   Small deals:", round(small_deal_pct, 1), "%\n")
if(small_deal_pct > 40) {
  cat("   → Concern: High concentration of small deals\n")
  cat("   → Action: Implement upselling strategies\n\n")
} else {
  cat("   Deal size distribution is healthy\n\n")
}

# TODO: Check customer value distribution
bronze_pct <- (sum(sales_enhanced$customer_value_score == "Bronze") / nrow(sales_enhanced)) * 100
cat("3. CUSTOMER VALUE CONCERNS:\n")
cat("   Bronze customers:", round(bronze_pct, 1), "%\n")
if(bronze_pct > 50) {
  cat("   → Concern: Large proportion of low-value customers\n")
  cat("   → Action: Develop customer upgrade programs\n\n")
} else {
  cat("   Customer value distribution is acceptable\n\n")
}

## Part 6: Capstone Reflection Questions

Answer the following questions based on your comprehensive analysis. These questions assess your understanding of the entire data wrangling workflow and its business applications.

### Question 6.1: Data Wrangling Workflow

**Describe the complete data wrangling workflow you followed in this capstone project. What were the most critical steps, and why? How did each step build upon the previous one?**

Your answer here:



### Question 6.2: Integration of Skills

**How did you integrate string manipulation, date/time operations, and data transformation techniques in this project? Provide specific examples where combining these skills was essential for generating insights.**

Your answer here:



### Question 6.3: Business Impact

**Based on your analysis, what are the top 3 most impactful business recommendations you would make to the executive team? For each recommendation, explain:**
- What data supports this recommendation?
- What is the expected business impact?
- How would you measure success?

Your answer here:

1. 

2. 

3. 



### Question 6.4: Data Quality Importance

**Why was data quality validation critical in this project? What would have happened if you had skipped the validation steps? Provide specific examples of how data quality issues could have led to incorrect business decisions.**

Your answer here:



### Question 6.5: Grouped Analysis Value

**How did grouped analysis (group_by + summarize) help you uncover insights that wouldn't be visible in the raw data? Provide at least three specific examples from your analysis.**

Your answer here:

1. 

2. 

3. 



### Question 6.6: Conditional Logic Application

**Explain how you used case_when() to create business categories (performance tiers, customer value scores, etc.). Why is this type of categorization important for business decision-making? How would you validate that your categorization logic is appropriate?**

Your answer here:



### Question 6.7: Temporal Analysis Insights

**What temporal patterns did you discover in the data (weekday, monthly, quarterly)? How can businesses use these patterns for operational planning, staffing, inventory management, and marketing timing?**

Your answer here:



### Question 6.8: Professional Development

**Reflect on your growth throughout this course. What data wrangling skills do you feel most confident about? What areas would you like to develop further? How will you apply these skills in your professional career?**

Your answer here:



## Summary and Submission

### 🎉 Congratulations on Completing the Capstone!

You have successfully completed a comprehensive data wrangling project that demonstrates mastery of:

**Technical Skills:**
- ✅ Data import and validation
- ✅ Data transformation with dplyr (select, filter, arrange, mutate, summarize, group_by)
- ✅ String manipulation with stringr (cleaning, detection, extraction)
- ✅ Date/time operations with lubridate (parsing, extraction, calculations)
- ✅ Complex conditional logic with case_when()
- ✅ Multi-dimensional grouped analysis
- ✅ Data quality checks and validation
- ✅ Professional business intelligence reporting

**Business Skills:**
- ✅ Customer segmentation and value scoring
- ✅ Performance analysis across multiple dimensions
- ✅ Temporal trend identification
- ✅ KPI calculation and tracking
- ✅ Strategic opportunity identification
- ✅ Data-driven recommendation development
- ✅ Executive-level communication

**Professional Practices:**
- ✅ Systematic data quality validation
- ✅ Clear, well-commented code
- ✅ Reproducible analysis workflow
- ✅ Business context integration
- ✅ Professional presentation of results

### 📋 Submission Checklist

Before submitting, ensure you have:
- [ ] Entered your name, student ID, and date at the top
- [ ] Completed all code tasks (Parts 1-5)
- [ ] Run all cells successfully without errors
- [ ] Verified all calculated metrics are reasonable
- [ ] Answered all 8 reflection questions thoroughly
- [ ] Created all required dataframes with correct variable names:
  - [ ] sales_enhanced (with all calculated columns)
  - [ ] regional_performance
  - [ ] category_performance
  - [ ] sales_rep_performance
  - [ ] monthly_trends
  - [ ] weekday_patterns
  - [ ] region_category_performance
  - [ ] business_kpis
- [ ] Used proper commenting throughout your code
- [ ] Used the pipe operator (`%>%`) appropriately
- [ ] Checked for any remaining TODO comments
- [ ] Verified your business recommendations are data-driven

### 📊 Grading Criteria

Your capstone will be evaluated on:

**Code Correctness (35%)**
- All tasks completed correctly
- Proper use of dplyr, stringr, and lubridate functions
- Accurate calculations and transformations
- Correct implementation of business logic

**Code Quality (20%)**
- Clean, well-organized code
- Meaningful variable names
- Helpful comments explaining logic
- Efficient use of pipe operator
- Professional code structure

**Business Analysis (25%)**
- Demonstrates understanding of business context
- Meaningful insights and patterns identified
- Appropriate categorization and segmentation
- Data-driven recommendations
- Strategic thinking

**Reflection Questions (15%)**
- Thoughtful, complete answers
- Demonstrates deep understanding
- Provides specific examples
- Shows critical thinking
- Connects concepts to real-world applications

**Presentation (5%)**
- Professional formatting
- Clear, organized output
- Complete student information
- No errors or warnings
- Executive-ready quality

### 🎯 What's Next?

With these data wrangling skills, you're ready for:
- **Advanced R Programming**: Functions, loops, and automation
- **Data Visualization**: Creating compelling charts with ggplot2
- **Statistical Analysis**: Hypothesis testing and modeling
- **Machine Learning**: Predictive analytics and classification
- **R Markdown Reports**: Automated, reproducible reporting
- **Shiny Dashboards**: Interactive web applications

### 💼 Career Applications

These skills are directly applicable to roles such as:
- Data Analyst
- Business Intelligence Analyst
- Marketing Analyst
- Operations Analyst
- Financial Analyst
- Data Scientist

### 🙏 Thank You!

Thank you for your dedication and hard work throughout this course. The data wrangling skills you've developed are foundational for any data-driven career. Keep practicing, keep learning, and keep asking great questions!

**Best of luck with your data analytics journey! 🚀**

---

**End of Capstone Project**