# Avocado Price Prediction: Business Understanding

## CRISP-DM Phase 1: Business Understanding

### Table of Contents
1. [Project Overview](#1)
2. [Business Objectives](#2)
3. [Business Success Criteria](#3)
4. [Situation Assessment](#4)
5. [Project Plan](#5)
6. [References](#6)

<a id='1'></a>
## 1. Project Overview

This project aims to develop a machine learning model to predict Hass avocado prices in various regions of the United States. The prediction model will utilize historical data including sales volumes, types of avocados, and regional information to forecast future prices.

### Background
Avocados have become an increasingly important commodity in the US market, with consumption growing significantly over the past decade. Accurate price prediction can benefit various stakeholders in the supply chain:
- Retailers can optimize inventory management and pricing strategies
- Distributors can better plan their operations
- Farmers can make informed decisions about harvesting and market timing
- Consumers can benefit from more stable pricing through better market efficiency

<a id='2'></a>
## 2. Business Objectives

### Primary Objectives:
1. Develop a reliable price prediction model for Hass avocados
2. Identify key factors influencing avocado prices
3. Provide actionable insights for stakeholders

### Research Questions:
1. How accurately can we predict avocado prices using historical data?
2. What features have the strongest influence on avocado prices?
3. How do regional differences affect pricing patterns?
4. What is the impact of organic vs. conventional classification on prices?
5. How do seasonal patterns affect avocado prices?

<a id='3'></a>
## 3. Business Success Criteria

The success of this project will be evaluated based on the following criteria:

1. **Technical Criteria:**
   - Achieve a Mean Absolute Percentage Error (MAPE) of less than 15%
   - Achieve an R-squared value of at least 0.7
   - Model should be interpretable and provide insights about feature importance

2. **Business Criteria:**
   - Identify key price drivers that stakeholders can monitor
   - Provide actionable recommendations based on model insights
   - Create a reproducible methodology for future price predictions

<a id='4'></a>
## 4. Situation Assessment

### Resources Available:
- Weekly retail scan data for National retail volume and price
- Data from multiple retail channels (grocery, mass, club, drug, dollar, military)
- Regional market information
- Product-specific data (PLU codes)

### Constraints and Assumptions:
1. **Data Constraints:**
   - Limited to Hass avocados only
   - Weekly granularity of data
   - Historical data might not capture recent market changes

2. **Technical Constraints:**
   - No deep learning techniques allowed
   - Focus on interpretable machine learning models

3. **Assumptions:**
   - Past price patterns are indicative of future trends
   - The relationship between features and prices is relatively stable
   - The data collection methodology remained consistent

<a id='5'></a>
## 5. Project Plan

### Phase 1: Business Understanding (Current Phase)
- Define objectives and success criteria
- Assess situation and constraints
- Develop project plan

### Phase 2: Data Understanding
- Collect and describe data
- Explore data quality
- Perform initial data analysis
- Verify data quality

### Phase 3: Data Preparation
- Select and clean data
- Feature engineering
- Format data for modeling

### Phase 4: Modeling
- Select modeling techniques (considering):
  * Linear Regression
  * Random Forest
  * XGBoost
  * LightGBM
  * Support Vector Regression
- Generate test design
- Build and assess models

### Phase 5: Evaluation
- Evaluate results
- Review process
- Determine next steps

### Timeline:
- Data Understanding: 1 week
- Data Preparation: 1 week
- Modeling: 2 weeks
- Evaluation: 1 week

<a id='6'></a>
## 6. References

[1] Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., & Wirth, R. (2000). "CRISP-DM 1.0: Step-by-step data mining guide." SPSS inc, 9, 13.

[2] Shahbandeh, M. (2021). "Avocado Industry - Statistics & Facts." Statista.

[3] USDA Economic Research Service. (2021). "Fruit and Tree Nut Yearbook Tables."

[4] Hass Avocado Board. (2021). "Market Data and Research Reports."

[5] Kuhn, M., & Johnson, K. (2013). "Applied Predictive Modeling." New York: Springer.

[6] James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). "An Introduction to Statistical Learning." New York: Springer.