### Plan of Action Summary:

- **Initial Modeling Approach:**
  - Start with ARIMA for individual buildings to establish a baseline.
  - Analyze data stationarity and seasonality to tune ARIMA parameters.

- **Feature Engineering:**
  - Categorize buildings (e.g., classrooms, dorms, labs).
  - Incorporate external factors such as weather data (scrape monthly temperature, humidity, etc.).
  - Generate lag features (previous month’s consumption, rolling averages).
  - Add calendar-based features (semester start/end dates, holidays).

- **Data Processing:**
  - Organize scattered data files efficiently.
  - Handle missing values and inconsistencies in electricity usage and cost data.
  - Investigate the granularity and consistency of timestamps across files.

- **Model Selection and Training:**
  - Implement gradient-boosting models (LightGBM, XGBoost) to capture complex relationships.
  - Experiment with per-building and global models.
  - Evaluate performance using RMSE, MAE, and MAPE.

- **Scalability Considerations:**
  - Explore using PySpark/Dask for handling large datasets.
  - Assess computational resource needs for training and inference.

---

### Questions

1. **Data Understanding and Availability:**
   - Do all buildings have complete and consistent data from 2013 onwards?
   - Are there any known data quality issues (e.g., missing months, anomalies)?
   - What metadata is available for buildings (e.g., type, occupancy patterns)?

2. **External Data Integration:**
   - What historical weather data sources can we reliably scrape and use?
   - Are there any other external factors we should consider (e.g., campus events, maintenance schedules)?

3. **Modeling Strategy:**
   - Should we prioritize building-specific models or a generalizable model across all buildings?
   - What forecasting horizon is most useful for stakeholders (e.g., monthly, quarterly)?

4. **Computational Resources:**
   - Are there limitations on storage or processing power for model training?
   - Should we consider cloud-based solutions for scalability?

5. **Stakeholder Expectations:**
   - What are the key deliverables and timeline expectations for the project?
   - How will the forecasting outputs be used for decision-making?

Let me know if you'd like to refine or add anything to this.