feat(models): add biomass and carbon-stock regression module#46
Merged
Conversation
Adds BiomassRegressor — a wrapper around sklearn RandomForest and xgboost.XGBRegressor that exposes a stable fit/predict/evaluate/save API for ClimateVision pipelines. Default feature ordering matches the spectral indices produced by the data preprocessor (NDVI, EVI, SAVI, NDMI, NBR, R, G, B, NIR, SWIR1). Also adds: - biomass_to_carbon / biomass_to_co2e helpers using IPCC defaults (carbon fraction 0.47, 44/12 ratio for CO2e). - evaluate_regression for RMSE, MAE, R^2, and MAPE. - estimate_biomass_from_indices for inference over a dict of per-pixel index arrays. - save() / load() round-trip via pickle.
This was referenced May 5, 2026
Goldokpa
approved these changes
May 5, 2026
Member
Goldokpa
left a comment
There was a problem hiding this comment.
Solid foundation. The fit/predict/evaluate/save/load API matches what the analytics module expects and the test coverage (11 cases) hits the important branches — perfect-fit r2=1, save/load round-trip, missing-index KeyError. xgboost guarded as a soft dependency is the right call.
Minor nit (non-blocking): evaluate_regression returns NaN for r2 when ss_tot=0 — fine, just worth a short docstring note so callers don't trip on it. Approving.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/climatevision/models/regression.py—BiomassRegressor, a wrapper around sklearnRandomForestRegressorandxgboost.XGBRegressorwith a stablefit / predict / evaluate / save / loadAPI.biomass_to_carbonandbiomass_to_co2euse IPCC defaults (carbon fraction 0.47, 44/12 ratio for CO2e).evaluate_regressioncomputes RMSE / MAE / R^2 / MAPE for the eval and model-card pipelines.estimate_biomass_from_indicesaccepts a dict of per-pixel index arrays and runs inference in one call.models/__init__.py.Why
Sprint deliverable: "Build carbon.py — Random Forest & XGBoost regression for biomass prediction" / "Implement metrics for regression evaluation (RMSE, MAE, R-squared)." Backs the carbon analytics module Francis is delivering this sprint.
Test plan
pytest tests/test_regression.py— 11/11 passNotes for reviewers
import xgboostis guarded so installs without it still work.models/__init__.pyextends the existing__all__; no existing imports are touched.