Chemical Engineer β Data Scientist
Energy AI & Climate Data | ML for Infrastructure & Sustainability
MSc Data Science and Artificial Intelligence for Sustainability, Cranfield University (2024β2025)
BEng Chemical Engineering(Hons), Swansea University (2019-2023)
Based in Nairobi, Kenya | Open to roles in UK, Europe & Africa
I build machine learning and optimisation tools for energy systems, climate data, and sustainable infrastructure.
My work focuses on applying AI to real-world energy challenges β including battery optimisation, energy demand forecasting, satellite data pipelines, and climate-informed infrastructure planning.
With a background in chemical engineering, I approach ML problems through the lens of physical systems: energy balances, operational constraints, and uncertainty in real-world data. I focus on building models that are not only accurate, but robust, interpretable, and usable in real operational settings.
A custom Gymnasium reinforcement learning environment for battery scheduling in Cranfield University's community energy system. The PPO agent learns to optimise battery dispatch alongside solar PV and CHP generation, minimising grid import costs (Octopus Agile time-of-use tariffs) while accounting for battery degradation.
- Designed a custom RL environment from scratch rather than using prebuilt benchmarks
- Real operational data: 2 years hourly (Feb 2023 β Jan 2025)
- Multi-objective reward function: cost Β· curtailment Β· battery wear Β· self-sufficiency
- Benchmarked against Excel Solver GRG baseline
- Known limitations documented. Active development continuing.
PythonGymnasiumStable-Baselines3PPOERA5Octopus Agile
Private repo β dummy data and full environment code available on request.
A data pipeline querying the Element 84 Earth Search STAC API to extract Landsat and Sentinel-2 cloud cover data over Ghana for 2023. Identifies optimal satellite imaging windows by analysing monthly cloud cover patterns.
- 4,442 satellite scene records extracted via HTTP GET + JSON flattening
- Sentinel-2 filtered analysis: December best (14% avg cloud cover), JuneβAugust worst (80β85%)
- Built as part of a PBL module β first experience designing an end-to-end data pipeline from API to insight
PythonrequestspandasSTAC APISentinel-2Landsat
Private repo β code and dummy data available on request.
LSTM model predicting hourly water consumption for a datacenter cooling system. Handles a zero-inflated seasonal distribution (free-air cooling in winter) using Keras Tuner Hyperband for automated hyperparameter search.
- Part of a group comparison: LSTM vs Random Forest vs XGBoost vs MLP
- LSTM achieved best performance on water demand forecasting across all four models
- Keras Tuner improved test MAE from 3.281 β 2.640 vs manual tuning
PythonTensorFlowKerasKeras-TunerLSTMpandasscikit-learn
Private repo β dummy data and saved model available on request.
A consultancy-style GIS project using ArcGIS Pro to build a prototype geodatabase for the Isle of Wight Council, demonstrating how GIS technology could support public service planning and the Council's Digital Plan 2024β2027.
- Integrated 7 datasets from Ordnance Survey, NSRI soils, DTM elevation, and Cranfield meteorological data into a single geodatabase
- Erosion risk scoring combining clay, silt, sand subsidence vulnerability with elevation (raster processing)
- SQL queries to identify road networks in high-risk zones: 122.32 km high risk, 316.43 km medium risk
- Full metadata produced in compliance with INSPIRE Directive and FAIR data principles
- Delivered as a formal council-facing report with map layouts, data structure diagrams and recommendations
ArcGIS ProSQLGeodatabase designRaster/Vector analysisOrdnance SurveyNSRI
Comparative cradle-to-gate LCA of hydrogen production via PEM electrolysis powered by solar and wind energy. Evaluated environmental impacts with a focus on Global Warming Potential (GWP) per kg of Hβ produced across the lifetime of the electrolyser β comparing renewable energy sources to establish the lower-carbon pathway for green hydrogen production.
LCA methodologyGreen hydrogenGWPRenewable energy systemsEnvironmental impact quantification
Group EIA for a proposed biohydrogen production facility using food waste as feedstock. Covered site selection, stakeholder analysis, impact identification and mitigation across air quality, water, land use and social dimensions.
EIA methodologyBiohydrogenWaste-to-energyCircular economyChemical engineering
| Area | Tools |
|---|---|
| Machine Learning | Scikit-learn, TensorFlow, Keras, Stable-Baselines3 |
| Reinforcement Learning | Gymnasium, PPO, custom environment design |
| Data Engineering | pandas, NumPy, REST APIs, JSON flattening, ERA5/ECMWF |
| Visualisation | Matplotlib, Seaborn, data visualisation and scientific plotting |
| Geospatial / Remote Sensing | ArcGIS Pro, STAC API, Sentinel-2, Landsat, geopandas |
| Sustainability | LCA, EIA, green hydrogen systems, circular economy |
| Languages | Python, SQL |
| Tools | Git, Google Colab, Jupyter, Excel Solver |
Before pivoting to data science, I worked as a chemical engineer β which shapes how I think about ML. Energy systems, physical constraints, and uncertainty under real-world conditions aren't abstract to me.
The pivot was intentional: I wanted to apply quantitative reasoning to the problems I care most about β energy access, climate resilience, and sustainable infrastructure, particularly in African contexts.
My current interests lie at the intersection of:
- AI for energy system optimisation
- Renewable energy integration and grid flexibility
- Climate and weather data applications in energy forecasting
- Infrastructure planning using geospatial and satellite data
- Energy access and climate resilience in emerging markets
- Climate, environment, and public health analytics
I am interested in opportunities applying machine learning and data science to energy systems, climate analytics, and sustainable infrastructure.
Areas I am particularly excited about include:
- Energy system optimisation and electricity market modelling
- Renewable energy integration and grid flexibility
- Climate and weather data applications in energy forecasting
- Geospatial analytics for infrastructure planning
- Data-driven solutions for energy access and climate resilience in emerging markets
- Health systems and data-driven approaches to population wellbeing
I am open to roles in energy analytics, climate and health tech, AI for infrastructure, and consulting across the UK, Europe, and Africa.
- π§ Continuing development on the RL energy model β fixing reward scaling, implementing Optuna hyperparameter search, extending to weekly episodes
- π Interested in ML applications for energy access, climate adaptation, and sustainable development in Africa and emerging markets
- π« Reach me: LinkedIn | abbie.gitonga@gmail.com
All repos are private due to data sensitivity. Code, dummy datasets, and documentation available on request.