In [22]:
# Import necessary libraries
import tensorflow as tf
import numpy as np
import pandas as pd
import pickle
import shap
import matplotlib.pyplot as plt
from tensorflow.keras.models import load_model
from modeling_class import PricePredictionTuner  
from keras import metrics
from modeling_class import MarketRecommendationModel  

## Price Prediction Model (for Wholesale & Retail Prices)
### (Feedforward Neural Network)

For price prediction, the process began with a carefully prepared dataset enriched with temporal features, lag variables, rolling statistics, and market-specific indicators, which formed the basis for developing two deep learning models—one for wholesale prices and one for retail prices. The model architectures were optimized using an automated hyperparameter tuning process, specifically the Hyperband algorithm from the Keras Tuner library. This approach defined a search space that included the number of units in dense layers, dropout usage and rates (ranging from 0.1 to 0.5), the option to add extra layers, and learning rates sampled logarithmically between 1e-4 and 1e-2. Hyperband dynamically allocated training resources by evaluating models based on the validation mean absolute error (val_mae), allowing promising configurations to train for more epochs while quickly terminating those with poor performance. The data was split into training and test sets to ensure robust evaluation.

In [9]:
tuner = PricePredictionTuner("price_prediction_cleaned.csv")
tuner.run()


 Tuning Model for Wholesale Prices...
 Training best model for Wholesale prices...

 Tuning Model for Retail Prices...
 Training best model for Retail prices...
Cause: Unable to locate the source code of <function Model.make_predict_function.<locals>.predict_function at 0x0000026A96DAB430>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.experimental.do_not_convert. Original error: could not get source code

 **Wholesale Price Model Metrics**
 MAE: 4.56
 RMSE: 7.89
 R² Score: 0.92
Cause: Unable to locate the source code of <function Model.make_predict_function.<locals>.predict_function at 0x0000026AEF5E6280>. Note that functions defined in certain environments, like the interactive Python shell, do not expose their source code. If that is the case, you should define 

In [4]:
# Create a dictionary with the data
data = {
    'Metric': ['MAE', 'RMSE', 'R² Score'],
    'Wholesale Price Model': [4.56, 7.89, 0.92],
    'Retail Price Model': [5.66, 9.47, 0.92]
}

# Create a DataFrame
results_df = pd.DataFrame(data)

# Display the DataFrame
results_df

Unnamed: 0,Metric,Wholesale Price Model,Retail Price Model
0,MAE,4.56,5.66
1,RMSE,7.89,9.47
2,R² Score,0.92,0.92


The Wholesale Price Model has an MAE of 4.56 and RMSE of 7.89, indicating it predicts wholesale prices with relatively small errors on average, and the R² score of 0.92 suggests the model captures the variation in the data very well, explaining 92% of the variance.

The Retail Price Model, on the other hand, has slightly higher errors (MAE of 5.66 and RMSE of 9.47). While its performance isn't as tight as the Wholesale Price Model, the R² score of 0.92 still shows that it does a good job of explaining the retail price data and its relationship with the features, similar to the wholesale model.

## Market Recommendation Model-Feedforward Neural Network

For market recommendation(FNN), the focus shifts to predicting the correct market based on a distinct set of features. The process begins with a separate, cleaned, and encoded dataset that is balanced using synthetic oversampling to ensure fair representation across different market classes, followed by dimensionality reduction with Principal Component Analysis (PCA) after standard scaling. The deep learning classifier is then constructed with multiple dense layers, enhanced by techniques such as batch normalization and dropout, and outputs probabilities across market classes via a softmax layer. The model is trained with early stopping and evaluated using classification accuracy as well as detailed precision, recall, and f1-scores for each market class. 

In [3]:
# Initialize and run the model
model = MarketRecommendationModel(
    data_path="market_recommendation_cleaned.csv",
    mappings_path="mappings.pkl"
)

model.run()  

[WinError 2] The system cannot find the file specified
  File "C:\Users\Knight Mbithe\anaconda3\envs\tensorflow_env\lib\site-packages\joblib\externals\loky\backend\context.py", line 257, in _count_physical_cores
    cpu_info = subprocess.run(
  File "C:\Users\Knight Mbithe\anaconda3\envs\tensorflow_env\lib\subprocess.py", line 505, in run
    with Popen(*popenargs, **kwargs) as process:
  File "C:\Users\Knight Mbithe\anaconda3\envs\tensorflow_env\lib\subprocess.py", line 951, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\Knight Mbithe\anaconda3\envs\tensorflow_env\lib\subprocess.py", line 1436, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,


Epoch 1/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 6ms/step - accuracy: 0.5597 - loss: 1.9217 - val_accuracy: 0.9051 - val_loss: 0.2789
Epoch 2/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m33s[0m 4ms/step - accuracy: 0.8524 - loss: 0.4214 - val_accuracy: 0.9192 - val_loss: 0.2235
Epoch 3/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 4ms/step - accuracy: 0.8750 - loss: 0.3496 - val_accuracy: 0.9325 - val_loss: 0.1754
Epoch 4/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m46s[0m 5ms/step - accuracy: 0.8900 - loss: 0.3072 - val_accuracy: 0.9377 - val_loss: 0.1629
Epoch 5/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 4ms/step - accuracy: 0.9008 - loss: 0.2752 - val_accuracy: 0.9406 - val_loss: 0.1551
Epoch 6/50
[1m8864/8864[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 4ms/step - accuracy: 0.9074 - loss: 0.2556 - val_accuracy: 0.9415 - val_loss: 0.1481
Epoch 7/50

The model training over 50 epochs shows a steady improvement in both accuracy and loss. Training accuracy rises from 55.97% to 94.22%, while validation accuracy peaks at 97.22%. The classification report reveals high precision, recall, and f1-scores for most markets, with a few exceptions like "Butere Livestock Market" and "Amukura," which have lower recall. Overall, the model performs well across most classes with an average of 97% accuracy.

## Outputting the Models Predictions csv files

Predictions in csv files for each of the models are need for creating the user-interface dashboard

In [16]:
# Load the preprocessor (only for market data)
with open("modelling/saved_models/preprocessor.pkl", "rb") as f:
    preprocessor = pickle.load(f)

# Load the trained models, including custom metrics if necessary
wholesale_model = load_model("modelling/saved_models/wholesale_price_model.h5", custom_objects={'mse': metrics.mean_squared_error})
retail_model = load_model("modelling/saved_models/retail_price_model.h5", custom_objects={'mse': metrics.mean_squared_error})
market_recommendation_model = load_model("modelling/saved_models/market_recommendation_model.h5", custom_objects={'mse': metrics.mean_squared_error})

# Load the data
price_df = pd.read_csv('price_prediction_cleaned.csv')  # Ensure this path is correct
market_df = pd.read_csv('market_recommendation_cleaned.csv')  # Ensure this path is correct

# Drop target columns before making predictions
price_features = price_df.drop(columns=['Wholesale', 'Retail'], errors='ignore')
market_features = market_df.drop(columns=['Market_ID'], errors='ignore')

# **No preprocessing on price data** – this is left untouched
# Apply the saved preprocessor to market features
market_features_transformed = preprocessor.transform(market_features)

# Generate predictions for Wholesale and Retail prices
wholesale_predictions = wholesale_model.predict(price_features)
retail_predictions = retail_model.predict(price_features)

# Predict market recommendations
market_predictions = market_recommendation_model.predict(market_features_transformed)

# Convert market predictions to a single recommendation (assuming multi-class)
market_recommendations = np.argmax(market_predictions, axis=1)

# Save predictions to separate CSV files
wholesale_df = price_df.copy()
wholesale_df['Wholesale_Prediction'] = wholesale_predictions
wholesale_df.to_csv("wholesale_price_predictions.csv", index=False)

retail_df = price_df.copy()
retail_df['Retail_Prediction'] = retail_predictions
retail_df.to_csv("retail_price_predictions.csv", index=False)

market_df_copy = market_df.copy()
market_df_copy['Market_Recommendation'] = market_recommendations
market_df_copy.to_csv("market_recommendations.csv", index=False)

# Print confirmation messages
print("Wholesale price predictions saved at: wholesale_price_predictions.csv")
print("Retail price predictions saved at: retail_price_predictions.csv")
print("Market recommendations saved at: market_recommendations.csv")


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m1465/1465[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step
[1m1465/1465[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step
[1m1465/1465[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step
Wholesale price predictions saved at: wholesale_price_predictions.csv
Retail price predictions saved at: retail_price_predictions.csv
Market recommendations saved at: market_recommendations.csv


## Conclusion

Based on our extensive development and rigorous evaluation, the AgriSenseAI system has achieved significant success in addressing the challenges faced by Kenyan farmers and traders. Below are the key outcomes:
- Successfully developed an integrated deep learning system that leverages historical price data, weather information, vegetation indices, supply-demand dynamics, transport costs, and macroeconomic indicators to enhance market decision-making.
- Achieved high predictive accuracy in price forecasting, with the wholesale model recording an MAE of 4.56, RMSE of 7.89, and R² of 0.92, and the retail model obtaining an MAE of 5.66, RMSE of 9.47, and R² of 0.92.
- Constructed a robust market recommender that effectively predicts optimal market locations, achieving an overall classification accuracy of approximately 97.22%.
- Delivered a user-friendly interactive dashboard using Streamlit, enabling farmers and traders to easily access real-time price forecasts, market recommendations, and visual insights.
- Deployed the system on Streamlit Community Cloud, ensuring a scalable, reliable production environment that supports seamless real-time decision support for the agricultural sector in Kenya.


## Recommendation

- Enhance Data Collection: Collaborate with government agencies and data providers to secure real-time, granular meteorological data.
- Expand Crop Coverage: Extend the system to include a broader range of crops to improve the robustness of market recommendations and price predictions.
- Improve Model Interpretability: Integrate advanced explainable AI techniques (e.g., SHAP analyses) to continuously validate model insights and build user trust.
- Strengthen Infrastructure: Refine the user interface and deployment strategy to ensure the system remains scalable and responsive, especially as additional data streams are integrated.


## Deployment

We developed a user-friendly interface by combining FastAPI and Streamlit. FastAPI handled the backend logic, providing APIs for price predictions and market recommendations by processing user inputs and fetching data from the saved prediction csv  files. Streamlit was used to create an intuitive frontend where users could interact with the application, selecting crops, counties, and price types to receive predictions and market suggestions. We deployed the app on Streamlit Community Cloud by linking the project from GitHub, enabling easy access for farmers through a public URL. This approach allowed us to seamlessly integrate backend calculations with a simple, interactive UI for farmers.
Here is the link to the user- friendly interface: [https://agrisenseai-project-4wckgd8cno4p4vjuc2rhme.streamlit.app/](link)

## Challenges

In [7]:
- Real-Time Price Forecasting: Encountered difficulties in establishing a reliable connection with the KAMIS website for live price updates.
- Data Collection Challenges: Faced issues with sourcing relevant data due to limited centralized data availability, necessitating the integration of multiple data sources.
- Meteorological Data Access: Inability to acquire comprehensive, high-resolution climate data from the meteorological department limited the accuracy of climate-based predictions.
- Data Quality and Consistency: Addressing inconsistencies, missing values, and varying data formats required extensive cleaning and preprocessing efforts.
- Integration of Heterogeneous Data Sources: Merging and harmonizing diverse datasets from various governmental and institutional sources proved complex and resource-intensive.


SyntaxError: invalid syntax (1925805912.py, line 1)

## Next Steps

- Model Refinement: Update and retrain the deep learning models by incorporating additional features and more diverse data sources to further improve prediction accuracy.
- Real-Time Data Integration: Establish a connection with the KAMIS website to integrate live price data into our system, enabling dynamic and up-to-date price forecasting.
- User Interface Enhancement: Continue refining the Streamlit dashboard with advanced visualization and filtering options to enhance user experience and accessibility.
- Pilot Deployment and Feedback: Initiate a pilot deployment with selected farmers and traders to gather real-world feedback and validate system performance.
- Monitoring and Maintenance: Implement a robust monitoring framework to continuously track model performance, data quality, and system uptime, with regular updates and retraining based on new data trends.
- Process Review: Conduct periodic reviews of the entire pipeline—from data collection through deployment—to identify and implement improvements that adapt to evolving market dynamics.

