Skip to content

hamdani2020/GreenAI

Repository files navigation

GreenAI: Smart Crop Disease Detection with Conversational AI


Home Page

home

Detection

detect

Introduction

GreenAI is an AI-powered solution designed to combat the devastating impact of crop diseases on agriculture in Ghana. It empowers farmers with an accessible, immediate, and intelligent tool to detect plant diseases early, helping to protect livelihoods and ensure food security.


The Problem

Crop diseases pose a significant threat to agricultural productivity and the well-being of farming communities. In Ghana, past events like the devastating Fall Armyworm outbreak have highlighted how rapidly diseases can spread, leading to widespread crop destruction, increased cost of living, and food shortages. Traditional methods of disease detection are often slow, require expert intervention, and are inaccessible to many farmers in rural areas.


What It Does

GreenAI revolutionizes disease detection by making expert advice available directly on a farmer's smartphone. Here's how it works:

  • Image Upload: Farmers can easily upload a photo of their affected crop, taken with a smartphone or drone.
  • AI-Powered Detection: Our system uses advanced object detection (YOLO) to instantly identify the specific crop and pinpoint any visible diseases.
  • Conversational AI: Powered by Google Gemini, GreenAI offers an intelligent chatbot interface. Farmers can ask follow-up questions about the detected disease.
  • Voice Interaction: To ensure maximum accessibility, GreenAI integrates Speech-to-Text (STT), allowing farmers to speak their questions, and Text-to-Speech (TTS), providing spoken answers in a clear, natural voice. This eliminates literacy barriers and enhances user experience.
  • Actionable Advice: Based on the detection and conversation, GreenAI provides practical, actionable advice to help farmers manage and treat the disease effectively.

Technical Deep Dive

GreenAI is built on a robust stack of cutting-edge AI and web technologies:

  • Object Detection: We leverage YOLOv8 (You Only Look Once), a state-of-the-art real-time object detection model. Our specific model (last.pt) was fine-tuned on the crop disease dataset, which includes over 24,000 raw images of Ghanaian crops (cashew, cassava, maize, tomato) and their associated disease classes, ensuring high relevance and accuracy for the local context.
  • Generative AI: The Google Gemini API forms the backbone of our conversational intelligence. It interprets user queries, synthesizes information about detected diseases, and generates coherent, context-aware responses.
  • Speech Processing:
    • Speech-to-Text (STT): We use OpenAI's Whisper-base model via the Hugging Face Transformers pipeline for accurate transcription of spoken questions into text.
    • Text-to-Speech (TTS): Microsoft's SpeechT5_TTS model from Hugging Face is employed to convert AI-generated text responses into natural-sounding speech. A custom speaker_embedding.npy file is used to enhance the speech quality and naturalness.
  • Web Application Framework: The entire interactive user interface is developed using Streamlit, enabling rapid prototyping and a user-friendly experience.
  • Environment Management: We utilize python-dotenv for secure management of API keys and other environment variables.

Key Features

  • Instant Crop Disease Detection: Upload an image and get immediate analysis.
  • AI-Powered Chat: Engage in natural language conversations with an AI expert.
  • Voice Input/Output: Ask questions verbally and receive spoken advice, ideal for diverse user groups.
  • Interactive User Interface: Simple and intuitive design powered by Streamlit.

Challenges & Learnings

Developing GreenAI presented several interesting challenges:

  • Real-time Responsiveness: Integrating multiple complex AI models (YOLO, Gemini, Whisper, SpeechT5) and ensuring a seamless, low-latency user experience was a significant hurdle. Optimizing data flow and processing pipelines was crucial.
  • Voice Interaction Optimization: Fine-tuning the interplay between speech input, AI processing, and speech output to prevent audio feedback loops and create a smooth conversational flow within a reactive framework like Streamlit was particularly challenging.
  • Multilingual Support: A major challenge identified for future development is the expansion of the AI voice assistant to support various Ghanaian languages and dialects. This will be critical for truly maximizing accessibility and impact across diverse farming communities.

Despite these, we learned invaluable lessons in multi-modal AI integration, prompt engineering, and building accessible solutions for real-world problems.


Accomplishments

We are incredibly proud of creating a multi-modal and highly accessible AI solution that directly addresses a critical agricultural challenge. The seamless integration of image recognition, advanced conversational AI, and intuitive voice interaction represents a significant technical achievement. GreenAI stands as a testament to how cutting-edge AI can be harnessed to empower communities and contribute to food security, helping to prevent and mitigate the impact of future agricultural crises like the Fall Armyworm outbreak.


Getting Started

Follow these steps to set up and run GreenAI locally:

Prerequisites

  • Python 3.8+
  • pip (Python package installer)

Model Training

Notebook

Installation

  1. Clone the repository:
    git clone https://github.com/hamdani2020/GreenAI
    cd GreenAI
  2. Create a virtual environment (recommended):
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies:
    pip install -r requirements.txt
    • Note: Ensure your requirements.txt includes:
      streamlit
      ultralytics
      Pillow
      opencv-python
      requests
      python-dotenv
      transformers
      torch
      torchaudio
      datasets
      numpy
      
  4. Download the YOLO model (last.pt): Place your trained YOLO model (last.pt) in the root directory of your project. If you are using a pre-trained YOLOv8n.pt or similar, adjust the load_yolo_model function accordingly.
  5. Download Speaker Embedding: Ensure you have your speaker_embedding.npy file in the root directory. This is crucial for the Text-to-Speech functionality.

Configuration

  1. Create a .env file: In the root directory of your project, create a file named .env and add your Google Gemini API key:
    GEMINI_API_KEY="YOUR_GEMINI_API_KEY"
    
    Replace "YOUR_GEMINI_API_KEY" with your actual API key.

Running the Application

  1. Start the Streamlit app:
    streamlit run app.py
  2. Your browser will automatically open to the GreenAI application.

Future Enhancements

  • Multilingual Support: Implement support for more Ghanaian local languages in the voice assistant.
  • Mobile App Development: Transition the Streamlit prototype into a standalone mobile application for offline capabilities and broader reach.
  • Disease Prevention Advice: Expand the AI's knowledge base to include proactive prevention strategies.
  • Community Reporting: Integrate features for farmers to report outbreaks, contributing to a real-time disease map.
  • Crop Yield Prediction: Incorporate additional AI models for predicting potential crop yields based on health status.

Contributing

We welcome contributions! Please feel free to fork the repository, make changes, and submit pull requests.


License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published