# GeoNLI Backend (Django)

This is the backend server for the GeoNLI application. It is built with Django and provides APIs for image processing, chat interactions, and multi-modal inference (Captioning, VQA, Grounding).

## Key Features

*   **Hybrid Inference Pipeline:** Automatically routes image processing based on image type:
    *   **RGB Images:** Processed locally/via cloud using the **Moondream** library.
    *   **SAR (Synthetic Aperture Radar) Images:** Routed to a specialized external SAR model API.
*   **Automatic Classification:** Uses an external CNN classifier to determine if an uploaded image is RGB or SAR.
*   **Session Management:** Persists chat sessions, message history, and image metadata (including classification type).
*   **RESTful API:** Endpoints for image upload, chat interactions, and history management.

## Prerequisites

*   Python 3.10+
*   Pip (Python package manager)
*   External services (optional but recommended for full functionality):
    *   CNN Image Classifier Service
    *   Hosted SAR Model Service

## Installation

1.  **Navigate to the backend directory:**
    ```bash
    cd website-updated
    ```

2.  **Create a virtual environment (optional but recommended):**
    ```bash
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    ```

3.  **Install dependencies:**
    ```bash
    pip install -r requirements.txt
    ```

## Configuration

Create a `.env` file in the `website-updated` directory to configure your API keys and external service URLs.

```ini
# .env

# Django Secret Key (Generate a secure random string for production)
SECRET_KEY=your_django_secret_key

# Debug Mode (Set to False in production)
DEBUG=True

# Moondream API Key (Required for RGB image processing)
MOONDREAM_API_KEY=your_moondream_api_key

# --- External Services Configuration ---

# URL for the CNN Classifier that distinguishes RGB vs SAR
CNN_CLASSIFIER_URL=http://localhost:5000/classify

# URL for the hosted SAR Model inference API
SAR_MODEL_URL=http://localhost:5001/v1/inference

# API Key for the SAR Model (if required)
SAR_API_KEY=your_sar_api_key
```

## Database Setup

Run the initial migrations to set up the SQLite database:

```bash
python manage.py migrate
```

## Running the Server

Start the Django development server:

```bash
python manage.py runserver
```

The server will start at `http://localhost:8000`.

## API Endpoints

### 1. Upload Image
*   **URL:** `/geoNLI/upload`
*   **Method:** `POST`
*   **Body:** `form-data` with key `image` (file).
*   **Response:** Returns `image_url`, `session_id`, and detected `image_type` ('RGB' or 'SAR').

### 2. Chat / Inference
*   **URL:** `/geoNLI/chat`
*   **Method:** `POST`
*   **Body:**
    ```json
    {
      "image_url": "http://...",
      "session_id": 1,
      "message": "Describe this image",
      "mode": "captioning"  // Options: "captioning", "vqa", "grounding"
    }
    ```
*   **Response:** Returns the model's text response or grounding objects.

### 3. Chat History
*   **URL:** `/geoNLI/history`
*   **Method:** `GET`
*   **Response:** List of recent chat sessions.

### 4. Session Messages
*   **URL:** `/geoNLI/history/<session_id>`
*   **Method:** `GET`
*   **Response:** Full conversation history for a specific session.

## Testing

To run the API integration tests:

```bash
# Ensure the server is running in another terminal first!
python test_api.py
```

# User Guide

## 1. Introduction
The application provides a complete pipeline for natural-language interpretation of satellite imagery, supporting captioning, grounding, and visual question answering.  
This guide explains how to set up the deployment package and operate the interface to run inference.  
The setup is lightweight and requires only minimal environment preparation using the files provided.

---

## 2. System Requirements
No special system requirements are needed beyond a working Python installation (Python 3.x).

---

## 3. Installation

### 3.1 Unzip the Deployment Package
Extract the provided ZIP file.  
It contains the full project folder along with all required checkpoints.

### 3.2 Create a Virtual Environment
Run the following commands:
```bash
python3 -m venv venv
source venv/bin/activate      # Linux/Mac
venv\Scripts\activate         # Windows
```

### 3.3 Install Dependencies    

Install all required Python packages using the provided `requirements.txt` file:
```bash
pip install -r requirements.txt
```

### 3.4 Download SAM 2.1 Source Code

Download the official SAM 2.1 source repository:
```bash
wget https://github.com/facebookresearch/sam2/archive/refs/heads/main.zip -O sam2_source.zip
```

### 3.5 Unzip SAM 2.1

Extract the downloaded SAM 2.1 source archive:
```bash
unzip sam2_source.zip
```

### 3.6 Install SAM 2.1 in Editable Mode

Navigate to the extracted directory and install SAM 2.1:
```bash
cd sam2-main
pip install -e .
```

### 3.7 Return to the Main Project Directory
```bash
cd ..
```

## 4. Running the Application

### 4.1 Start the Flask Server

Launch the application server in debug mode:
```bash
flask --app VLMHosting run --debug
```
## 5. Operating the Website for Inference

### 5.1 Uploading an Image

1. On the front page of the website, you will see an **Upload Image** option
2. Click on the upload button and select a satellite image from your local system
3. Supported formats include common image types (JPEG, PNG, etc.)
4. Once uploaded, the interface will split into two sections:
   - **Left half**: Interactive chatbot interface
   - **Right half**: Your uploaded satellite image

### 5.2 Interacting with the Chatbot

The chatbot provides three primary modes of interaction for analyzing your satellite imagery:

#### 5.2.1 Captioning
- Generate natural language descriptions of the satellite image
- The model will provide a comprehensive caption describing the contents, features, and characteristics visible in the image
- Simply type your request (e.g., "Describe this image" or "Generate a caption")

#### 5.2.2 Visual Question Answering (VQA)
Ask questions about the image in three different categories:

- **Binary VQA**: Ask yes/no questions about the image
  - Example: "Is there a river in this image?"
  - Example: "Are there buildings present?"

- **Semantic VQA**: Ask descriptive questions requiring detailed answers
  - Example: "What type of terrain is shown?"
  - Example: "What structures are visible in the center?"

- **Numeric VQA**: Ask questions that require numerical answers
  - Example: "How many buildings are visible?"
  - Example: "What is the approximate area coverage?"

#### 5.2.3 Grounding
- Identify and locate specific objects or features within the image
- The system will highlight or mark the requested features on the displayed image
- Example: "Locate all buildings in the image"
- Example: "Show me where the roads are"

### 5.3 Viewing Chat History

- The chatbot interface includes a **History** option
- Click on the history button to view your previous interactions and queries
- This allows you to:
  - Review past questions and answers
  - Revisit previous image analyses
  - Track your inference sessions

### 5.4 Tips for Best Results

- **Be specific**: Clearly state what you want to know about the image
- **Choose the right mode**: Select the appropriate interaction type (captioning, VQA, or grounding) based on your needs
- **Experiment**: Try different question types to explore the full capabilities of the system
- **Use history**: Reference your previous queries to build upon earlier analyses

---
