# Pratikshya Regmi
## Final Project
#This assistant model gives the answers to user who want to know which grass GIS tool should they use for their geospatial analysis.

---



## Install required libraries
The command !pip install transformers grass-session installs two Python libraries using the pip package manager. The transformers library, provided by Hugging Face, is widely used for working with pre-trained language models such as LLaMA 2, GPT, and BERT for tasks like text classification, generation, and more. The grass-session library allows interaction with GRASS GIS, an open-source geospatial software suite, enabling Python-based scripting and automation of GIS operations.

In [None]:
!pip install transformers grass-session

Collecting grass-session
  Downloading grass_session-0.5-py2.py3-none-any.whl.metadata (2.5 kB)
Downloading grass_session-0.5-py2.py3-none-any.whl (31 kB)
Installing collected packages: grass-session
Successfully installed grass-session-0.5


## Import necessary libraries.
The line import json imports Python's built-in JSON (JavaScript Object Notation) library, which is used for working with JSON data. It allows you to encode Python objects as JSON strings and decode JSON strings back into Python objects, commonly used for data exchange between applications.

The line from transformers import pipeline imports the pipeline utility from the Hugging Face Transformers library. The pipeline function simplifies the use of pre-trained models for various tasks like text classification, question answering, and text generation by abstracting away the model loading and configuration details.

In [None]:
import json
from transformers import pipeline

## Define the knowledge base.
The purpose of defining a knowledge base is to map specific tasks to their corresponding commands, descriptions, and documentation, enabling accurate and efficient query classification and response generation.

In [None]:
# Define the knowledge base
KNOWLEDGE_BASE = {
    "buffer_vector_point": {
        "command": "v.buffer",
        "description": "Creates a buffer around vector points, lines, or areas.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.buffer.html"
    },
    "reproject_raster": {
        "command": "r.proj",
        "description": "Reprojects raster maps from one location to another.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.proj.html"
    },
    "calculate_slope": {
        "command": "r.slope.aspect",
        "description": "Calculates slope and aspect maps from a DEM.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.slope.aspect.html"
    },
    "create_contours": {
        "command": "r.contour",
        "description": "Generates contour lines from a raster.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.contour.html"
    },
    "generate_watershed_subbasins": {
        "command": "r.basins.fill",
        "description": "Generates watershed subbasins raster map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.basins.fill.html"
    },
    "blend_raster_maps": {
        "command": "r.blend",
        "description": "Blends color components of two raster maps by a given ratio.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.blend.html"
    },
    "buffer_raster": {
        "command": "r.buffer",
        "description": "Creates a raster map showing buffer zones surrounding cells that contain non-NULL category values.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.buffer.html"
    },
    "buffer_raster_lowmem": {
        "command": "r.buffer.lowmem",
        "description": "Creates a raster map showing buffer zones surrounding cells using low memory.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.buffer.lowmem.html"
    },
    "build_virtual_raster": {
        "command": "r.buildvrt",
        "description": "Builds a VRT (Virtual Raster) from the list of input raster maps.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.buildvrt.html"
    },
    "generate_stream_channels": {
        "command": "r.carve",
        "description": "Generates stream channels.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.carve.html"
    },
    "manage_raster_categories": {
        "command": "r.category",
        "description": "Manages category values and labels associated with raster maps.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.category.html"
    },
    "create_raster_circle": {
        "command": "r.circle",
        "description": "Creates a raster map containing concentric rings around a given point.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.circle.html"
    },
    "clump_raster_cells": {
        "command": "r.clump",
        "description": "Recategorizes data in a raster map by grouping cells that form physically discrete areas into unique categories.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.clump.html"
    },
    "tabulate_raster_coincidence": {
        "command": "r.coin",
        "description": "Tabulates the mutual occurrence (coincidence) of categories for two raster map layers.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.coin.html"
    },
    "modify_raster_colors": {
        "command": "r.colors",
        "description": "Creates or modifies the color table associated with a raster map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.colors.html"
    },
    "export_raster_colors": {
        "command": "r.colors.out",
        "description": "Exports the color table associated with a raster map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.colors.out.html"
    },
    "generate_contours": {
        "command": "r.contour",
        "description": "Produces a vector map of specified contours from a raster map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.contour.html"
    },
    "calculate_cost_surface": {
        "command": "r.cost",
        "description": "Creates a raster map showing the cumulative cost of moving between different geographic locations.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/r.cost.html"
    },
    "buffer_vector_features": {
        "command": "v.buffer",
        "description": "Creates a buffer around vector features of a given type.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.buffer.html"
    },
    "rebuild_vector_topology": {
        "command": "v.build.all",
        "description": "Rebuilds topology on all vector maps in the current mapset.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.build.all.html"
    },
    "build_vector_topology": {
        "command": "v.build",
        "description": "Creates topology for a vector map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.build.html"
    },
    "build_vector_polylines": {
        "command": "v.build.polylines",
        "description": "Builds polylines from lines or boundaries.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.build.polylines.html"
    },
    "manage_vector_categories": {
        "command": "v.category",
        "description": "Attaches, deletes, or reports vector categories to/from map geometry.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.category.html"
    },
    "add_vector_centroids": {
        "command": "v.centroids",
        "description": "Adds missing centroids to closed boundaries.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.centroids.html"
    },
    "clean_vector_topology": {
        "command": "v.clean",
        "description": "Cleans topology of a vector map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.clean.html"
    },
    "clip_vector_map": {
        "command": "v.clip",
        "description": "Extracts features of an input map that overlay features of the clip map.",
        "manual_link": "https://grass.osgeo.org/grass-stable/manuals/v.clip.html"
    }
}

## Load the Hugging Face NLP model for intent classification
This line initializes a zero-shot classification pipeline using the facebook/bart-large-mnli model, enabling the classification of text into predefined categories without requiring task-specific training.

In [None]:
nlp_model = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

## Define potential intents

The TASKS dictionary maps task identifiers to descriptive labels, providing a reference for classifying user queries into specific geospatial operations.

In [None]:
TASKS = {
    "buffer_vector_point": "Create a buffer around vector features",
    "reproject_raster": "Reproject raster data to a different CRS",
    "calculate_slope": "Calculate the slope and aspect from a DEM",
    "create_contours": "Generate contour lines from a raster dataset",
    "generate_watershed_subbasins": "Generate watershed subbasins raster map",
    "blend_raster_maps": "Blend color components of two raster maps",
    "buffer_raster": "Create a raster map showing buffer zones for non-NULL cells",
    "buffer_raster_lowmem": "Create buffer zones using low memory for raster cells",
    "build_virtual_raster": "Build a virtual raster (VRT) from input raster maps",
    "generate_stream_channels": "Generate stream channels from elevation data",
    "manage_raster_categories": "Manage category values and labels in a raster map",
    "create_raster_circle": "Create a raster map with concentric rings around a point",
    "clump_raster_cells": "Group raster cells into unique categories for discrete areas",
    "tabulate_raster_coincidence": "Tabulate mutual occurrence of categories between raster maps",
    "modify_raster_colors": "Modify the color table associated with a raster map",
    "export_raster_colors": "Export the color table from a raster map",
    "generate_contours": "Generate contour lines from a raster map",
    "calculate_cost_surface": "Create a cost surface raster map based on movement costs",
    "buffer_vector_features": "Create a buffer around vector features",
    "rebuild_vector_topology": "Rebuild topology for all vector maps in the mapset",
    "build_vector_topology": "Create topology for a vector map",
    "build_vector_polylines": "Generate polylines from vector lines or boundaries",
    "manage_vector_categories": "Manage categories for vector features",
    "add_vector_centroids": "Add centroids to closed boundaries in vector data",
    "clean_vector_topology": "Clean topology errors in vector maps",
    "clip_vector_map": "Clip vector features based on overlay with another map"
}

## Function to classify user query and match with knowledge base

In [None]:
def classify_task(query):
    # Define distinct keywords for each task
    keywords_to_tasks = {
        "buffer": "buffer_vector_point",
        "contour": "create_contours",
        "slope": "calculate_slope",
        "aspect": "calculate_slope",
        "reproject": "reproject_raster",
        "projection": "reproject_raster",
                "buffer raster": "buffer_raster",
        "buffer vector": "buffer_vector_features",
        "low memory buffer": "buffer_raster_lowmem",
        "virtual raster": "build_virtual_raster",
        "stream channels": "generate_stream_channels",
        "raster categories": "manage_raster_categories",
        "raster circle": "create_raster_circle",
        "clump raster": "clump_raster_cells",
        "raster coincidence": "tabulate_raster_coincidence",
        "modify colors": "modify_raster_colors",
        "export colors": "export_raster_colors",
        "contour lines": "generate_contours",
        "cost surface": "calculate_cost_surface",
        "rebuild topology": "rebuild_vector_topology",
        "build topology": "build_vector_topology",
        "build polylines": "build_vector_polylines",
        "vector categories": "manage_vector_categories",
        "add centroids": "add_vector_centroids",
        "clean topology": "clean_vector_topology",
        "clip vector": "clip_vector_map",
        "generate watershed": "generate_watershed_subbasins",
        "blend raster": "blend_raster_maps",
        "contour": "create_contours",
        "slope": "calculate_slope",
        "aspect": "calculate_slope",
        "reproject": "reproject_raster",
        "projection": "reproject_raster"
    }

    # Check for keywords in the query first
    for keyword, task_key in keywords_to_tasks.items():
        if keyword in query.lower():
            return task_key

    # Fallback to NLP model if no keywords match
    labels = list(TASKS.values())
    result = nlp_model(query, labels)

    # Debugging: Print the classification results
    print("Classification Results:", result)

    # Find the label with the highest score
    scores = result["scores"]
    best_score = max(scores)
    if best_score > 0.7:  # Confidence threshold
        best_label = result["labels"][scores.index(best_score)]
        for task_key, description in TASKS.items():
            if best_label == description:
                return task_key
    return None

## Function to retrieve details about a task from the knowledge base

In [None]:
def get_tool_details(task_key):
    return KNOWLEDGE_BASE.get(task_key, None)

## Function to interact with the user

In [None]:
def grass_gis_assistant():
    print("Welcome to the GRASS GIS Assistant!")
    print("Ask me about which tool or command to use for specific tasks in GRASS GIS.")
    print("Type 'exit' to quit.")

    while True:
        query = input("\nEnter your query: ")
        if query.lower() == "exit":
            print("Goodbye!")
            break

        print("\nProcessing your query...")
        task_key = classify_task(query)
        if task_key:
            tool_details = get_tool_details(task_key)
            if tool_details:
                print(f"\nCommand: {tool_details['command']}")
                print(f"Description: {tool_details['description']}")
                print(f"Manual Link: {tool_details['manual_link']}")
            else:
                print("\nSorry, I couldn't find information about this task.")
        else:
            print("\nSorry, I couldn't understand your query. Please try rephrasing.")

## Run the assistant model

In [None]:
grass_gis_assistant()

Welcome to the GRASS GIS Assistant!
Ask me about which tool or command to use for specific tasks in GRASS GIS.
Type 'exit' to quit.

Processing your query...

Command: v.buffer
Description: Creates a buffer around vector points, lines, or areas.
Manual Link: https://grass.osgeo.org/grass-stable/manuals/v.buffer.html

Processing your query...
Classification Results: {'sequence': 'I want to calculate steepness of a terrain', 'labels': ['Clip vector features based on overlay with another map', 'Calculate the slope and aspect from a DEM', 'Manage categories for vector features', 'Create a buffer around vector features', 'Create a buffer around vector features', 'Create a cost surface raster map based on movement costs', 'Group raster cells into unique categories for discrete areas', 'Generate contour lines from a raster map', 'Generate polylines from vector lines or boundaries', 'Generate contour lines from a raster dataset', 'Build a virtual raster (VRT) from input raster maps', 'Create a