# SceneXplain and geospatial data

This notebook aims to test [SceneXplain](https://scenex.jina.ai)'s capabilities on land use classification.

We used images from several land use datasets:

- [UC Merced land use classification](http://weegee.vision.ucmerced.edu/datasets/landuse.html)
- [AID (Aerial Image Dataset)](https://captain-whu.github.io/AID/)
- [RESISC45 (Remote Sensing Image Scene Classification)](https://www.tensorflow.org/datasets/catalog/resisc45)

After testing with sevetral algorithms, we saw the "Flash" algorithm offered the fastest performance and precision on par with more recent algorithms.

Our results ranged from 60% to 80% accuracy.

## Methodology

### Pre-processing

For each dataset, we selected a subset of 10 images from each category (for example, in the `airport` category we used the files `airport_01.jpg` to `airport_10.jpg`), and discarded the rest. So if a dataset had 20 categories that would give us 200 images in total to test.

For each test, we specify a `MAX_COUNT` of images to test. If, say, the `MAX_COUNT` is 50, we randomly sample a total of 50 random images from random categories.

All images were converted to JPEG format, since the original format was often TIFF which is unsupported by SceneXplain.

All of this pre-processing was done in advance (i.e. not in this notebook), and processed files stored in the project's repo.

### Classifying

We classified the images files by using SceneXplain's "Extract JSON from Image" task, sending each image with a simple JSON schema that specified a `category` string, with potential choices in an `enum` containing dataset's categories.

The JSON schema is as follows:

```json
{
  "type": "object",
  "properties": {
    "category": {
      "type": "array",
      "description": "Which single main category of geospatial imagery does this image belong to?",
      "enum": [<categories from dataset>],
      "maxContains": 1
    }
  }
}
```

A success condition was defined as the output category (output by SceneXplain) matching the target category (defined by the folder name). The score was determined by `successes`/(`successes`)+(`failures`). Any errors (e.g. timeouts) were excluded.

## What happens when it goes wrong?

There are several factors:

- Several categories are very similar, e.g. `sparse_residential`, `medium_residential`, `dense_residential`. SceneXplain often picks the wrong one. This can also be seen in cases like `road` vs `runway`.
- Occasionally it hallucinates a new category not specified in the `enum`, for example `residential`. Occasionally it glitches and assigns a category like `A`.
- Some category names like `chaparral` ([a scrubland common in California](https://en.wikipedia.org/wiki/Chaparral)) are uncommon words and/or concepts. It seems unlikely that many pictures of (and references to) chaparral are used in training datasets.

## Notes

- The first section of this notebook is largely function definition and general set up.
- We are using a subset of the full datasets, pre-processed for use with SceneXplain's API (e.g. resizing to manageable size, converting to common format, making directory structure consistent). For the sake of space and processing power, this is stored directly in our repo (otherwise there would be even more stuff in this notebook).
- While SceneXplain offers batch processing, we simply use serial transmission in this notebook, since we need to send the image data and then compare SceneXplain's output with the target label. Doing this in a batched manner is surely possible, but would take longer to implement.

## Before you start

Get a [secret](https://scenex.jina.ai/api) to access SceneXplain's API and enter it below:

In [1]:
SCENEX_SECRET = '<YOUR SECRET>'

In [2]:
# ensure we're always using latest version by deleting whatever we've got and cloning from scratch
!rm -rf scenex-geospatial

In [3]:
!git clone https://github.com/alexcg1/scenex-geospatial.git

import os
os.chdir('scenex-geospatial')

Cloning into 'scenex-geospatial'...
remote: Enumerating objects: 1147, done.[K
remote: Counting objects: 100% (2/2), done.[K
remote: Compressing objects: 100% (2/2), done.[K
remote: Total 1147 (delta 0), reused 1 (delta 0), pack-reused 1145[K
Receiving objects: 100% (1147/1147), 85.61 MiB | 31.81 MiB/s, done.
Resolving deltas: 100% (24/24), done.


In [4]:
import glob
import json
import os
import base64
import http
from random import sample
from pprint import pprint

In [5]:
# SceneXplain options
FEATURES = ['json']
ALGO = 'Flash'

# Dataset options
MAX_COUNT = 50

In [6]:
# Get base schema for classification of images
with open('base_schema.json') as file:
    schema = json.loads(file.read())

In [7]:
headers = {
    "x-api-key": f"token {SCENEX_SECRET}",
    "content-type": "application/json",

}

def image_to_data_uri(file_path: str):
    with open(file_path, "rb") as image_file:
        encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
        return f"data:image/jpeg;base64,{encoded_image}"


def process_image(filename: str, schema: str, features: list=FEATURES):
    print(f'Processing {filename}')
    data = {
        "data": [
            {
                "image": image_to_data_uri(filename),
                "algorithm": ALGO,
                "features": features,
                "json_schema": json.dumps(schema),
            },
        ]
    }

    connection = http.client.HTTPSConnection("api.scenex.jina.ai")
    connection.request("POST", "/v1/describe", json.dumps(data), headers)
    response = connection.getresponse()

    response_data = response.read().decode("utf-8")
    response_json = json.loads(response_data)

    connection.close()

    return response_json

In [8]:
def process_dataset(folder_name: str, schema: str, max_count: int):
  image_files = glob.glob(f'{folder_name}/**/*.jpg')
  shuffled_files = sample(image_files, max_count)
  successes = []
  fails = []
  errors = []


  for filename in shuffled_files:
      category_name = filename.split('/')[-2]
      image_data = process_image(filename=filename, schema=schema, features=FEATURES)
      try:
          scenex_category = json.loads(image_data['result'][0]['i18n']['en'])['category'][0]
      except:
          scenex_category = "error processing"

      data = {
          "image_path": filename,
          "target_category": category_name,
          "scenex_category": scenex_category
      }

      if data['target_category'] == data['scenex_category']:
          data['match'] = True
          successes.append(data)
          print("\t✅ Successful match!")
      elif data['scenex_category'] == 'error processing':
          errors.append(data)
          print("\t😭 Error")
          pprint(image_data)
      else:
          data['match'] = False
          fails.append(data)
          print(f"\t❌ Failed match! (Identified {data['target_category']} as {data['scenex_category']})")

  # what percent did we get right?
  score = len(successes)/(len(successes)+len(fails))

  output = {
      'algo': ALGO,
      'features': FEATURES,
      'score': score,
      'dataset': folder_name,
      'successes': successes,
      'fails': fails,
      'errors': errors
  }

  return output

### UC Merced Dataset

[UC Merced Land Use Dataset](http://weegee.vision.ucmerced.edu/datasets/landuse.html) is a 21 class land use image dataset meant for research purposes.

For our testing we downloaded 10 images from each category and then sample a selection of that (defined by `MAX_COUNT`) for testing.

The categories are: agricultural, airplane, baseball diamond, beach, buildings, chaparral, dense residential, forest, freeway, golf course, harbor, intersection, medium residential, mobile home park, overpass, parking lot, river, runway, sparse residential, storage tanks, tennis court.

In [9]:
dataset = "./data/uc_merced"

# Create JSON schema - since each dataset may have different tags, we have to create it dynamically
category_dirs = os.listdir(dataset)
schema['properties']['category']['enum'] = category_dirs

In [None]:
uc_merced_output = process_dataset(dataset, schema, MAX_COUNT)

Processing ./data/nwpu/meadow/meadow_006.jpg
	✅ Successful match!
Processing ./data/nwpu/intersection/intersection_009.jpg
	✅ Successful match!
Processing ./data/nwpu/thermal_power_station/thermal_power_station_003.jpg
	✅ Successful match!
Processing ./data/nwpu/river/river_001.jpg
	✅ Successful match!
Processing ./data/nwpu/railway_station/railway_station_007.jpg
	❌ Failed match! (Identified railway_station as railway)
Processing ./data/nwpu/mobile_home_park/mobile_home_park_004.jpg
	❌ Failed match! (Identified mobile_home_park as residential_area)
Processing ./data/nwpu/dense_residential/dense_residential_003.jpg
	✅ Successful match!
Processing ./data/nwpu/baseball_diamond/baseball_diamond_002.jpg
	✅ Successful match!
Processing ./data/nwpu/forest/forest_009.jpg
	✅ Successful match!
Processing ./data/nwpu/golf_course/golf_course_002.jpg
	✅ Successful match!
Processing ./data/nwpu/desert/desert_005.jpg
	❌ Failed match! (Identified desert as wood)
Processing ./data/nwpu/baseball_diamon

In [None]:
uc_merced_output['score']

0.68

## AID dataset

[AID](https://captain-whu.github.io/AID/) is a large-scale aerial image dataset consisting of sample images from Google Earth imagery.

The dataset is made up of the following 30 aerial scene types: airport, bare land, baseball field, beach, bridge, center, church, commercial, dense residential, desert, farmland, forest, industrial, meadow, medium residential, mountain, park, parking, playground, pond, port, railway station, resort, river, school, sparse residential, square, stadium, storage tanks and viaduct. All the images are labelled by the specialists in the field of remote sensing image interpretation.

As with the other datasets in this notebook, we are using a small subset of 10 images per category for our testing.

In [None]:
dataset = "./data/aid"
category_dirs = os.listdir(dataset)
schema['properties']['category']['enum'] = category_dirs

In [None]:
aid_output = process_dataset(dataset, schema, MAX_COUNT)

Processing ./data/nwpu/mobile_home_park/mobile_home_park_002.jpg
	❌ Failed match! (Identified mobile_home_park as dense_residential)
Processing ./data/nwpu/meadow/meadow_003.jpg
	✅ Successful match!
Processing ./data/nwpu/railway_station/railway_station_008.jpg
	❌ Failed match! (Identified railway_station as land)
Processing ./data/nwpu/sea_ice/sea_ice_006.jpg
	✅ Successful match!
Processing ./data/nwpu/rectangular_farmland/rectangular_farmland_007.jpg
	❌ Failed match! (Identified rectangular_farmland as meadow)
Processing ./data/nwpu/storage_tank/storage_tank_008.jpg
	✅ Successful match!
Processing ./data/nwpu/lake/lake_001.jpg
	✅ Successful match!
Processing ./data/nwpu/runway/runway_007.jpg
	❌ Failed match! (Identified runway as airport)
Processing ./data/nwpu/snowberg/snowberg_009.jpg
	✅ Successful match!
Processing ./data/nwpu/beach/beach_001.jpg
	✅ Successful match!
Processing ./data/nwpu/intersection/intersection_007.jpg
	✅ Successful match!
Processing ./data/nwpu/meadow/meadow_

In [None]:
aid_output['score']

0.6

### NWPU RESISC45 dataset

[RESISC45 dataset](https://github.com/tensorflow/datasets/blob/master/docs/catalog/resisc45.md) is a publicly available benchmark for Remote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class.

For our testing we downloaded 10 images from each category and then sample a selection of that (defined by `MAX_COUNT`) for testing.

The categories are: airplane, airport, baseball_diamond, basketball_court, beach, bridge, categories.txt, chaparral, church, circular_farmland, cloud, commercial_area, dense_residential, desert, forest, freeway, golf_course, ground_track_field, harbor, industrial_area, intersection, island, lake, meadow, medium_residential, mobile_home_park, mountain, overpass, palace, parking_lot, railway, railway_station, rectangular_farmland, river, roundabout, runway, sea_ice, ship, snowberg, sparse_residential, stadium, storage_tank, tennis_court, terrace, thermal_power_station, wetland

In [None]:
dataset = "./data/nwpu"
category_dirs = os.listdir(dataset)
schema['properties']['category']['enum'] = category_dirs

In [None]:
nwpu_output = process_dataset(dataset, schema, MAX_COUNT)

Processing ./data/nwpu/harbor/harbor_007.jpg
	✅ Successful match!
Processing ./data/nwpu/forest/forest_002.jpg
	✅ Successful match!
Processing ./data/nwpu/beach/beach_007.jpg
	✅ Successful match!
Processing ./data/nwpu/roundabout/roundabout_002.jpg
	✅ Successful match!
Processing ./data/nwpu/runway/runway_003.jpg
	❌ Failed match! (Identified runway as roundabout)
Processing ./data/nwpu/stadium/stadium_008.jpg
	❌ Failed match! (Identified stadium as soccer stadium)
Processing ./data/nwpu/baseball_diamond/baseball_diamond_001.jpg
	❌ Failed match! (Identified baseball_diamond as airplane)
Processing ./data/nwpu/stadium/stadium_001.jpg
	✅ Successful match!
Processing ./data/nwpu/rectangular_farmland/rectangular_farmland_010.jpg
	✅ Successful match!
Processing ./data/nwpu/railway/railway_002.jpg
	❌ Failed match! (Identified railway as railway_station)
Processing ./data/nwpu/sparse_residential/sparse_residential_004.jpg
	❌ Failed match! (Identified sparse_residential as rectangular_farmland)

In [None]:
nwpu_output['score']

0.48