# Tutorial 1: Loading, Processing, Visualizing, and Storing Data

This short notebook demostrates how `landlens` can be used to load, process, visualize, and store street-view data from local file directories and Mapillary servers.

In [1]:
from landlens.handlers.cloud import Mapillary
from landlens.handlers.image import ImageExifProcessor as iep
from landlens.process import snap
from landlens.handlers.db import ImageDB

Before we get started, we will need to load our Mapillary API token and other environmental variables. For simplicity, we will use the `dotenv` library to please install this and create a .env file to follow this tutorial.

In [2]:
import os

from dotenv import load_dotenv

load_dotenv()

MLY_TOKEN = os.environ.get("MLY_TOKEN")
LOCAL_IMAGES = os.environ.get("LOCAL_IMAGES")
DATABASE_URL = os.environ.get("DATABASE_URL")
DB_TABLE = os.environ.get("DB_TABLE")

# Loading Images

`landlens` provides two simple ways to load images for the first time which can then be processed and stored for further analysis.

## 1. Loading images from local directory

To load images from a local directory, simply call the `load_images` function while providing the source directory to read from. Currently, only `jpeg` images are supported and it is best to provide the full path to the images.

In [3]:
local_images = iep.load_images(LOCAL_IMAGES)
local_images

Unnamed: 0,name,altitude,camera_type,camera_parameters,captured_at,compass_angle,exif_orientation,image_url,geometry
0,IMG_0408.jpeg,234.415619,,,2023-03-06T11:04:19+03:00,316.262604,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79483 -16.33181)
1,IMG_0404.jpeg,235.402679,,,2023-03-06T11:04:15+03:00,76.32843,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79484 -16.33181)
2,IMG_0405.jpeg,235.890045,,,2023-03-06T11:04:15+03:00,45.822021,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79484 -16.33181)
3,IMG_0409.jpeg,234.390961,,,2023-03-06T11:04:20+03:00,271.465851,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79483 -16.33181)
4,IMG_0403.jpeg,235.64389,,,2023-03-06T11:04:14+03:00,97.850014,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79484 -16.33182)
5,IMG_0410.jpeg,234.289215,,,2023-03-06T11:04:21+03:00,254.171234,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79483 -16.33181)
6,IMG_0406.jpeg,235.161835,,,2023-03-06T11:04:16+03:00,36.9245,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79484 -16.33181)
7,IMG_0407.jpeg,234.766693,,,2023-03-06T11:04:18+03:00,350.140991,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79484 -16.33181)
8,IMG_0411.jpeg,234.289215,,,2023-03-06T11:04:22+03:00,248.655853,1.0,/Users/iosefa/repos/misc/SU_GCPsystem/notebook...,POINT (46.79483 -16.33181)


The resulting image is a GeoImageFrame, which is a simple extension of a Pandas GeoDataFrame with a few required column definitions and additional methods for visualization and data verification.

## 2. Loading images from mapillary

`landlens` was made to work with Mapillary data and it includes helper functions to make calls to the Mapillary API and download and convert Mapillary data into a format for `landlens`.

To use `landlens` to fetch data from Mapillary, you first need to initialize a Mapillary connection using your Mapillary Secret Token.

In [4]:
importer = Mapillary(MLY_TOKEN)

`landlens` offers a few functions to filter Mapillary data from their API. However, for more advanced filtering, we recommend that users use the `mapillary-python-sdk` and convert the resulting data into a GeoImageFrame.

Here is an example of how to load data using the `fetch_by_id` method of `landlens`:

In [5]:
image_id = 915374089313107
image = importer.fetch_by_id(image_id)
image

Unnamed: 0,altitude,atomic_scale,camera_parameters,camera_type,captured_at,compass_angle,computed_altitude,computed_compass_angle,computed_geometry,computed_rotation,...,height,merge_cc,mesh,sequence,sfm_cluster,width,detections,mly_id,name,image_url
0,41.782,1.002665,"0.61739578749889,0.26131500830183,0.1242660260...",fisheye,2019-10-23T22:29:42+09:00,99.299232,1.795589,102.951814,POINT (140.95153462743 42.329677227362),"-1.0627190885041,-0.84029284280692,-1.15538369...",...,3000,1.926644e+18,"{'id': '313263440182706', 'url': 'https://scon...",emgV_2cwMSoW9w7fkg7xJQ,"{'id': '169747341731652', 'url': 'https://scon...",4000,"{'data': [{'id': '916266259223890'}, {'id': '9...",915374089313107,mly|915374089313107,https://scontent-itm1-1.xx.fbcdn.net/m1/v/t6/A...


By default, `landlens` will download all fields from the Mapillary image endpoint and default to `thumb_1024_url` as the `image_url`, however, you may specify a subset of fields using the `fields` argument and only these fields will be downloaded. Note, you must supply at least the `id`, `geometry`, and one of the image url fields.

For example, using the `fetch_within_bbox` method of `landlens`:

In [9]:
bbox = [140.8282500,42.2625132,141.1812100,42.4647410]
start = '2019-10-22'
end = '2019-10-23'
fields = ['id', 'altitude', 'captured_at', 'camera_type', 'thumb_1024_url', 'compass_angle', 'computed_compass_angle', 'computed_geometry', 'geometry']

images = importer.fetch_within_bbox(bbox, start_date=start, end_date=end, fields=fields)
images.head()

Current bbox: [140.82825, 42.2625132, 141.18121, 42.464741]
Current bbox: [140.82825, 42.2625132, 141.00473, 42.3636271]


Exception ignored in: <function AbstractTimezoneFinder.__del__ at 0x121b203a0>
Traceback (most recent call last):
  File "/Users/iosefa/repos/misc/SU_GCPsystem/venv/lib/python3.10/site-packages/timezonefinder/timezonefinder.py", line 96, in __del__
    getattr(self, attribute_name).close()
AttributeError: 'TimezoneFinder' object has no attribute 'poly_zone_ids'


Current bbox: [140.82825, 42.2625132, 140.91649, 42.31307015]
Current bbox: [140.91649, 42.2625132, 141.00473, 42.31307015]
Current bbox: [140.82825, 42.31307015, 140.91649, 42.3636271]
Current bbox: [140.91649, 42.31307015, 141.00473, 42.3636271]


Exception: There was an error connecting to the Mapillary API. Exception: {"error":{"code":1,"message":"An unknown error occurred","error_subcode":99}}

It is also important to realize that Mapillary image urls are not permanent. So, `landlens` offers a method to download Mapillary images and update the `image_url` to the new location.

In [None]:
images.download_images_to_local(LOCAL_IMAGES, filename_column='name')
images.head()

## Loading data from arbitrary sources
It is also possible to read from any OGC-recognized vector file format, including ESRI shapefile, geojson, and geopackage, or to create a `GeoImageFrame` in the same manner as a geopandas dataframe by initializing it with data so long as it has a `name`, `image_url`, and `geometry` column.

Data can also be imported from a PostreSQL postGIS enabled database. There is more information below on creating and exporting postgres tables for `landlens`.

When reading from postgres, it can be beneficial to load a subset of data. This can be important when the database contains upwards of tens of thousands of images. For this purpose, there are several database utility and query functions to select only a subset of the data in the database.

# Processing Images

Now that we have loaded some data, we can perform some simple processing on the images. Check the documentation for the current processing functions available. Here is an example of how `landlens` can be used to snap images to road networks.

First, we need a road network to snap your images to. `landlens` also offers a helper function to download road networks from Open Street Map within a given bounding box.

In [None]:
bbox = snap.create_bbox(image['geometry'][0], 300, 300)
network = snap.get_osm_lines(bbox)

Then, calling the `snap_to_road_network` will snap all points to the closest road network (within the provided threshold distance) and will create a new geometry column in the `GeoImageFrame` falled `snapped_geometry` to represent this new point.

In [None]:
snap.snap_to_road_network(image, 100, network)

# Visualizing Images

`landlens` provides a simply way to visualize its `GeoImageFrames` interactively using Folium. The `map` method of a `GeoImageFrame` will plot all images as markers on a map and will display the image on click along with any metadata set using the `additional_properties` argument as well as markers for any provided additional geometry.

In [None]:
image.map(
    additional_properties=['altitude', 'camera_type'],
    additional_geometries=[
        {'geometry': 'computed_geometry', 'angle': 'computed_compass_angle', 'label': 'Computed'},
        {'geometry': 'snapped_geometry', 'angle': 'snapped_angle', 'label': 'Snapped'},
    ]
)

# Storing Images

`GeoImageFrame` data can be stored in a variety of formats. Given that it is built on Pandas the `GeoDataFrame` class, it will take any geodataframe method to save data. For instance, to save a table as a `geopackage`, we simply call:

In [None]:
image.to_file('image1.gpkg')

## Saving to a PostgreSQL Database

`landlens` also offers functionality to store data in a PostGIS enabled PostgreSQL database. This is done by extending the `to_postgis` method of GeoPandas. There are some constraints, such as unique image_urls, that are automatically applied when storing data, as well as some data validity checks -- see the documentation for details. 

To save a `GeoImageFrame` to a PostgreSQL table, you will need to first initiate a connection to a PostgreSQL database. You can do this using the `ImageDB` class:

In [None]:
db_con = ImageDB(DATABASE_URL)

Then save using `to_postgis`:

In [None]:
local_images.to_postgis(DB_TABLE, db_con.engine)

### Updating an Existing Table

When saving to PostgreSQL, you can choose to handle existing tables. `to_postgis` offers the same `replace` and `append` methods that GeoPandas offers, however, `append` requires that all data going in will not conflict with any existing data. Instead, it is possible to "upsert" (insert and update) data into existing tables using the `upsert_images` class method of `Image_DB`. You may choose to either update conflicting records or skip them by declaring `"update"` or `"nothing"` in the conflict argument of the function.

In [None]:
db_con.upsert_images(local_images, DB_TABLE, conflict='update')

### Querying an Existing Table

It is also possible to load and filter data from existing postgres connections. `landlens` offers simple filter functions to query and filter tables to provide a subset of the data. This can be important when working with very large datasets. For example, to load all images with an altitude greater than 90:

In [None]:
high_altitude_images = db_con.table(DB_TABLE).filter(altitude__gt=90).all()
high_altitude_images.map()