# Gleam
This notebook aims to design, train, validate, and use a convolutional neural network that is able to predict the distribution of population of a geographical area from nighttime satellite imagery. The product is a high resolution approximation map of .

# Datasets

[Recommended source nighttime pictures (NOAA)](https://ngdc.noaa.gov/eog/viirs/download_dnb_composites.html)

[Recommended source population dataset (SEDAC)](http://sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals-rev10)

This data needs to be processed with a GIS tool to meet a few requirements before training :
- Rasters of the same year need to be merged in one GeoTIFF file : lights on the first band, population on the second band.
- Since both sets don't have the same resolution, upscaling the population set might have increased population density. This needs to be adjusted before training. In the case of the recommended datasets, every population pixel needs to be devided by 4 after merging with gdal_merge or QGIS.
- Use a vector file to clip the raster and isolate the region of interest ([recommended vector file to clip by countries (Natural Earth)](https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-0-countries/)). Areas with a population of zero (out of borders or in the ocean) will be ignored during preprocessing.

## Area of interest
The population distribution we want to predict is the on living in Colombia. The data we have on Colombia is outdated and has a low resolution. The Colombian armed conflict has been settling in the last decade, reducing the sense of insecurity in rural areas. People have been moving from cities to the amazonian forest, and we can see their impact on deforestation. Nighttime imagery could allow us to quantify this migration.

# Imports

In [2]:
import rasterio
import numpy as np
import keras.layers.core as core
import keras.layers.convolutional as conv
import keras.models as models
import keras.callbacks
from sklearn.model_selection import KFold
from keras import optimizers
import time

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


# Preprocessing
The preprocess function creates a window sliding over the input raster.

In [None]:
def preprocess(filepath, input_tile_size, offset):
    """
    :param filepath: GeoTIFF raster file with 2 bands
    :param input_tile_size: width of the square sliding window, and of the output tiles
    :param offset: distance in pixels the window will move between each tile
    :return: (X, Y) where X is a suitable array of inputs for the neural network
            and Y is the expected output for each of these inputs
    This function filters out the tiles that have zero population on them.
    """
    
    raster = rasterio.open(filepath)

    matrix_x = raster.read(1)
    matrix_y = raster.read(2)

    X = []
    Y = []
    col = 0
    while col + input_tile_size < matrix_x.shape[1]:
        row = 0
        while row + input_tile_size < matrix_x.shape[0]:
            pop = np.sum(matrix_y[row: row + input_tile_size, col: col + input_tile_size])
            # only use tiles that have people living on it
            if pop > 0:
                X.append(matrix_x[row: row + input_tile_size, col: col + input_tile_size])
                Y.append(pop)

            row += offset
        col += offset

    raster.close()
    matrix_x, matrix_y = None, None  # free some memory
    X, Y = np.array(X), np.array(Y)
    X = np.expand_dims(X, axis=3)  # add the color channel as a new dimension
    print('input shape (observations, obs_width, obs_height, channels) : ' + str(X.shape))
    return X, Y


# The neural network
This part of the script defines how the neural network will be initialized and creates callbacks.

In [3]:
def init_cnn(input_shape):
    # kernel size for each convolution layer
    kernel_size = (3, 3)
    
    cnn = models.Sequential()

    cnn.add(conv.Convolution2D(filters=64, kernel_size=kernel_size, activation="relu", padding='same',
                               input_shape=input_shape))
    cnn.add(conv.AveragePooling2D(strides=(2, 2)))
    
    cnn.add(conv.Convolution2D(filters=128, kernel_size=kernel_size, activation="relu", padding='same'))
    cnn.add(conv.AveragePooling2D(strides=(2, 2)))

    cnn.add(conv.Convolution2D(filters=256, kernel_size=kernel_size, activation="relu", padding='same'))
    cnn.add(conv.AveragePooling2D(strides=(2, 2)))
    
    cnn.add(core.Flatten())
    cnn.add(core.Dropout(0.5))
    cnn.add(core.Dense(128))
    cnn.add(core.Dense(1))

    cnn.compile(loss="mean_squared_error", optimizer=optimizers.Adam(lr=0.02, decay=0.0), metrics=["mse", "mae"])
    return cnn

# CALLBACKS
# reduce learning rate when we stopped learning anything
rlrp = keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.5, patience=20, verbose=1, mode='auto', min_lr=0.0000001)

# stop learning early if we stopped leaning anything for a longer time
early_stopping = keras.callbacks.EarlyStopping(monitor='loss', min_delta=0.0001, patience=200, verbose=1, mode='auto')

# K-Fold validation
This script trains and validates the model using 4-fold cross-validation. This script was used to find the best performing neural network topology and measure the amount of error we can expect from a trained model. The resulting model is saved to the models/ subfolder.

In [31]:
## PARAMETERS ##
nb_epoch = 2000  # maximum number of epochs
model_birthday = time.strftime("%Y-%m-%d_%H-%M-%S", time.gmtime())  # used to identify generated files (logs and models)
raster_path = '../../data/lightpop_merged/adj_2015_brazil.tif'  # raster containing the training dataset
verbose = 2  # 0: no progress message, 1: progress bar for each epoch, 2: one message for each epoch
################

# CALLBACKS
# logs for tensorboard
tensorboard = keras.callbacks.TensorBoard(log_dir="logs/" + model_birthday)

print('preprocessing')
X, Y = preprocess(raster_path, 32, 32)

print('configuring cnn')

# input dimensions
img_count, img_rows, img_cols, img_channel_count = X.shape

# k-fold split
kfold = KFold(n_splits=4, shuffle=True, random_state=None)

# initialize statistics aggregate
kfold_mse = []
kfold_mae = []
kfold_sae = []

print('logs will be saved to logs/' + model_birthday)

for train, test in kfold.split(X, Y):
    
    cnn = init_cnn((img_rows, img_cols, img_channel_count))

    print('training ...')

    cnn.fit(X[train], Y[train], batch_size=1024, epochs=nb_epoch, verbose=verbose, callbacks=[tensorboard, rlrp, early_stopping],
            sample_weight=None)

    cnn.save('models/' + model_birthday + '.h5')

    print('model saved to models/' + model_birthday + '.h5')
    
    # evaluate and print stats
    evaluation = cnn.evaluate(X[test], Y[test], verbose=2, batch_size=1024)
    evaluation = dict(zip(cnn.metrics_names, evaluation))
    kfold_mse.append(evaluation['mean_squared_error'])
    kfold_mae.append(evaluation['mean_absolute_error'])
    kfold_sae.append(evaluation['mean_absolute_error'] * len(Y[test]))
    
    print('K-fold validation results :')
    print('Mean squared error : %.2f (std %.2f)' % (np.mean(kfold_mse), np.std(kfold_mse)))
    print('Mean absolute error : %.2f (std %.2f)' % (np.mean(kfold_mae), np.std(kfold_mae)))
    print('Sum of absolute errors : %.2f (std %.2f)' % (np.mean(kfold_sae), np.std(kfold_sae)))

print('done !')


preprocessing
opening raster
input shape (observations, obs_width, obs_height, channels) : (626684, 32, 32, 1)
configuring cnn
logs will be saved to logs/2018-07-08_19-55-56
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_71 (Conv2D)           (None, 32, 32, 64)        640       
_________________________________________________________________
average_pooling2d_59 (Averag (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_72 (Conv2D)           (None, 16, 16, 128)       73856     
_________________________________________________________________
average_pooling2d_60 (Averag (None, 8, 8, 128)         0         
_________________________________________________________________
conv2d_73 (Conv2D)           (None, 8, 8, 256)         295168    
_________________________________________________________________
average_pooling2d_61 (Averag (None

Epoch 57/2000
 - 42s - loss: 129526721.8562 - mean_squared_error: 129526721.8562 - mean_absolute_error: 2070.2466
Epoch 58/2000
 - 42s - loss: 119763766.2553 - mean_squared_error: 119763766.2553 - mean_absolute_error: 2098.9206
Epoch 59/2000
 - 42s - loss: 119615696.2446 - mean_squared_error: 119615696.2446 - mean_absolute_error: 2077.6637
Epoch 60/2000
 - 42s - loss: 125506472.2529 - mean_squared_error: 125506472.2529 - mean_absolute_error: 2077.1413
Epoch 61/2000
 - 42s - loss: 122612491.5367 - mean_squared_error: 122612491.5367 - mean_absolute_error: 2117.4095
Epoch 62/2000
 - 42s - loss: 117033289.3713 - mean_squared_error: 117033289.3713 - mean_absolute_error: 2072.9828
Epoch 63/2000
 - 42s - loss: 129571199.9616 - mean_squared_error: 129571199.9616 - mean_absolute_error: 2114.6842
Epoch 64/2000
 - 42s - loss: 110293666.7416 - mean_squared_error: 110293666.7416 - mean_absolute_error: 2058.7290
Epoch 65/2000
 - 42s - loss: 116712625.7081 - mean_squared_error: 116712625.7081 - mean_

Epoch 130/2000
 - 42s - loss: 70290497.7809 - mean_squared_error: 70290497.7809 - mean_absolute_error: 1888.0046
Epoch 131/2000
 - 42s - loss: 64505028.4106 - mean_squared_error: 64505028.4106 - mean_absolute_error: 1853.8037
Epoch 132/2000
 - 42s - loss: 59575516.9622 - mean_squared_error: 59575516.9622 - mean_absolute_error: 1813.5099
Epoch 133/2000
 - 42s - loss: 61207177.5300 - mean_squared_error: 61207177.5300 - mean_absolute_error: 1809.4365
Epoch 134/2000
 - 42s - loss: 65148676.3240 - mean_squared_error: 65148676.3240 - mean_absolute_error: 1854.5170
Epoch 135/2000
 - 42s - loss: 79674990.6477 - mean_squared_error: 79674990.6477 - mean_absolute_error: 1897.1711
Epoch 136/2000
 - 42s - loss: 72223065.0538 - mean_squared_error: 72223065.0538 - mean_absolute_error: 1971.0804
Epoch 137/2000
 - 42s - loss: 62664699.0676 - mean_squared_error: 62664699.0676 - mean_absolute_error: 1889.9652
Epoch 138/2000
 - 42s - loss: 58481649.0861 - mean_squared_error: 58481649.0861 - mean_absolute_

Epoch 202/2000
 - 41s - loss: 34324079.1052 - mean_squared_error: 34324079.1052 - mean_absolute_error: 1542.6108
Epoch 203/2000
 - 41s - loss: 36082406.2713 - mean_squared_error: 36082406.2713 - mean_absolute_error: 1549.0418
Epoch 204/2000
 - 41s - loss: 32773674.0257 - mean_squared_error: 32773674.0257 - mean_absolute_error: 1521.2117
Epoch 205/2000
 - 41s - loss: 38516605.5405 - mean_squared_error: 38516605.5405 - mean_absolute_error: 1567.2291
Epoch 206/2000
 - 41s - loss: 37071836.6786 - mean_squared_error: 37071836.6786 - mean_absolute_error: 1555.7606
Epoch 207/2000
 - 41s - loss: 33008583.2840 - mean_squared_error: 33008583.2840 - mean_absolute_error: 1521.3929
Epoch 208/2000
 - 41s - loss: 34646593.4175 - mean_squared_error: 34646593.4175 - mean_absolute_error: 1526.0526
Epoch 209/2000
 - 41s - loss: 37218560.0671 - mean_squared_error: 37218560.0671 - mean_absolute_error: 1537.8283

Epoch 00209: ReduceLROnPlateau reducing learning rate to 0.004999999888241291.
Epoch 210/2000
 

 - 43s - loss: 22585043.9625 - mean_squared_error: 22585043.9625 - mean_absolute_error: 1354.3558
Epoch 274/2000
 - 43s - loss: 22488575.1456 - mean_squared_error: 22488575.1456 - mean_absolute_error: 1351.5461
Epoch 275/2000
 - 43s - loss: 21355877.0353 - mean_squared_error: 21355877.0353 - mean_absolute_error: 1341.7613
Epoch 276/2000
 - 43s - loss: 22617681.7011 - mean_squared_error: 22617681.7011 - mean_absolute_error: 1350.0131
Epoch 277/2000
 - 43s - loss: 20805750.6689 - mean_squared_error: 20805750.6689 - mean_absolute_error: 1339.9826
Epoch 278/2000
 - 43s - loss: 22564381.2303 - mean_squared_error: 22564381.2303 - mean_absolute_error: 1351.3123
Epoch 279/2000
 - 43s - loss: 22220896.3559 - mean_squared_error: 22220896.3559 - mean_absolute_error: 1346.1876
Epoch 280/2000
 - 43s - loss: 22310320.0685 - mean_squared_error: 22310320.0685 - mean_absolute_error: 1355.1108
Epoch 281/2000
 - 43s - loss: 23105502.9376 - mean_squared_error: 23105502.9376 - mean_absolute_error: 1354.185

Epoch 345/2000
 - 42s - loss: 20705101.7853 - mean_squared_error: 20705101.7853 - mean_absolute_error: 1302.7023
Epoch 346/2000
 - 42s - loss: 20516309.2995 - mean_squared_error: 20516309.2995 - mean_absolute_error: 1303.6863
Epoch 347/2000
 - 42s - loss: 19971516.9138 - mean_squared_error: 19971516.9138 - mean_absolute_error: 1311.7405
Epoch 348/2000
 - 42s - loss: 20443123.0679 - mean_squared_error: 20443123.0679 - mean_absolute_error: 1305.2395
Epoch 349/2000
 - 42s - loss: 19977712.5660 - mean_squared_error: 19977712.5660 - mean_absolute_error: 1306.7162
Epoch 350/2000
 - 42s - loss: 20384261.5913 - mean_squared_error: 20384261.5913 - mean_absolute_error: 1305.4613
Epoch 351/2000
 - 42s - loss: 20533221.9759 - mean_squared_error: 20533221.9759 - mean_absolute_error: 1309.2703
Epoch 352/2000
 - 42s - loss: 20246337.7477 - mean_squared_error: 20246337.7477 - mean_absolute_error: 1303.5608
Epoch 353/2000
 - 42s - loss: 20991261.9328 - mean_squared_error: 20991261.9328 - mean_absolute_

Epoch 417/2000
 - 42s - loss: 18359535.9443 - mean_squared_error: 18359535.9443 - mean_absolute_error: 1272.3973
Epoch 418/2000
 - 42s - loss: 19136331.6177 - mean_squared_error: 19136331.6177 - mean_absolute_error: 1282.5103
Epoch 419/2000
 - 42s - loss: 19328360.0072 - mean_squared_error: 19328360.0072 - mean_absolute_error: 1278.0329
Epoch 420/2000
 - 42s - loss: 19211175.1121 - mean_squared_error: 19211175.1121 - mean_absolute_error: 1281.6248
Epoch 421/2000
 - 42s - loss: 19406172.5953 - mean_squared_error: 19406172.5953 - mean_absolute_error: 1283.6912
Epoch 422/2000
 - 42s - loss: 18847653.9213 - mean_squared_error: 18847653.9213 - mean_absolute_error: 1281.0163
Epoch 423/2000
 - 42s - loss: 18654839.2801 - mean_squared_error: 18654839.2801 - mean_absolute_error: 1273.2196
Epoch 424/2000
 - 42s - loss: 19924110.8328 - mean_squared_error: 19924110.8328 - mean_absolute_error: 1287.1638
Epoch 425/2000
 - 42s - loss: 19675158.5918 - mean_squared_error: 19675158.5918 - mean_absolute_

 - 42s - loss: 19222538.5981 - mean_squared_error: 19222538.5981 - mean_absolute_error: 1273.1846
Epoch 489/2000
 - 42s - loss: 18851905.9334 - mean_squared_error: 18851905.9334 - mean_absolute_error: 1277.0368
Epoch 490/2000
 - 42s - loss: 18996779.3570 - mean_squared_error: 18996779.3570 - mean_absolute_error: 1274.6840
Epoch 491/2000
 - 42s - loss: 19483201.9381 - mean_squared_error: 19483201.9381 - mean_absolute_error: 1274.5341
Epoch 492/2000
 - 42s - loss: 18719485.9857 - mean_squared_error: 18719485.9857 - mean_absolute_error: 1272.2951
Epoch 493/2000
 - 42s - loss: 18936389.9594 - mean_squared_error: 18936389.9594 - mean_absolute_error: 1276.7119
Epoch 494/2000
 - 42s - loss: 18947337.6939 - mean_squared_error: 18947337.6939 - mean_absolute_error: 1269.5482
Epoch 495/2000
 - 42s - loss: 18732974.1566 - mean_squared_error: 18732974.1566 - mean_absolute_error: 1268.2369
Epoch 496/2000
 - 42s - loss: 18958333.3238 - mean_squared_error: 18958333.3238 - mean_absolute_error: 1270.788

Epoch 560/2000
 - 42s - loss: 18419301.3884 - mean_squared_error: 18419301.3884 - mean_absolute_error: 1266.4679
Epoch 561/2000
 - 42s - loss: 18295747.3564 - mean_squared_error: 18295747.3564 - mean_absolute_error: 1266.3496
Epoch 562/2000
 - 42s - loss: 18243782.2458 - mean_squared_error: 18243782.2458 - mean_absolute_error: 1264.1084
Epoch 563/2000
 - 42s - loss: 18341479.5272 - mean_squared_error: 18341479.5272 - mean_absolute_error: 1262.0077
Epoch 564/2000
 - 42s - loss: 18580495.5068 - mean_squared_error: 18580495.5068 - mean_absolute_error: 1266.4773
Epoch 565/2000
 - 42s - loss: 17741102.6781 - mean_squared_error: 17741102.6781 - mean_absolute_error: 1262.0467
Epoch 566/2000
 - 42s - loss: 18598515.4355 - mean_squared_error: 18598515.4355 - mean_absolute_error: 1269.2957
Epoch 567/2000
 - 42s - loss: 18401927.6309 - mean_squared_error: 18401927.6309 - mean_absolute_error: 1268.0796
Epoch 568/2000
 - 42s - loss: 18028734.6779 - mean_squared_error: 18028734.6779 - mean_absolute_

Epoch 630/2000
 - 42s - loss: 18091121.8192 - mean_squared_error: 18091121.8192 - mean_absolute_error: 1265.3744
Epoch 631/2000
 - 42s - loss: 18771301.4018 - mean_squared_error: 18771301.4018 - mean_absolute_error: 1266.3807
Epoch 632/2000
 - 42s - loss: 17748357.1816 - mean_squared_error: 17748357.1816 - mean_absolute_error: 1260.1783
Epoch 633/2000
 - 42s - loss: 18184154.1502 - mean_squared_error: 18184154.1502 - mean_absolute_error: 1266.1446
Epoch 634/2000
 - 42s - loss: 18866824.2254 - mean_squared_error: 18866824.2254 - mean_absolute_error: 1269.8507
Epoch 635/2000
 - 42s - loss: 18528052.0007 - mean_squared_error: 18528052.0007 - mean_absolute_error: 1268.4027
Epoch 636/2000
 - 42s - loss: 18049076.0156 - mean_squared_error: 18049076.0156 - mean_absolute_error: 1263.8259
Epoch 637/2000
 - 42s - loss: 18656880.6116 - mean_squared_error: 18656880.6116 - mean_absolute_error: 1269.0835
Epoch 638/2000
 - 42s - loss: 18821520.8574 - mean_squared_error: 18821520.8574 - mean_absolute_

Epoch 701/2000
 - 42s - loss: 18245321.4883 - mean_squared_error: 18245321.4883 - mean_absolute_error: 1263.6835
Epoch 702/2000
 - 42s - loss: 18225448.6275 - mean_squared_error: 18225448.6275 - mean_absolute_error: 1265.3906
Epoch 703/2000
 - 42s - loss: 18632319.6703 - mean_squared_error: 18632319.6703 - mean_absolute_error: 1270.8315
Epoch 704/2000
 - 42s - loss: 17881849.0836 - mean_squared_error: 17881849.0836 - mean_absolute_error: 1262.3193
Epoch 705/2000
 - 42s - loss: 17914359.9287 - mean_squared_error: 17914359.9287 - mean_absolute_error: 1264.9977
Epoch 706/2000
 - 42s - loss: 18201571.2581 - mean_squared_error: 18201571.2581 - mean_absolute_error: 1265.5589
Epoch 707/2000
 - 42s - loss: 19074654.6898 - mean_squared_error: 19074654.6898 - mean_absolute_error: 1273.9677
Epoch 708/2000
 - 42s - loss: 18780876.4548 - mean_squared_error: 18780876.4548 - mean_absolute_error: 1271.1672

Epoch 00708: ReduceLROnPlateau reducing learning rate to 1.5258788721439487e-07.
Epoch 709/2000

Epoch 7/2000
 - 42s - loss: 338416445.1521 - mean_squared_error: 338416445.1521 - mean_absolute_error: 2921.3378
Epoch 8/2000
 - 42s - loss: 281446796.8244 - mean_squared_error: 281446796.8244 - mean_absolute_error: 2818.5610
Epoch 9/2000
 - 42s - loss: 327288766.4734 - mean_squared_error: 327288766.4734 - mean_absolute_error: 2916.2389
Epoch 10/2000
 - 42s - loss: 380525713.5936 - mean_squared_error: 380525713.5936 - mean_absolute_error: 3050.6443
Epoch 11/2000
 - 42s - loss: 314730427.1533 - mean_squared_error: 314730427.1533 - mean_absolute_error: 2883.3593
Epoch 12/2000
 - 42s - loss: 297865063.5342 - mean_squared_error: 297865063.5342 - mean_absolute_error: 2845.7997
Epoch 13/2000
 - 42s - loss: 278796378.5106 - mean_squared_error: 278796378.5106 - mean_absolute_error: 2702.7251
Epoch 14/2000
 - 42s - loss: 316716365.5005 - mean_squared_error: 316716365.5005 - mean_absolute_error: 2750.9926
Epoch 15/2000
 - 42s - loss: 291435360.4067 - mean_squared_error: 291435360.4067 - mean_abs

KeyboardInterrupt: 

# Training without validation
Trains on a whole raster once, without validating. The goal is to build a model that makes the best possible prediction, so it has to train on every data available including the validation set. The resulting model is saved to the models/ subfolder.

In [33]:
## PARAMETERS ##
nb_epoch = 2000  # maximum number of epochs
model_birthday = time.strftime("%Y-%m-%d_%H-%M-%S", time.gmtime())  # used to identify generated files (logs and models)
raster_path = '../../data/lightpop_merged/adj_2015_safrica_namibia.tif'  # raster containing the training dataset
verbose = 2  # 0: no progress message, 1: progress bar for each epoch, 2: one message for each epoch
################

# CALLBACKS
# logs for tensorboard
tensorboard = keras.callbacks.TensorBoard(log_dir="logs/" + model_birthday)

# checkpoints in case of crash or interruption
checkpoint = keras.callbacks.ModelCheckpoint('models/' + model_birthday + '.h5', save_weights_only=False)

print('preprocessing')

X, Y = preprocess(raster_path, 32, 8)

print('configuring cnn')

# input dimensions
img_count, img_rows, img_cols, img_channel_count = X.shape

print('logs will be saved to logs/' + model_birthday)

cnn = init_cnn((img_rows, img_cols, img_channel_count))

print('training ...')

cnn.fit(X, Y, batch_size=1024, epochs=nb_epoch, verbose=verbose,
        callbacks=[tensorboard, checkpoint, rlrp, early_stopping], sample_weight=None)

cnn.save('models/' + model_birthday + '.h5')

print('model saved to models/' + model_birthday + '.h5')

print('done !')

preprocessing
opening raster
input shape (observations, obs_width, obs_height, channels) : (173448, 32, 32, 1)
configuring cnn
logs will be saved to logs/2018-07-10_20-28-42
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 32, 32, 64)        640       
_________________________________________________________________
average_pooling2d_1 (Average (None, 16, 16, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 128)       73856     
_________________________________________________________________
average_pooling2d_2 (Average (None, 8, 8, 128)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 8, 8, 256)         295168    
_________________________________________________________________
average_pooling2d_3 (Average (None

Epoch 57/1000
 - 15s - loss: 113891432.2922 - mean_squared_error: 113891432.2922 - mean_absolute_error: 2278.4462
Epoch 58/1000
 - 15s - loss: 115449521.1241 - mean_squared_error: 115449521.1241 - mean_absolute_error: 2277.8356
Epoch 59/1000
 - 15s - loss: 120196518.8667 - mean_squared_error: 120196518.8667 - mean_absolute_error: 2299.7099
Epoch 60/1000
 - 15s - loss: 112435096.4919 - mean_squared_error: 112435096.4919 - mean_absolute_error: 2291.9676
Epoch 61/1000
 - 15s - loss: 116301279.4602 - mean_squared_error: 116301279.4602 - mean_absolute_error: 2279.6454
Epoch 62/1000
 - 15s - loss: 101649376.9738 - mean_squared_error: 101649376.9738 - mean_absolute_error: 2240.8235
Epoch 63/1000
 - 15s - loss: 109122556.7476 - mean_squared_error: 109122556.7476 - mean_absolute_error: 2313.1267
Epoch 64/1000
 - 15s - loss: 102594314.2246 - mean_squared_error: 102594314.2246 - mean_absolute_error: 2262.1055
Epoch 65/1000
 - 15s - loss: 98037540.6950 - mean_squared_error: 98037540.6950 - mean_ab

Epoch 130/1000
 - 15s - loss: 46581521.3812 - mean_squared_error: 46581521.3812 - mean_absolute_error: 1809.6965
Epoch 131/1000
 - 15s - loss: 64529654.8807 - mean_squared_error: 64529654.8807 - mean_absolute_error: 1935.7349
Epoch 132/1000
 - 15s - loss: 42951422.8435 - mean_squared_error: 42951422.8435 - mean_absolute_error: 1782.5125
Epoch 133/1000
 - 15s - loss: 58963987.2266 - mean_squared_error: 58963987.2266 - mean_absolute_error: 1913.6792
Epoch 134/1000
 - 15s - loss: 43965447.4825 - mean_squared_error: 43965447.4825 - mean_absolute_error: 1789.4336
Epoch 135/1000
 - 15s - loss: 45890459.1443 - mean_squared_error: 45890459.1443 - mean_absolute_error: 1784.4900
Epoch 136/1000
 - 15s - loss: 44009107.7762 - mean_squared_error: 44009107.7762 - mean_absolute_error: 1774.0086
Epoch 137/1000
 - 15s - loss: 59378000.9056 - mean_squared_error: 59378000.9056 - mean_absolute_error: 1880.8561
Epoch 138/1000
 - 15s - loss: 47137496.2006 - mean_squared_error: 47137496.2006 - mean_absolute_

Epoch 203/1000
 - 15s - loss: 38472568.3922 - mean_squared_error: 38472568.3922 - mean_absolute_error: 1710.1681
Epoch 204/1000
 - 15s - loss: 34347996.8739 - mean_squared_error: 34347996.8739 - mean_absolute_error: 1662.5828
Epoch 205/1000
 - 15s - loss: 38387230.5849 - mean_squared_error: 38387230.5849 - mean_absolute_error: 1683.4041
Epoch 206/1000
 - 15s - loss: 38509191.0518 - mean_squared_error: 38509191.0518 - mean_absolute_error: 1716.4284
Epoch 207/1000
 - 15s - loss: 38846397.2187 - mean_squared_error: 38846397.2187 - mean_absolute_error: 1702.3648
Epoch 208/1000
 - 15s - loss: 33424765.4455 - mean_squared_error: 33424765.4455 - mean_absolute_error: 1664.7543
Epoch 209/1000
 - 15s - loss: 33882581.9399 - mean_squared_error: 33882581.9399 - mean_absolute_error: 1663.3425
Epoch 210/1000
 - 15s - loss: 43911547.2052 - mean_squared_error: 43911547.2052 - mean_absolute_error: 1769.9402

Epoch 00210: ReduceLROnPlateau reducing learning rate to 0.009999999776482582.
Epoch 211/1000
 

 - 15s - loss: 22846804.5948 - mean_squared_error: 22846804.5948 - mean_absolute_error: 1487.2774
Epoch 275/1000
 - 16s - loss: 22113937.6003 - mean_squared_error: 22113937.6003 - mean_absolute_error: 1491.2169
Epoch 276/1000
 - 15s - loss: 22804624.7129 - mean_squared_error: 22804624.7129 - mean_absolute_error: 1484.1280
Epoch 277/1000
 - 15s - loss: 23304559.9341 - mean_squared_error: 23304559.9341 - mean_absolute_error: 1488.5047
Epoch 278/1000
 - 15s - loss: 22324629.6317 - mean_squared_error: 22324629.6317 - mean_absolute_error: 1482.1687
Epoch 279/1000
 - 15s - loss: 23106935.5377 - mean_squared_error: 23106935.5377 - mean_absolute_error: 1488.5999
Epoch 280/1000
 - 15s - loss: 23390516.7907 - mean_squared_error: 23390516.7907 - mean_absolute_error: 1496.1541
Epoch 281/1000
 - 15s - loss: 21389128.4825 - mean_squared_error: 21389128.4825 - mean_absolute_error: 1473.7053
Epoch 282/1000
 - 15s - loss: 21886554.5725 - mean_squared_error: 21886554.5725 - mean_absolute_error: 1476.413

Epoch 346/1000
 - 15s - loss: 19467059.2135 - mean_squared_error: 19467059.2135 - mean_absolute_error: 1437.4649
Epoch 347/1000
 - 15s - loss: 20115629.0306 - mean_squared_error: 20115629.0306 - mean_absolute_error: 1432.9827
Epoch 348/1000
 - 15s - loss: 19248704.8568 - mean_squared_error: 19248704.8568 - mean_absolute_error: 1433.5132
Epoch 349/1000
 - 15s - loss: 19005530.5542 - mean_squared_error: 19005530.5542 - mean_absolute_error: 1423.3730

Epoch 00349: ReduceLROnPlateau reducing learning rate to 0.0006249999860301614.
Epoch 350/1000
 - 15s - loss: 19033407.1149 - mean_squared_error: 19033407.1149 - mean_absolute_error: 1430.7823
Epoch 351/1000
 - 15s - loss: 19080906.1315 - mean_squared_error: 19080906.1315 - mean_absolute_error: 1425.8681
Epoch 352/1000
 - 15s - loss: 18940817.1399 - mean_squared_error: 18940817.1399 - mean_absolute_error: 1424.8877
Epoch 353/1000
 - 15s - loss: 20095622.7740 - mean_squared_error: 20095622.7740 - mean_absolute_error: 1429.3445
Epoch 354/1000


Epoch 418/1000
 - 16s - loss: 18903360.5546 - mean_squared_error: 18903360.5546 - mean_absolute_error: 1422.1615
Epoch 419/1000
 - 16s - loss: 19374082.7544 - mean_squared_error: 19374082.7544 - mean_absolute_error: 1419.4398
Epoch 420/1000
 - 16s - loss: 18771769.3468 - mean_squared_error: 18771769.3468 - mean_absolute_error: 1413.5035
Epoch 421/1000
 - 16s - loss: 18777432.1786 - mean_squared_error: 18777432.1786 - mean_absolute_error: 1424.3857
Epoch 422/1000
 - 16s - loss: 18724222.4956 - mean_squared_error: 18724222.4956 - mean_absolute_error: 1424.1626
Epoch 423/1000
 - 16s - loss: 19124203.7854 - mean_squared_error: 19124203.7854 - mean_absolute_error: 1424.8786
Epoch 424/1000
 - 16s - loss: 19484679.4673 - mean_squared_error: 19484679.4673 - mean_absolute_error: 1427.8443
Epoch 425/1000
 - 16s - loss: 19078968.0557 - mean_squared_error: 19078968.0557 - mean_absolute_error: 1424.5795
Epoch 426/1000
 - 16s - loss: 18807759.9892 - mean_squared_error: 18807759.9892 - mean_absolute_

 - 16s - loss: 17444357.6229 - mean_squared_error: 17444357.6229 - mean_absolute_error: 1405.9859
Epoch 490/1000
 - 16s - loss: 19134428.6378 - mean_squared_error: 19134428.6378 - mean_absolute_error: 1423.0538
Epoch 491/1000
 - 16s - loss: 18844123.1783 - mean_squared_error: 18844123.1783 - mean_absolute_error: 1422.1604
Epoch 492/1000
 - 16s - loss: 18539542.0372 - mean_squared_error: 18539542.0372 - mean_absolute_error: 1419.1037
Epoch 493/1000
 - 16s - loss: 18516148.4991 - mean_squared_error: 18516148.4991 - mean_absolute_error: 1413.3667
Epoch 494/1000
 - 16s - loss: 18216337.8457 - mean_squared_error: 18216337.8457 - mean_absolute_error: 1406.9857
Epoch 495/1000
 - 16s - loss: 17758340.9853 - mean_squared_error: 17758340.9853 - mean_absolute_error: 1406.9390
Epoch 496/1000
 - 16s - loss: 18456447.8050 - mean_squared_error: 18456447.8050 - mean_absolute_error: 1413.7431
Epoch 497/1000
 - 16s - loss: 19237997.4464 - mean_squared_error: 19237997.4464 - mean_absolute_error: 1420.031

Epoch 560/1000
 - 16s - loss: 18345558.0673 - mean_squared_error: 18345558.0673 - mean_absolute_error: 1410.1178
Epoch 561/1000
 - 16s - loss: 18562532.1681 - mean_squared_error: 18562532.1681 - mean_absolute_error: 1408.3905
Epoch 562/1000
 - 16s - loss: 17850391.6782 - mean_squared_error: 17850391.6782 - mean_absolute_error: 1407.9319
Epoch 563/1000
 - 16s - loss: 17844315.1349 - mean_squared_error: 17844315.1349 - mean_absolute_error: 1408.0557
Epoch 564/1000
 - 16s - loss: 17823892.5162 - mean_squared_error: 17823892.5162 - mean_absolute_error: 1410.1719
Epoch 565/1000
 - 16s - loss: 18193897.1391 - mean_squared_error: 18193897.1391 - mean_absolute_error: 1407.4666
Epoch 566/1000
 - 16s - loss: 18295825.6403 - mean_squared_error: 18295825.6403 - mean_absolute_error: 1413.0018
Epoch 567/1000
 - 16s - loss: 17870227.1375 - mean_squared_error: 17870227.1375 - mean_absolute_error: 1401.4599
Epoch 568/1000
 - 16s - loss: 18952349.5218 - mean_squared_error: 18952349.5218 - mean_absolute_

 - 16s - loss: 18150622.5174 - mean_squared_error: 18150622.5174 - mean_absolute_error: 1409.4957
Epoch 632/1000
 - 16s - loss: 18232871.0752 - mean_squared_error: 18232871.0752 - mean_absolute_error: 1408.0427
Epoch 633/1000
 - 16s - loss: 17905562.3337 - mean_squared_error: 17905562.3337 - mean_absolute_error: 1407.0537

Epoch 00633: ReduceLROnPlateau reducing learning rate to 2.441406195430318e-06.
Epoch 634/1000
 - 16s - loss: 17950894.4704 - mean_squared_error: 17950894.4704 - mean_absolute_error: 1408.7893
Epoch 635/1000
 - 16s - loss: 18832088.0813 - mean_squared_error: 18832088.0813 - mean_absolute_error: 1416.1146
Epoch 636/1000
 - 16s - loss: 17908718.9096 - mean_squared_error: 17908718.9096 - mean_absolute_error: 1407.5633
Epoch 637/1000
 - 16s - loss: 18127490.5538 - mean_squared_error: 18127490.5538 - mean_absolute_error: 1409.8825
Epoch 638/1000
 - 16s - loss: 18778132.8610 - mean_squared_error: 18778132.8610 - mean_absolute_error: 1421.1191
Epoch 639/1000
 - 16s - loss: 

Epoch 701/1000
 - 16s - loss: 18378874.3796 - mean_squared_error: 18378874.3796 - mean_absolute_error: 1407.0362
Epoch 702/1000
 - 16s - loss: 18733506.0777 - mean_squared_error: 18733506.0777 - mean_absolute_error: 1420.7876
Epoch 703/1000
 - 16s - loss: 18329657.1599 - mean_squared_error: 18329657.1599 - mean_absolute_error: 1406.1169
Epoch 704/1000
 - 16s - loss: 18598705.9682 - mean_squared_error: 18598705.9682 - mean_absolute_error: 1413.9145
Epoch 705/1000
 - 16s - loss: 18329140.0775 - mean_squared_error: 18329140.0775 - mean_absolute_error: 1409.7498
Epoch 706/1000
 - 16s - loss: 18088680.2017 - mean_squared_error: 18088680.2017 - mean_absolute_error: 1410.0890
Epoch 707/1000
 - 16s - loss: 18160556.0855 - mean_squared_error: 18160556.0855 - mean_absolute_error: 1409.6745
Epoch 708/1000
 - 16s - loss: 18289317.2434 - mean_squared_error: 18289317.2434 - mean_absolute_error: 1404.7252
Epoch 709/1000
 - 16s - loss: 18097936.6925 - mean_squared_error: 18097936.6925 - mean_absolute_

Epoch 772/1000
 - 16s - loss: 18506692.3462 - mean_squared_error: 18506692.3462 - mean_absolute_error: 1412.0414
Epoch 773/1000
 - 16s - loss: 18106553.4590 - mean_squared_error: 18106553.4590 - mean_absolute_error: 1411.3924
Epoch 774/1000
 - 16s - loss: 18075611.1791 - mean_squared_error: 18075611.1791 - mean_absolute_error: 1410.5285
Epoch 775/1000
 - 16s - loss: 18132635.0695 - mean_squared_error: 18132635.0695 - mean_absolute_error: 1409.8974
Epoch 776/1000
 - 16s - loss: 18100982.7595 - mean_squared_error: 18100982.7595 - mean_absolute_error: 1411.2038
Epoch 777/1000
 - 16s - loss: 17875110.4906 - mean_squared_error: 17875110.4906 - mean_absolute_error: 1414.4043
Epoch 778/1000
 - 16s - loss: 18142136.9622 - mean_squared_error: 18142136.9622 - mean_absolute_error: 1411.0208
Epoch 779/1000
 - 16s - loss: 19088933.9186 - mean_squared_error: 19088933.9186 - mean_absolute_error: 1419.7340
Epoch 780/1000
 - 16s - loss: 18864358.9269 - mean_squared_error: 18864358.9269 - mean_absolute_

Epoch 843/1000
 - 16s - loss: 17813015.6928 - mean_squared_error: 17813015.6928 - mean_absolute_error: 1411.5884
Epoch 844/1000
 - 16s - loss: 18190240.2199 - mean_squared_error: 18190240.2199 - mean_absolute_error: 1407.8465

Epoch 00844: ReduceLROnPlateau reducing learning rate to 1e-07.
Epoch 845/1000
 - 16s - loss: 18449492.1699 - mean_squared_error: 18449492.1699 - mean_absolute_error: 1411.2403
Epoch 846/1000
 - 16s - loss: 18426719.2902 - mean_squared_error: 18426719.2902 - mean_absolute_error: 1412.0341
Epoch 847/1000
 - 16s - loss: 17641312.4187 - mean_squared_error: 17641312.4187 - mean_absolute_error: 1407.7828
Epoch 848/1000
 - 16s - loss: 17952464.0782 - mean_squared_error: 17952464.0782 - mean_absolute_error: 1412.3591
Epoch 849/1000
 - 16s - loss: 18134630.1523 - mean_squared_error: 18134630.1523 - mean_absolute_error: 1408.6390
Epoch 850/1000
 - 16s - loss: 18454294.5565 - mean_squared_error: 18454294.5565 - mean_absolute_error: 1412.0715
Epoch 851/1000
 - 16s - loss: 1

Epoch 914/1000
 - 16s - loss: 18272277.0778 - mean_squared_error: 18272277.0778 - mean_absolute_error: 1410.7067
Epoch 915/1000
 - 16s - loss: 18429890.3370 - mean_squared_error: 18429890.3370 - mean_absolute_error: 1411.3965
Epoch 916/1000
 - 16s - loss: 17770336.8221 - mean_squared_error: 17770336.8221 - mean_absolute_error: 1404.9675
Epoch 917/1000
 - 16s - loss: 17873315.6643 - mean_squared_error: 17873315.6643 - mean_absolute_error: 1408.4255
Epoch 918/1000
 - 16s - loss: 18521022.5283 - mean_squared_error: 18521022.5283 - mean_absolute_error: 1411.3616
Epoch 919/1000
 - 16s - loss: 18698944.7178 - mean_squared_error: 18698944.7178 - mean_absolute_error: 1421.4834
Epoch 920/1000
 - 16s - loss: 17696737.3908 - mean_squared_error: 17696737.3908 - mean_absolute_error: 1405.7155
Epoch 921/1000
 - 16s - loss: 18912152.1268 - mean_squared_error: 18912152.1268 - mean_absolute_error: 1422.4676
Epoch 922/1000
 - 16s - loss: 17456867.7641 - mean_squared_error: 17456867.7641 - mean_absolute_

# Prediction
This script generates a high resolution population raster from a satellite image and a trained model. It uses its own preprocessing algorithm that works without validation data.

Since the model gives only one output for each 32x32 tile, the distribution of the output value in the tile is managed by this script. It assumes a log relation between nightlights and population for each pixel.

The notebook "rastercomparator" can then be used to compare actual data with predicted data, as well as the predicted population counts over the years.

In [22]:
## PARAMETERS ##
model_path = "models/safrica_namibia.h5"
prediction_dataset = '../../data/lightrasters_noaa/2017_colombia.tif'  # input nightlights
out = '2015_safrica_namibia_to_2017_colombia.tif'  # output population file
input_tile_size = 32
################

print('loading model')

cnn = models.load_model(model_path)

print('opening raster')

raster = rasterio.open(prediction_dataset)
band = raster.read(1)
profile = raster.profile
profile.update(count=1)
width, height = raster.width, raster.height

# preprocess
matrix_x = raster.read(1)
tiles_x = []
y = 0
while y + input_tile_size < matrix_x.shape[1]:
    x = 0
    while x + input_tile_size < matrix_x.shape[0]:
        tiles_x.append(matrix_x[x: x + input_tile_size, y: y + input_tile_size])
        x += input_tile_size
    y += input_tile_size
testX = np.array(tiles_x)
raster.close()
matrix_x = None
tiles_x = None
testX = np.expand_dims(testX, axis=3)

print('generating raster')

predicted_tiles = cnn.predict(testX, verbose=0)

predicted_raster = np.zeros(shape=(raster.height, raster.width))
y = 0
pred_index = 0
while y + input_tile_size < width:
    x = 0
    while x + input_tile_size < height:
        in_tile = band[x: x + input_tile_size, y: y + input_tile_size]
        if np.max(in_tile) <= 0:
            # avoid divisions by 0
            predicted_raster[x: x + input_tile_size, y: y + input_tile_size] = 0
        else:
            # normalize visible light between 0 and 1 to avoid overflows (also gives better results)
            weights = in_tile / np.max(in_tile)
            # visible light is perceived logarithmically => counteract with exp
            weights = np.exp(weights) - 1
            # the sum of all weights must be 1
            weights = weights / np.sum(weights)
            predicted_raster[x: x + input_tile_size, y: y + input_tile_size] = predicted_tiles[pred_index] * weights

        pred_index += 1
        x += input_tile_size
    y += input_tile_size

predicted_raster = np.array(predicted_raster)

with rasterio.open('predictions/' + out, 'w', **profile) as dst:
    dst.write(predicted_raster.astype(rasterio.float32), 1)
print("prediction saved to predictions/" + out)

print('prediction done !')


loading model
opening raster
generating raster
prediction saved to predictions/2015_safrica_namibia_to_2017_colombia.tif
prediction done !
