# The image slurper
Are you **tired** of asking people for the data behind their published false color plots? Use the image slurper to find the underlying matrix of values from a false-color (heat map) image and its colorbar! 

The slurper can be used on many types of false color plots, and works best on relatively smooth data.

In [None]:
%run imageslurper.py
%run rectangle_picker.py
%run plot_updater.py

As an example, here is a typical pseudoprojection plot of NASA brightness temperature data and a corresponding colorbar. This plot is not ideal as the ocean appears to be uniformly set to the lowest value which generates sharp gradients along coastlines. Also the `jpeg` file format does not preserve color information very well.

In [None]:
file = "img/world-temp.jpg"
full_image = PIL.Image.open(file)
IPython.display.display(full_image)

Next, run the following cell and use the mouse to draw boxes around the plot area and around the colorbar area in the cell output:

In [None]:
%matplotlib notebook
select_rectangles(full_image)

If tick marks extend into the colorbar, try to avoid that region. The mouse interactions loosely crop out the map area  and the colorbar by specifying `map_corners` and `bar_corners`. The two areas can overlap but the map area must not contain any part of the colorbar, or vice versa. The values can also be given by hand below.

There is no more interactivity from here on so the rest of the sheet can be run with `Cell/Run All Below`.

In [None]:
%matplotlib inline
print_picker_result()
map_corners = get_map_corners()
colorbar_corners = get_bar_corners()

The notebook should now isolate the map area and the colorbar from the rest of the image. The reconstruction uses a horizontal colorbar, accomplished by the `rotate` method if necessary. The lowest value should be on the left. 

In [None]:
map_image = full_image.crop(map_corners)
colorbar_image = full_image.crop(colorbar_corners)

map_image = auto_crop(map_image, threshold=100)
colorbar_image = auto_crop(colorbar_image, threshold=100)
colorbar_image = auto_rotate(colorbar_image)

display('Map', map_image, 'Colorbar', colorbar_image)

Convert the images of the plot area and colorbar area to `numpy` arrays. Create a numpy RGB array containing the median RGB value for each row of the colorbar image. 

In [None]:
image_data = np.asarray(map_image)

assert not np.any(np.isnan(image_data)), "No NaN values expected in plot area rgb image."

colorbar_image_data = np.asarray(colorbar_image)
assert not np.any(np.isnan(colorbar_image_data)), "No NaN values expected in colorbar rgb image"

colorbar_data = np.median(colorbar_image_data, axis=0)  # One pixel wide

fig, axs = plt.subplots(1, 2, figsize=(12, 6))
plot_input(image_data, colorbar_image_data, axs)
plt.show()


The reconstruction alogrithm uses brute force and for each pixel in the plot area it picks the index in the colormap where the colormap RGB value is closest to the pixel RGB value.

In [None]:
norm_order=1
nearest_indices = buffered_unmap(image_data, 
                                 colorbar_data, 
                                 updater=plot_updater(), 
                                 norm_order=norm_order)
assert (nearest_indices.shape[:2] == image_data.shape[:2])

Scale the image to the expected by setting `vmin` and `vmax` to the values indicated on the colorbar. Create arrays of $x,y$ values to match the original images axes

In [None]:
xlim = (-180, 180)
ylim = (90, -90)
clim = (180, 280)

mapped_colors = colorbar_data[nearest_indices]
residual_rgb = image_data - mapped_colors
residual_norm = np.linalg.norm(residual_rgb, ord=norm_order, axis=-1)

x, y, scaled_image, scaled_residual = auto_scale(nearest_indices, residual_norm,
                                                 colorbar_data,
                                                 xlim=xlim,
                                                 ylim=ylim,
                                                 clim=clim)

fig, ax = plt.subplots(figsize=(14, 6))
img = ax.pcolormesh(x, y, scaled_image, cmap='viridis')
ax.set_title('Reconstructed dataset')
plt.colorbar(img, ax=ax)

fig, ax = plt.subplots(figsize=(14, 6))
ax.set_title('Reconstruction residual')
img = ax.pcolormesh(x, y, scaled_residual, cmap='magma')
plt.colorbar(img, ax=ax)

Plot a histogram of the residuals

In [None]:
fig, ax = plt.subplots()
plot_residual_histogram(ax, norm_order, residual_norm, residual_rgb)
plt.show()

Use hole filling (if needed) to remove tick marks and contours inside the plot area. In areas where the error is larger than `thresh` replace the value by the median of its neighbours. Set the threshold value as low as possible to capture the tick marks extending into the plot area, but not the interior of the plot.

In [None]:
error_threshold = None

if error_threshold is not None:
    fig, ax = plt.subplots(1, 1, figsize=(14, 6))
    ax.set_title('Bad pixels')
    ax.imshow(1.0 * (scaled_residual > error_threshold), cmap='magma')

    unmapped_filled_image = auto_hole_fill(scaled_image, scaled_residual, error_threshold)

    if np.any(np.isnan(unmapped_filled_image)):
        print("Some NaN values could not be filled.")

else:
    unmapped_filled_image = scaled_image

Plot the reconstructed map with the original colormap. The images should be very similar.

In [None]:
IPython.display.display(full_image)

fig, ax = plt.subplots(figsize=(14, 6))
original_colormap = ListedColormap(colorbar_data / 255)

img = ax.pcolormesh(x, y, unmapped_filled_image, cmap=original_colormap)
fig.colorbar(img)
plt.title('Reconstructed dataset using original colorbar')
plt.show()

Save the image and the colorbar RGB values in a `pickle` file.

In [None]:
# Save pickle and csv file.
pickle_file = save_pickle(file, unmapped_filled_image, colorbar_data)
print("Saved result as " + str(pickle_file))

# This reads the objects back:
with open(pickle_file, 'rb') as load_handle:
    filled, colorbar_rgb = pickle.load(load_handle)

Or save the image as a `csv` file

In [None]:
csv_file = basename(file) + "-slurped.csv"
np.savetxt(csv_file,
           unmapped_filled_image,
           delimiter=", ",
           fmt="%0.6e",
           header=make_header(file, unmapped_filled_image.shape))
print("Saved results as " + str(csv_file))