# Compressing an image

Often we may need to compress a file to reduce the size of the given data. 

We have two options:
- **lossy** compression which is a method of data compression in which the size of the file is reduced by eliminating data in the file (thus, lowering quality).
- **lossless** which is a class of data compression algorithms that allows the original data to be perfectly reconstructed from the compressed data (thus, no loss of quality).

For satellite image process, lossy is typically not very useful as we suffer a reduction in the present quality (and quality is really important).

Many satellite data sources provide imagery as GeoTIFFs using lossless LZW compression (e.g. Planet). 

Let's see an example of a lossy method where we will copy an image and save it as a .jpeg. 

This should not be new code to you, as this builds on the previous tutorials. 

In [11]:
import os
import numpy as np
import rasterio

image_filename = "../week3/20190321_174348_0f1a_3B_AnalyticMS.tif"
my_image = rasterio.open(image_filename)

blue = my_image.read(1)
green = my_image.read(2)
red = my_image.read(3)
nir = my_image.read(4)

# rgb_reordered[rgb_reordered==0] = np.nan

# Stack our bands into a numpy array
rgb = np.dstack((red, green, blue)) 
rgb_reordered = np.moveaxis(rgb, [0, 1, 2], [1, 2, 0])

rgb_reordered = rgb_reordered.astype('uint8')
rgb_reordered.dtype
with rasterio.open(
    'compressed.jpg',                               #our filename
    'w',                                            #write mode
    driver='JPEG',                                  #write a .jpg
    compress='JPEG', #<---- we compress to a .jpg!
    height=rgb_reordered.shape[1],                  #specify the height of our image data
    width=rgb_reordered.shape[2],                   #specify the width of our image data
    count=rgb_reordered.shape[0],                   #number of bands present (e.g. )
    dtype=rgb_reordered.dtype,                      #data type
    crs=my_image.profile['crs'],                    #coordinate reference system
    transform=my_image.profile['transform']         #affine geometry transform information
    ) as my_raster_writer:
        my_raster_writer.write(rgb_reordered)       #write the data

print('Finished writing rgb_reordered')

print('--')
old_size = os.path.getsize(image_filename) # Get the new file size
print('The size of the old file was {} bytes'.format(old_size))
print('This translates to {} Megabytes'.format(round(old_size/1e6, 1)))

print('--')
new_size = os.path.getsize("compressed.jpg") # Get the new file size
print('The size of the file is {} bytes'.format(new_size))
print('This translates to {} Megabytes'.format(round(new_size/1e6, 1)))

print('--')
difference = old_size-new_size
print('Therefore, this is a {}% reduction in file size!'.format(round(difference/old_size*100,1)))


Finished writing rgb_reordered
--
The size of the old file was 155486436 bytes
This translates to 155.5 Megabytes
--
The size of the file is 12032124 bytes
This translates to 12.0 Megabytes
--
Therefore, this is a 92.3% reduction in file size!


However, this reduction in file size naturally comes with a loss in quality. 

Navigate to the .jpg file and open it in QGIS for inspection. 
