# Cropping

To analyze and made predictions for the project a certain region needs to be cropped from orignal RAW world data fetched from WorldClim source. This newer dataset will also allow faster processing of data due to far less filesizes for each file. The cropping demonstrated in this notebook is done through following steps
- The user adjusts parameter values in the user variables section
- Two nested loops count generate filenames with paths for each month of data
- In the nested loops, dataset files are accessed
- Each file/batch of files are loaded and using PIL.Image module, the .tiff world files are cropped
- The cropping process is done by the 'Coordinates2Pixel' function to calculate pixel values for real-world coordinates. Those pixel values are starting and ending locations for the x and y coordinates
- The pixel values form a primitive shape of a square which is easy to formulate later on
- The loaded image(s) are cropped and outputted into a seperate directory

## 1. Setup

In [1]:
# Imports
from PIL import Image
import numpy as np

In [6]:
# User Variables & Parameters
# Control Pparameters
prec_FileName           = 'wc2.1_2.5m_prec_'
tmin_FileName           = 'wc2.1_2.5m_tmin_'
tmax_FileName           = 'wc2.1_2.5m_tmax_'
prec_DataPath           = '../../WaterBucket/data/world/wc2.1_2.5m_prec/'
tmin_DataPath           = '../../WaterBucket/data/world/wc2.1_2.5m_tmin/'
tmax_DataPath           = '../../WaterBucket/data/world/wc2.1_2.5m_tmax/'
startYear               = 1961
endYear                 = 2018
numYears                = (endYear - startYear) + 1
fromMonth               = 1
toMonth                 = 12

# Ooutput Parameters
out_PrecPath            = '../../WaterBucket/data/isb/prec/'
out_TminPath            = '../../WaterBucket/data/isb/tmin/'
out_TmaxPath            = '../../WaterBucket/data/isb/tmax/'

                        # (longitude, latitude, Area)
locationCoordiantes     = (73.084488, 33.738045, 0.405)

# Processing Parameters
batchSize               = 12 * 3
filePathList            = []
# ISB: (33.738045, 73.084488)
# Area of ISB is 906km^2 or 2000km^2 as a square i.e. 30km * 30km or 45km * 45km respectively
# Which is ~0.405' in coordinate value

In [7]:
# Function to calculate the pixel values from coordinates
def Coordinates2Pixels(coordValX, coordValY, areaVal):

    # Calculating coordinate pixel location 1' coordVal = 24 pixels)
    pixelValX           = int(round( ((180 * 24) + (coordValX * 24)), 0) )
    pixelValY           = int(round( ((90 * 24) - (coordValY * 24)), 0)  )

    # Using area to calculate a box
    radius              = int(round((areaVal * 24), 0))
    heightWidth         = radius * 2
    pixelStartX         = pixelValX - radius
    pixelStartY         = pixelValY - radius

    return (pixelStartX, pixelStartY, heightWidth)

print(Coordinates2Pixels(locationCoordiantes[0],
                        locationCoordiantes[1],
                        locationCoordiantes[2]))

(6064, 1340, 20)


## 2. Test

In [27]:
# Loading in a sample image
sampleImagePath         = '../../WaterBucket/data/world/wc2.1_2.5m_tmax/wc2.1_2.5m_tmax_2018-07.tif'
sampleImage             = Image.open(sampleImagePath)

In [28]:
# Displaying full sample image
sampleImage.show()

In [34]:
# Calculation
samplePixelCord         = Coordinates2Pixels(locationCoordiantes[0],
                                            locationCoordiantes[1],
                                            locationCoordiantes[2])
print(samplePixelCord)

(6064, 1340, 20)


In [35]:
# To crop, image.crop((x, y, x + width, y + height))
sampleCroppedImage      = sampleImage.crop((samplePixelCord[0],
                                            samplePixelCord[1],
                                            samplePixelCord[0] + samplePixelCord[2],
                                            samplePixelCord[1] + samplePixelCord[2]))
numpyArray              = np.array(sampleCroppedImage)
print(numpyArray)
np.info(numpyArray)

[[35.676674 34.314632 34.487568 34.74749  34.835403 34.723328 35.035244
  34.683167 33.38308  32.031002 31.86292  30.60284  30.130758 30.37468
  30.628626 29.400597 27.940569 25.840542 26.104513 32.024487]
 [35.27828  35.00324  34.28755  34.41522  34.87488  35.190544 35.16221
  34.209873 33.74154  33.653202 32.200867 31.352531 30.980196 30.91986
  30.53754  28.069237 26.424929 24.584623 26.66832  31.588015]
 [35.599884 34.887848 33.923534 34.018948 34.706356 35.485767 35.537174
  35.256588 34.703995 33.455406 32.206814 32.942226 32.665634 32.001045
  29.970457 26.925873 25.94529  24.880707 25.724123 31.259542]
 [35.689487 34.932457 33.11552  34.342678 35.39383  35.552986 35.35614
  34.951298 33.87045  33.269608 33.62476  33.063915 31.879072 30.306227
  29.283373 27.954514 27.241652 24.868792 27.54393  31.17107 ]
 [35.45109  33.977066 33.531506 35.218403 35.461304 35.364204 35.091106
  34.750004 34.372906 33.843807 32.78671  31.753609 31.008509 29.075409
  29.32029  28.71515  27.098013 

In [37]:
sampleCroppedImage.save('../output/test.tif')

## 3. Processing

### 3.1. Precipitation

In [4]:
# Filename generation for batch processing
for y in range(startYear, endYear + 1):
    for m in range(fromMonth, toMonth + 1):

        # Handling case where month name is 01, 02, to 09
        monthName       = ''
        if m < 10:
            monthName   = '0' + str(m)
        else:
            monthName   = str(m)
        
        # Calculaing imagepath and searching if it exists on the system
        fileName        = prec_FileName + str(y) + '-' + str(monthName) + '.tif'
        imagePath       = prec_DataPath + fileName
        filePathList.append(imagePath)

# Calculating picel coordinates
pixelCord               = Coordinates2Pixels(locationCoordiantes[0],
                                            locationCoordiantes[1],
                                            locationCoordiantes[2])

In [5]:
# Batch processing
currentFileIndex        = 0
filePathListSize        = len(filePathList)

# Iterating over each file by batch
fileIndex               = 0
while fileIndex < filePathListSize:

    # File list and paths (RAM) for each batch
    batchFiles          = []
    batchPaths          = []

    # Iterating over the current set of files in batch to load
    for i in range(fileIndex, fileIndex + batchSize):

        # Extra check for last batch if it has less elemments than batch size
        if i < filePathListSize:
            
            # Loading Images
            CurrentImage = Image.open(filePathList[i])
            batchFiles.append(CurrentImage)
            
            genFileName = out_PrecPath + (filePathList[i])[56:]
            batchPaths.append(genFileName)

    print(batchPaths[0], ' to ', batchPaths[-1])
    # Going over each file of the batch
    for i in range(0, len(batchFiles)):

        # Cropping Image
        CroppedImage    = batchFiles[i].crop((pixelCord[0],
                                            pixelCord[1],
                                            pixelCord[0] + pixelCord[2],
                                            pixelCord[1] + pixelCord[2]))
        
        # Saving File
        CroppedImage.save(batchPaths[i])
    
    # Proceeeding to next batch
    fileIndex = fileIndex + batchSize

../../WaterBucket/data/isb/prec/prec_1961-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1963-12.tif
../../WaterBucket/data/isb/prec/prec_1964-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1966-12.tif
../../WaterBucket/data/isb/prec/prec_1967-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1969-12.tif
../../WaterBucket/data/isb/prec/prec_1970-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1972-12.tif
../../WaterBucket/data/isb/prec/prec_1973-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1975-12.tif
../../WaterBucket/data/isb/prec/prec_1976-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1978-12.tif
../../WaterBucket/data/isb/prec/prec_1979-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1981-12.tif
../../WaterBucket/data/isb/prec/prec_1982-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1984-12.tif
../../WaterBucket/data/isb/prec/prec_1985-01.tif  to  ../../WaterBucket/data/isb/prec/prec_1987-12.tif
../../WaterBucket/data/isb/prec/prec_1988-01.tif  to  ../../WaterBucket/d

### 3.2. Minimum Temperature

In [8]:
# Filename generation for batch processing
for y in range(startYear, endYear + 1):
    for m in range(fromMonth, toMonth + 1):

        # Handling case where month name is 01, 02, to 09
        monthName       = ''
        if m < 10:
            monthName   = '0' + str(m)
        else:
            monthName   = str(m)
        
        # Calculaing imagepath and searching if it exists on the system
        fileName        = tmin_FileName + str(y) + '-' + str(monthName) + '.tif'
        imagePath       = tmin_DataPath + fileName
        filePathList.append(imagePath)

# Calculating picel coordinates
pixelCord               = Coordinates2Pixels(locationCoordiantes[0],
                                            locationCoordiantes[1],
                                            locationCoordiantes[2])

In [9]:
# Batch processing
currentFileIndex        = 0
filePathListSize        = len(filePathList)

# Iterating over each file by batch
fileIndex               = 0
while fileIndex < filePathListSize:

    # File list and paths (RAM) for each batch
    batchFiles          = []
    batchPaths          = []

    # Iterating over the current set of files in batch to load
    for i in range(fileIndex, fileIndex + batchSize):

        # Extra check for last batch if it has less elemments than batch size
        if i < filePathListSize:
            
            # Loading Images
            CurrentImage = Image.open(filePathList[i])
            batchFiles.append(CurrentImage)
            
            genFileName = out_TminPath + (filePathList[i])[56:]
            batchPaths.append(genFileName)

    print(batchPaths[0], ' to ', batchPaths[-1])
    # Going over each file of the batch
    for i in range(0, len(batchFiles)):

        # Cropping Image
        CroppedImage    = batchFiles[i].crop((pixelCord[0],
                                            pixelCord[1],
                                            pixelCord[0] + pixelCord[2],
                                            pixelCord[1] + pixelCord[2]))
        
        # Saving File
        CroppedImage.save(batchPaths[i])
    
    # Proceeeding to next batch
    fileIndex = fileIndex + batchSize

../../WaterBucket/data/isb/tmin/tmin_1961-01.tif  to  ../../WaterBucket/data/isb/tmin/tmin_1963-12.tif
../../WaterBucket/data/isb/tmin/tmin_1964-01.tif  to  ../../WaterBucket/data/isb/tmin/tmin_1965-12.tif


### 3.3. Maximum Temperature

In [8]:
# Filename generation for batch processing
for y in range(startYear, endYear + 1):
    for m in range(fromMonth, toMonth + 1):

        # Handling case where month name is 01, 02, to 09
        monthName       = ''
        if m < 10:
            monthName   = '0' + str(m)
        else:
            monthName   = str(m)
        
        # Calculaing imagepath and searching if it exists on the system
        fileName        = tmax_FileName + str(y) + '-' + str(monthName) + '.tif'
        imagePath       = tmax_DataPath + fileName
        filePathList.append(imagePath)

# Calculating picel coordinates
pixelCord               = Coordinates2Pixels(locationCoordiantes[0],
                                            locationCoordiantes[1],
                                            locationCoordiantes[2])

In [9]:
# Batch processing
currentFileIndex        = 0
filePathListSize        = len(filePathList)

# Iterating over each file by batch
fileIndex               = 0
while fileIndex < filePathListSize:

    # File list and paths (RAM) for each batch
    batchFiles          = []
    batchPaths          = []

    # Iterating over the current set of files in batch to load
    for i in range(fileIndex, fileIndex + batchSize):

        # Extra check for last batch if it has less elemments than batch size
        if i < filePathListSize:
            
            # Loading Images
            CurrentImage = Image.open(filePathList[i])
            batchFiles.append(CurrentImage)
            
            genFileName = out_TmaxPath + (filePathList[i])[56:]
            batchPaths.append(genFileName)

    print(batchPaths[0], ' to ', batchPaths[-1])
    # Going over each file of the batch
    for i in range(0, len(batchFiles)):

        # Cropping Image
        CroppedImage    = batchFiles[i].crop((pixelCord[0],
                                            pixelCord[1],
                                            pixelCord[0] + pixelCord[2],
                                            pixelCord[1] + pixelCord[2]))
        
        # Saving File
        CroppedImage.save(batchPaths[i])
    
    # Proceeeding to next batch
    fileIndex = fileIndex + batchSize

../../WaterBucket/data/isb/tmax/tmax_1961-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1963-12.tif
../../WaterBucket/data/isb/tmax/tmax_1964-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1966-12.tif
../../WaterBucket/data/isb/tmax/tmax_1967-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1969-12.tif
../../WaterBucket/data/isb/tmax/tmax_1970-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1972-12.tif
../../WaterBucket/data/isb/tmax/tmax_1973-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1975-12.tif
../../WaterBucket/data/isb/tmax/tmax_1976-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1978-12.tif
../../WaterBucket/data/isb/tmax/tmax_1979-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1981-12.tif
../../WaterBucket/data/isb/tmax/tmax_1982-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1984-12.tif
../../WaterBucket/data/isb/tmax/tmax_1985-01.tif  to  ../../WaterBucket/data/isb/tmax/tmax_1987-12.tif
../../WaterBucket/data/isb/tmax/tmax_1988-01.tif  to  ../../WaterBucket/d