# MOC stream function analysis script

This script performs several analysis steps which are, within bounds, configurable to the users needs. The steps performed are listed here and documented below, where they are actually run. The Configuration for all analysis steps is located in the cell below, for convenience.

The tasks performed by this script are:
1. Create a plot for each timestep supplied in the netCDF-file both for the global MOC and all regions specified. Then creates a video of all these plots for each region.
2. Create a plot of the difference between two timesteps for each timestep, then create a video of those plots. For global MOC only.
3. Calculate and plot the average, min, max and standard deviation of the MOC stream function for each bin. Global MOC only.
4. Compares the values of the MOC stream function for each timestep within an interval _i_ to the averaged value for that interval and plots the differences. Both for the global and regional MOC.

## Configuration

*outputStyle* - Determines how the MOC streamfunction plots will be presented. Possible styles are _pdf_, _video_ and _both_. The default is _video_.

*subPlotsPerRow* - Determines the layout of the regional subplots (pdf files only). The default is 3.

*regionPlotPadding* - Determines how many of the bins outside of the actual latitude range of each region is being displayed in each regional plot. The default is 3.

*dataFilePath* - The path to the datafile used. The datafile must contain all the variables of the mocStreamfunction output stream.

*outputFileName* - The name of the output pdf file (pdf files only).

*outputDir* - The name of the general output directory. This is where your projects will be saved.

*diffDir* - The name of the folder the difference calculation results will be stored. The default is 'differences'.

*projectName* - The name of the current project. This determines the subfolder in _outputDir_ that all files generated will be stored.

*framesPerSecond* - Global configuration of the video-framerate used to encode the videos. The default is 10. Values can be any floating point number.

*differenceOutputFileName* - The name of the PDF file the difference plots will be saved.

*timeAverage* - Determines if, and, if yes, how many, timesteps are averaged for each plot.
The resulting number of plots will be `timestepsInRecord - timeAverage`. The default is 1 (no time averaging).

*stepping* - Determines, which plots will be created. `stepping = n` means, that every _n_ th timestep a plot will be created, averaging over timeAverage timesteps.

*colorScaleName* - Determines the colorscale used for all contour plots. The default is plasma. A list of available color scale names can be found at the respective code location.

*crossSectionConfig* - Configures the region and the latitudes cross section plots through the MOC should be created for. 

In [None]:
subPlotsPerRow = 3
regionPlotPadding = 10
outputStyle = 'video'
dataFilePath = '/Users/nilsfeige/30to60yearlong/mocStreamfunction.0007-01-01.nc'
outputFileName = 'mocDaily.pdf'
outputDir = 'outputs'
diffDir = 'differences'
projectName = 'signedEdge30To60year8'
framesPerSecond = 5
differenceOutputFileName = 'mocDifference.pdf'
timeAverage = 1
stepping = 1
timeStepsPerDay = 8
crossSectionConfig = {'Atlantic' : [26.5, 66]}
# Accent, Accent_r, Blues, Blues_r, BrBG, BrBG_r, BuGn, BuGn_r, BuPu, BuPu_r, CMRmap, 
# CMRmap_r, Dark2, Dark2_r, GnBu, GnBu_r, Greens, Greens_r, Greys, Greys_r, OrRd, OrRd_r, 
# Oranges, Oranges_r, PRGn, PRGn_r, Paired, Paired_r, Pastel1, Pastel1_r, Pastel2, Pastel2_r,
# PiYG, PiYG_r, PuBu, PuBuGn, PuBuGn_r, PuBu_r, PuOr, PuOr_r, PuRd, PuRd_r, Purples, 
# Purples_r, RdBu, RdBu_r, RdGy, RdGy_r, RdPu, RdPu_r, RdYlBu, RdYlBu_r, RdYlGn, RdYlGn_r, 
# Reds, Reds_r, Set1, Set1_r, Set2, Set2_r, Set3, Set3_r, Spectral, Spectral_r, Wistia, 
# Wistia_r, YlGn, YlGnBu, YlGnBu_r, YlGn_r, YlOrBr, YlOrBr_r, YlOrRd, YlOrRd_r, afmhot, 
# afmhot_r, autumn, autumn_r, binary, binary_r, bone, bone_r, brg, brg_r, bwr, bwr_r, cool, 
# cool_r, coolwarm, coolwarm_r, copper, copper_r, cubehelix, cubehelix_r, flag, flag_r, 
# gist_earth, gist_earth_r, gist_gray, gist_gray_r, gist_heat, gist_heat_r, gist_ncar, 
# gist_ncar_r, gist_rainbow, gist_rainbow_r, gist_stern, gist_stern_r, gist_yarg, gist_yarg_r, 
# gnuplot, gnuplot2, gnuplot2_r, gnuplot_r, gray, gray_r, hot, hot_r, hsv, hsv_r, inferno, 
# inferno_r, jet, jet_r, magma, magma_r, nipy_spectral, nipy_spectral_r, ocean, ocean_r, pink, 
# pink_r, plasma, plasma_r, prism, prism_r, rainbow, rainbow_r, seismic, seismic_r, spectral, 
# spectral_r, spring, spring_r, summer, summer_r, terrain, terrain_r, viridis, viridis_r, 
# winter or winter_r
colorScaleName = 'spectral'

In [None]:
%matplotlib inline

import sys, math
import numpy as np
import netCDF4 as nc
import matplotlib.pyplot as plt
from IPython.display import set_matplotlib_formats
from array import *
from math import ceil
set_matplotlib_formats('png', 'pdf')
from colormap import *
from matplotlib.backends.backend_pdf import PdfPages
from datetime import *
import os
from subprocess import call
from statistics import mean, stdev
import multiprocessing

In [None]:
ct = datetime.now()
batchSize = 50

def getNextBatch(queue):
    retval = queue.get()
    queue.put(retval + batchSize)
    return retval

## Data loading and preprocessing

Here the required data for all calculations is loaded from the netCDF file. Additionally, the currently selected region group and the (bin) boundaries of all regions are calculated.

In [None]:
ds = nc.Dataset(dataFilePath)
mocStreamvalLatAndDepth = ds['mocStreamvalLatAndDepth'][:]
binBoundaries = ds['binBoundaryMocStreamfunction'][:]*180/math.pi
xtime = nc.chartostring(ds['xtime'][:])

regionsInGroup = ds['regionsInGroup'][:]
nRegionsInGroup = ds['nRegionsInGroup'][:]
regionNames = nc.chartostring(ds['regionNames'][:])
regionGroupNames = nc.chartostring(ds['regionGroupNames'][:])
regionMocData = ds['mocStreamvalLatAndDepthRegion'][:]
regionBoundaries = ds['minMaxLatRegion'][:]*180/math.pi

additionalGroupName = getattr(ds,'config_AM_mocStreamfunction_additionalRegion')
nRegionGroups = ds.dimensions['nRegionGroups'].size

for i in range(len(regionBoundaries[:,0])):
    if (regionBoundaries[i, 0] > regionBoundaries[i, 1]):
        temp = regionBoundaries[i, 0]
        regionBoundaries[i, 0] = regionBoundaries[i, 1]
        regionBoundaries[i, 1] = temp
    

for i in range(len(regionNames)):
    regionNames[i] = regionNames[i].strip()
    
for i in range(len(regionGroupNames)):
    regionGroupNames[i] = regionGroupNames[i].strip()
    
regionNumber = -1

for i in range(len(regionGroupNames)):
    if (regionGroupNames[i].decode('utf8') == additionalGroupName):
        curRegionGroup = i

numRegionsInCurGroup = nRegionsInGroup[curRegionGroup]

for i in range(len(xtime)):
    xtime[i] = xtime[i].strip()
    
rBD = ds['refBottomDepth'][:]*-1

regionBinNumber = [[0 for x in range(numRegionsInCurGroup)] for y in range(2)] 

for i in range(len(binBoundaries)-1): # for every bin
    for j in range(numRegionsInCurGroup): # and every group
        if regionBoundaries[j][0] > binBoundaries[i]: # find min
            regionBinNumber[0][j] = max(0, i - 1 - regionPlotPadding)
        if regionBoundaries[j][1] > binBoundaries[i + 1]: # find max
            regionBinNumber[1][j] = min(len(binBoundaries) - 1, i + regionPlotPadding)
            
numTimesteps = len(xtime)
            
# print(numTimesteps)
# print(regionBoundaries.shape)
# print(regionBoundaries)
# print(regionBinNumber)
# print(xtime.shape)
# print(mocStreamvalLatAndDepth.shape)
# print(binBoundaries.shape)
# print(rBD.shape)
# print(regionBoundaries.shape)
print(regionMocData.shape)
# print(additionalGroupName)
# print(regionsInGroup)
# print(nRegionsInGroup)
# print(regionNames)
# print(regionGroupNames)
# print(nRegionGroups)
# print(curRegionGroup)
# print(numRegionsInCurGroup)

## Folder creation

As preparation for all outputs, the folders are created here.

In [None]:
if not os.path.isdir(outputDir):
    os.mkdir(outputDir)

projectDir = outputDir + '/' + projectName + '/'

if not os.path.isdir(projectDir):
    os.mkdir(projectDir)
    
if not os.path.isdir(projectDir + 'MOC/global'):
    os.mkdir(projectDir + 'global')
    
if not os.path.isdir(projectDir + diffDir):
    os.mkdir(projectDir + diffDir)
    
if not os.path.isdir(projectDir + 'avgVSsingle'):
    os.mkdir(projectDir + 'avgVSsingle')


## Transport plot generation
Calculates a plot for every timestep for every region. Plottet is the transport through the transect that borders the region against depth.

In [None]:
transportDir = projectDir + 'transports/'

if not os.path.isdir(transportDir):
    os.mkdir(transportDir)
    
curFig = plt.figure()
    
lastPercent = 0
for i in range(numTimesteps):
    newPercent = 100 * i // numTimesteps
    if newPercent > lastPercent:
        print(str(newPercent) + '% of transport plots generated')
        lastPercent = newPercent
    for j in range(numRegionsInCurGroup):
        regionName = regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8')
        if not os.path.isdir(transportDir + regionName):
            os.mkdir(transportDir + regionName)
        curPlotData = regionMocData[i,j,1:,0] - regionMocData[i,j,:-1,0]
        curFig.clf()
        axes = curFig.gca()
        axes.plot(curPlotData, rBD[:-1])
        curFig.savefig(transportDir + regionName + '/transport' + str(i).zfill(5) + '.png',
                      format = 'png')

## Cross section plot generation
Calculates a cross section through the MOC at given latitudes.

In [None]:
def regionNumberByName(name):
    retval = -1
    for j in range(numRegionsInCurGroup):
        regionName = regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8')
        if name == regionName:
            retval = j
    if retval == -1:
        print('WARNING: No region with name ' + name + ' in current region group.')
        print('The regions in the current group are:')
        print([regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8') 
               for j in range(numRegionsInCurGroup)])
    return retval

def binNumberByLatitude(latitude):
    if latitude < binBoundaries[0] or latitude > binBoundaries[-1]:
        print('WARNINNG: Latitude value ' + latitude + ' out of bounds')
    for i in range(numBins - 1):
        if binBoundaries[i] == latitude: return i
        elif binBoundaries[i + 1] == latitude: return i + 1
        elif binBoundaries[i] < latitude < binBoundaries[i + 1]: return i
    
crossSectionDir = projectDir + 'crossSections/'

if not os.path.isdir(crossSectionDir):
    os.mkdir(crossSectionDir)
    
curFig = plt.figure()

lastPercent = 0
for regionName in crossSectionConfig:
    for i in range(numTimesteps):
        newPercent = 100 * i // numTimesteps
        if newPercent > lastPercent:
            print(str(newPercent) + '% of cross section plots generated')
            lastPercent = newPercent

        regionNumber = regionNumberByName(regionName)

        for j in crossSectionConfig[regionName]:
            if not os.path.isdir(crossSectionDir + regionName):
                os.mkdir(crossSectionDir + regionName)
            binNumber = binNumberByLatitude(j)
            curPlotData = regionMocData[i,regionNumber,:,binNumber]
            curFig.clf()
            axes = curFig.gca()
            axes.plot(curPlotData, rBD)
            curFig.savefig(crossSectionDir + regionName + '/' + regionName + str(j) + 
                           'section' + str(i).zfill(5) + '.png', format = 'png')

## Average vs SinglePlot comparison

For a given list of intervals, the average of the MOC stream function over that interval is calculated. Then, every timestep within that interval is compared to the average and the results, currently only min and max deviations, are stored. This is done for every timestep and every region. Calculations therefore may take a long time.

In [None]:
dailyAverageGlobal = []
dailyAverageLocal  = []

for i in range(0, numTimesteps // timeStepsPerDay):
    averageGlobal = mocStreamvalLatAndDepth[timeStepsPerDay * i, :, :]
    
    averageLocal = []
    for j in range(numRegionsInCurGroup):
        binSlice = slice(regionBinNumber[0][j], regionBinNumber[1][j])
        averageLocal.append(regionMocData[timeStepsPerDay * i, j, :, binSlice])
        
    for j in range(1, timeStepsPerDay):
        averageGlobal = averageGlobal + mocStreamvalLatAndDepth[timeStepsPerDay * i + j, :, :]
        for k in range(numRegionsInCurGroup):
            binSlice = slice(regionBinNumber[0][k], regionBinNumber[1][k])
            averageLocal[k] = averageLocal[k] + regionMocData[timeStepsPerDay * i + j, k, :, binSlice]
    dailyAverageGlobal.append(averageGlobal / timeStepsPerDay)
    dailyAverageLocal.append([x / timeStepsPerDay for x in averageLocal])
    
    
weeklyAverageGlobal = []
weeklyAverageLocal  = []

for i in range(52):
    averageGlobal = dailyAverageGlobal[7 * i][:]
    
    averageLocal = []
    for j in range(numRegionsInCurGroup):
        averageLocal.append(dailyAverageLocal[7 * i][j][:])
        
    for j in range(1, 7):
        averageGlobal = averageGlobal + dailyAverageGlobal[7 * i + j][:]
        for k in range(numRegionsInCurGroup):
            averageLocal[k] = averageLocal[k] + dailyAverageLocal[7 * i + j][k][:]
    weeklyAverageGlobal.append(averageGlobal / 7)
    weeklyAverageLocal.append([x / 7 for x in averageLocal])
    
    
monthlyAverageGlobal = []
monthlyAverageLocal  = []

for i in range(12):
    averageGlobal = dailyAverageGlobal[30 * i][:]
    
    averageLocal = []
    for j in range(numRegionsInCurGroup):
        averageLocal.append(dailyAverageLocal[30 * i][j][:])
        
    for j in range(1, 30):
        averageGlobal = averageGlobal + dailyAverageGlobal[30 * i + j][:]
        for k in range(numRegionsInCurGroup):
            averageLocal[k] = averageLocal[k] + dailyAverageLocal[30 * i + j][k][:]
    monthlyAverageGlobal.append(averageGlobal / 30)
    monthlyAverageLocal.append([x / 30 for x in averageLocal])
    
    
yearlyAverageGlobal = []
yearlyAverageLocal  = []

for i in range(1):
    averageGlobal = monthlyAverageGlobal[0][:]
    
    averageLocal = []
    for j in range(numRegionsInCurGroup):
        averageLocal.append(monthlyAverageLocal[0][j][:])
        
    for j in range(1, 12):
        averageGlobal = averageGlobal + monthlyAverageGlobal[j][:]
        for k in range(numRegionsInCurGroup):
            averageLocal[k] = averageLocal[k] + monthlyAverageLocal[j][k][:]
            
    averageGlobal = averageGlobal * 30
    for j in range(numRegionsInCurGroup):
        averageLocal[j] = averageLocal[j] * 30
    
    for j in range(5):
        averageGlobal = averageGlobal + dailyAverageGlobal[360 + i]
        for k in range(numRegionsInCurGroup):
            averageLocal[k] += dailyAverageLocal[360 + i][k]
            
    yearlyAverageGlobal.append(averageGlobal / 365)
    yearlyAverageLocal.append([x / 365 for x in averageLocal])
    
mainAverages = [dailyAverageGlobal, weeklyAverageGlobal, monthlyAverageGlobal,\
                yearlyAverageGlobal]

mainAveragesLocal = [dailyAverageLocal, weeklyAverageLocal, monthlyAverageLocal,\
                    yearlyAverageLocal]

tss = [timeStepsPerDay, 7 * timeStepsPerDay, 30 * timeStepsPerDay, 365 * timeStepsPerDay]
tsss = ['1', '7', '30', '365']
tssss = ['day', 'week', 'month', 'year']

In [None]:
times = [datetime.strptime(timestring.decode('utf-8'), '%Y-%m-%d_%H:%M:%S') \
         for timestring in xtime]

intervals = [timedelta(1), 
             timedelta(7), 
             timedelta(30), 
             timedelta(365) 
            ]

globalMax = []
globalMin = []
regionalMax = []
regionalMin = []

timestepGlobalMin = []
timestepGlobalMax = []
timestepRegionalMax = []
timestepRegionalMin = []
    
for interval in range(len(intervals)):
    globalMax.append(0)
    globalMin.append(0)
    regionalMax.append([0 for x in range(numRegionsInCurGroup)])
    regionalMin.append([0 for x in range(numRegionsInCurGroup)])
    
    curInterval = intervals[interval]
    numTimestepsPerMainInterval = curInterval.total_seconds() // \
        (times[1] - times[0]).total_seconds()
    print('Starting calculation for interval:' + str(curInterval))
    
    i = 0
    j = 0
    k = 0
    lastPercent = 0
    
    timestepGlobalMin.append([])
    timestepGlobalMax.append([])
    timestepRegionalMax.append([])
    timestepRegionalMin.append([])
    timestepNumber = 0
    while i < len(mainAverages[interval]):
        mainAverageGlobal = mainAverages[interval][i]
        
        curPercent = 100 * i // len(mainAverages[interval])
        if (curPercent > lastPercent):
            print(str(curPercent) + '% of ' + \
                  str(curInterval) + ' completed')
            lastPercent = curPercent

        timestepGlobalMin[interval].append(0)
        timestepGlobalMax[interval].append(0)
        timestepRegionalMin[interval].append([0 for x in range(numRegionsInCurGroup)])
        timestepRegionalMax[interval].append([0 for x in range(numRegionsInCurGroup)])
        
        q = k
        while q < numTimesteps and q < k + tss[interval]:
            plotValues = mainAverageGlobal[:,:] - mocStreamvalLatAndDepth[q, :, :]
            
            maxPerBin = [max(x) for x in plotValues]
            minPerBin = [min(x) for x in plotValues]
            
            curMax = max(maxPerBin)
            curMin = min(minPerBin)
            
            globalMax[interval] = max(globalMax[interval], curMax)
            globalMin[interval] = min(globalMin[interval], curMin)
            
            timestepGlobalMin[interval][timestepNumber] = \
                min(timestepGlobalMin[interval][timestepNumber], curMin)
            timestepGlobalMax[interval][timestepNumber] = \
                max(timestepGlobalMax[interval][timestepNumber], curMax)
            for l in range(numRegionsInCurGroup):
                binSlice = slice(regionBinNumber[0][l], regionBinNumber[1][l])
                
                mainAverageLocal = mainAveragesLocal[interval][i][l][:]
                
                plotValues = mainAverageLocal[:] - regionMocData[q, l,:, binSlice]
                curMax = max([max(x) for x in plotValues])
                curMin = min([min(x) for x in plotValues])
                    
                timestepRegionalMax[interval][timestepNumber][l] = \
                    max(timestepRegionalMax[interval][timestepNumber][l], curMax)
                timestepRegionalMin[interval][timestepNumber][l] = \
                    min(timestepRegionalMin[interval][timestepNumber][l], curMin)
                    
                regionalMax[interval][l] = \
                    max(regionalMax[interval][l], curMax)
                regionalMin[interval][l] = \
                    min(regionalMin[interval][l], curMin)
            q += 1
        timestepNumber += 1
        k += tss[interval]
        i += 1

In [None]:
times = [datetime.strptime(timestring.decode('utf-8'), '%Y-%m-%d_%H:%M:%S') \
         for timestring in xtime]

intervals = [timedelta(1), 
             timedelta(7), 
             timedelta(30), 
             timedelta(365) 
            ]

plt.rcParams['figure.figsize'] = 16, 9

def plotDifferencesForInterval(figureNum):
    interval = figureNum
    curPlotPath = outputDir + "/" + projectName + '/avgVSsingle/interval'\
                         + tsss[interval] + '/'

    if not os.path.isdir(curPlotPath):
        os.mkdir(curPlotPath)

    curInterval = intervals[interval]
    numTimestepsPerMainInterval = curInterval.total_seconds() // \
        (times[1] - times[0]).total_seconds()
    print('Starting calculation for interval:' + str(curInterval))

    i = 0
    j = 0
    k = 0
    lastPercent = 0

    while i < len(mainAverages[interval]):
        mainAverageGlobal = mainAverages[interval][i]

        curPercent = 100 * i // len(mainAverages[interval])
        if (curPercent > lastPercent):
            print(str(curPercent) + '% of ' + \
                  str(curInterval) + ' completed')
            lastPercent = curPercent
            
        curFig = plt.figure(figureNum)
        curFig.clf()
        axes = curFig.gca()
        contourSet = axes.contour(binBoundaries, rBD, mainAverageGlobal, linewidths = 0.5, \
                                    colors="black")
        csf = axes.contourf(binBoundaries, rBD, mainAverageGlobal, cmap=colorScaleName)
        cb = curFig.colorbar(csf)
        axes.clabel(contourSet, colors="black")
        axes.set_title('Global MOC averaged over 1 ' + tssss[interval] + ' (' +\
                       tssss[interval] + ' ' + str(i) + ')')
        axes.set_xlabel('latitude [deg]')
        axes.set_ylabel('Depth [m]')
        
        filename = curPlotPath + 'average' + tssss[interval] + str(i).zfill(5) + '.png'
        curFig.savefig(filename, format='png')
        
        for l in range(numRegionsInCurGroup):
            curFig.clf()
            axes = curFig.gca()
            binSlice = slice(regionBinNumber[0][l], regionBinNumber[1][l])
            contourSet = axes.contour(binBoundaries[binSlice], rBD, \
                                      mainAveragesLocal[interval][i][l], \
                                      linewidths = 0.5, colors="black")
            csf = axes.contourf(binBoundaries[binSlice], rBD, \
                                mainAveragesLocal[interval][i][l], cmap=colorScaleName)
            cb = curFig.colorbar(csf)
            axes.clabel(contourSet, colors="black")
            axes.set_title('Regional MOC averaged over 1 ' + tssss[interval] + ' (' +\
                           tssss[interval] + ' ' + str(i) + '), region ' + str(l))
            axes.set_xlabel('latitude [deg]')
            axes.set_ylabel('Depth [m]')

            filename = curPlotPath + 'averageRegion' + str(l) + tssss[interval] + \
                str(i).zfill(5) + '.png'
            curFig.savefig(filename, format='png')

        q = k
        while q < numTimesteps and q < k + tss[interval]:
            plotValues = mainAverageGlobal[:,:] - mocStreamvalLatAndDepth[q, :, :]

            maxPerBin = [max(x) for x in np.swapaxes(plotValues, 0, 1)]
            minPerBin = [min(x) for x in np.swapaxes(plotValues, 0, 1)]

            curFig.clf()
            axes = curFig.gca()
            axes.plot(binBoundaries, maxPerBin, color='blue')
            axes.plot(binBoundaries, minPerBin, color='red')
            curFig.savefig(curPlotPath + 'diffTS' + str(q).zfill(5) + '.png', format='png')
            
            for l in range(numRegionsInCurGroup):
                binSlice = slice(regionBinNumber[0][l], regionBinNumber[1][l])
                mainAverageLocal = mainAveragesLocal[interval][i][l][:]

                plotValues = mainAverageLocal[:] - regionMocData[q, l,:, binSlice]
                maxPerBin = [max(x) for x in np.swapaxes(plotValues, 0, 1)]
                minPerBin = [min(x) for x in np.swapaxes(plotValues, 0, 1)]

                curFig.clf()
                axes = curFig.gca()
                axes.plot(binBoundaries[binSlice], maxPerBin, color='blue')
                axes.plot(binBoundaries[binSlice], minPerBin, color='red')
                curFig.savefig(curPlotPath + 'diffTS' + str(l) + 'region' +\
                               str(q).zfill(5) + '.png', format='png')
            q += 1
        k += tss[interval]
        i += 1
        
processes = []
for i in range(4):
    process = multiprocessing.Process(target=plotDifferencesForInterval, args=(i,))
    process.deamon = True
    process.start()
    processes.append(process)
    
for i in range(4):
    processes[i].join()
    print(not processes[i].is_alive())
    processes[i].terminate()
    
print('all processes finished')

In [None]:
for i in range(4):
    call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
          str(framesPerSecond), '-i', outputDir + "/" + projectName + \
          '/avgVSsingle/interval' + tsss[i] + '/diffTS%05d.png', '-y', '-codec', \
          'mpeg4', '-b:v', '40000k', \
          outputDir + '/' + projectName + '/avgVSsingle/globalInterval' + \
          tsss[i] + 'movie.mp4'])
    
    for j in range(numRegionsInCurGroup):
        call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
          str(framesPerSecond), '-i', outputDir + "/" + projectName + \
          '/avgVSsingle/interval' + tsss[i] + '/diffTS' + str(j) + 'region%05d.png', \
          '-y', '-codec', 'mpeg4', '-b:v', '40000k', \
          outputDir + '/' + projectName + '/avgVSsingle/region' + str(j) + 'Interval' + \
          tsss[i] + 'movie.mp4'])

In [None]:
plt.figure(1)
plt.clf()
plt.plot(globalMax, color='blue')
plt.plot(regionalMax, color='green')
plt.savefig(outputDir + '/' + projectName + '/avgVSsingle/globalAndRegionalMaxDev.png'\
           , format = 'png')

for i in range(len(intervals)):
    plt.clf()
    plt.plot(timestepGlobalMax[i][:][:], color='green')
    plt.plot(timestepRegionalMax[i][:][:], color='blue')
    plt.plot(timestepGlobalMin[i][:][:], color='red')
    plt.plot(timestepRegionalMin[i][:][:], color='yellow')
    plt.savefig(outputDir + '/' + projectName + '/avgVSsingle/minMaxGlobalAndRegion'\
               + str(i) + '.png', format = 'png')
    #plt.gca().set_ylim([-10,10])

plt.clf()
print(timestepGlobalMax[3], timestepGlobalMin[3], timestepRegionalMax[3][0], timestepRegionalMin[3][0])

## MOC plot creation

Here, plots for the global MOC as well as every region are created. It can be configured to output either a list of png files, from which a video is created at the end, or to generate a pdf file with one page for every timestep. In the pdf, every region has a smaller plot below the global MOC plot. The width of the subplots can be configured using the _subPlotsPerRow_ variable.

In [None]:
def plotMOCpdf():
    start = 0
    end = len(xtime) - timeAverage + 1
    
    pp = PdfPages(outputDir + '/' + projectName + '/' + outputFileName)
    numSubplotRows = math.ceil(numRegionsInCurGroup / subPlotsPerRow) * 2 + 4
    pdfParams = 18, 2 * numSubplotRows
    
    plt.rcParams['figure.figsize'] = pdfParams
        
    lastPercent = 0
    for i in range(start, end, stepping):
        mainAverage = mocStreamvalLatAndDepth[i, :, :]
        for j in range(1, timeAverage):
            mainAverage = mainAverage + mocStreamvalLatAndDepth[i + j,:,:]
        mainAverage /= timeAverage
        
        plt.figure(1, tight_layout=True)
            
        if (100 * i // numTimesteps > lastPercent):
            lastPercent = 100 * i // numTimesteps
            print(str(lastPercent) + '% completed')
        
        curFig = plt.subplot2grid((numSubplotRows, subPlotsPerRow), (0, 0), \
                                colspan=subPlotsPerRow, rowspan=4)
        try:
            contourSet = plt.contour(binBoundaries, rBD, mainAverage, linewidths = 0.5, \
                                    colors="black")
        except ValueError:
            continue
        csf = plt.contourf(binBoundaries, rBD, mainAverage, cmap=colorScaleName)
        cb = plt.colorbar(csf)
        plt.clabel(contourSet, colors="black")
        plt.title('Global MOC by Latitude and Depth [time: ' + xtime[i].decode('utf8') + ']')
        plt.xlabel('latitude [deg]')
        plt.ylabel('Depth [m]')
        subplotCounter = 0
        
        for j in range(numRegionsInCurGroup):
            regionSlice = slice(regionBinNumber[0][j], regionBinNumber[1][j])
            regionName = regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8')
            regionBinBoundaries = binBoundaries[regionSlice]
            if (not os.path.isdir(outputDir + '/' + projectName + '/' + regionName)):
                os.mkdir(outputDir + '/' + projectName + '/' + regionName)
            
            mainAverage = regionMocData[i, j, :, regionSlice]
            for k in range(1, timeAverage):
                mainAverage = mainAverage + regionMocData[i + k, j, :, regionSlice]
            mainAverage /= timeAverage
            curRow = math.floor(j / subPlotsPerRow) * 2 + 4
            curColumn = math.floor(j % subPlotsPerRow)
            
            plt.subplot2grid((numSubplotRows, subPlotsPerRow), (curRow, curColumn), \
                                 rowspan=2)
            
            cs = plt.contour(regionBinBoundaries, rBD, \
                             mainAverage, \
                             linewidths = 0.5, colors="black", \
                             extent=contourSet.extent, extend='both')
            csf = plt.contourf(regionBinBoundaries, rBD, \
                         mainAverage, cmap=colorScaleName, \
                         extent=contourSet.extent, extend='both')
            cb = plt.colorbar(csf)
            plt.clabel(cs, colors="black")
            plt.title('Regional MOC [region: "' + regionName \
                      + '"]' + '[time: ' + xtime[i].decode('utf8') + ']')
            plt.xlabel('latitude [deg]')
            plt.ylabel('Depth [m]')
        pp.savefig()
        plt.clf()
    pp.close()

In [None]:
def plotMOCvideo(lock, queue, figNum):
    start = 0
    end = len(xtime) - timeAverage + 1
    plt.rcParams['figure.figsize'] = 16, 9
    lock.acquire()
    start = getNextBatch(queue)
    lock.release()
    if start > end:
        return
    oldend = end
    end = min(start + batchSize, oldend)
    mainFigure = plt.figure(figNum)
        
    lastPercent = 0
    for i in range(start, end, stepping):
        mainAverage = mocStreamvalLatAndDepth[i, :, :]
        for j in range(1, timeAverage):
            mainAverage = mainAverage + mocStreamvalLatAndDepth[i + j,:,:]
        mainAverage /= timeAverage
        
        curFig = mainFigure
        curFig.clf()
            
        axes = curFig.gca()
        if (100 * i // numTimesteps > lastPercent):
            lastPercent = 100 * i // numTimesteps
            print(str(lastPercent) + '% completed')
        
        try:
            contourSet = axes.contour(binBoundaries, rBD, mainAverage, linewidths = 0.5, \
                                    colors="black")
        except ValueError:
            continue
        csf = axes.contourf(binBoundaries, rBD, mainAverage, cmap=colorScaleName)
        cb = curFig.colorbar(csf)
        axes.clabel(contourSet, colors="black")
        axes.set_title('Global MOC by Latitude and Depth [time: ' + xtime[i].decode('utf8') + ']')
        axes.set_xlabel('latitude [deg]')
        axes.set_ylabel('Depth [m]')
        subplotCounter = 0
        
        curFig.savefig(outputDir + '/' + projectName + '/MOC/global/plot' + str(i).zfill(5) \
                        + '.png', format='png')
    
        for j in range(numRegionsInCurGroup):
            regionSlice = slice(regionBinNumber[0][j], regionBinNumber[1][j])
            regionName = regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8')
            regionBinBoundaries = binBoundaries[regionSlice]
            if (not os.path.isdir(outputDir + '/' + projectName + '/MOC/' + regionName)):
                os.mkdir(outputDir + '/' + projectName + '/MOC/' + regionName)
            
            curFig.clf()
            axes = curFig.gca()
                
            mainAverage = regionMocData[i, j, :, regionSlice]
            for k in range(1, timeAverage):
                mainAverage = mainAverage + regionMocData[i + k, j, :, regionSlice]
            mainAverage /= timeAverage
            curRow = math.floor(j / subPlotsPerRow) * 2 + 4
            curColumn = math.floor(j % subPlotsPerRow)
            
            cs = axes.contour(regionBinBoundaries, rBD, \
                             mainAverage, \
                             linewidths = 0.5, colors="black", \
                             extent=contourSet.extent, extend='both')
            csf = axes.contourf(regionBinBoundaries, rBD, \
                         mainAverage, cmap=colorScaleName, \
                         extent=contourSet.extent, extend='both')
            cb = curFig.colorbar(csf)
            axes.clabel(cs, colors="black")
            axes.set_title('Regional MOC [region: "' + regionName \
                      + '"]' + '[time: ' + xtime[i].decode('utf8') + ']')
            axes.set_xlabel('latitude [deg]')
            axes.set_ylabel('Depth [m]')
            plt.savefig(outputDir + '/' + projectName + '/MOC/' + regionName + \
                        '/plot' + str(i).zfill(5) + '.png' \
                        , format='png')
    plotMOCvideo(lock, queue, figNum)

def plotMOCVideo():
    print('Plotting MOC video')
    lock = multiprocessing.Lock()
    queue = multiprocessing.Queue()
    queue.put(0)
    processes = []
    for i in range(multiprocessing.cpu_count()):
        p = multiprocessing.Process(target=plotMOCvideo, args=(lock, queue, i))
        p.deamon = True
        processes.append(p)
        p.start()
        
    for process in processes:
        process.join()
        print(not process.is_alive())
        process.terminate()

In [None]:
isBoth = outputStyle == 'both'
isPDF = isBoth or outputStyle == 'pdf'
isVideo = isBoth or outputStyle == 'video'

if isPDF:
    plotMOCpdf()
if isVideo:
    plotMOCVideo()

In [None]:
print("Video mode enabled:", isVideo)
if isVideo:
    call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
          str(framesPerSecond), '-i', outputDir + '/' + projectName + \
          '/global/plot%05d.png', '-y', '-codec', 'mpeg4', '-b:v', '40000k', \
          outputDir + '/' + projectName + '/globalMovie.mp4'])
    
    print('Finished global video.')

    for j in range(numRegionsInCurGroup):
        regionName = regionNames[regionsInGroup[curRegionGroup][j] - 1].decode('utf8')
        call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
              str(framesPerSecond), '-i', outputDir + '/' + projectName + '/' + regionName \
              + '/plot%05d.png', '-y', '-codec', 'mpeg4', '-b:v', '40000k', outputDir + \
              '/' + projectName + '/' + regionName + 'Movie.mp4'])
        print('Finished video for region ' + regionName + '.')

## Statistical analysis

Here, the min and max values for each bin are calculated for each timestep. The results are plotted as a series of images and then converted to a video. Also, the difference in the MOC between each timestep and its successor is calculated, plotted as contours and output as a series of images. The images then are converted into a video file.

Then, a plot showing the average and the standard deviation for each bin over all timesteps is written to a file. In a last step, the minimum and maximum value of each timestep is plotted against time.

In [None]:
plt.rcParams['figure.figsize'] = 22, 16
    
numBins = len(binBoundaries)
maxBinValue = [[] for i in range(numTimesteps)]
minBinValue = [[] for i in range(numTimesteps)]

maxBinValueRegion = [[[] for i in range(numRegionsInCurGroup)] for j in range(numTimesteps)]
minBinValueRegion = [[[] for i in range(numRegionsInCurGroup)] for j in range(numTimesteps)]

def calcBinValues(start, end, maxBinValue, minBinValue):
    curFig = plt.figure()
    lastPercent = 0
    
    for i in range(start, end):
        if (100 * i // numTimesteps > lastPercent):
            lastPercent = 100 * i // numTimesteps
            print(str(lastPercent) + '% completed')
            
        curFig.clf()
        axes = curFig.gca()
        axes.plot(binBoundaries, maxBinValue[i], color='black')
        axes.plot(binBoundaries, minBinValue[i], color='blue')
        curFig.savefig(outputDir + '/' + projectName + '/' + diffDir + '/binVals' + \
                    str(i).zfill(5) + '.png', format='png')
    plt.close(curFig)
    
def calcBinValuesRegion(start, end, maxBinValue, minBinValue):
    curFig = plt.figure()
    lastPercent = 0
    
    for i in range(start, end):
        if (100 * i // numTimesteps > lastPercent):
            lastPercent = 100 * i // numTimesteps
            print(str(lastPercent) + '% completed')
        for j in range(numRegionsInCurGroup):
            regionSlice = slice(regionBinNumber[0][j], regionBinNumber[1][j])
            rBinBoundaries = binBoundaries[regionSlice]
            curFig.clf()
            axes = curFig.gca()
            axes.plot(rBinBoundaries, maxBinValue[i][j], color='black')
            axes.plot(rBinBoundaries, minBinValue[i][j], color='blue')
            fname = outputDir + '/' + projectName + '/' + diffDir + '/region' + \
                           str(j) + 'binVals' + str(i).zfill(5) + '.png'
            curFig.savefig(fname, format='png')
    plt.close(curFig)

def calcDifferences(start, end):
    curFig = plt.figure()
    lastPercent = 0
    for i in range(start, min(numTimesteps - 1, end)):
        if (100 * i // numTimesteps > lastPercent):
            lastPercent = 100 * i // numTimesteps
            print(str(lastPercent) + '% completed')

        values = mocStreamvalLatAndDepth[i, :, :] - mocStreamvalLatAndDepth[i + 1, :, :]

        curFig.clf()
        axes = curFig.gca()
        cs = axes.contour(binBoundaries, rBD, values, \
                         linewidths=0.5, colors="black"\
                         ,extend='both')
        csf = axes.contourf(binBoundaries, rBD, \
                     values, cmap=colorScaleName, \
                     extent=cs.extent, extend='both')
        b = curFig.colorbar(csf)
        axes.clabel(cs, colors="black")
        axes.set_title('Successive step MOC differences ' + str(i).zfill(4))
        axes.set_xlabel('latitude [deg]')
        axes.set_ylabel('Depth [m]')
        curFig.savefig(outputDir + '/' + projectName + '/' + diffDir + '/diffVals' + \
                    str(i).zfill(5) + '.png', format='png')
    plt.close(curFig)
    
for i in range(numTimesteps):
    for j in range(numBins):
        tempData = mocStreamvalLatAndDepth[i,:,j]
        maxBinValue[i].append(max(tempData))
        minBinValue[i].append(min(tempData))
    for j in range(numRegionsInCurGroup):
        regionSlice = slice(regionBinNumber[0][j], regionBinNumber[1][j])
        rBinBoundaries = binBoundaries[regionSlice]
        for k in range(regionBinNumber[0][j], regionBinNumber[1][j]):
            tempData = regionMocData[i, j, :, k]
            maxBinValueRegion[i][j].append(max(tempData))
            minBinValueRegion[i][j].append(min(tempData))
        
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())

i = 0
while i < numTimesteps:
    end = min(i + batchSize, numTimesteps)
    pool.apply_async(calcBinValues, args = (i, end, maxBinValue, minBinValue))
    pool.apply_async(calcDifferences, args = (i, end))
    pool.apply_async(calcBinValuesRegion, args = (i, end, maxBinValueRegion, \
                                                  minBinValueRegion))
    i += batchSize
    
pool.close()
pool.join()
pool.terminate()
  
maxValueOverTime = [max(x) for x in maxBinValue]
minValueOverTime = [min(x) for x in minBinValue]

plt.figure(1)
plt.clf()
plt.plot(maxValueOverTime, color='blue')
plt.plot(minValueOverTime, color='red')
plt.savefig(outputDir + '/' + projectName + '/' + diffDir + '/minMaxBinOTime.png', \
            format='png')

avgMax = []
avgMin = []
stddMax = []
stddMin = []

for i in range(numBins):
    avgMax.append(mean([x[i] for x in maxBinValue]))
    avgMin.append(mean([x[i] for x in minBinValue]))
    stddMax.append(stdev([x[i] for x in maxBinValue]))
    stddMin.append(stdev([x[i] for x in minBinValue]))
    
plt.figure(1)
plt.clf()
plt.plot(binBoundaries, avgMax, color='blue')
plt.plot(binBoundaries, avgMin, color='red')
plt.plot(binBoundaries, stddMax, color='turquoise')
plt.plot(binBoundaries, stddMin, color='purple')
plt.savefig(outputDir + '/' + projectName + '/' + diffDir + '/avgAndDevOverBin.png', \
            format='png')
plt.clf()


print('succession calculation done')

In [None]:
print('Creating diference videos')
call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
      str(framesPerSecond), '-i', outputDir + '/' + projectName + '/' + diffDir +\
      '/diffVals%05d.png', '-y', '-codec', 'mpeg4', \
      '-b:v', '40000k', outputDir + '/' + projectName + '/diffMovie.mp4'])

print("Finished video for forward differences.")

call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
      str(framesPerSecond), '-i', outputDir + '/' + projectName + '/' + diffDir \
      + '/binVals%05d.png', '-y', '-codec', 'mpeg4', \
      '-b:v', '40000k', outputDir + '/' + projectName + '/minMaxBinMovie.mp4'])

for i in range(numRegionsInCurGroup):
    call(['/Users/nilsfeige/pyana/anaconda/bin/ffmpeg', '-f', 'image2', '-r', \
          str(framesPerSecond), '-i', outputDir + '/' + projectName + '/' + diffDir \
          + '/region' + str(j) + 'binVals%05d.png', '-y', '-codec', 'mpeg4', \
          '-b:v', '40000k', outputDir + '/' + projectName + '/minMaxBinRegion' + str(j) +\
          'Movie.mp4'])

print("Finished video for minMax per Bin.")

In [None]:
ct2 = datetime.now()
print(ct2 -ct)