This notebook describes the processes associated with taking a random sample of portions of the global continental shelf as represented in the new shelf bathymetry model presented as part of this research. The random sample consists of p sample points, where p is a number probably greater than or equal to 100. From the points, sampling areas are generated. These areas, which at this writing consist of p 200 km diameter circles, will then be interrogated to collect location, depth, slope, and shelf area contained. These data, collected, can then be normalized, and inserted into an analysis workflow such as regression, principal components, or Q-mode factor analysis.

Let's begin...

**0:) Import the requisite modules and libraries**

In [1]:
import os, sys
from pyproj import Proj

1.) **Set up the GRASS working environment**

First, we have to set up an external GRASS environment. This consists largely of setting a few environment variables and then instantiating the GRASS scripting class. Doing this permits us to call any of the GRASS libraries (sans the display code, I believe) from outside of a GRASS environment. 

In [2]:
# set the location and mapset for the GRASS instance to be created

gisdbase = '/Users/paulparis/Documents/Projects/csi/data/GRASSData'    # location of your GRASS database
location = 'csi_shelf_mapping_global'                                  # name of geographic Location
mapset = 'user'         


# This CLASS definition is used to initialize an instance of GRASS 7.x in this domain space
class GrassBASE:
    def initGRASSEnv( self, database, loc, map ):
        gisbase=os.environ[ 'GISBASE' ] = '/Applications/GRASS-7.0.app/Contents/MacOS'
        gisdbase=database
        location=loc
        mapset=map
        sys.path.append(os.path.join(os.environ['GISBASE'], "etc", "python"))
        import grass.script as grass
        import grass.script.setup as gsetup
        gsetup.init( gisbase, gisdbase, location, mapset )
        print(grass.gisenv())
        

        
# initiate a GRASS instance/environment

print('')
print ('********************************************')
print ('Setting up GRASS Environment')
g = GrassBASE()
g.initGRASSEnv( gisdbase, location, mapset )


********************************************
Setting up GRASS Environment
{'MAPSET': 'user', 'GISDBASE': '/Users/paulparis/Documents/Projects/csi/data/GRASSData', 'LOCATION_NAME': 'csi_shelf_mapping_global'}


CREATE A COMPOSITE CONTINENTAL SHELF **GEOMORPHON** FROM THE COMPONENT CLASSES (e.g., shallow, shallow intermediate,...)

NOTE THAT **YOU NEED ONLY DO THIS ONCE. IF THE RASTER ALREADY EXISTS, NO NEED TO RECREATE!**

in GRASS:
    r.patch --o input=ETOPO1_geomorph_shallowshelf_1km,ETOPO1_geomorph_shallowintershelf_1km,ETOPO1_geomorph_deepintershelf_1km,ETOPO1_geomorph_deepshelf_1km output=ETOPO1_geomorph_compositeshelf_1km

CREATE A COMPOSITE CONTINENTAL SHELF (**BATHY** MODEL DEM) FROM THE GEOMORPHON COMPOSITE SHELF MODEL. WE'LL TAKE A RANDOM SAMPLE ACROSS THIS DEM NEXT...

NOTE THAT **YOU NEED ONLY DO THIS ONCE. IF THE RASTER ALREADY EXISTS, NO NEED TO RECREATE!**

In GRASS:
r.mapcalc --o
ETOPO1_bathy_compositeshelf_1km=if(ETOPO1_geomorph_compositeshelf_1km, ETOPO1_bathy_1km, null() )

CREATE A COMPOSITE CONTINENTAL SHELF **SLOPE** MODEL FROM THE BATHY COMPOSITE SHELF MODEL GENERATED ABOVE. WE'LL SAMPLE ACROSS THIS DEM A BIT LATER...

NOTE THAT **YOU NEED ONLY DO THIS ONCE. IF THE RASTER ALREADY EXISTS, NO NEED TO RECREATE!**

In GRASS:
r.slope.aspect --o elevation=ETOPO1_bathy_compositeshelf_1km slope=ETOPO1_slope_compositeshelf_1km 

**2.) Extract a p point random sample from the ETOPO1_geomorph_compositeshelf_1km**

In [3]:
# set the computational region...
import grass.script as grass

grass.run_command( 'g.region', flags='p', region='ETOPO1_World_1km')

0

In [19]:
import grass.script as grass

# set number of random sample points, p, to collect...
p=100000,  #250

inp='ETOPO1_shelfslope_1211m_1km'
vout='ETOPO1_100kshelfsamplepts'
# extract the random sample of p points...
grass.run_command( 'r.random', overwrite=True, input=inp, npoints=p, vector=vout)

print p,'randomly sampled points generated.'

(100000,) randomly sampled points generated.


In [20]:
# ## export vector data from GRASS to an external ASCII text file

xy = grass.read_command('v.out.ascii', overwrite=True, input='ETOPO1_100kshelfsamplepts', output='/Users/paulparis/Documents/Projects/csi/data/vector/ETOPO1_100kshelfsamplepts.csv', separator=',', type='point',columns='*' )

**Adding some additional coverage to ensure that areas where shelves are sparsely represented get some representation in the analysis.**

As there is a great deal of variation in the distribution of c. shelf across the globe--that is, some regions have broad extensive shelves, while others have little or no shelf to speak of--we've elected to edit the random sample points by hand so as to include some of those lesser well represented areas. Some of the original points will be altered. Some areas received, by chance, a large number of points which resulted in much overlap between sampling areas. Those added will be along the sparsely shelved western North and South American continents, as well as the east and west coasts of Africa, and Asia along its Indian Ocean exposure. Those deleted will come from anywhere where there are clusters of points which result in multiple sampling area overlaps. 

**3.) create buffer regions (sampling polygons) around the p random sample points...**

In [16]:
import grass.script as grass

d=125000   # buffer distance in map units (meters, probably)
grass.run_command('v.buffer', flags='t', overwrite=True, input='ETOPO1_cshelfsamplepts', type='point', output='ETOPO1_cshelfsampleareas', distance=d)

0

In [None]:
# there are 281 sampling areas (p) to work...

import grass.script as grass

depth_formula = 'tmp_bathy=if(tmpsampleDEM,ETOPO1_bathy_compositeshelf_1km, null())'
slope_formula = 'tmp_slope=if(tmpsampleDEM,ETOPO1_slope_compositeshelf_1km, null())'

# ## open a text file to receive the northing (we'll convert to latitude later),depths, slopes,
# ## and shelf area, all measured within each of the p 250km diameter study areas
fp = open('/Users/paulparis/Documents/Projects/csi/data/vector/factor_analysis.csv','w')

try:
    for p in range(1,282,1):
        #grass.run_command('v.extract', overwrite=True, input='ETOPO1_cshelfsampleareas', type='area', cats=p, output='tempsamplearea' )
        grass.run_command( 'v.to.rast', overwrite=True, input='ETOPO1_cshelfsampleareas', cats=p, use='cat', output='tmpsampleDEM')
        grass.mapcalc(  depth_formula, overwrite=True )
        grass.mapcalc(  slope_formula, overwrite=True )

        # ## collect the x,y position of the current samping area's centroid...
        xy = grass.read_command('v.out.ascii', input='ETOPO1_cshelfsamplepts', type='point', cats=p )

        # ## generate and collect basic statistics on depths, slopes, and location...
        depths = grass.read_command( 'r.univar', flags='t', map='tmp_bathy', separator=',' )
        slopes = grass.read_command( 'r.univar', flags='t', map='tmp_slope', separator=',' )
        
        if( len(depths) > 100):
            # ## parse the xy, depths, and slopes data in prep for writing to an output file...
            # ## for the xy location...
            x = xy.rstrip().split('|')[0]
            y = xy.rstrip().split('|')[1]
        
            # ## for the depths...
            depth = depths.split(',')
            n = depth[11].split('\n')[1]  
            d_max = float(depth[13])
            d_mean = float(depth[16])
            d_std = float(depth[18])
        
            # ## for the slopes...
            slope = slopes.split(',')
            #s_n = int(slope[11].split('\n')[1])
            #s_min = float(slope[13])
            s_mean = float(slope[16])
            s_std = float(slope[18])
        
            # un-project the x and y coordinates to GPs...
            pdef = Proj(proj='igh',zone=10,ellps='WGS84')
            lon,lat = pdef(x,y, inverse=True)
            
            #print x,y,n,d_max,d_mean,d_std,s_mean,s_std
            fp.write(str(x)+','+str(y)+','+str(n)+','+str(d_max)+','+str(d_mean)+','+str(d_std)+','+str(s_mean)+','+str(s_std)+'\n' )
        else:
            print 'skipping missing cat record:',p
            
except IndexError:
    print 'missing area', p
    
fp.close()
print 'Fin!'

In [None]:
# ## Re-read the site sampling data file generated above to set the northing values (column 2)
# ## to their absolute values...we don't want values here < 0.


ifn='/Users/paulparis/Documents/Projects/csi/data/vector/factor_analysis.csv'
ofn='/Users/paulparis/Documents/Projects/csi/data/vector/factor_analysis2.csv'

cols=['easting','northing','p','depth_max','depth_mean','depth_std','slope_mean','slope_std']

idat = open(ifn, 'r')
odat = open(ofn, 'w')

for i, line in enumerate(idat):
    _,northing,p,depth_max,depth_mean,depth_std,slope_mean,slope_std=line.rstrip().split(',')
    north=str(int(abs(float(northing))))
    print i,north,p
    odat.write(north+','+p+','+depth_max+','+depth_mean+','+depth_std+','+slope_mean+','+slope_std+'\n')

print 'Fin!'

In [146]:
print int(5222129.82357993)

5222129
