<a href="https://colab.research.google.com/github/athapa42/VIIRS/blob/master/Viirs_dump.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


**Module: read_tropomi_no2_and_dump_ascii.py**

**Disclaimer**: The code is for demonstration purposes only. Users are responsible to check for accuracy and revise to fit their objective.

**Author**: Justin Roberts-Pierel and Pawan Gupta, 2015.

**Modified to work with TROPOMI** : Vikalp Mishra, 2019 

**Organization**: NASA ARSET

**Modified to work with VIIRS data**: Aavash Thapa, 2020

**Purpose**: To save data into a csv file from a VIIRS Deep Blue netcdf file


In [None]:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

In [None]:
pip install netCDF4

Collecting netCDF4
[?25l  Downloading https://files.pythonhosted.org/packages/35/4f/d49fe0c65dea4d2ebfdc602d3e3d2a45a172255c151f4497c43f6d94a5f6/netCDF4-1.5.3-cp36-cp36m-manylinux1_x86_64.whl (4.1MB)
[K     |████████████████████████████████| 4.1MB 2.6MB/s 
Collecting cftime
[?25l  Downloading https://files.pythonhosted.org/packages/81/f4/31cb9b65f462ea960bd334c5466313cb7b8af792f272546b68b7868fccd4/cftime-1.2.1-cp36-cp36m-manylinux1_x86_64.whl (287kB)
[K     |████████████████████████████████| 296kB 29.2MB/s 
[?25hInstalling collected packages: cftime, netCDF4
Successfully installed cftime-1.2.1 netCDF4-1.5.3


In [None]:

#!/usr/bin/python      
from netCDF4 import Dataset
import numpy as np
import sys
import time
import calendar
import datetime as dt
import pandas as pd
from shutil import copyfile


#This finds the user's current path so that all hdf4 files can be found
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    fileList = open('/content/drive/My Drive/Colab Notebooks/VIIRS/fileList.txt', 'r')

except:
    print('Did not find a text file containing file names (perhaps name does not match)')
    sys.exit()

#loops through all files listed in the text file
for FILE_NAME in fileList:
    FILE_NAME=FILE_NAME.strip()
    user_input=input('\nWould you like to process\n' + FILE_NAME + '\n\n(Y/N)')
    if (user_input == 'N' or user_input == 'n'):
        print('Skipping...')
        continue
    else:
        file = Dataset('/content/drive/My Drive/Colab Notebooks/VIIRS/'+ FILE_NAME, 'r')
     #   grp='PRODUCT' 
# read the data
        if 'AERDB' in FILE_NAME:
            print('This is a VIIRS Deep Blue file.')
            #this is how you access the data tree in an hdf5 file
            SDS_NAME='Aerosol_Optical_Thickness_550_Land_Best_Estimate'    
        ds=file
       # grp='PRODUCT'  
        lat= ds.variables['Latitude'][:][:]
        lon= ds.variables['Longitude'][:][:]
        data= ds.variables[SDS_NAME]

        #get necessary attributes 
        fv=data._FillValue
          
        fileparts=FILE_NAME.split('.')

        #There are some columns that are going to be the same
        #like the year, month and so on listed below.
        #Therefore, we can make the columns for them to store
        #the data for every row.
        year = np.zeros(lat.shape)
        mth = np.zeros(lat.shape)
        doy = np.zeros(lat.shape)
        hr = np.zeros(lat.shape)
        mn = np.zeros(lat.shape)
        
        for i in range(0,lat.shape[0]):
            y= fileparts[1][1:5]
            h = fileparts[2][0:2]
            m = fileparts[2][2:4]
            date = y + ',' + fileparts[1][5:8] + ',' + h + ',' + m
            t2 = dt.datetime.strptime(date,'%Y,%j,%H,%M')
           
            mt = t2.month
            d = t2.day
            
            year[i][:] = y
            mth[i][:] = mt
            doy[i][:] = d
            hr[i][:] = h
            mn[i][:] = m
            
        vlist = list(file.variables.keys())
        #print('vlist: ', vlist)
        
        #create the dataframe and enter the values here
        df = pd.DataFrame()
        df['Year'] = year.ravel()
        df['Month'] = mth.ravel()
        df['Day'] = doy.ravel()
        df['Hour'] = hr.ravel()
        df['Minute'] = mn.ravel()
        
        #0-->Aerosol_Optical_Thickness_550_Land
        #3-->Aerosol_Optical_Thickness_550_Land_Ocean_Best_Estimate
        #8-->Aerosol_Optical_Thickness_QA_Flag_Land
        #11-->Aerosol_Type_Land_Ocean
        #18-->Angstrom_Exponent_Land_Ocean_Best_Estimate
        sds_lst = [ 'Aerosol_Optical_Thickness_550_Land',
                   'Aerosol_Optical_Thickness_550_Land_Ocean_Best_Estimate',
                   'Aerosol_Optical_Thickness_QA_Flag_Land',
                   'Aerosol_Type_Land_Ocean',
                   'Angstrom_Exponent_Land_Ocean_Best_Estimate']
        
        #This for loop saves all of the SDS in the dictionary at the top (dependent on file type) to the array (with titles)
        #All the sds that we need seem to be contained in this range.
        #Can extend this range to loop through more sds variables in the NC file.
        for i in range(0,20):
            SDS_NAME=vlist[(i)] # The name of the sds to read
            
            if SDS_NAME in sds_lst:
                print('SDS_NAME', SDS_NAME)
                #get current SDS data, or exit program if the SDS is not found in the file
                #try:
                sds=ds.variables[SDS_NAME]
               
                #for i in range(0, len(sds)):
                #if len(sds.shape) == 3:
                #print(SDS_NAME,sds.shape)
                #get attributes for current SDS
                if 'qa' in SDS_NAME:
                    scale=sds.scale_factor
                else: scale = 1.0
            
                fv=sds._FillValue
    
            #get SDS data as a vector
                data=sds[:].ravel()
                #print(data)
               #The next few lines change fill value/missing value to NaN so that we can multiply valid values by the scale factor, then back to fill values for saving
                data=data.astype(float)
                data=(data)*scale  
                data[np.isnan(data)]=fv
                data[data==float(fv)]=np.nan
                data=np.array(data[:])
                df[SDS_NAME] = data
    
    outfilename=FILE_NAME[:-3]+'.csv'    
    df.to_csv(outfilename, index = False)
    copyfile(outfilename, "drive/My Drive/Colab Notebooks/VIIRS/" + outfilename)    
    print('\nAll files have been saved successfully.')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive

Would you like to process
AERDB_L2_VIIRS_SNPP.A2020056.1954.001.2020057113600.nc

(Y/N)y
This is a VIIRS Deep Blue file.
SDS_NAME Aerosol_Optical_Thickness_550_Land
SDS_NAME Aerosol_Optical_Thickness_550_Land_Ocean_Best_Estimate
SDS_NAME Aerosol_Optical_Thickness_QA_Flag_Land
SDS_NAME Aerosol_Type_Land_Ocean
SDS_NAME Angstrom_Exponent_Land_Ocean_Best_Estimate

All files have been saved successfully.
