# [1] About 

<img style="float: right;" src="../BCYadav_about.png">

- This notebook is a part of tutorial series prepared by B. C. Yadav, Research Scholar @ IIT Roorkee.  
- ORCID iD: https://orcid.org/0000-0001-7288-0551 
- Google Scholar: https://scholar.google.com/citations?user=6fJpxxQAAAAJ&hl=en&authuser=1  
- Github: https://github.com/Bankimchandrayadav/PythonInGeomatics
- Twitter: https://twitter.com/DrBCY
- **Recent Publication:** https://rmets.onlinelibrary.wiley.com/doi/10.1002/joc.6562
- This notebook demonstrates the [handling files in formats other than regular tiff format (netcdf taken here which is an increasingly popular format)]    
---

# [2] First time usage for conda users

In [1]:
# !conda install -c conda-forge xarray dask netCDF4 bottleneck -y
# !conda install -c conda-forge numpy -y
# !conda install -c conda-forge tqdm -y

# [3] First time usage for pip users

In [2]:
# !pip install xarray
# !pip install dask 
# !pip install netCDF4
# !pip install bottleneck
# !pip install numpy
# !pip install tqdm

# [4] Importing libraries

In [3]:
import xarray as xr 
import time
import numpy as np 
from tqdm.notebook import tqdm as td
import os 
import shutil
start = time.time()  # will be used to measure the effectiveness of automation

# [5] Creating routine functions 

In [4]:
def fresh(where):
    if os.path.exists(where):
        shutil.rmtree(where)
        os.mkdir(where)
    else:
        os.mkdir(where) 

# [6] Read files

In [7]:
ds = xr.open_dataset("../02_data/01_netcdf/hourly_2000_limited.nc")

# [7] Extracting information from source files

## [7.1] Specify output directory:

In [9]:
outDir = "../02_data/02_netcdf_multiple/"

## [7.2] Delete any existing or old files

In [10]:
fresh(where=outDir)

## [7.3] Check output directory [optional]

In [11]:
os.startfile(os.path.realpath(outDir))

## [7.4] Extract multiple netcdf files from the source 

In [12]:
## loop starts here 
for i in td(range(0, 1000), desc='Extracting multiple files'):
    
    # extract slice's time as a string
    fileName = str(ds.time[i].values)[:13]  

    # subset dataset to the time slice
    ds1 = ds.isel(time=i)  

    # save the extracted information (write a slice)
    ds1.to_netcdf(outDir + "./" + fileName + ".nc")  

HBox(children=(HTML(value='Extracting multiple files'), FloatProgress(value=0.0, max=1000.0), HTML(value='')))




# [8] Time elapsed

In [13]:
end = time.time()
print('Time elapsed:', np.round(end-start ,2), 'secs')

Time elapsed: 352.62 secs


# [9] See results [1000 files,optional]

In [None]:
os.startfile(os.path.realpath(outDir))

---
# End of first tutorial
---