# `kbatch` on Nebari

## Goal

Submit a batch job (a notebook or a script) to run headlessly, immediately or on a schedule. 

## Refer to the documentation for more detail

For more detailed documentation, please refer to the ["How to submit batch jobs"](https://nebari-docs.netlify.app/how-tos/kbatch-howto) in the Nebari docs. And for more information about `kbatch`, please refer to [the `kbatch` docs](https://kbatch.readthedocs.io/en/latest/).


## One-time setup command

This is a one-time setup command required to configure `kbatch`:

```shell
$ kbatch configure \
  --token <JUPYTERHUB_API_TOKEN> \
  --kbatch-url http://kbatch-kbatch-proxy.dev.svc.cluster.local

Wrote config to /home/<username>/.config/kbatch/config.json
```

The required arguments are:
- `--token`
  - generate a `JUPYTERHUB_API_TOKEN` from [esip-ogc.nebari.dev/hub/token](https://esip-ogc.nebari.dev/hub/token).
- `--kbatch-url`
  - `--kbatch-url=http://kbatch-kbatch-proxy.dev.svc.cluster.local`


In [None]:
import os
import time

import rasterio
import rio_cogeo
from s3fs import S3FileSystem

In [None]:
# dataset from NASA EarthData available on AWS S3
fp = "s3://modis-vi-nasa/MOD13A2.006"

In [None]:
nasa_s3 = S3FileSystem(anon=True)

In [None]:
nasa_s3.ls("s3://modis-vi-nasa")

In [None]:
tff_files_s3 = nasa_s3.ls(fp)

In [None]:
dir_file_s3 = tff_files_s3.pop()

In [None]:
dir_file_s3

In [None]:
print(len(tff_files_s3))
print(tff_files_s3[-1:])

In [None]:
# the `dir_file_s3` contains links to data from NASA's Land Processes Distributed
# Active Archive Center (LP DAAC) located at the USGS Earth Resources Observation and
# Science (EROS) Center. Downloading the data requires NASA Earthdata Login.
 
# dir_file = "dir_files.txt"
# nasa_s3.download(dir_file_s3, dir_file)

In [None]:
# needed to read data from S3 anonymously
os.environ["AWS_NO_SIGN_REQUEST"] = "YES"

In [None]:
log_file = "ogc-cog-validation.txt"

In [None]:
for tff in tff_files_s3[:10]:
    tff = "s3://" + tff
    valid, errors, warnings = rio_cogeo.cog_validate(tff)
    with open(log_file, "a") as f:
        current_time = time.strftime("%Y-%m-%d-%H:%M:%S", time.localtime())
        f.write(f"{current_time}, {tff}, ")
        if valid:
            f.write("Valid COG format\n")
        else:
            f.write("Invalid COG format\n")
            f.write(errors)
            f.write(warnings)

    time.sleep(30)

In [None]:
# AWS service account with R/W access to a single bucket: s3://esip-nebari-dev
key = "AKIARNNK42TEI57T7N55"
secret = "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"

In [None]:
esip_s3 = S3FileSystem(key=key, secret=secret)
fp = "s3://esip-nebari-dev"

In [None]:
esip_s3.ls(fp)

In [None]:
log_file_s3 = fp + "/testing/" + log_file

In [None]:
esip_s3.put_file(log_file, log_file_s3)