# 3 ways to read S3 from IDL
### Antunes, Mar 2022

IDL as of May 2022 does not read S3 objects, whereas Python can.  IDL does have an IDL-Python bridge.  Therefore, our approach to accessing S3 data from IDL is to use a python helper script, which we provide.

We present 3 ways to access S3.  All three work within a Jupyter Notebook environment; one of the methods also works for a typical console (non-Jupyter) IDL session.


1) Working entirely in Python in a Notebook cell.
2) Set variables in IDL, then call Python.
3) Run the python routine as an executable using 'spawn' from IDL.


### How it works
The underpinning is a python 'helper' script called 's3idlhelpers.py', which is S3-aware and can copy S3 files over to your notebook's local file storage.  In this notebook, we invoke that helper to create temporary caches of the files we need, and also use that script to clear out the temp files once we're done with them.

### Data Identifiers
Note that we operate on s3:// identifiers, of the form:
s3://bucket_name/optional_keypaths/more_optionals_etc/filename

Our working example is:

"s3://helio-public/skantunes/psp_wispr.nc"

## Method 1: Pure python within a cell

We set the bucket and key in the python snippet within a Jupyter cell.  Python is invoked with the '%%python' magic tag, while IDL just gets typed as usual.

The two sessions can talk to each other through shared variables.  The variable in Python called 'IDL.fname' will appear in the IDL stack as just the variable 'fname'

First we'll name our data in the python 'subspace'. Because of how python-in-IDL persists, this name is available to any subsequent python cells in this notebook.

In [None]:
%%python
s3urlp = 's3://helio-public/skantunes/psp_wispr.nc'

Now we do the actual creation of the temp copy, as a 2-line python call.

In [1]:
%%python
import s3idlhelpers
IDL.fname = s3idlhelpers.s3tempsync(s3urlp)

In [2]:
print,'Now we are back in IDL'
print,'Sample result-- temp file is: ',fname

### Cleanup
Now we pretend we're done our analysis and ready to delete that temp file.  Note that we do not have to re-import the s3idlhelpers.py file because the python variables persist across cells!

In [3]:
%%python
s3idlhelpers.s3temppurge(IDL.fname)

## Method 2: set bucket in an IDL cell, then use in a Python cell.

This is syntaxically identical to the previous method, just a different way to write it.

In [4]:
; Setting the variable in IDL-space
s3urli = 's3://helio-public/skantunes/psp_wispr.nc'

In [5]:
%%python
# this is the python 2nd half
import s3idlhelpers
IDL.fname = s3idlhelpers.s3tempsync(IDL.s3urli)

In IDL, verify that we got the temporary filename

In [6]:
print,'Temp file is',fname

### Cleanup
Again, let's delete that temporary file

In [7]:
%%python
s3idlhelpers.s3temppurge(IDL.fname)

You can alternately pass the original s3:// address instead of the temp file name, it will reparse the filename and act appropriately. It will never actually delete anything on s3.

In [8]:
%%python
s3idlhelpers.s3temppurge(IDL.s3urli)

## Method 3: Calling the python code as an executable from IDL, no in-cell Python use.

This is the only method that works within non-Jupyter IDL, simply because the IDL-python bridge does not allow for mix-match of languages from the IDL console.

Note that python must be in the command path and that, if the s3idlhelpers.py file is not in the current working directory, that path must also be updated (or you can play with environmental variables or other approaches).

In [9]:
mycommand = 'python s3idlhelpers.py sync '+s3urli
print,mycommand
spawn,mycommand,fname
print,fname


Related, here is the spawn-ed delete of that file

In [10]:
print,'fname is',fname
mycommand = 'python s3idlhelpers.py del '+fname
print,mycommand
spawn,mycommand

And the alt form, passing the original s3url (safely-- it gets downparsed to just the temp file name)

In [11]:
mycommand = 'python s3idlhelpers.py del '+s3urli
spawn,mycommand

## So, where are these temp files stored?
You can set the environmental variable 'S3TEMP' (all caps) in your conda environment to choose where to store the temp files.  (Note you cannot set this in the notebook using IDL's 'setenv' or a %%python command, as those do not persist across %%python calls).

Here is an example within IDL:

## Digression on playing with environmental variables and states
You can set environmental variables in IDL that are available to IDL.

In [12]:
setenv,'S3TEMP=./mytempdir'
print,getenv('S3TEMP')

However, these will not persist into a %%python invocation, for example below the 'S3TEMP' set in IDL will not be seen by python.  However, each %%python invocation _does_ retain any imports and variables from prior cells, making them appear as a contiguous script.

In [13]:
%%python
import os
testvar = 42  # we will use this later in this demo
print("Notice the IDL environment is not found: ",os.environ.get('S3TEMP'))

In [14]:
%%python
print("But our test variable '42' does persist across cells: ",testvar)

You can set the location of s3idlhelpers.py via an ordinary variable or within IDL, to assist with bookkeeping.

In [15]:
setenv,'s3script=./3idlhelpers.py'

Then build your commands using that variable:

In [16]:
s3script=getenv('s3script')
print,s3script
mycommand = 'python '+s3script+' del '+s3urli
print,mycommand

## Iterating over many files
To again assist with bookkeeping, here is sample logic to walk through an IDL list of S3 files, and return the temporary file names.  (Note that IDL multi-line blocks in a cell require an extra 'END' statement, as per https://www.l3harrisgeospatial.com/docs/IDL_Kernel.html)

In [17]:
myfiles = LIST('s3://helio-public/skantunes/psp_wispr.nc','s3://helio-public/skantunes/mms_fgm.nc','s3://helio-public/skantunes/guvi3.nc')
tempfiles = LIST()

In [18]:
FOREACH element, myfiles DO BEGIN
    spawn,'python s3idlhelpers.py sync '+element,tname
    tempfiles.add,tname
ENDFOREACH
END

In [19]:
print,tempfiles

And here we can easily delete them in a loop as well.

In [20]:
FOREACH element, tempfiles DO spawn,'python s3idlhelpers.py del '+element

##  Closing Thoughts
These scripts and methods are strong but not bulletproof.  Feedback to the developer is encouraged (sandy.antunes@jhuapl.edu).