## How to use the Data Lab *Store Client* Service

*Revised:  Jul 24, 2019*

This notebook documents how to use the Data Lab virtual storage system via the store client service. This can be done either from a Python script (e.g. within this notebook) or from the command line using the <i>datalab</i> command.

### The storage manager service interface

The store client service simplifies access to the Data Lab virtual storage system. This section describes the store client service interface in case we want to write our own code against that rather than using one of the provided tools. The store client service accepts an HTTP GET call to the appropriate endpoint for the particular operation:

| Endpoint | Description | Req'd Parameters |
|----------|-------------|------------|
| /get | Retrieve a file | name |
| /put | Upload a file | name |
| /load | Load a file to vospace | name, endpoint |
| /cp | Copy a file/directory | from, to |
| /ln | Link a file/directory | from, to |
| /lock | Lock a node from write updates | name |
| /ls | Get a file/directory listing | name |
| /access | Determine file accessability | name |
| /stat | File status info | name,verbose |
| /mkdir | Create a directory | name |
| /mv | Move/rename a file/directory | from, to |
| /rm | Delete a file | name |
| /rmdir | Delete a directory | name |
| /tag | Annotate a file/directory | name, tag |

For example, a call to <i>http://datalab.noao.edu/storage/get?name=vos://mag.csv</i> will retrieve the file '_mag.csv_' from the root directory of the user's virtual storage.  Likewise, a python call using the _storeClient_ interface such as "_storeClient.get('vos://mag.csv')_" would get the same file.

#### Virtual storage identifiers

Files in the virtual storage are usually identified via the prefix "_vos://_". This shorthand identifier is resolved to a user's home directory of the storage space in the service.  As a convenience, the prefix may optionally be omitted when the parameter refers to a node in the virtual storage. Navigation above a user's home directory is not supported, however, subdirectories within the space may be created and used as needed.

#### Authentication
The storage manager service requires a DataLab security token. This needs to be passed as the value of the header keyword "X-DL-AuthToken" in any HTTP GET call to the service. If the token is not supplied anonymous access is assumed but provides access only to public storage spaces.

### From Python code

The store client service can be called from Python code using the <i>datalab</i> module. This provides methods to access the various functions in the <i>storeClient</i> subpackage. 

#### Initialization
This is the setup that is required to use the store client. The first thing to do is import the relevant Python modules and also retrieve our DataLab security token.

In [2]:
# Standard notebook imports
from getpass import getpass
from dl import authClient, storeClient

In [3]:
# Get the authentication token for the user
token = authClient.login(input("Enter user name: (+ENTER) "),getpass("Enter password: (+ENTER) "))
if not authClient.isValidToken(token):
    raise Exception('Token is not valid. Please check your usename/password and execute this cell again.')

Enter user name: (+ENTER) fitz
Enter password: (+ENTER) ········


#### Listing a file/directory

We can see all the files that are in a specific directory or get a full listing for a specific file.  In this case, we'll list the default virtual storage directory to use as a basis for changes we'll make below.

In [3]:
listing = storeClient.ls (name = 'vos://')
print (listing)

grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,results,tmp


The *public* directory shown here is visible to all Data Lab users and provides a means of sharing data without having to setup special access.  Similarly, the *tmp* directory is read-protected and provides a convenient temporary directory to be used in a workflow.

#### File Existence and Info

Aside from simply listing files, it's possible to test whether a named file already exists or to determine more information about it.

In [4]:
# A simple file existence test:
if storeClient.access ('vos://public'):
    print ('User "public" directory exists')
if storeClient.access ('vos://public', mode='w'):
    print ('User "public" directory is group/world writable')
else:
    print ('User "public" directory is not group/world writable')
    
if storeClient.access ('vos://tmp'):
    print ('User "tmp" directory exists')        
if storeClient.access ('vos://tmp', mode='w'):
    print ('User "tmp" directory is group/world writable')
else:
    print ('User "tmp" directory is not group/world writable')

User "public" directory exists
User "public" directory is group/world writable
User "tmp" directory exists
User "tmp" directory is group/world writable


#### Uploading a file

Now we want to upload a new data file from our local disk to the virtual storage:

In [5]:
storeClient.put (to = 'vos://newmags.csv', fr = './newmags.csv')
print(storeClient.ls (name='vos://'))

(1 / 1) ./newmags.csv -> vos://newmags.csv
grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,results,tmp


#### Downloading a file

Let's say we want to download a file from our virtual storage space, in this case a query result that we saved to it in the "How to use the Data Lab query manager service" notebook:

In [6]:
storeClient.get (fr = 'vos://newmags.csv', to = './mymags.csv')



['OK']

It is also possible to get the contents of a remote file directly into your notebook by specifying the location as an empty string:

In [7]:
data = storeClient.get (fr = 'vos://newmags.csv', to = '')
print (data)

id,g,r,i
001,22.3,12.4,21.5
002,22.3,12.4,21.5
003,22.3,12.4,21.5
004,22.3,12.4,21.5
005,22.3,12.4,21.5
006,22.3,12.4,21.5
007,22.3,12.4,21.5



#### Loading a file from a remote URL

It is possible to load a file directly to virtual storage from a remote URL )e.g. an "accessURL" for an image cutout, a remote data file, etc) using the "storeClient.load()" method:

In [4]:
url = "http://datalab.noao.edu/svc/cutout?col=&siaRef=c4d_161005_022804_ooi_g_v1.fits.fz&extn=31&POS=335.0,0.0&SIZE=0.1"
storeClient.load('vos://cutout.fits',url)

'OK'

#### Creating a directory

We can create a directory on the remote storage to be used for saving data later:

In [8]:
storeClient.mkdir ('vos://results')

'OK'

#### Copying a file/directory

We want to put a copy of the file in a remote work directory:

In [9]:
storeClient.mkdir ('vos://temp')
print ("Before: " + storeClient.ls (name='vos://temp/'))
storeClient.cp (fr = 'vos://newmags.csv', to = 'vos://temp/newmags.csv')
print ("After: " + storeClient.ls (name='vos://temp/'))

Before: 
After: newmags.csv


Notice that in the *ls()* call we append the directory name with a trailing '/' to list the contents of the directory rather than the directory itself.

#### Linking to a file/directory

**WARNING**: Linking is currently **not** working in the Data Lab storage manager. This notebook will be updated when the problem has been resolved.

Sometimes we want to create a link to a file or directory.  In this case, the link named by the *'fr'* parameter is created and points to the file/container named by the *'target'* parameter.

In [10]:
storeClient.ln ('vos://mags.csv', 'vos://temp/newmags.csv')
print ("Root dir: " + storeClient.ls (name='vos://'))
print ("Temp dir: " + storeClient.ls (name='vos://temp/'))

Root dir: grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,results,temp,tmp
Temp dir: newmags.csv


#### Moving/renaming a file/directory

We can move a file or directory:

In [11]:
storeClient.mv(fr = 'vos://temp/newmags.csv', to = 'vos://results')
print ("Results dir: " + storeClient.ls (name='vos://results/'))

Results dir: 


#### Deleting a file

We can delete a file:

In [12]:
print ("Before: " + storeClient.ls (name='vos://'))
storeClient.rm (name = 'vos://mags.csv')
print ("After: " + storeClient.ls (name='vos://'))

Before: grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,results,temp,tmp
After: grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,results,temp,tmp


#### Deleting a directory

We can also delete a directory, doing so also deletes the contents of that directory:

In [13]:
storeClient.rmdir(name = 'vos://temp')

'OK'

#### Tagging a file/directory

**Warning**: Tagging is currently **not** working in the Data Lab storage manager. This notebook will be updated when the problem has been resolved.

We can tag any file or directory with arbitrary metadata:

In [14]:
storeClient.tag('vos://results', 'The results from my analysis')

'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">\n<title>500 Internal Server Error</title>\n<h1>Internal Server Error</h1>\n<p>The server encountered an internal error and was unable to complete your request.  Either the server is overloaded or there is an error in the application.</p>\n'

#### Cleanup the demo directory of remaining files

In [15]:
storeClient.rm (name = 'vos://newmags.csv')
storeClient.rm (name = 'vos://results')
storeClient.ls (name = 'vos://')

'grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmap.fits,public,tmp'

### Using the datalab command

The <i>datalab</i> command provides an alternate command line way to work with the query manager through the <i>query</i> subcommands.

#### Initialization
We need to be logged into the DataLab to use the query manager.

In [16]:
!datalab login user=geychaner password=

User 'geychaner' is already logged in to the Data Lab


#### Downloading a file

Let's say we want to download a file from our virtual storage space:

In [17]:
!datalab get fr="vos://mags.csv" to="./mags.csv"

#### Uploading a file

Now we want to upload a new data file from our local disk:

In [18]:
!datalab put fr="./newmags.csv" to="vos://newmags.csv"

#### Copying a file/directory

We want to put a copy of the file in a remote work directory:

In [19]:
!datalab cp fr="vos://newmags.csv" to="vos://temp/newmags.csv"

#### Linking to a file/directory

Sometimes we want to create a link to a file or directory:

In [20]:
!datalab ln fr="vos://temp/mags.csv" to="vos://mags.csv"

#### Listing a file/directory

We can see all the files that are in a specific directory or get a full listing for a specific file:

In [21]:
!datalab ls name="vos://temp"

grzw1_3sn10_29M.jpg,grzw1_sn10_15M.jpg,newmags.csv,newmap.fits,public,tmp


#### Creating a directory

We can create a directory:

In [22]:
!datalab mkdir name="vos://results"

#### Moving/renaming a file/directory

We can move a file or directory:

In [23]:
!datalab mv fr="vos://temp/newmags.csv" to="vos://results"

#### Deleting a file

We can delete a file:

In [24]:
!datalab rm name="vos://temp/mags.csv"

#### Deleting a directory

We can also delete a directory:

In [25]:
!datalab rmdir name="vos://temp"

#### Tagging a file/directory

We can tag any file or directory with arbitrary metadata:

In [26]:
!datalab tag name="vos://results" tag="The results from my analysis"