In [1]:
%load_ext ivortex
%vortex tmpcocoon

# [2019/08/27-15:25:55][vortex.sessions][_set_rundir:0155][INFO]: Session <root> set rundir </home/meunierlf/vortex-workdir/auto_cocoon_1xrzv38q>


Vortex 1.6.2 loaded ( Tuesday 27. August 2019, at 15:25:53 )
The working directory is now: /home/meunierlf/vortex-workdir/auto_cocoon_1xrzv38q/root


'/home/meunierlf/vortex-workdir/auto_cocoon_1xrzv38q'

# Handling of Resources in Vortex (discovery) 

A normally constituted person could describe a request as follows:

> Grid-point data at 3h lead time (on the EUROC25
> domain, in GRIB format) for the Arpege model of
> the production cutoff from 01/06/2017 18UTC for
> the second member of the e-suite PEARP ensemble
> prediction system (to be recovered via the
> mass-archiving system). 

The key idea of Vortex is to keep all the richness of such a description. 

Curiously, in most scripts, we can find this:

``ftp://hendrix.meteo.fr/~mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib``

## Vortex Data Description Objects

With Vortex, this descriptive approach results in the association of three objects:

  * An object of base class **vortex.data.resources.Resource**. This object describes the data itself. In the previous example: “Grid-point data at 3h lead time (on the EUROC25 domain, in GRIB format) for the Arpege model of the production cutoff from 01/06/2017 18UTC”.
  * An object of base class **vortex.data.providers.Provider**. This object describes how the data will be retrieved. In the previous example: “second member of the e-suite PEARP ensemble prediction system (to be recovered via the mass-archiving system).”. 
  * An object of base class **vortex.data.containers.Container**. This object describes where the resource will be stored locally. It is usually a file (described by its path in the file system). 

In vortex, this gives ... For the *Resource* object: 

In [2]:
pe_resource = fp.proxy.resource(kind='gridpoint',
                                term=3,
                                geometry='euroc25',
                                nativefmt='grib',
                                model='arpege',
                                cutoff='production',
                                date='2017060118',
                                origin='historic')
print(pe_resource)

<common.data.gridfiles.GridPointExport object at 0x7f9cbf1803c8 | geometry='<vortex.data.geometries.LonlatGeometry | tag='euroc25' id='Large european BDAP target' area='EUROC25' r='02dg500'>' cutoff='production' date='2017-06-01T18:00:00Z' origin='hst' model='arpege' filtername='None' term='03:00'>


For the *Provider* object: 

In [3]:
pe_provider = fp.proxy.provider(vapp='arpege',
                                vconf='pearp',
                                member=2,
                                experiment='DBLE',
                                namespace='vortex.archive.fr',
                                block='forecast')
print(pe_provider)

<vortex.data.providers.VortexOp object at 0x7f9cbf180390 | namespace='vortex.archive.fr' block='forecast'>


For the *Container* object:

In [4]:
pe_container = fp.proxy.container(local='my_pearp_gribfile', format='grib')
print(pe_container)

<vortex.data.containers.SingleFile object at 0x7f9cbf1802e8 | path='my_pearp_gribfile'>


## To make these objects interact: the *resource*'s Handler 

To manipulate data in Vortex, you must associate the three objects mentioned above. This aggregation is done by an object of base class **vortex.data.handlers.Handler**.

This object allows the manipulation of data via simple methods: 

  * **locate** : determine the physical location of the data (here the ftp address); 
  * **get** : retrieve the data and place it in the *container*;
  * **put** : send the contents of the *container* to the location described by the *resource* and *provider* objects; 
  * **check** : check the existence of the data; 
  * **delete** : delete the data described by the *resource* and *provider* objects; 
  * **clear** : delete the contents of the *container*.

In practice...

In [5]:
rhandler = vortex.data.handlers.Handler(dict(resource=pe_resource,
                                             provider=pe_provider,
                                             container=pe_container))

To retrieve data:

In [6]:
rhandler.get()
%ls -l

# [2019/08/27-15:33:24][vortex.data.stores][inarchiveget:1199][INFO]: inarchiveget on vortex://vsop.archive.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:33:24][vortex.tools.storage][_ftpretrieve:0690][INFO]: ftpget on ftp://hendrix.meteo.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:33:26][vortex.tools.net][get:0268][INFO]: FTP <get:/home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib>


total 1328
-rw-r--r-- 1 meunierlf algo 1357578 août  27 15:33 my_pearp_gribfile


A little bit of cleaning before going on...

In [7]:
rhandler.clear()

True

In [8]:
%ls -l

total 0


## Use of the data

The **Handler** class does not meet all the requirements; we would like to be able to specify:

  * What we want to do with these data 
  * What to do if an error occurs 
  * If the data need to be updated or not 

This is the role of the object of class **vortex.layout.dataflow.Section** that wraps the *resource*'s *Handler*:

In [10]:
section = vortex.layout.dataflow.Section(rh=rhandler,
                                         # This file serves as:
                                         role='InitialCondition',
                                         # This is an input file:
                                         kind=vortex.layout.dataflow.ixo.INPUT,
                                         # Crash on error:
                                         fatal=True,
                                         # Read/Write:
                                         intent='inout')

This object has two main methods, **get** and **put**, that call the *resource*’s *Handler* , taking into account the *fatal* and *intent* attributes : 

In [11]:
section.get()
%ls -l

# [2019/08/27-15:37:44][vortex.data.stores][inarchiveget:1199][INFO]: inarchiveget on vortex://vsop.archive.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:37:44][vortex.tools.storage][_ftpretrieve:0690][INFO]: ftpget on ftp://hendrix.meteo.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:37:46][vortex.tools.net][get:0268][INFO]: FTP <get:/home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib>


total 1328
-rw-r--r-- 1 meunierlf algo 1357578 août  27 15:37 my_pearp_gribfile


The *resource*'s *Handler* remains accessible (for example, to empty the *container*): 

In [12]:
section.rh.clear()
%ls -l

total 0


Summary diagram

![](../images/data_management_overview.png)

## Quick creation of *Handlers* and *Sections* via the Toolbox module 

It is tedious to create the different objects manually, the  **vortex.toolbox** module (often systematically imported) thus provides functions allowing the creation of set of objects in a single call. 

### Quick creation of *resource*'s *Handlers*

In [13]:
rhandlers = toolbox.rload(# Resource
                          kind='gridpoint', term=3, geometry='euroc25',
                          nativefmt='grib', model='arpege', 
                          cutoff='production', date='2017060118',
                          origin='historic',
                          # Provider
                          vapp='arpege', vconf='pearp', member=2,
                          experiment='DBLE', namespace='vortex.archive.fr',
                          block='forecast',
                          # Container
                          local='my_pearp_gribfile', format='grib')
print(rhandlers)

[<vortex.data.handlers.Handler object at 0x7f9cbf10c438>]


We get a list of *resource*'s *Handlers* and not a single object ... we will come back to that later; by now, let's accept this state of fact. The object contained in the list is the expected one: 

In [14]:
rhandlers[0].quickview()

00. <vortex.data.handlers.Handler object at 0x7f9cbf10c438>
  Complete  : True
  Container : <vortex.data.containers.SingleFile object at 0x7f9cbf10c4a8 | path='my_pearp_gribfile'>
  Provider  : <vortex.data.providers.VortexOp object at 0x7f9cbf10c320 | namespace='vortex.archive.fr' block='forecast'>
  Resource  : <common.data.gridfiles.GridPointExport object at 0x7f9cbf10c390 | geometry='<vortex.data.geometries.LonlatGeometry | tag='euroc25' id='Large european BDAP target' area='EUROC25' r='02dg500'>' cutoff='production' date='2017-06-01T18:00:00Z' origin='hst' model='arpege' filtername='None' term='03:00'>


### Quick creation of sections

In [15]:
rhandlers = toolbox.input(# Input function options
                          verbose=False,
                          # Section
                          role='InitialCondition', fatal=True, intent='inout',
                          # Resource
                          kind='gridpoint', term=3, geometry='euroc25',
                          nativefmt='grib', model='arpege', 
                          cutoff='production', date='2017060118',
                          origin='historic',
                          # Provider
                          vapp='arpege', vconf='pearp', member=2,
                          experiment='DBLE', namespace='vortex.archive.fr',
                          block='forecast',
                          # Container
                          local='my_pearp_gribfile', format='grib')
print(rhandlers)

[<vortex.data.handlers.Handler object at 0x7f9cbf10c780>]


* Here too, we get a list of *resource*'s *Handlers* . It may sound strange but the associated *Section* objects do exist (they are stored and can be recovered). 
* The **input** function accepts some options:
    * **verbose** : If *verbose=True*, a summary is displayed on the standard output;
    * **loglevel** : The desired log level for Vortex (‘info’ by default);
    * **now** : If *now=True* (*False* by défaut), the resource is retrieved on the field by a call to the *get* method. This is often what we want;
    * **insitu** : If *insitu=True*, that an identical *resource*'s *Handler* has already retrieved the data in a previous execution and the container is filled, we just create the *Section* and *Handler* objects (we do not retrieve the data again).
* There is also a **toolbox.output** method used for creating *Sections* for output files (in this case *now=True*, implies an immediate call to the *put* method). 

In [16]:
rhandlers = toolbox.input(# Input function options
                          verbose=False, now=True,
                          # Section
                          role='InitialCondition', fatal=True, intent='inout',
                          # Resource
                          kind='gridpoint', term=3, geometry='euroc25',
                          nativefmt='grib', model='arpege', 
                          cutoff='production', date='2017060118',
                          origin='historic',
                          # Provider
                          vapp='arpege', vconf='pearp', member=2,
                          experiment='DBLE', namespace='vortex.archive.fr',
                          block='forecast',
                          # Container
                          local='my_pearp_gribfile', format='grib')
print(rhandlers)
rhandlers[0].clear()  # Cleanup

# [2019/08/27-15:47:59][vortex.data.stores][inarchiveget:1199][INFO]: inarchiveget on vortex://vsop.archive.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:47:59][vortex.tools.storage][_ftpretrieve:0690][INFO]: ftpget on ftp://hendrix.meteo.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:48:00][vortex.tools.net][get:0268][INFO]: FTP <get:/home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib>


[<vortex.data.handlers.Handler object at 0x7f9cbf10c978>]


True

It is possible to check and modify some defaults used by the *toolbox* module: 

In [17]:
toolbox.show_toolbox_settings()

+ active_now               = False
+ active_insitu            = False
+ active_verbose           = True
+ active_promise           = True
+ active_clear             = False
+ active_metadatacheck     = True
+ active_incache           = False


In [18]:
toolbox.active_now = True

The data retrieval is now done automatically even if `now=True` is omitted:

In [19]:
rhandlers = toolbox.input(# Input function options
                          verbose=False,
                          # Section
                          role='InitialCondition', fatal=True, intent='inout',
                          # Resource
                          kind='gridpoint', term=3, geometry='euroc25',
                          nativefmt='grib', model='arpege', 
                          cutoff='production', date='2017060118',
                          origin='historic',
                          # Provider
                          vapp='arpege', vconf='pearp', member=2,
                          experiment='DBLE', namespace='vortex.archive.fr',
                          block='forecast',
                          # Container
                          local='my_pearp_gribfile', format='grib')
print(rhandlers)
rhandlers[0].clear()  # cleanup

# [2019/08/27-15:49:14][vortex.data.stores][inarchiveget:1199][INFO]: inarchiveget on vortex://vsop.archive.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:49:14][vortex.tools.storage][_ftpretrieve:0690][INFO]: ftpget on ftp://hendrix.meteo.fr//home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib (to: my_pearp_gribfile)
# [2019/08/27-15:49:15][vortex.tools.net][get:0268][INFO]: FTP <get:/home/m/mxpt/mxpt001/vortex/arpege/pearp/DBLE/2017/06/01/T1800P/mb002/forecast/grid.arpege-forecast.euroc25+0003:00.grib>


[<vortex.data.handlers.Handler object at 0x7f9cbf10cd68>]


True

## Intermediate conclusion

* Via the *toolbox* module, it is quite simple, in a script, an interactive Python session or a notebook, to create all the objects necessary for the manipulation of resources. 
* Still, the *Resource*, *Provider* and *Container* objects are created in a rather mysterious way (according to the arguments specified by the user). All these objects have very specific types:

In [20]:
print('Resource:  ', type(rhandlers[0].resource))
print('Provider:  ', type(rhandlers[0].provider))
print('Container: ', type(rhandlers[0].container))

Resource:   <class 'common.data.gridfiles.GridPointExport'>
Provider:   <class 'vortex.data.providers.VortexOp'>
Container:  <class 'vortex.data.containers.SingleFile'>


> That's the whole point of the next presentation...

Any questions: *vortex.support@meteo.fr*