### Build local cache file from Argo data sources
*Execute commands to pull data from the Internet into a local HDF file so that we can better interact with the data*

On a development system (where we have not executed a `pip install oxyfloat`) we need to add the oxyfloat directory to the Python search path. Do this before starting the notebook server with (replace `~/dev/oxyfloatgit/` with the directory where you cloned the oxyfloat project):

```bash
export PYTHONPATH=~/dev/oxyfloatgit/
cd ~/dev/oxyfloatgit/notebooks
ipython notebook
```

Alternatively, you can set the path interactively, e.g.:

In [1]:
import sys
sys.path.insert(0, '/home/mccann/dev/oxyfloatgit/')

Import the OxyFloat class and instatiate an OxyFloat object (`of`) with verbosity set to 2 so that we get INFO messages.

In [2]:
from oxyfloat import OxyFloat
of = OxyFloat(verbosity=2)

You can now explore what methods the of object has by typing `of.` in a cell and pressing the tab key. One of the methods is `get_oxy_floats()`; to see what it does select it and press shift-tab with the cursor in the parantheses of `of.get_oxy_floats()`. Let's get a list of all the floats that have been out for at least 304 days and print the length of that list.

In [3]:
%%time
floats340 = of.get_oxy_floats(age_gte=340)
print('{} floats at least 340 days old'.format(len(floats340)))

INFO 2015-10-28 16:15:55,436 OxyFloat.py status_to_df():100 Reading data from http://argo.jcommops.org/FTPRoot/Argo/Status/argo_all.txt
INFO:oxyfloat.OxyFloat:Reading data from http://argo.jcommops.org/FTPRoot/Argo/Status/argo_all.txt
INFO 2015-10-28 16:16:16,036 OxyFloat.py put_df():77 Saving DataFrame to name "status" in file /home/mccann/dev/oxyfloatgit/oxyfloat/oxyfloat_cache.hdf
INFO:oxyfloat.OxyFloat:Saving DataFrame to name "status" in file /home/mccann/dev/oxyfloatgit/oxyfloat/oxyfloat_cache.hdf


563 floats at least 340 days old
CPU times: user 350 ms, sys: 175 ms, total: 525 ms
Wall time: 20.9 s


your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block2_values] [items->['WMO', 'TELECOM', 'TTYPE', 'MY_ID', 'SERIAL_NO', 'DATE0', 'NOTIF_DATE', 'SHIP', 'CRUISE', 'DATE_', 'MODEL', 'FULL_NAME', 'EMAIL', 'PROGRAM', 'COUNTRY']]

  self.put_df(self.status_to_df(), self.STATUS)


If this the first time you've executed the cell it will take half minute or so to read the Argo status information from the Internet (the PerformanceWarning can be ignored - for this small table it doesn't matter much). 

Once the status information is read it is cached locally and further calls to `get_oxy_floats()` will execute much faster. To demonstrate, let's count all the oxygen labeled floats that have been out for at least 2 years. 

In [4]:
%%time
floats730 = of.get_oxy_floats(age_gte=730)
print('{} floats at least 730 days old'.format(len(floats730)))

400 floats at least 730 days old
CPU times: user 135 ms, sys: 0 ns, total: 135 ms
Wall time: 162 ms


Now let's find the Data Assembly Center URL for each of the floats in our list.

In [5]:
%%time
dac_urls = of.get_dac_urls(floats340)
print(len(dac_urls))
dac_urls[:5]

INFO 2015-10-28 16:16:42,746 OxyFloat.py global_meta_to_df():112 Reading data from ftp://ftp.ifremer.fr/ifremer/argo/ar_index_global_meta.txt
INFO:oxyfloat.OxyFloat:Reading data from ftp://ftp.ifremer.fr/ifremer/argo/ar_index_global_meta.txt
INFO 2015-10-28 16:16:46,604 OxyFloat.py put_df():77 Saving DataFrame to name "global_meta" in file /home/mccann/dev/oxyfloatgit/oxyfloat/oxyfloat_cache.hdf
INFO:oxyfloat.OxyFloat:Saving DataFrame to name "global_meta" in file /home/mccann/dev/oxyfloatgit/oxyfloat/oxyfloat_cache.hdf


562
CPU times: user 849 ms, sys: 10 ms, total: 859 ms
Wall time: 4.69 s


In [6]:
for url in of.get_profile_opendap_urls(dac_urls[0]):
    d = of.get_profile_data(url)
    print d
    break

{'lon': [{'coordinate_reference_frame': 'urn:ogc:crs:EPSG::4326', '_FillValue': 99999.0, 'reference': 'WGS84', 'valid_min': -180.0, 'long_name': 'Longitude of the station, best estimate', 'standard_name': 'longitude', 'units': 'degree_east', 'valid_max': 180.0, 'axis': 'X'}, 87.269999999999996], 'o': [{'_FillValue': 99999.0, 'C_format': '%9.3f', 'resolution': 0.0010000000474974513, 'valid_min': 0.0, 'FORTRAN_format': 'F9.3f', 'long_name': 'Dissolved oxygen', 'standard_name': 'moles_of_oxygen_per_unit_mass_in_sea_water', 'units': 'micromole/kg', 'valid_max': 600.0}, [99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 99999.0, 9999