# Caching

This notebook illustrate the use of the climetlab cache.

The relevant Climetlab documentation is located at https://climetlab.readthedocs.io/en/latest/guide/caching.html

Relevant CliMetLab settings are:
- cache-directory 
- maximum-cache-disk-usage 
- maximum-cache-size

In [None]:
import climetlab as cml
URL = "https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.SP.list.v04r00.csv"

Using ``cml.load_source("url",...)`` stores the data in the climetlab cache.  

In [None]:
data = cml.load_source("url", URL)
# pd = data.to_pandas()

Next call to the same code does not redownload the data.

In [None]:
data = cml.load_source("url", URL)
# pd = data.to_pandas()

The downloaded data is actually store in a cache directory, managed by CliMetLab, using a small database. Data is also unzipped if needed within the cache directory.

The cache can be observed and manipulated:
- Within python using ``cml.cache``
- With command line interface ``climetlab cache`` and ``climetlab decache``
- Using the web interface GUI (in progress: summer of code project https://github.com/ecmwf-lab/climetlab-script-web)
- NOT by playing directly with the cache files (same logic as a web browser cache).

In [None]:
cml.cache

In [None]:
!climetlab cache

In [None]:
!climetlab cache --all

In [None]:
!climetlab cache --newer 1d

In [None]:
!climetlab cache --help

In [None]:
# Delete cached data newer than 1d
# !climetlab decache --newer 1d

# Configuring CliMetLab cache settings

In [None]:
!climetlab settings cache-directory 
!climetlab settings maximum-cache-disk-usage 
!climetlab settings maximum-cache-size  

# Concurrent cache use

If the cache is full, the older data is automatically deleted (with a log message). 
When multiple scripts are using the same cache this may lead to a file being deleted (because the cache is full), even if it is currently in use by another script.
 




In [21]:
import climetlab as cml
cml.settings.set("maximum-cache-size", "2k")

Exercice:
- Set the 'maximum-cache-size' to 2K (see above).
- Run script get_nc.py.
- Run script get_grib.py.
- Run script get_nc.py.
- See what happens.
- Reset all settings (cml.settings.reset())

-> The climetlab cache is a cache: it is automatically cleaned up.
-> Multiple users should not share the same cache directory.

In [22]:
cml.settings.reset()