# lksearch Cloud Configuration

`lksearch` has three configuration parameters that are particularly relevant to cloud-based science platforms.  These are:
    - `CLOUD_ONLY`: Only Download cloud based data. If `False`, will download all data. If `True`, will only download data located on a cloud (Amazon S3) bucket
    - `PREFER_CLOUD`: Prefer Cloud-based data product retrieval where available
    - `DOWNLOAD_CLOUD`: Download cloud based data. If `False`, download() will return a pointer to the cloud based datainstead of downloading it - intended usage for cloud-based science platforms (e.g. TiKE)

`CLOUD_ONLY` governs whether or not non-cloud based data will be possible to be downloaded.  Many science files have both a cloud-based location (typically on Amazon S3) and a MAST archive location. By default this is `False`, and all products will be downloaded regardless of whether the file is available via cloud-hosting or MAST archive hosting. If `CLOUD_ONLY` is `True`, only files available for download on a cloud-based platform will be retrieved.  This configuration parameter is passed through to the `~astroquery.mast` parameter of the same name.  

`PREFER_CLOUD` governs the default download behaviour in the event that a data product is available from both a cloud-based location and a MAST-hosted archive location.  If `True` (default), then `lksearch` will preferentially download files from the cloud-host rather than the MAST-hosted Archive. This configuration parameter is passed through to the `~astroquery.mast` parameter of the same name.  

`DOWNLOAD_CLOUD` governs whether files that are hosted on the cloud are downloaded locally. If this value is `True` (default), cloud-hosted files are downloaded normally.  If `False`, then files hosted on a cloud based platform are not downloaded, and a URI containing the path to the desired file on the cloud-host is returned instead of the local path to the file.  This path can then be used to read the file remotely (see `~astropy.io.fits` [working with remote and cloud hosted files](https://docs.astropy.org/en/stable/io/fits/#working-with-remote-and-cloud-hosted-files:~:text=with%20large%20files-,Working%20with%20remote%20and%20cloud%2Dhosted%20files,-Unsigned%20integers) for more information). This ability may be most relevant when using `lksearch` on a cloud-based science platform where the remote read is very rapid and short-term local storage comparatively expensive.  

Using this `DOWNLOAD_CLOUD` functionality, we can find a cloud-hosted file and read it directly into memory like so:

In [1]:
# First, lets update our configuration to not download a cloud-hosted file
from lksearch import Conf, TESSSearch

Conf.DOWNLOAD_CLOUD = False

# Now, lets find some data. We use this target earlier in the tutorial.
toi = TESSSearch("TOI 1161")

# What happens when we try to download it in our updated configuration?
cloud_result = toi.timeseries.mission_products[0].download()
cloud_result

Downloading products: 100%|██████████████████████| 1/1 [00:00<00:00, 497.72it/s]


Unnamed: 0,Local Path,Status,Message,URL
0,s3://stpubdata/tess/public/tid/s0014/0000/0001...,COMPLETE,Link to S3 bucket for remote read,


As we can see above, instead of downloading the above file we have instead returned an amazon S3 URI for its cloud hosted location.  If we want to access the file, we can do it using the remote-read capabilities of `~astropy.io.fits`.  

(Note: to do this you will need to install `fsspec` and `s3fs`.)

In [2]:
import astropy.io.fits as fits

with fits.open(
    cloud_result["Local Path"].values[0], use_fsspec=True, fsspec_kwargs={"anon": True}
) as hdu:
    for item in hdu:
        print(item.fileinfo())

ImportError: Install s3fs to access S3