# Data retrieval example

In order to run lephare we must download some input data. In this short notebook we a simple example which uses pooch to check if the required files have already been downloaded and to download them if not.

In [1]:
import os
import lephare

Lephare default working directory already exists at
                    /Users/rshirley/Library/Caches/lephare/work.


In [2]:
# Helper function for use in this notebook


def partial_print(print_list, number_lines):
    print(f"{len(print_list)} lines in list:\n")
    if len(print_list) < 2 * number_lines:
        for line in print_list:
            print(line)
    else:
        for line in print_list[:number_lines]:
            print(line)
        print("...")
        for line in print_list[-number_lines:]:
            print(line)

In [3]:
# Getting a list of file names from a list file
# The list file can be a url or a path to a local list file

list_file = "https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/91006fcdf6a4b36932f1b5938e8d2084aca4a2e0/sed/QSO/QSO_MOD.list"
file_names = lephare.data_retrieval.read_list_file(list_file, prefix="")

partial_print(file_names, 3)

11 lines in list:

sed/QSO/qso_template_norm.sed
sed/QSO/qso_low_template_norm.sed
sed/QSO/qso_high_template_norm.sed
...
sed/QSO/Redqso_template_norm.sed
sed/QSO/MR_QSO1.sed
sed/QSO/MR_QSO2.sed


In [4]:
# Or, alternatively, you can download files by subdirectory
# Here, we specify our desired subdirectories and get a list of the files they contain

# target_dirs = ["sed/GAL/", "filt/lsst/"]
# file_names = lephare.data_retrieval.filter_files_by_prefix(registry_file, target_dirs)
# partial_print(file_names, 4)

In [5]:
# A third way to get data is to grab the zip file from OSF,
# which can be found in: https://osf.io/mvpks/files/osfstorage

# TODO: extend this in a future PR

In [6]:
# Download the registry file
# This will default to the default registry location at the default base url,
# then output as the default registry file name, but these can be overridden
# with the url and outfile keywords

lephare.data_retrieval.download_registry_from_github()

File downloaded and saved as data_registry.txt


In [7]:
# Download the data files

# The parameters here are already the default values in the function,
# but we explictly define them for examples' sake
base_url = lephare.data_retrieval.DEFAULT_BASE_DATA_URL
registry_file = lephare.data_retrieval.DEFAULT_REGISTRY_FILE
data_path = lephare.data_retrieval.DEFAULT_LOCAL_DATA_PATH

retriever = lephare.data_retrieval.make_retriever(
    base_url=base_url, registry_file=registry_file, data_path=data_path
)

lephare.data_retrieval.download_all_files(retriever, file_names, ignore_registry=False)

Downloading file 'sed/QSO/qso_low_template_norm.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/sed/QSO/qso_low_template_norm.sed' to '/Users/rshirley/Documents/github/lincc/lephare-dev/docs/notebooks/data'.
Downloading file 'sed/QSO/qso_template_norm.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/sed/QSO/qso_template_norm.sed' to '/Users/rshirley/Documents/github/lincc/lephare-dev/docs/notebooks/data'.
Downloading file 'sed/QSO/qso_high_template_norm.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/sed/QSO/qso_high_template_norm.sed' to '/Users/rshirley/Documents/github/lincc/lephare-dev/docs/notebooks/data'.
Downloading file 'sed/QSO/QSO_SDSS.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/sed/QSO/QSO_SDSS.sed' to '/Users/rshirley/Documents/github/lincc/lephare-dev/docs/notebooks/data'.
Downloading file 'sed/QSO/MR_QSO1.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHAR

Created directory: sed/QSO
Checking/downloading 11 files...
11 completed.
All files downloaded successfully and are non-empty.


In [8]:
base_url, registry_file, data_path

('https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/',
 'data_registry.txt',
 './data')

In [9]:
os.environ["LEPHAREDIR"],os.environ["LEPHAREWORK"]

('/Users/rshirley/Library/Caches/lephare/data',
 '/Users/rshirley/Library/Caches/lephare/work')

In [11]:
base_url = lephare.data_retrieval.DEFAULT_BASE_DATA_URL
registry_file = lephare.data_retrieval.DEFAULT_REGISTRY_FILE
data_path = os.environ["LEPHAREDIR"]

retriever = lephare.data_retrieval.make_retriever(
    base_url=base_url, registry_file=registry_file, data_path=data_path
)

lephare.data_retrieval.download_all_files(retriever, file_names, ignore_registry=False)

Downloading file 'sed/QSO/MR_QSO1.sed' from 'https://raw.githubusercontent.com/OliviaLynn/LEPHARE-data/main/sed/QSO/MR_QSO1.sed' to '/Users/rshirley/Library/Caches/lephare/data'.


Checking/downloading 11 files...
11 completed.
All files downloaded successfully and are non-empty.
