Starting point to build an application to generate CAOM2 Observations from FITS files.
In an empty directory:
-
This is the working directory, so it should probably have some space.
-
In the
main
branch of this repository, find the fileDockerfile
. In thescripts
directory, find the filesdocker-entrypoint.sh
, andconfig.yml
. Copy these files to the working directory.wget https://raw.github.com/opencadc/possum2caom2/main/Dockerfile wget https://raw.github.com/opencadc/possum2caom2/main/scripts/docker-entrypoint.sh wget https://raw.github.com/opencadc/possum2caom2/main/scripts/config.yml
-
Make
docker-entrypoint.sh
executable. -
config.yml
is configuration information for the ingestion. It will work with the files named and described here. For a complete description of its content, see https://github.com/opencadc/collection2caom2/wiki/config.yml. -
The ways to tell this tool the work to be done:
-
provide a file containing the list of file ids to process, one file id per line, and the config.yml file containing the entries 'use_local_files' set to False, and 'task_types' set to -ingest -modify. The 'todo' file may provided in one of two ways:
- named 'todo.txt' in this directory, as specified in config.yml, or
- as the fully-qualified name with the --todo parameter
-
provide the files to be processed in the working directory, and the config.yml file containing the entries 'use_local_files' set to True, and 'task_types' set to -store -ingest -modify.
- The store task does not have to be present, unless the files on disk are newer than the same files at CADC.
-
provide the files to be processed in a Pawsey acacia remote, with the config.yml file containing the entries 'use_local_files' set to False, the 'task_types' set to -store -ingest -modify, and the
data_sources
set to<acacia remote>/possum/tiles
.- The
data_sources
entry requires therclone
configuration to be set up. Usepawsey
in theacacia remote
name to get the correctrclone
syntax for the commands.
- The
-
provide the files to be processed in the working directory, and the config.yml file containing the entries 'use_local_files' set to True, and 'task_types' set to -scrape.
- This configuration will not attempt to write files or CAOM2 records to CADC. It is a good way to craft the content of the CAOM2 record without continually updating database content.
-
-
To build the container image, run this:
docker build -f Dockerfile -t possum_run_cli ./
-
In the working directory, place a CADC proxy certificate. The Docker image can be used to create a proxy certificate as follows. You will be prompted for the password for your CADC user:
user@dockerhost:<cwd># docker run --rm -ti -v ${PWD}:/usr/src/app -v <fully-qualified path to staging directory>:/data --user $(id -u):$(id -g) -e HOME=/usr/src/app --name possum_run_cli possum_run cadc-get-cert --days-valid 10 --cert-filename /usr/src/app/cadcproxy.pem -u <your CADC username>
-
To set up the
rclone
configuration for Pawsey s3 acacia storage, run the image, and then runrclone config
from within the image. Followrclone config
steps as described here: https://www.youtube.com/watch?v=mOp7NJpwzac&t=1507s. Note that this will leave the.config/rclone/rclone.conf
file on disk, which is why the last step is to set permissions on therclone
configuration file:user@dockerhost:<cwd># docker run --rm -ti -v <cwd>:/usr/src/app --user $(id -u):$(id -g) -e HOME=/usr/src/app --name possum_run_cli possum_run_cli /bin/bash cadcops@d51a02720ea6:~$ rclone config cadcops@d51a02720ea6:~$ <follow the rclone config steps> cadcops@d51a02720ea6:~$ exit user@dockerhost:<cwd># chmod 600 .config/rclone/rclone.conf
-
To run the application where it will retrieve files from the remote Pawsey s3 acacia storage:
user@dockerhost:<cwd># docker run --rm -ti -v <cwd>:/usr/src/app --user $(id -u):$(id -g) -e HOME=/usr/src/app --name possum_run_cli possum_run_cli possum_run_remote
-
To edit and test the application from inside a container:
user@dockerhost:<cwd># git clone https://github.com/opencadc/possum2caom2.git user@dockerhost:<cwd># docker run --rm -ti -v <cwd>:/usr/src/app -v <fully-qualified path to staging directory>:/data --user $(id -u):$(id -g) -e HOME=/usr/src/app --name possum_run_cli possum_run_cli /bin/bash root@53bef30d8af3:/usr/src/app# pip install -e ./possum2caom2 root@53bef30d8af3:/usr/src/app# pip install mock pytest root@53bef30d8af3:/usr/src/app# cd possum2caom2/possum2caom2/tests root@53bef30d8af3:/usr/src/app# pytest
-
For some instructions that might be helpful on using containers, see: https://github.com/opencadc/collection2caom2/wiki/Docker-and-Collections