Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write and document setup_om scripts #42

Closed
apeck12 opened this issue Apr 19, 2022 · 7 comments
Closed

Write and document setup_om scripts #42

apeck12 opened this issue Apr 19, 2022 · 7 comments
Assignees

Comments

@apeck12
Copy link
Collaborator

apeck12 commented Apr 19, 2022

We need a dag that spans the following tasks:

  1. fetch_mask, which takes as input the detector (e.g. jungfrau4M or epix10k2M), experiment, format (crystfel, cctbx, psana), and savename. If the experiment is latest, then the latest mask from the given detector will be retrieved. After retrieving the correct mask, it will be formatted if needed and saved.
  2. fetch_geom, which does the same as above, except retrieving / converting / saving the most recent geom file.
  3. deploy_om, which deploys OM.
@apeck12 apeck12 self-assigned this Apr 19, 2022
@fredericpoitevin
Copy link
Collaborator

The DAG needs "file sensors" to check whether the mask and geometry files already exist (and trigger fetch_mask and/or fetch_geom if they do not).

@fredericpoitevin
Copy link
Collaborator

From @valmar:
Ok, so basically, what we need to do is pull the following files from the website:
https://www.ondamonitor.com/html/files/lcls/{hutch}/run_om.sh
https://www.ondamonitor.com/html/files/lcls/{hutch}/monitor.yaml
Where hutch is mfx or cxi.
We must then pull the latest geometry and mask from btx

@fredericpoitevin
Copy link
Collaborator

From @valmar:
We can work on these files, because they currently contain "default" nodes. For example, run_om.sh for cxi has the following nodes: daq-mfx-mon02,daq-mfx-mon03,daq-mfx-mon04,daq-mfx-mon05

These are usually OK! I can make an empty template and we could fill them in with the correct nodes. Or we could just use these if they are usually OK!

The main problem I see is this: in order to get the available nodes, we must run wherepsana but we must run it on the DAQ machine (say cxi-daq), which might be a problem

One thing we could do is have this information "somewhere" on the network where AirFlow can get it, like the queue as we discussed yesterday

@fredericpoitevin
Copy link
Collaborator

fredericpoitevin commented Apr 22, 2022

After discussing with @valmar , Murali, Wilko and Thorsten this morning, it became apparent that this use case was not adapted to Airflow DAG runs triggered from the eLog, as it requires web access, and targets the beamline operator for localized tasks rather than a general user task that could be run across multiple HPC.

Instead, what needs to be implemented is:

  • cron jobs on pslogin to regularly git pull btx, mrxv and omdevteam.github.io to /cds/sw/package/autosfx/
  • a simple script that can be run from ${hutch}-daq by ${hutch}opr to:
    • create /cds/home/opr/${hutch}opr/OM-GUI/${experiment}/om-workspace
    • copy monitor.yaml and run_om.sh from /cds/sw/package/autosfx/omdevteam.github.io/html/files/lcls/${hutch}
    • get the latest mask and geometry from /cds/sw/package/autosfx/mrxv/ for the relevant detector.
    • (bonus) run wherepsana and update run_om.sh with the corresponding monitoring nodes.

@fredericpoitevin fredericpoitevin changed the title Generate setup_om dag Write and document setup_om scripts Apr 22, 2022
@fredericpoitevin
Copy link
Collaborator

cron job now setup:

[fpoitevi@pslogin01 btx]$ ./scripts/pull_repos_cron_pslogin.sh 
# Date: Fri Apr 22 18:15:32 PDT 2022 | User: fpoitevi | Location: /cds/sw/package/autosfx/btx
Pulling to /cds/sw/package/autosfx/btx
Already up-to-date.
Pulling to /cds/sw/package/autosfx/mrxv
Already up-to-date.
Pulling to /cds/sw/package/autosfx/omdevteam.github.io
Already up-to-date.
[fpoitevi@pslogin01 btx]$ crontab -l
@daily /cds/sw/package/autosfx/btx/scripts/pull_repos_cron_pslogin.sh >> /cds/sw/package/autosfx/btx/cronjob.log 2>&1

@fredericpoitevin
Copy link
Collaborator

An initial attempt at automation for the second task above was drafted in #55

However, this will probably not be used in a fully automated fashion in production, due to unpredictable issues like the one we just encountered where the monitoring nodes were weirdly defined in the DAQ config file, resulting in MPI connection weirdness that could have impacted OM's performances.

@fredericpoitevin
Copy link
Collaborator

Cron job setup on S3DF as well:

[fpoitevi@sdflogin003 scripts]$ crontab -l
@daily /sdf/group/lcls/ds/tools/btx/scripts/pull_repos_cron_pslogin.sh >> /sdf/group/lcls/ds/tools/btx/cronjob.log 2>&1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants