Defining the "common workflow" for our lesson #5

ocaisa · 2022-07-20T14:30:46Z

The current example is a set of books that are downloaded. How do we define our raw data? We effectively don't have any, what we are doing is taking measurements with amdahl which will become our raw data.

In 01-introduction.md we start off by creating a bash script describing the manual workflow. We will somehow need to replicate this. This will require:

Generating a set of data (which will require parsing of amdahl output, or perhaps adding a --terse option to amdahl, see Add --terse option to amdahl to make it easier to parse the output #6). Redirecting the amdahl output to a file could work...or indeed using the output files from SLURM itself.
Plotting the result (both graphically...and perhaps in terminal)

The text was updated successfully, but these errors were encountered:

ocaisa · 2022-07-21T10:33:02Z

The "common workflow" identified in 01-introduction.md is

Read a data file.
Perform an analysis on this data file.
Write the analysis results to a new file.
Plot a graph of the analysis results.
Save the graph as an image, so we can put it in a paper.
Make a summary table of the analyses, which requires aggregation of all previous results.

Can we cover the same points? I think the last point is the hardest (and unnecessary for us). Order could be changed though to

Create data files (Run slurm job using job template we provide, store output in well-defined filename)
Perform an analysis on the data files. (extract our timings convert into speedup)
Write the analysis results to a new file.
Plot a graph of the analysis results (could consider doing this locally or remotely).
Save the graph as an image.
Pull the results (in this case an image) from the cluster and review it.

reid-a · 2022-08-01T18:02:15Z

This could build off what was done in the HPC Intro lesson -- call back to that lesson, show a job script, and look at the output of the job script. This could live in first episode. Induces HPC Intro as a pretty hard pre-requisite for this lesson.

bkmgit · 2022-08-10T14:22:35Z

The current format with the job submission script at the end seems ok. However, one may wish to enable attendees to practice using SLURM, in which case one could introduce the job submission script at the beginning. The lesson seems independent of HPC intro, but does allow practice using a scheduler.

ocaisa added the hackathon label Jul 21, 2022

ocaisa changed the title ~~What is our raw data?~~ Defining the "common workflow" for our lesson Aug 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defining the "common workflow" for our lesson #5

Defining the "common workflow" for our lesson #5

ocaisa commented Jul 20, 2022 •

edited

Loading

ocaisa commented Jul 21, 2022 •

edited

Loading

reid-a commented Aug 1, 2022

bkmgit commented Aug 10, 2022

Defining the "common workflow" for our lesson #5

Defining the "common workflow" for our lesson #5

Comments

ocaisa commented Jul 20, 2022 • edited Loading

ocaisa commented Jul 21, 2022 • edited Loading

reid-a commented Aug 1, 2022

bkmgit commented Aug 10, 2022

ocaisa commented Jul 20, 2022 •

edited

Loading

ocaisa commented Jul 21, 2022 •

edited

Loading