Skip to content

First version of the report generator. Add further tools.#320

Merged
perezjosibm merged 6 commits intoceph:masterfrom
perezjosibm:wip.report_gen
Jan 28, 2025
Merged

First version of the report generator. Add further tools.#320
perezjosibm merged 6 commits intoceph:masterfrom
perezjosibm:wip.report_gen

Conversation

@perezjosibm
Copy link
Contributor

@perezjosibm perezjosibm commented Dec 10, 2024

This PR introduces some new tools:

  • report_gen.py - script that traverses the dir tree to select .JSON entries to produce a report in .tex (uses some existing templates with contents to add, like the header, table of contents, etc.)
  • diskstat_diff.py - script to filter disk utilisation, producing a .JSON as result, ready to plot, etc.
  • top-parser.py - script to filter .JSON files from top (translated by jc), calculating averages for CPU and MEM utilisation, used to combine this data with .JSON from FIO.
  • gnuplot_plate.py - thin wrapper around gnuplot to generate Response latency curves (hockey stick performance charts).
  • gen_json_xtractor.py, test_run_spec.py - refactoring of fio-parse-jsons.py into modules that can be reused by other tools.

Usage:

All the scripts have a --help option to provide guide of usage.

  • parse-top.py:
    cat ${TEST_RESULT}_top.out | jc --top --pretty > ${TEST_RESULT}_top.json
    python3 /root/bin/parse-top.py --config=${TEST_RESULT}_top.json --cpu="${OSD_CORES}" --avg=${OSD_CPU_AVG} \
          --pids=${TOP_PID_JSON} 2>&1 > /dev/null

will produce a ${TEST_RESULT}_cpu_avg.json file as output (in the specified dir).

  • diskstat_diff.py: the following snippet illustrates taking two samples, one before and one after the test execution, the script calculates the difference and produces the result updating the given file name as argument:
# Take diskstats measurements before FIO instances
      jc --$pretty /proc/diskstats > ${DISK_STAT}

# Measure the diskstats after the completion of FIO
      jc --pretty /proc/diskstats | python3 /root/bin/diskstat_diff.py -a ${DISK_STAT}

Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
Signed-off-by: Jose J Palacios-Perez <perezjos@uk.ibm.com>
@perezjosibm
Copy link
Contributor Author

Hi @sseshasa I'd appreciate if you could review my PR please. Many thanks in advance.

@perezjosibm
Copy link
Contributor Author

Hi @sseshasa when convenient, could you please review my PR? Many thanks in advance.

@sseshasa
Copy link
Contributor

Hi @sseshasa when convenient, could you please review my PR? Many thanks in advance.

@perezjosibm Apologies, Have been busy with other things. I will try and look into it this week.

@perezjosibm
Copy link
Contributor Author

Hi @sseshasa it would be great if you could please provide some feedback, when you find it convenient please. I understand you are busy, so I appreciate your time. Thanks

Copy link
Contributor

@sseshasa sseshasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Apologies again for taking time. I Just left a couple of questions. This is shaping up to be quite a comprehensive set of tools! I haven't gone through each and every line and just tried to understand the purpose behind the set of tools and the general mechanism you employ to chart out the metrics.

Comment on lines +20 to +22
OSD_LIST = [1,3,8]
REACTOR_LIST = [1,2,4]
ALIEN_LIST = [7,14,21]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of crimson is limited and so this question:
Are these lists not going to change for the foreseeable future? Can these instead be passed as an input to the script?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sseshasa thanks for your feedback! Yes, the plan is that such list is going to be received from the input test plan .yaml. That example is temporarily, only for the good practice of initializing Python data structures 👍

- (input/output) and a _cpu_avg.json file name, average over a range (typically for Response latency curves).

Example of usage:
cat ${TEST_RESULT}_top.out | jc --top --pretty > ${TEST_RESULT}_top.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the script itself be made to generate the _top.json file? I mean the script itself be made to execute the top command and saving it in json format and then using that to extract the metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I decided to make it separated to not depend on the underlying system, since often one needs to test on cold data taken from the target box whilst running another set of tests 👍
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants