-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
removing pv req for indexing #154
Conversation
2d84b9e
to
62ee4e7
Compare
/rerun kni |
/rerun minishift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the documented produced. LGTM
/rerun minikube |
/rerun minishift |
/rerun minikube |
/rerun minishift |
@@ -10,7 +10,6 @@ time_based=1 | |||
runtime={{fiod.runtime}} | |||
clocksource=clock_gettime | |||
ramp_time={{fiod.ramp_time}} | |||
stonewall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing stonewall from the jobfiles?
Side note: This only needs to be defined once in the [global]
section; it is redundant to repeat it in the individual job sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING: BIG REVIEW AHEAD
I have a number of concerns about the approach being taken between ripsaw and snafu. I'd like to open these thoughts to feedback, particularly from familiar users of fio. I'm looking at you @bengland2 and @shekharberry.
Splitting jobfiles into single jobs instead of using stonewall
I'm actually fine with this approach in concept, as it is how I have always run fio in my own wrapper scripts in the past. However, I don't think that it is the correct approach to take a combined jobfile, provided as a ripsaw role template, and then split it via python loop logic in the snafu wrapper script. Doing this confuses the context of the job from the view of an experienced fio user who may look into or even modify our jobfile templates in ripsaw. As an fio user, I would look at the combined jobfile and expect it to run as combined, so I would immediately question why stonewall
is not being used. I wouldn't be able to resolve that question without digging into the wrapper code in snafu.
The right place to split the jobfiles
IMO, what we should be doing instead is splitting the jobfiles in the ripsaw templates. This to me is much simpler than maintaining the _parse_jobs
and _build_fio_job
python functions as currently written in the snafu wrapper. This also provides the added advantage of better user control of the jobs to be run by allowing us to adjust the CR such that the job
is a list of built-in jobfiles to loop through, and this also better paves the way for the user to provide custom jobfiles.
User story:
As a user, I want to run only the sequential write and random write fio jobs.
With the current implementation, that ^^ is not possible.
What snafu should and shouldn't do
I believe that the snafu wrapper scripts should:
- Normalize the data output for the CDM
- Provide the logic for uploading the data to the remote store (Elasticsearch or other)
The logic to split apart the combined fio jobfile I think is an overload of snafu. I think the context of how to run the job should remain within ripsaw. Perhaps that could still be passed to snafu for better logic handling via python when needed (spitballing... maybe a python ansible module would make even more sense?), but a snafu wrapper shouldn't fundamentally change the way a job operates, as that may be counter to the user expectations set within ripsaw.
fio jobfiles within ripsaw
I've started to address current problems with our jobfile templates in PR #147, but those concerns become more relevant with these proposed changes. We currently duplicate a lot of values among the job sections, and this is redundant. I think we need one simple global template that covers all of our base expectations. Then the actual jobs might not need to be template files at all, since for the most part they may only have one or two lines.
Timestamps
I'm not sure yet about this part. There seems to be some extra effort in the snafu wrapper script to derive timestamps outside of what fio is providing as part of its own results. I don't fully grok the justification for this, and it's possible the results could be misleading for a distributed run with ms timestamp resolution where individual pods may not be perfectly in-sync. The relative timestamps from fio may actually be more valuable. Open to discussion.
Example
In the CR you would have something like:
jobs:
- write
- read
- randwrite
Where the values of the list items match actual jobs per fio(1)
.
Then we have a single global jobfile template like:
[global]
directory={{ fio_path }}
ioengine=libaio
bs={{ fiod.bs }}
iodepth={{ fiod.iodepth }}
direct=1
create_on_open=1
time_based=1
runtime={{ fiod.runtime }}
clocksource=clock_gettime
ramp_time={{ fiod.ramp_time }}
size={{ fiod.filesize }}
write_bw_log=fio
write_iops_log=fio
write_lat_log=fio
write_hist_log=fio
per_job_logs=1
log_avg_msec={{ fiod.log_sample_rate }}
log_hist_msec={{ fiod.log_sample_rate }}
filename_format=f.\$jobnum.\$filenum
end_fsync=1
Then we only have to pass the name of the job from the loop, and the snafu script can create the jobfile on the fly since the only thing missing for a particular single job is:
[write]
rw=write
Alternately, if there are per-job things that shouldn't go in the global section, we could go ahead and make small templates in ripsaw for each of them.
Other
I have some sidebar concern about the original justification for this change. (Note that this concern doesn't affect what we do here as long as we take one approach or another to serialize the jobs within our own workflow instead of using stonewall.) The observation was explained to me as the stonewall
parameter being honored per-client in a distributed run rather than per-job. This apparently led to discrepancies in timestamps and the idea that subsequent jobs on some clients were overlapping prior jobs on other clients. I don't think this is how stonewall
is supposed to work, but I've struggled to find a good explanation to say for sure.
have replies to certain concerns inline.
The splitting jobs through python script, can be toggled by adding another parameter
This can be easily addressed by allowing custom job files, which is a bit out of scope of this PR.
I think ultimately everything boils down to this
The output of snafu is :
If we feel that snafu should only do following:
Then we'll need the input to snafu to be:
The output will still be:
So if we move forward with the above, then snafu shouldn't run the benchmark as well and ripsaw would still run benchmark and just pass the output. I'm okay with either way, but we need to define the interfaces well. So ultimately we need to decide the input/output of the component ( snafu ) and its interfaces with other components ( ripsaw ). |
@dustinblack , I won't comment on "snafu" since I'm not familiar with it.
I think @aakarshgopi proposal to run a separate fio --client per fio job
is much more in line with how I and others (e.g. CBT, pbench-fio) have used
it in the past. It also means there is less uncertainty on how to
timestamp the data from the fio logs, since we know that the job starts
when the fio --client starts (+ a couple of sec for transmitting job
files). There will always be some uncertainty in a distributed workload
- this doesn't matter much as long as the fio threads are almost entirely
running in the same time interval (say 95%). As long as fio --client
starts them at approximately the same time +- a few seconds, they should
stop at approximately the same time also because they all have the same
"runtime" parameter, assuming nodes are time-synced.
It's important for timestamps of fio results to be reasonably accurate so
that we can correlate with data collected by prometheus/pbench/whatever on
h/w resource utilization, etc. as Alex Calhoun was saying. I'm not
hung up on the output format from ripsaw, but some consistency in it would
make it easier to convert benchmark results to elastic search documents and
correlate it to time-series data from other sources (prometheus, pbench,
etc).
…On Wed, Jul 17, 2019 at 4:05 PM Aakarsh Gopi ***@***.***> wrote:
have replies to certain concerns inline.
*Splitting jobfiles into single jobs instead of using stonewall*
I'm actually fine with this approach in concept, as it is how I have
always run fio in my own wrapper scripts in the past. However, I don't
think that it is the correct approach to take a combined jobfile, provided
as a ripsaw role template, and then split it via python loop logic in the
snafu wrapper script. Doing this confuses the context of the job from the
view of an experienced fio user who may look into or even modify our
jobfile templates in ripsaw. As an fio user, I would look at the combined
jobfile and expect it to run as combined, so I would immediately question
why stonewall is not being used. I wouldn't be able to resolve that
question without digging into the wrapper code in snafu.
The splitting jobs through python script, can be toggled by adding another
parameter no_split to the cr. I can add this functionality to avoid
confusion. That way if the user wants the jobs to not be split, they can
still achieve that.
*The right place to split the jobfiles*
IMO, what we should be doing instead is splitting the jobfiles in the
ripsaw templates. This to me is much simpler than maintaining the
_parse_jobs and _build_fio_job python functions as currently written in
the snafu wrapper
<https://github.com/cloud-bulldozer/snafu/blob/0da1bf70b4d13dde31f7e786013a3e3b983ea204/fio-wrapper/fio-wrapper.py>.
This also provides the added advantage of better user control of the jobs
to be run by allowing us to adjust the CR such that the job is a list of
built-in jobfiles to loop through, and this also better paves the way for
the user to provide custom jobfiles.
simpler is debatable, because imho it'll make things harder to maintain.
Ansible isn't a programming language, so the framework would've less
control on running the jobs. My concern is we'll probably hit the same
issues we've seen with indexing work and ultimately decide to move even
that logic into snafu.
*User story: As a user, I want to run only the sequential write and random
write fio jobs.*
With the current implementation, that ^^ is not possible.
This can be easily addressed by allowing custom job files, which is a bit
out of scope of this PR.
*What snafu should and shouldn't do*
I believe that the snafu wrapper scripts should:
1. Normalize the data output for the CDM
2. Provide the logic for uploading the data to the remote store
(Elasticsearch or other)
The logic to split apart the combined fio jobfile I think is an overload
of snafu. I think the context of *how* to run the job should remain
within ripsaw. Perhaps that could still be passed to snafu for better logic
handling via python when needed (spitballing... maybe a python ansible
module would make even more sense?), but a snafu wrapper shouldn't
fundamentally change the way a job operates, as that may be counter to the
user expectations set within ripsaw.
I think ultimately everything boils down to this What snafu should and
shouldn't do? . I'm of the opinion that snafu is a component, with input
of following
1. benchmark type, in this case fio
2. benchmark metadata, in this case job file, hosts, sample and other
parameters such as es details.
3. certain optional metadata related to the environment i.e. in case
of kubernetes run ( not implemented yet ) - nodes on which the pods
landed.
The output of snafu is :
1. structured benchmark data.
2. optionally index to elasticsearch/other_data_store.
If we feel that snafu should only do following:
1. Normalize the data output for the CDM
2. Provide the logic for uploading the data to the remote store
(Elasticsearch or other)
Then we'll need the input to snafu to be:
1. the stdout from the benchmark run.
The output will still be:
1. structured benchmark data.
2. optionally index to elasticsearch/other_data_store.
So if we move forward with the above, then snafu shouldn't run the
benchmark as well and ripsaw would still run benchmark and just pass the
output.
I'm okay with either way, but we need to define the interfaces well. So
ultimately we need to decide the input/output of the component ( snafu )
and its interfaces with other components ( ripsaw ).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#154>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AANDG42QVTITZSE4JX3CASDP753QFANCNFSM4ID32HZA>
.
|
I am very against this. You are taking a function in the script that I already think is overloaded and adding more logic to it. My argument is that the better approach is to take the logic away from snafu and put it in ripsaw where it is closer to the user context for the benchmark itself.
We are talking very basic jinja2 loop logic, here. Nothing earth-shattering. You'll always have the point that Ansible is a bad place to be doing logic in the first place. I get that. But I don't think the snafu wrapper is the right place to address this. We're not re-writing the operator in Golang any time soon. However, if we are overloading Ansible too much, we should think about new Ansible modules, I think. (IMO that's overkill by a lot for this particular concern.)
I think it will stay out of scope for longer with your current proposed approach. By splitting the jobfiles in ripsaw, we have a much stronger base for getting custom jobfiles into place.
This comes back to conversations we've had around your interest in making snafu more consumable outside of ripsaw. I just don't think that's anywhere on our roadmap or within our needs. Stick to the two things I proposed it should do, and leave the snafu scripts as external utilities for ripsaw. I don't think there is any reason to change fundamentally the way the sanfu scripts work, as the model has proven itself already with uperf and pgbench. It doesn't need to change to a post-processing tool, as you suggest. Using the scripts as wrappers has other advantages. |
@dustinblack I see your concern about modifying how the workload runs by breaking it apart into multiple runs, so we can simply suggest that user should pass a job file with a single job in it, and if they pass a single job then we can index the logs / result jsons as is. However if the user passes a job file with multiple fio jobs, we just skip indexing logs as we can't accurately measure the start time, as well as the visualizations wouldn't make much sense as there'd be an overlap of the jobs. Do you think this is acceptable @bengland2 , given that most of the times you'd run fio with a single job in the jobfile? |
And just to be clear, i'm okay to modify the code according to what @bengland2 and @dustinblack think is the best. |
Proposed goal: invoke ripsaw, once, and get it to run a series of fio job files and record the fio test result files for that, in such a way that this data can be indexed into elastic and visualized. That implies we should be able to convert relative timestamps in fio logs to absolute timestamps. I think Aakarsh's solution of running separate fio --client run per job would have done that, since the stat() st_mtime would then be (approximately) the end time for the job, and since we know the job duration (runtime=), we can compute the start time and rebase the timestamps in the log on that. Not as accurate as having fio tell us the start time, but good enough. Why does this conflict with Dustin's idea? Dustin is saying "The logic to split apart the combined fio jobfile I think is an overload of snafu". But I'm not asking for that. If ripsaw runs a series of fio job files as separate fio runs, then the only requirement on snafu is to process a series of runs, rather than just one - it doesn't require any additional knowledge of fio to process a series of runs. I think there isn't really a conflict here. |
This is a question that can be deferred since it only affects custom jobfiles and the initial implementation will only support our built-in jobfiles. This will allow us more time to better address the questions of timestamps, stonewall behavior, and overlapping results. We could even phase in the support for custom jobfiles, first supporting only files with single jobs and later adding support for combined files once we better understand the behavior. |
+1, though I would augment this to "run a series of jobfiles of the user's choosing". I should be able to pass a specific list of jobs I want to run.
My proposal doesn't change this part of the operation; individual fio runs per job will still be orchestrated by the snafu wrapper script, implying that $whatever needs to be done to normalize the timestamp data still applies. That, to me, is within the purview of snafu.
+1 |
Indexing fio documents/logs requires for storage to be created with a vol plugin that supports `RWX`. This is not possible in all environments, so moving to an indexing in a similar fashion to uperf. Commit also does following 1. Cleanup fio-d 2. Remove any files that were related to `RWX` pv Co-authored-by: Dustin Black <dblack@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to spend some time with the configmap to clean things up and get them more "right" for our needs, but we can address that separately; not worth blocking on.
pass filesize scale in CR for fio
pass filesize scale in CR for fio
pass filesize scale in CR for fio
pass filesize scale in CR for fio
Indexing fio documents/logs requires for storage to be created with a vol plugin
that supports
RWX
. This is not possible in all environments, so moving to an indexingin a similar fashion to uperf.
Commit also does following
RWX
pv