New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mgr/progress & mgr/pg_autoscaler: Added Pg Autoscaler Event #29035
mgr/progress & mgr/pg_autoscaler: Added Pg Autoscaler Event #29035
Conversation
refs=[("pool", int(pool_id))]) | ||
|
||
else: | ||
new_progress = p['pg_num_target']/p['pg_num_final'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this path is only taken once, when we set pg_num_target
. Isn't there an update method of some sort to refresh progress (that would need to compare pg_num with pg_num_target and the initial pg_num)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sage,
So from my understanding (could be wrong) is that p['pg_num_final'] is the current pg_num in the pool and p['p_num_target'] is the target PG the pool is trying to reach. As to how it refreshes the progress, I think the function _maybe_adjust get called every 1 min as specified by the 'sleep interval', and calls the function get_osdmap() to get a new map every time and use it to call get_pools_by_name to get new information about the pools. Therefore, I feel like it is already updating at that point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pg_num_target
is what ceph currently wants pg_num
to be.
pg_num
is the current, real pg count for the pool. The mgr will move pg_num
toward pg_num_target
for you... this is what should update the progress.
pg_num_final
is what the pg_autoscaler module decided it wants pg_num_target
to be.
If I'm reading the code correctly, the would_adjust
will be true when we first make the adjustment to pg_num_target
. But early in _maybe_adjust()
you have
if not p['would_adjust']:
continue
So you want to create the progress event there, but that's not the right place to update progress.
Instead, I think you want a separate helper like _update_progress_events()
or something like that. Loop over the progress event objects you have in the active list, and compare the latest pg_num
to pg_num_target
and the starting pg_num
value. I think to do this you need to create a class like PgAdjustmentProgress
that keeps the initial and target pg_num
values...
self._event[pool_id] = str(uuid.uuid4()) | ||
if p['pg_num_target'] < p['pg_num_final']: | ||
self.remote('progress', 'update', self._event[pool_id], | ||
ev_msg=" PgAutoscaler growing PGs in pool: {0}".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PG autoscaler increasing pool %s PGs from %d to %d
del self._event[pool_id] | ||
elif new_progress > 1.0: | ||
new_progress = p['pg_num_final']/p['pg_num_target'] | ||
self.remote('progress', 'update', self._event[pool_id], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ev_id=self._event[pool_id]
just so it is consistent
@@ -73,6 +72,7 @@ class PgAutoscaler(MgrModule): | |||
def __init__(self, *args, **kwargs): | |||
super(PgAutoscaler, self).__init__(*args, **kwargs) | |||
self._shutdown = threading.Event() | |||
self._event = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Events within the progress module are stored in the persistent key value store of the progress module. self._event = {}
is not stored permanently.
Is there a possibility this can lead to progress events without references from the autoscaler module due to MGR restart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebastian-philipp
Hi,
so the purpose of this self._event = {}
is just for local references to track if the is an autoscaler event at a pool already, so we don't need to create a new one, but rather update it. But yes basically when we craete an event locally, we also create one in the progress module as well. There is an alternative strategy where we could just reference everything in the progress event so like we could check the events in the progress and see which pool it is referencing to, therefore we don't another this._event
in the pg_autoscaler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebastian-philipp
Hi,
so the purpose ofthis self._event = {}
is just for local references to track if the is an autoscaler event at a pool already, so we don't need to create a new one, but rather update it. But yes basically when we craete an event locally, we also create one in the progress module as well.
Just run a ceph mgr fail $(ceph mgr dump | jq -r '.active_name')
while a pg_autoscaler event is there and unfinished. The progress module should then reload all events from the store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we'd need to stash and repopulate this module's events too - could that metadata (initial_pg_num/pg_num_target) be added to the remote event and handled by the existing progress module persistence?
@@ -66,13 +75,14 @@ class PgAutoscaler(MgrModule): | |||
MODULE_OPTIONS = [ | |||
{ | |||
'name': 'sleep_interval', | |||
'default': str(60), | |||
'default': str(10), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this just changed for testing?
@@ -240,9 +251,9 @@ def _get_pool_status( | |||
self, | |||
osdmap, | |||
pools, | |||
threshold=3.0, | |||
threshold=1.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for testing?
refs=[("pool", int(pool_id))]) | ||
|
||
|
||
# if p['pg_num_target'] < p['pg_num_final']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no longer needed?
self.remote('progress', 'update', ev._ev_id, | ||
ev_msg=" PG autoscaler increasing pool %s PGs from %d to %d" % | ||
(pool_id, pg_num, pg_num_target), | ||
ev_progress=pg_num/pg_num_target, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking the progress should be measured by current pg_num compared to distance between the last initial_pg_num when the module made a change, and target_pg_num
i.e. for increasing, (pg_num - initial_pg_num)/(target_pg_num - initial_pg_num)
Each time the autoscaler chose a new target, it could reset the initial_pg_num and target_pg_num for an existing event. Then the progress is measuring number of pgs changed in the latest decision, which makes sense since earlier history of shrinking/growing pg_num becomes irrelevant when the autoscaler changes things again.
Then in this function, the PgAdjustmentProgress wouldn't change, only the remote ev_progress would be updated.
@@ -73,6 +72,7 @@ class PgAutoscaler(MgrModule): | |||
def __init__(self, *args, **kwargs): | |||
super(PgAutoscaler, self).__init__(*args, **kwargs) | |||
self._shutdown = threading.Event() | |||
self._event = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we'd need to stash and repopulate this module's events too - could that metadata (initial_pg_num/pg_num_target) be added to the remote event and handled by the existing progress module persistence?
502a397
to
5540892
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just a couple minor comments
elif pg_num == initial_pg_num: | ||
continue | ||
|
||
elif pg_num_target > pg_num: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two cases only differ by one word now, could make 'increasing'/'decreasing' another variable in the message to avoid repitition
Creates a progress event in progress module when the pool need pg_num adjusting to match target pg_num. Also made a bit of change to the logic of trigerring pg adjusting. Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
change back the threshold value and delete white space, also added more comments Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
make threshold value back to the orginal Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
Just getting rid of a whitespace Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
create a new class call PgAjustmentProgress to keep track of initial pg_num and pg_num_target also create a helper function call _update_progress_events to update the progress module using self.remote function Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
…st initial_pg_num changed the progress calculations to: (pg_num - initial_pg_num)/(target_pg_num - initial_pg_num) Also changed the threshold and sleep interval back to default value. Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
5540892
to
39de999
Compare
basically get rid of if else statement and passed in a variable for increase/decrease instead Signed-off-by: Kamoltat (Junior) Sirivadhna <ksirivad@redhat.com>
39de999
to
3163b68
Compare
pools = osdmap.get_pools() | ||
for pool_id in list(self._event): | ||
ev = self._event[pool_id] | ||
pool_data = pools[int(pool_id)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2019-08-21T21:03:28.521+0000 7f9b3b5fe700 -1 Traceback (most recent call last):
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 175, in serve
self._update_progress_events()
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 350, in _update_progress_events
pool_data = pools[int(pool_id)]
KeyError: (1,)
i don't really understand how come int
could return a tuple..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2019-08-29T07:52:47.255+0000 7f691c344700 -1 Traceback (most recent call last):
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 175, in serve
self._update_progress_events()
File "/usr/share/ceph/mgr/pg_autoscaler/module.py", line 353, in _update_progress_events
pool_data = pools[int(pool_id)]
KeyError: (1,)
i still ran into this issue even my branch contains the fix for https://tracker.ceph.com/issues/41386
* refs/pull/29035/head: mgr/pg_autoscaler: changes made reflect jdurgin's request mgr/pg_autoscaler: current pg_num compared to distance between the last initial_pg_num mgr/progress & mgr/pg_autoscaler: changes reflect liewegas' comment mgr/pg_autoscaler: get rid of white space mgr/progress: change threshold value to origin mgr/progress: cleaning up for pg_autoscaler mgr/progress: Added Pg Autoscaler Event Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Creates a progress event in progress module
when the pool need pg_num adjusting to match
target pg_num. Also made a bit of change to
the logic of trigerring pg adjusting.
Signed-off-by: Kamoltat (Junior) Sirivadhna ksirivad@redhat.com