Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon: prime pg_temp and a few health warning fixes #16530

Merged
merged 8 commits into from Jul 26, 2017

Conversation

Projects
None yet
3 participants
@xiexingguo
Copy link
Member

xiexingguo commented Jul 24, 2017

No description provided.

@xiexingguo xiexingguo requested a review from liewegas Jul 24, 2017

@@ -783,7 +783,7 @@ std::string pg_vector_string(const vector<int32_t> &a)
return oss.str();
}

std::string pg_state_string(int state)
std::string pg_state_string(uint32_t state)

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jul 24, 2017

Member

What's the point of this change? Above you're passing signed int.

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Jul 24, 2017

Author Member

So we can passin -1 and get all available pg states out

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jul 24, 2017

Member

You don't need to change type here to do that.

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Jul 24, 2017

Author Member

It's simpler and does no harm...

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jul 24, 2017

Member

Using 0xFFFFFFFF is simplier and doesn't change the interface.

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Jul 24, 2017

Author Member

as you wish

@@ -3778,7 +3782,8 @@ int process_pg_map_command(
} else {
int filter = pg_string_state(state_str);
if (filter < 0) {
*ss << "'" << state_str << "' is not a valid pg state";
*ss << "'" << state_str << "' is not a valid pg state,"
<< " available choices: " << pg_state_string(-1);

This comment has been minimized.

Copy link
@branch-predictor

branch-predictor Jul 24, 2017

Member

This is already handled by a ceph python binary:

$ bin/ceph -c ceph.conf pg ls somethingawful
somethingawful not valid:  somethingawful not in active|clean|down|scrubbing|degraded|inconsistent|peering|repair|recovering|backfill_wait|incomplete|stale|remapped|deep_scrub|backfill|backfill_toofull|recovery_wait|undersized|activating|peered                                                                    
Invalid command:  unused arguments: [u'somethingawful']                                                 
pg ls {<int>} {active|clean|down|scrubbing|degraded|inconsistent|peering|repair|recovering|backfill_wait|incomplete|stale|remapped|deep_scrub|backfill|backfill_toofull|recovery_wait|undersized|activating|peered [active|clean|down|scrubbing|degraded|inconsistent|peering|repair|recovering|backfill_wait|incomplete|stale|remapped|deep_scrub|backfill|backfill_toofull|recovery_wait|undersized|activating|peered...]} :  list pg with specific pool, osd, state                                                                  
Error EINVAL: invalid command                                                                           

What's the point of redoing it here?

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Jul 24, 2017

Author Member

@branch-predictor Please refer to #16465, by suggestions from @liewegas we drop the client-side validations of pg-states now, so user will lose any pg_state info.

This patch tries to be slightly helpful due to the reason above.

@xiexingguo xiexingguo force-pushed the xiexingguo:wip-fix-pgtemp branch from 5d6f488 to f0ea4f7 Jul 24, 2017

@xiexingguo

This comment has been minimized.

Copy link
Member Author

xiexingguo commented Jul 24, 2017

retest this please

// do not touch if acting size will shrink
// otherwise we'll probably add some non-existent osds into pg_temp
return;
}

This comment has been minimized.

Copy link
@liewegas

liewegas Jul 24, 2017

Member

I'm worried this will skip some cases where a primed pg_temp helps. Can we instead filter acting to remove !exists osds? It seems possible there are some corner cases that this missed (e.g., if the osd is destroyed in the same epoch that some other CRUSH change is made).

xiexingguo added some commits Jul 24, 2017

mon/OSDMonitor: post-clean pg_temp during priming if possible
By vstarting a 3-osds cluster and 'ceph osd purge osd.0', then 'ceph -s':

  services:
    mon: 3 daemons, quorum a,b,c
    mgr: x(active)
    mds: 1/1/1 up {0=a=up:active}, 2 up:standby
    osd: 2 osds: 2 up, 2 in; 14 remapped pgs

And "ceph osd dump":

max_osd 3
osd.1 up   in  weight 1 up_from 9 up_thru 17 down_at 0 last_clean_interval [0,0) 127.0.0.1:6804/269058 127.0.0.1:6805/269058 127.0.0.1:6806/269058 127.0.0.1:6807/269058 exists,up 04336407-097f-4d72-89db-15b6e317370c
osd.2 up   in  weight 1 up_from 12 up_thru 18 down_at 0 last_clean_interval [0,0) 127.0.0.1:6808/269590 127.0.0.1:6809/269590 127.0.0.1:6810/269590 127.0.0.1:6811/269590 exists,up c8d334e4-aa40-4032-84e6-297d9f7a6fdb
pg_temp 1.0 [1,0,2]
pg_temp 1.1 [2,0,1]
pg_temp 1.3 [1,2,0]
pg_temp 1.4 [1,0,2]
pg_temp 1.5 [2,0,1]
pg_temp 1.6 [1,0,2]
pg_temp 1.7 [1,2,0]
pg_temp 2.0 [2,1,0]
pg_temp 2.1 [2,1,0]
pg_temp 2.3 [1,2,0]
pg_temp 2.4 [1,0,2]
pg_temp 2.5 [1,0,2]
pg_temp 2.6 [1,0,2]
pg_temp 2.7 [1,0,2]

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/HealthMontior: guard propose_pending() process
Which is for leader only.
The current code logic has no problem but this is defensive
and therefore should be considered as a fine cleanup.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/PGMap: make plurals be more compatible with existing codes
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/PGMap: fix copy-and-paste error
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/PGMap: fix condition checking for slow-requests
op_queue_age_hist collects op stats in ms, not sec.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/PGMap: be helpful for "pg ls *" CLIs
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

@xiexingguo xiexingguo force-pushed the xiexingguo:wip-fix-pgtemp branch from f0ea4f7 to b61cbe7 Jul 25, 2017

@xiexingguo

This comment has been minimized.

Copy link
Member Author

xiexingguo commented Jul 25, 2017

I'm worried this will skip some cases where a primed pg_temp helps. Can we instead filter acting to remove !exists osds? It seems possible there are some corner cases that this missed (e.g., if the osd is destroyed in the same epoch that some other CRUSH change is made).

@liewegas Can you check ea723fb and see if it is the right cure? Local verification shows it is okay.

xiexingguo added some commits Jul 22, 2017

mon/OSDMonitor: ENOENT on disabling non-existend app
so we don't bother to trigger an pool update, which is potentially
big stuff.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
mon/OSDMonitor: ENOENT on removing non-existent app key
So we don't bother to trigger an pool update, which is potentially
big stuff.

Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>

@liewegas liewegas added this to the luminous milestone Jul 25, 2017

@liewegas liewegas merged commit ee06dc6 into ceph:master Jul 26, 2017

4 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
make check (arm64) make check succeeded
Details

@xiexingguo xiexingguo deleted the xiexingguo:wip-fix-pgtemp branch Feb 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.