Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: update mon_data_avail_warn percentage #11131

Merged
merged 1 commit into from
Oct 10, 2022
Merged

Conversation

parth-gr
Copy link
Member

@parth-gr parth-gr commented Oct 10, 2022

I see the mon_data_avail_warn is set to 500M
but acording to ceph doc this should be in percentage
And by default this percentage is 30%,
So re assigned to 10% for the CI

Closes: #11130
Signed-off-by: parth-gr paarora@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

I see the mon_data_avail_warn is set to 500M
but acording to ceph doc this should be in percentage
And by default this percentage is 30%,
So re assigned to 10% for the CI

Closes: rook#11130
Signed-off-by: parth-gr <paarora@redhat.com>
@parth-gr parth-gr marked this pull request as ready for review October 10, 2022 13:30
@parth-gr parth-gr requested a review from travisn October 10, 2022 13:30
@@ -19,7 +19,7 @@ data:
mon_warn_on_pool_no_redundancy = false
bdev_flock_retry = 20
bluefs_buffered_io = false
mon_data_avail_warn = 500M
mon_data_avail_warn = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to remove this configuration from this file? we cannot have any static values here, as the size differes from cluster to cluster. i got this warning as well in new deployment.

Copy link
Member Author

@parth-gr parth-gr Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this needs to be set as in the recent CI has low availability in disk size,
So it always gives us a warning.
BY default the mon_data_avail_warn is set to 30%, so making it lesser 10% so no such warning we should see in the CI.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR I'd support removing the value entirely.

The default is a percentage (30%).

As a user new to ceph, when I hit this issue I quickly found that the expected value was 30%, so I didn't not understand why it was warning when I had 92% free. I didn't know to look for this config.

As a user new to ceph, I think it would have been least confusing to use the default value. If it was warning when I was actually low on disk space I would have been ok with that, even in a test environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree to above points, we should not change default configuration in test yamls which user uses to deploy test clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay will discuss this and update you,
As of now, it is becoming lesser than that default value which won't give you any such warnings in the future,

Copy link
Contributor

@subhamkrai subhamkrai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

@parth-gr
Copy link
Member Author

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

As remove these warnings from the CI, making this percentage lesser to 10% to not see warnings.

@Madhu-1
Copy link
Member

Madhu-1 commented Oct 10, 2022

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

As remove these warnings from the CI, making this percentage lesser to 10% to not see warnings.

if we really want to test it in CI. why not add it to CI scripts instead of modifiing the yaml files directly?

@parth-gr
Copy link
Member Author

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

As remove these warnings from the CI, making this percentage lesser to 10% to not see warnings.

if we really want to test it in CI. why not add it to CI scripts instead of modifiing the yaml files directly?

Yaa right,
That I did before but was going much big, so had a suggestion on it to make it like this #10990 (review)

@subhamkrai
Copy link
Contributor

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

As remove these warnings from the CI, making this percentage lesser to 10% to not see warnings.

if we really want to test it in CI. why not add it to CI scripts instead of modifiing the yaml files directly?

Yaa right, That I did before but was going much big, so had a suggestion on it to make it like this #10990 (review)

okay, but why in the first place we need those changes if it is only ci which gives the mon health warning we don't have any issue since it just CI unless something else in the CI depending on the mon health we can safely remove this flag. @travisn @BlaineEXE
thoughts?

@parth-gr parth-gr changed the title ci: test mon health in CI ci: update mon_data_avail_warn percentage Oct 10, 2022
@parth-gr
Copy link
Member Author

question: why we are changing this field for CI status, we know that resources in CI are minimal so we expect warnings.

As remove these warnings from the CI, making this percentage lesser to 10% to not see warnings.

if we really want to test it in CI. why not add it to CI scripts instead of modifiing the yaml files directly?

Yaa right, That I did before but was going much big, so had a suggestion on it to make it like this #10990 (review)

okay, but why in the first place we need those changes if it is only ci which gives the mon health warning we don't have any issue since it just CI unless something else in the CI depending on the mon health we can safely remove this flag. @travisn @BlaineEXE thoughts?

I would be okay with updating every CI with this value if we won't accept this change because our local test env. change,
But CI with no warnings was the aim for the issue.

Copy link
Member

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as its a % making it to 10 is fine i think as its not a Gb/Mb value anymore.

@travisn
Copy link
Member

travisn commented Oct 10, 2022

@parth-gr Did we not see the warnings in the CI when it was set to the 500G? Sounds good to change it to 10%, but just want to make sure we can confirm it helps the CI.

@parth-gr
Copy link
Member Author

parth-gr commented Oct 10, 2022

@parth-gr Did we not see the warnings in the CI when it was set to the 500G? Sounds good to change it to 10%, but just want to make sure we can confirm it helps the CI.

Yes, there in the CI I saw the warnings,
As the CI was understanding it as 500% instead of 500G which wasn't intended.

@parth-gr
Copy link
Member Author

@travisn I tested this in the failed CI applying 10%,
I think I should test this PR changes also by failing the CI once to confirm it applies the change now.

@parth-gr
Copy link
Member Author

It works well

sh-4.4$ ceph -s
  cluster:
    id:     65a44baf-955b-48bc-afe6-a33c4dfb89b4
    health: HEALTH_OK
 
  services:
    mon:           1 daemons, quorum a (age 21m)
    mgr:           a(active, since 19m)
    mds:           1/1 daemons up, 1 hot standby
    osd:           2 osds: 2 up (since 19m), 2 in (since 20m)
    cephfs-mirror: 1 daemon active (1 hosts)
    rbd-mirror:    1 daemon active (1 hosts)
    rgw:           1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   13 pools, 400 pgs
    objects: 245 objects, 486 KiB
    usage:   74 MiB used, 12 GiB / 12 GiB avail
    pgs:     400 active+clean
 
  io:
    client:   1.4 KiB/s rd, 2 op/s rd, 0 op/s wr

@travisn travisn merged commit b321524 into rook:master Oct 10, 2022
mergify bot added a commit that referenced this pull request Oct 10, 2022
ci: update mon_data_avail_warn percentage  (backport #11131)
travisn added a commit that referenced this pull request Oct 10, 2022
ci: update mon_data_avail_warn percentage  (backport #11131)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ceph status reports 'mon a is low on available space' when fs is 8% used
5 participants