Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size NESE allocation correctly for nerc-ocp-infra cluster #38

Closed
larsks opened this issue Nov 17, 2022 · 18 comments
Closed

Size NESE allocation correctly for nerc-ocp-infra cluster #38

larsks opened this issue Nov 17, 2022 · 18 comments
Assignees
Labels
loki-logs observability openshift This issue pertains to NERC OpenShift research This task is primarily about information discovery

Comments

@larsks
Copy link
Contributor

larsks commented Nov 17, 2022

We ran out of space today on the 1TB RBD pool currently allocated to the nerc-ocp-infra cluster. @jtriley has temporarily increased the pool size, but we would like to get a better sense of our utilization has been increasing since we installed the observability tools and use that to select a more appropriate pool allocation.

@larsks
Copy link
Contributor Author

larsks commented Nov 17, 2022

From the slack conversation:

14:56 naved001 @ChristopherTate I am looking at those StorageSystem
    dashboards again, and I see:
14:56          Used Capacity is 624.7 GiB
14:56          Requested capacity is 2.36 TiB
14:56
14:56          and the ceph pool was definitely fully utilized (it
    reported 999 GiB) when things started to fail. What I am trying to
    figure out is how can we configure a useful alert to not run into
    this again.
14:56
14:56          The current pool size is 2 TiB, and if we use all the
    requested capacity we will run into it again. cc: @jtriley*
    (edited) [:heavy_check_mark:1]
14:59 jtriley* We probably need to bump the quota further but wanted
    to check with you folks re: forecasting. If we're logging things
    and collecting metrics on infra then we should probably bump it to
    at least 5T.
15:04 ChristopherTate @naved001 what is the right resource to look at
    when looking for the 2TiB limit? I can query
    PersistentVolumeClaims and the storage remaining as a %, but I
    don't think that's the right thing to look at here:
15:04                 https://multicloud-console.apps.nerc-ocp-infra.r
    c.fas.harvard.edu/grafana/explore?orgId=1&left=%5B%22now-
    1h%22,%22now%22,%22Observatorium%22,%7B%22exemplar%22:true,%22expr
    %22:%22kubelet_volume_stats_available_bytes%7Bpersistentvolumeclai
    m%3D~%5C%22db-noobaa-db-pg-.*%7Cmetrics-backing-store-noobaa-
    pvc-.*%7Cnoobaa-default-backing-store-noobaa-
    pvc-.*%5C%22%7D%2Fkubelet_volume_stats_capacity_bytes%22%7D%5D
    (https://multicloud-console.apps.nerc-ocp-infra.rc.fas.harvard.edu
    /grafana/explore?orgId=1&left=%5B%22now-
    […]let_volume_stats_capacity_bytes%22%7D%5D)
15:07 naved001 I am not sure, I think the used capacity should be it
    but I suspect that there may be some orphan volumes in the ceph
    pool that don’t correspond to any PV (maybe from old odf
    installs).
15:11 jtriley* Something driven off raw ceph metrics would be better
    but not sure what metrics ODF publishes exactly.
15:11 naved001 well the number of volumes in ceph pool == pvs in
    openshift, so my theory about orphan volumes is wrong.
15:11 larsks @jtriley* I'm going to follow up with @ChristopherTate
    tomorrow (i hope) on the subject of storage metrics, alerts, etc.
    [:+1:1]
15:12 naved001 so yeah, if we could somehow get it from ceph metrics
    that would be the most reliable.
15:12 larsks I'm going to create an issue about sizing and
    forecasting. [:+1:1]
15:12 ChristopherTate I hope to find some metrics for entire
    StorageSystems, but so far I haven't had any luck.
15:13 ChristopherTate I'm around tomorrow too, so feel free to reach
    out @larsks
15:13 jtriley* > For internal Mode clusters, various alerts related to
    the storage metrics services, storage cluster, disk devices,
    cluster health, cluster capacity, and so on are displayed in the
    Block and File, and the object dashboards. These alerts are not
    available for external Mode. (edited)
15:13 jtriley* https://access.redhat.com/documentation/en-us/red_hat_o
    penshift_data_foundation/4.10/html/monitoring_openshift_data_found
    ation/alerts (edited)
15:13 | Chapter 3. Alerts Red Hat OpenShift Data Foundation 4.10 | Red
    Hat Customer Portal
15:13 | Access Red Hat’s knowledge, guidance, and support through your
    subscription.

@naved001
Copy link

Out of curiosity I was looking at rbd du which tells you how much is provisioned vs how much we are actually using:

NAME                                          PROVISIONED  USED
csi-vol-1f6c31ee-f9a4-11ec-aa57-0a580a8000a8      954 MiB  224 MiB
csi-vol-1f6c3587-f9a4-11ec-aa57-0a580a8000a8       94 GiB  9.1 GiB
csi-vol-2f86b744-1f27-11ed-9e8c-0a580a820096       94 GiB  4.6 GiB
csi-vol-30292544-22de-11ed-9e8c-0a580a820096       10 GiB  320 MiB
csi-vol-302f4c6b-22de-11ed-9e8c-0a580a820096       50 GiB  472 MiB
csi-vol-3fa060d7-2a25-11ed-97a4-0a580a810068       10 GiB  124 MiB
csi-vol-3fa0646b-2a25-11ed-97a4-0a580a810068       10 GiB      0 B
csi-vol-3fa1cab1-2a25-11ed-97a4-0a580a810068       20 GiB  1.4 GiB
csi-vol-3fa1cd30-2a25-11ed-97a4-0a580a810068        1 GiB   32 MiB
csi-vol-3fa1cd3b-2a25-11ed-97a4-0a580a810068        5 GiB  248 MiB
csi-vol-3fa1ce2c-2a25-11ed-97a4-0a580a810068       10 GiB  336 MiB
csi-vol-3fa1cebb-2a25-11ed-97a4-0a580a810068        1 GiB   68 MiB
csi-vol-45c9235d-091f-11ed-be28-0a580a8000a8        1 GiB  248 MiB
csi-vol-49a5b626-2f8e-11ed-97a4-0a580a810068        5 GiB  516 MiB
csi-vol-4d08f73e-0920-11ed-be28-0a580a8000a8        1 GiB  420 MiB
csi-vol-56d48967-091f-11ed-be28-0a580a8000a8       10 GiB  528 MiB
csi-vol-572d8230-091f-11ed-be28-0a580a8000a8      100 GiB   26 GiB
csi-vol-572f7952-091f-11ed-be28-0a580a8000a8       10 GiB  540 MiB
csi-vol-57471448-091f-11ed-be28-0a580a8000a8        1 GiB  248 MiB
csi-vol-574883bc-091f-11ed-be28-0a580a8000a8      100 GiB   14 GiB
csi-vol-57aa819c-091f-11ed-be28-0a580a8000a8       10 GiB  492 MiB
csi-vol-57c9615a-091f-11ed-be28-0a580a8000a8        1 GiB  372 MiB
csi-vol-5c8d9ec6-3b59-11ed-ac22-0a580a820044       10 GiB  372 MiB
csi-vol-5c8d9fc9-3b59-11ed-ac22-0a580a820044      150 GiB  141 GiB
csi-vol-5ce35237-0920-11ed-be28-0a580a8000a8        1 GiB  412 MiB
csi-vol-677dff29-091f-11ed-be28-0a580a8000a8        1 GiB  248 MiB
csi-vol-7f665279-229b-11ed-9e8c-0a580a820096       50 GiB  168 MiB
csi-vol-9237c5a8-229b-11ed-9e8c-0a580a820096       50 GiB  168 MiB
csi-vol-95866f8c-0397-11ed-be28-0a580a8000a8       10 GiB  240 MiB
csi-vol-9586731f-0397-11ed-be28-0a580a8000a8       10 GiB  228 MiB
csi-vol-9589f83d-0397-11ed-be28-0a580a8000a8       10 GiB  240 MiB
csi-vol-ad6e4995-f934-11ec-aa57-0a580a8000a8      100 GiB  2.1 GiB
csi-vol-b29e4628-350c-11ed-97a4-0a580a810068       10 GiB  1.6 GiB
csi-vol-c32d498f-02e9-11ed-be28-0a580a8000a8       10 GiB   88 MiB
csi-vol-c32fe68d-02e9-11ed-be28-0a580a8000a8       10 GiB   76 MiB
csi-vol-c332e81b-02e9-11ed-be28-0a580a8000a8       10 GiB   72 MiB
csi-vol-c8e19345-f88c-11ec-aa57-0a580a8000a8       50 GiB   12 GiB
csi-vol-d3acf4f3-18c0-11ed-8ab8-0a580a8200c5       10 GiB   72 MiB
csi-vol-d3acf741-18c0-11ed-8ab8-0a580a8200c5      150 GiB  141 GiB
csi-vol-d3af0b2a-18c0-11ed-8ab8-0a580a8200c5       10 GiB   68 MiB
csi-vol-d3b6a961-18c0-11ed-8ab8-0a580a8200c5       50 GiB  168 MiB
csi-vol-d526632f-19c0-11ed-8ab8-0a580a8200c5      400 GiB  207 GiB
csi-vol-d528069a-19c0-11ed-8ab8-0a580a8200c5      400 GiB  209 GiB
csi-vol-d5296ef1-19c0-11ed-8ab8-0a580a8200c5      400 GiB  199 GiB
csi-vol-e86ab39f-091f-11ed-be28-0a580a8000a8      100 GiB   13 GiB
csi-vol-f3ddc05a-091f-11ed-be28-0a580a8000a8      100 GiB   17 GiB
csi-vol-f5ba8f7d-f8e8-11ec-aa57-0a580a8000a8       50 GiB   22 GiB
csi-vol-faa23e6d-2302-11ed-9e8c-0a580a820096       10 GiB   68 MiB
csi-vol-fabdd2a8-2302-11ed-9e8c-0a580a820096       50 GiB  168 MiB
<TOTAL>                                           2.7 TiB  1.0 TiB

Things that stood out to me:

  1. The total provisioned space matches what we see in StorageCluster dashboard, but the used space is more than what's reported in the openshift dashboard.
  2. the 3 X 400 Gig volumes (openshift-storage/metrics-backing-store-noobaa-pvc) are at 50% usage.
  3. There are 2 loki related 150G volumes (openshift-operators-redhat/wal-lokistack-ingester-0, and loki-namespace/wal-lokistack-ingester-0), and those are 90%+ full.

@computate
Copy link
Member

computate commented Nov 18, 2022

I will be looking into alerts that can watch the available space of our Ceph Cluster. Here are some metrics I have found that can help. I need to make sure that all of these fields are also sent to Observability, because some are missing:

@computate
Copy link
Member

I was attempting to configure retention of Loki logs with this PR OCP-on-NERC/nerc-ocp-config#165, but this is not yet possible because the RetentionStreamSpec feature is not yet released in the latest Loki Operator.

These commits in this PR grafana/loki#7106 from Grafana Loki have been merged into OpenShift Loki openshift/loki@9bee101, but the latest version of the Red Hat Loki Operator v5.5.4-4 does not yet include these commits containing RetentionStreamSpec. See the lokistack_types.go file.

As soon as a newer version of the Loki Operator is available, we can be sure to upgrade to provide retention. Until them, it seems like there is not a way for us to limit the amount of logs stored in Loki, unless there is a way to limit the data from the OpenShift side. I will look into limits on the OpenShift side.

@hpdempsey
Copy link

Documentation for acceptance tests updated to indicate NERC currently plans to defer Data Management requirements 3 and 4, so we will proceed with acceptance tests and return to this issue later when NERC is ready to address.

@hpdempsey
Copy link

Whoops--sorry closed wrong issue.

@hpdempsey hpdempsey reopened this Dec 7, 2022
@computate
Copy link
Member

@larsks I have opened a RH Support Case for the Loki Operator Retention: https://access.redhat.com/support/cases/#/case/03391568

@computate
Copy link
Member

Closing because we were able to apply basic 90 day retention described in this issue here
#75

@larsks
Copy link
Contributor Author

larsks commented Feb 1, 2023

@computate is the existing volume size going to be sufficient for 90 days of data at the rate we're consuming space?

@computate computate reopened this Feb 6, 2023
@computate
Copy link
Member

It's a good question @larsks , I will reopen this and do some estimations.

@computate
Copy link
Member

@larsks @jtriley Here is a look at the Ceph volume size for Loki:

I don't think this leaves a lot of room to collect more logs, especially with a lot of users in production. Are we constrained to 3.5 TiB, or is it possible to have a lot more? Any thoughts on the data?

@larsks
Copy link
Contributor Author

larsks commented Feb 15, 2023

I think I answered this in chat, but I should probably leave the answer here as well. We're not constrained to 3.5TB; we can request additional storage from NESE through @jtriley. From your comment it sounds like we currently have enough storage to support our target 90 day retention, but if we longer term storage we can request additional space.

@joachimweyl
Copy link
Contributor

@computate now that we are able to properly size retention is this still an issue? If so what size should we request?

@computate
Copy link
Member

@joachimweyl We setup the new logging and retention features 2 weeks ago. I think we should wait until after 90 days of logs and retention being applied to decide on the amount of storage here. 3.5 TB seems to be enough for the logs from 2 weeks of observation at least at the current load on NERC.

@larsks larsks removed their assignment Apr 25, 2023
@joachimweyl joachimweyl added the openshift This issue pertains to NERC OpenShift label Aug 16, 2023
@computate
Copy link
Member

Based on the recent issue "Unsupported volume count, the maximum supported volume count is 20" reported by @larsks, we need to be able to increase the numVolumes of noobaa BackingStores over time. We cannot exceed numVolumes: 20, so we need to ensure that the storage values of BackingStores is sufficient to never exceed numVolumes: 20.

@computate
Copy link
Member

Based on the latest comment in this issue regarding log backups reported by @computate, We are currently generating 1 Ti / 9 days * 365.25 days / 12 months = 3.4 Ti of log data per month.

We currently have 4.8 Ti available Ceph storage for logs. With our log retention of 90 days for audit logs and 30 days for infrastructure and application logs on numVolumes: 12, We're doing fine staying between 1 Ti - 2 Ti of retained log storage on Ceph. I expect application logs to grow, and infrastructure logs to grow as we increase the number of clusters.

@larsks
Copy link
Contributor Author

larsks commented Sep 13, 2023

@computate I'm not sure of 20 is a hard limit, or if it is somehow influenced by number of nodes, size of volumes, etc. I'm going to open a support case with those questions today.

@joachimweyl
Copy link
Contributor

moving away from Infra housing observability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loki-logs observability openshift This issue pertains to NERC OpenShift research This task is primarily about information discovery
Projects
None yet
Development

No branches or pull requests

5 participants