Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid syntax in 24.05 when VolumeAnalytics enabled #3023

Closed
db-wally007 opened this issue Jun 29, 2024 · 11 comments · Fixed by #3026
Closed

Invalid syntax in 24.05 when VolumeAnalytics enabled #3023

db-wally007 opened this issue Jun 29, 2024 · 11 comments · Fixed by #3026
Labels

Comments

@db-wally007
Copy link

db-wally007 commented Jun 29, 2024

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please let us know in a comment

Problem

Hello,

I've upgraded Harvest from 24.02 to the latest (as of today) 24.05 and upon container start, I see the following error flooding container log output (journald in my case)

Jun 29 17:00:28 podman1 netapp-harvest-exporter[3241772]: 2024-06-29T17:00:28Z ERR volumeanalytics/volumeanalytics.go:253 > set metric error="strconv.ParseFloat: parsing \"\": invalid syntax" Poller=agora object=VolumeAnalytics plugin=Rest:VolumeAnalytics value=

config file:

---

Exporters:
  troy:
    exporter: Prometheus
    local_http_addr: 0.0.0.0
    port: 12990
    global_prefix: netapp_
  agora:
    exporter: Prometheus
    local_http_addr: 0.0.0.0
    port: 12991
    global_prefix: netapp_

Pollers:
  troy:
    datacenter: EQX
    addr: troy-cluster.xxx.de
    auth_style: basic_auth
    username: $__env{NETAPP_HARVEST_READONLY_USERNAME}
    password: $__env{NETAPP_HARVEST_READONLY_PASSWORD}
    use_insecure_tls: true
    exporters:
      - troy
    collectors:
      - Rest
      - RestPerf
      - Ems
  agora:
    datacenter: EQX
    addr: agora-cluster.xxx.de
    auth_style: basic_auth
    username: $__env{NETAPP_HARVEST_READONLY_USERNAME}
    password: $__env{NETAPP_HARVEST_READONLY_PASSWORD}
    use_insecure_tls: true
    exporters:
      - agora
    collectors:
      - Rest
      - RestPerf
      - Ems

Container file:

    FROM {{ netapp_harvest_exporter_podman_template }}
     EXPOSE 12990 12991
     COPY harvest.yml /opt/harvest/harvest.yml
     COPY harvest_entrypoint.sh /opt/harvest/harvest_entrypoint.sh
     ENV NETAPP_HARVEST_READONLY_USERNAME
     ENV NETAPP_HARVEST_READONLY_PASSWORD
     WORKDIR /opt/harvest
     ENTRYPOINT ["./harvest_entrypoint.sh"]
     CMD ["start", "--config", "harvest.yml", "--loglevel=4"]

harvest_entrypoint.sh:

      #!/bin/bash
      bin/harvest "$@"
      `exec /bin/sleep infinity`

Configuration

No response

Poller

agora poller

Version

root@podman1:~>podman exec -it  netapp-harvest-exporter bin/harvest version
harvest version 24.05.2-1 (commit bae3aad2) (build date 2024-06-13T07:58:11-0400) linux/amd64

Poller logs

No response

OS and platform

RHEL 9.4 , podman

ONTAP or StorageGRID version

NetApp Release 9.13.1P7

Additional Context

root@podman1:~>podman exec -it  netapp-harvest-exporter bin/harvest zapi -p agora show system
connected to AGORA (NetApp Release 9.13.1P7: Wed Jan 24 16:17:54 UTC 2024)
[results]                                          -                                   *
  [build-timestamp]                                -                          1706113074
  [is-clustered]                                   -                                true
  [version]                                        - NetApp Release 9.13.1P7: Wed Jan 24 16:17:54 UTC 2024

  [version-tuple]                                  -                                   *
    [system-version-tuple]                         -                                   *
      [generation]                                 -                                   9
      [major]                                      -                                  13
      [minor]                                      -                                   1

References

No response

@db-wally007
Copy link
Author

db-wally007 commented Jun 29, 2024

I just checked our 2 clusters and issue is a bit strange and maybe it will help you to replicate the issue.

1, TROY appliance - has volume with VolumeAnalytics feature enabled = no errors from the poller and VolumeAnalytics metrics are present in the Prometheus/Grafana

2, AGORA appliance - has no volume with VolumeAnalytics feature enabled but we have SVM-DR relationship setup from TROY -> to -> AGORA (i.e. SVM from TROY that has the VolumeAnalytics enabled is replicated to AGORA) = error in the logs as described in the earlier post.

@rahulguptajss
Copy link
Contributor

It appears that the AGORA appliance has volumes with analytics enabled. Could you run the following command for the AGORA appliance? Please note that the USERNAME, PASSWORD, URL should be replaced with the appropriate credentials.

curl -s -k -u USERNAME:PASSWORD 'https://URL/api/storage/volumes?return_records=true&fields=name,svm.name,uuid&analytics.state=on&max_records=20&order_by=space.used%20desc&ignore_unknown_fields=true'

If there are records returned, could you take the UUID of a volume and replace UUID in the following command to check if percentages are available in the response? It seems they are empty resulting in error log.

curl -s -k -u USERNAME:PASSWORD 'https://URL/api/storage/volumes/UUID/files?return_records=true&fields=analytics.by_accessed_time.bytes_used.percentages&type=directory&max_records=100'

Also, could you also share the ONTAP version of the Troy appliance?

You can also share output of these curl commands with us via ng-harvest-files@netapp.com

@db-wally007
Copy link
Author

db-wally007 commented Jul 1, 2024

ng-harvest-files@netapp.com

Email sent with output of the curl commands from both SVM-DR source (TROY) and SVM-DR destination (AGORA)

It includes home directories with internal usernames - which might be somewhat sensitive.
Let me know if you received it.

TROY (AFF-250) has the exact same version as AGORA (AFF-220) (we were advised by the Netapp techs to try to keep ONTAP versions in sync for best results with SVM-DR functionality)

@cgrinds
Copy link
Collaborator

cgrinds commented Jul 1, 2024

Thanks @db-wally007 we got the curl emails

@cgrinds
Copy link
Collaborator

cgrinds commented Jul 1, 2024

hi @db-wally007 your curl request shows that ONTAP believes analytics are enabled for two volumes on agora, both of those volumes are on the same SVM. It sounds like you didn't think volume analytics were enabled on agora. Is that correct?

Interestingly, even though those two volumes say they have volume analytics enabled, when queried for analytics they return no analytics. 😄 I wonder if this is an ONTAP bug related to the SVM-DR relationship you mentioned? Are you on Discord? If so, please ask if this is a bug on the #ontap channel.

Regardless, Harvest needs to do a better job ignoring these faux volume-analytics-enabled volumes. We'll fix that.

@db-wally007
Copy link
Author

db-wally007 commented Jul 1, 2024

hi @db-wally007 your curl request shows that ONTAP believes analytics are enabled for two volumes on agora, both of those volumes are on the same SVM. It sounds like you didn't think volume analytics were enabled on agora. Is that correct?

Interestingly, even though those two volumes say they have volume analytics enabled, when queried for analytics they return no analytics. 😄 I wonder if this is an ONTAP bug related to the SVM-DR relationship you mentioned? Are you on Discord? If so, please ask if this is a bug on the #ontap channel.

Regardless, Harvest needs to do a better job ignoring these faux volume-analytics-enabled volumes. We'll fix that.

VolumeAnalytics is enabled on volumes that reside on TROY (plato_svm) and that SVM gets replicated to AGORA via SVM-DR feature into dr_plato_svm SVM.

dr_plato_svm on AGORA is stopped, maybe that is the issue ?
(while source SVM plato_svm is running, destination SVM dr_plato_svm must be stopped, obviously)

AGORA serves as a SVM-DR destination.

Re: Discord, (un)fortunately I do not (no social media accounts)

@rahulguptajss rahulguptajss self-assigned this Jul 2, 2024
@rahulguptajss rahulguptajss linked a pull request Jul 2, 2024 that will close this issue
@rahulguptajss
Copy link
Contributor

@db-wally007 It will be fixed via #3026.

In the meantime, you may try to disable volume activity tracking for these volumes which are in an SVM-DR stopped vserver.

List Relevant Volumes

To list the relevant volumes with activity tracking enabled, use the following command:

volume show -activity-tracking-state on -vserver

Disable Activity Tracking

To disable activity tracking, use the following command:

volume activity-tracking off -volume

Ideally, in a DR relationship SVM, activity tracking should be disabled. I tried enabling it in version 9.15 but encountered the following error:

Volume activity tracking wasn't enabled on volume "umeng_aff300_05_svm5_root" in storage VM "xyz" due to the following reason: "This operation is not permitted on a Vserver that is configured as the destination for Vserver DR."

@db-wally007
Copy link
Author

@db-wally007 It will be fixed via #3026.

In the meantime, you may try to disable volume activity tracking for these volumes which are in an SVM-DR stopped vserver.

Ideally, in a DR relationship SVM, activity tracking should be disabled. I tried enabling it in version 9.15 but encountered the following error:

Like I wrote a few times, VolumeAnalytics IS off on AGORA.
The problem is with the query or query response that Harvest is doing.

Here is ONTAP output:

AGORA::*> volume activity-tracking show
vserver volume                                     state
------- ------------------------------------------ -----
AGORA   MDV_CRS_e26ccfc6e20911ea9b9cd039ea1acb75_A off
AGORA   MDV_CRS_e26ccfc6e20911ea9b9cd039ea1acb75_B off
AGORA-01
        vol0                                       off
AGORA-02
        vol0                                       off
agora_svm
        agora_svm_root_volume                      off
agora_svm
        vmware_backup_agora                        off
dr_plato_svm
        backup                                     off
dr_plato_svm
        home_dirs                                  off
dr_plato_svm
        pcap                                       off
dr_plato_svm
        plato_svm_root_volume                      off
dr_plato_svm
        rdf                                        off
dr_plato_svm
        shared_data                                off
dr_plato_svm
        xml                                        off
13 entries were displayed.

AGORA::*>

and

AGORA::*> volume show -activity-tracking-state on
There are no entries matching your query.

AGORA::*>

@rahulguptajss
Copy link
Contributor

rahulguptajss commented Jul 2, 2024

Understood. Activity Tracking is used for Top K metrics in ONTAP. I should have asked you to check the output of the following CLI, as these are what we filter on in the REST call.

You should see some results for this SVM with the following query:

volume show -analytics-state on

However, we get empty records when we call the following REST curl, which makes sense as files were never accessed on this volume on the SVM-DR site yet. This is something we need to handle in Harvest. It will be fixed via #3026. Thanks.

curl -s -k -u xxx:xxx 'https://xxxx/api/storage/volumes/xxxx/files?return_records=true&fields=analytics.by_accessed_time.bytes_used.percentages&type=directory&max_records=100'

@db-wally007
Copy link
Author

Understood. Activity Tracking is used for Top K metrics in ONTAP. I should have asked you to check the output of the following CLI, as these are what we filter on in the REST call.

You should see some results for this SVM with the following query:

volume show -analytics-state on

Here is the output from SVM-DR destination:

AGORA::*> volume show -analytics-state on
There are no entries matching your query.

AGORA::*>

and here is the output from SVM-DR source:

TROY::*> volume show -analytics-state on
Vserver   Volume       Aggregate    State      Type       Size  Available Used%
--------- ------------ ------------ ---------- ---- ---------- ---------- -----
plato_svm home_dirs    -            online     RW      42.10TB    21.78TB    0%
plato_svm shared_data  -            online     RW      42.10TB    21.89TB    0%
2 entries were displayed.

TROY::*>

@rahulguptajss
Copy link
Contributor

verified in 24.08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants