Describe the bug
In v25.10.0 (and current master), Trident's go-swagger-generated ONTAP REST client aborts during its very first call ("initial call") if ONTAP returns a Content-Type the client does not have a JSON consumer for, with the canonical go-swagger error:
Error creating ONTAP REST API client for initial call. Falling back to ZAPI.
error="&{<nil>} (*models.ErrorResponse) is not supported by the TextConsumer,
can be resolved by supporting TextUnmarshaler interface"
What is happening, end to end:
- The initial probe is dispatched through
github.com/go-openapi/runtime's ClientOperation.Do.
- When the response
Content-Type is text/* (or otherwise not registered for the operation), the runtime selects its default TextConsumer.
TextConsumer.Consume(reader, target) only accepts *string, *[]byte, or encoding.TextUnmarshaler. The generated models.ErrorResponse is none of those, so the consumer returns the error string above verbatim.
- The wrapper around the initial REST call treats this consumer error as fatal, logs
Falling back to ZAPI., and switches the backend to the legacy ZAPI client for the rest of the controller process — even though the underlying HTTP call may have completed normally and even though a retry would likely succeed.
The bug is therefore not "a noisy log line"; it is a one-shot, irreversible decision made on the basis of a body-decoding failure, and it is reachable in real environments any time ONTAP or an upstream proxy/LB returns a non-JSON body on /api/cluster (HTML auth challenge, plain-text proxy error, mid-upgrade response, etc.).
Environment
- Trident version: v25.10.0
- Kubernetes orchestrator: OpenShift
- OS: RHEL CoreOS / RHEL 9.x worker nodes
- NetApp backend type: ONTAP 9.15.1P7
To Reproduce
-
Install Trident v25.10.0 with --https_rest against an ONTAP 9.x cluster whose management LIF responds to the very first REST call with a Content-Type other than application/hal+json / application/json. Common triggers:
- A
text/html body when the LIF fronts an auth challenge / session-establishment page (e.g., the Trident user is missing application: http access on its role, or an HTTP-only/proxy intercept sits in front of the cluster).
- A
text/plain error body produced by some load-balancers/proxies when an upstream is briefly unhealthy.
-
Watch the controller log right after pod start:
oc logs deploy/trident-controller -c trident-main -n trident -f \
| grep -E 'Error creating ONTAP REST API client|TextConsumer|Falling back to ZAPI'
-
Within the first few seconds the verbatim line shown in Describe the bug appears, after which every subsequent ONTAP API call from this controller is sent to .../servlets/netapp.servlets.admin.XMLrequest_filer instead of /api/....
The error is also reproducible without a Trident pod, using the generated REST client against a small mock server that returns any non-JSON Content-Type:
package main
import (
"fmt"
"net/http"
"net/http/httptest"
httptransport "github.com/go-openapi/runtime/client"
rtclient "github.com/netapp/trident/storage_drivers/ontap/api/rest/client"
"github.com/netapp/trident/storage_drivers/ontap/api/rest/client/cluster"
)
func main() {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Any of text/html, text/plain, application/octet-stream, application/xml
// reproduces the bug.
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.WriteHeader(http.StatusInternalServerError)
_, _ = w.Write([]byte("<html><body>boom</body></html>"))
}))
defer srv.Close()
rt := httptransport.New(srv.Listener.Addr().String(), "/api", []string{"http"})
c := rtclient.New(rt, nil)
_, err := c.Cluster.ClusterGet(cluster.NewClusterGetParams(), nil)
fmt.Println("err:", err)
// err: &{<nil>} (*models.ErrorResponse) is not supported by the TextConsumer,
// can be resolved by supporting TextUnmarshaler interface
}
The text/html → TextConsumer path is taken whenever the response Content-Type does not match one of the operation's declared producers; models.ErrorResponse is the schema go-swagger picks for the default error response, and that struct lacks TextUnmarshaler, so the consumer rejects the body and c.Cluster.ClusterGet returns the error string above. This is exactly what the controller log shows.
Expected behavior
Two independent, additive fixes — either one alone is sufficient; both together is best.
-
The REST client must not abort initialization on a body it cannot parse. Whether the initial probe's body decodes into *models.ErrorResponse is independent of whether the call itself succeeded. The wrapper that triggers the Falling back to ZAPI. branch should:
- Use the HTTP status code as the source of truth: a
2xx response with an unparseable body should still mark REST as available (log a warning, not an error).
- On non-2xx, fall back to a status-code-only error (e.g.
fmt.Errorf("REST init returned %d: %s", resp.StatusCode, http.StatusText(resp.StatusCode))) instead of returning the raw consumer error.
-
Make models.ErrorResponse implement encoding.TextUnmarshaler (and ideally encoding.BinaryUnmarshaler) so the TextConsumer can succeed on text/* bodies even when ONTAP or a fronting LB returns an unexpected content type. A minimal implementation:
func (e *ErrorResponse) UnmarshalText(b []byte) error {
if e.Error == nil {
e.Error = &Error{}
}
msg := strings.TrimSpace(string(b))
e.Error.Message = &msg
return nil
}
This turns an opaque text payload into a populated ErrorResponse.Message and lets the REST client surface a clean, structured error to its caller instead of aborting REST initialization entirely.
Describe the bug
In
v25.10.0(and currentmaster), Trident's go-swagger-generated ONTAP REST client aborts during its very first call ("initial call") if ONTAP returns aContent-Typethe client does not have a JSON consumer for, with the canonical go-swagger error:What is happening, end to end:
github.com/go-openapi/runtime'sClientOperation.Do.Content-Typeistext/*(or otherwise not registered for the operation), the runtime selects its defaultTextConsumer.TextConsumer.Consume(reader, target)only accepts*string,*[]byte, orencoding.TextUnmarshaler. The generatedmodels.ErrorResponseis none of those, so the consumer returns the error string above verbatim.Falling back to ZAPI., and switches the backend to the legacy ZAPI client for the rest of the controller process — even though the underlying HTTP call may have completed normally and even though a retry would likely succeed.The bug is therefore not "a noisy log line"; it is a one-shot, irreversible decision made on the basis of a body-decoding failure, and it is reachable in real environments any time ONTAP or an upstream proxy/LB returns a non-JSON body on
/api/cluster(HTML auth challenge, plain-text proxy error, mid-upgrade response, etc.).Environment
To Reproduce
Install Trident
v25.10.0with--https_restagainst an ONTAP 9.x cluster whose management LIF responds to the very first REST call with aContent-Typeother thanapplication/hal+json/application/json. Common triggers:text/htmlbody when the LIF fronts an auth challenge / session-establishment page (e.g., the Trident user is missingapplication: httpaccess on its role, or an HTTP-only/proxy intercept sits in front of the cluster).text/plainerror body produced by some load-balancers/proxies when an upstream is briefly unhealthy.Watch the controller log right after pod start:
Within the first few seconds the verbatim line shown in Describe the bug appears, after which every subsequent ONTAP API call from this controller is sent to
.../servlets/netapp.servlets.admin.XMLrequest_filerinstead of/api/....The error is also reproducible without a Trident pod, using the generated REST client against a small mock server that returns any non-JSON
Content-Type:The
text/html→TextConsumerpath is taken whenever the responseContent-Typedoes not match one of the operation's declared producers;models.ErrorResponseis the schema go-swagger picks for the default error response, and that struct lacksTextUnmarshaler, so the consumer rejects the body andc.Cluster.ClusterGetreturns the error string above. This is exactly what the controller log shows.Expected behavior
Two independent, additive fixes — either one alone is sufficient; both together is best.
The REST client must not abort initialization on a body it cannot parse. Whether the initial probe's body decodes into
*models.ErrorResponseis independent of whether the call itself succeeded. The wrapper that triggers theFalling back to ZAPI.branch should:2xxresponse with an unparseable body should still mark REST as available (log a warning, not an error).fmt.Errorf("REST init returned %d: %s", resp.StatusCode, http.StatusText(resp.StatusCode))) instead of returning the raw consumer error.Make
models.ErrorResponseimplementencoding.TextUnmarshaler(and ideallyencoding.BinaryUnmarshaler) so theTextConsumercan succeed ontext/*bodies even when ONTAP or a fronting LB returns an unexpected content type. A minimal implementation:This turns an opaque text payload into a populated
ErrorResponse.Messageand lets the REST client surface a clean, structured error to its caller instead of aborting REST initialization entirely.