Skip to content

fix(instance): GET /instance/status returns 400 after manual device disconnection#38

Open
edilsonoliveirama wants to merge 5 commits intoEvolutionAPI:mainfrom
edilsonoliveirama:fix/instance-status-400-on-disconnect
Open

fix(instance): GET /instance/status returns 400 after manual device disconnection#38
edilsonoliveirama wants to merge 5 commits intoEvolutionAPI:mainfrom
edilsonoliveirama:fix/instance-status-400-on-disconnect

Conversation

@edilsonoliveirama
Copy link
Copy Markdown

@edilsonoliveirama edilsonoliveirama commented Apr 20, 2026

Closes #20

Root cause

GET /instance/status chamava ensureClientConnected internamente.
Essa função retorna erro quando o cliente WhatsApp existe na memória
mas não está conectado — exatamente o estado após o usuário remover
o dispositivo manualmente pelo celular.

Isso causava HTTP 400 em loop até reiniciar o container.

Correção

Status é uma consulta de leitura: deve reportar o estado atual,
não exigir conexão ativa para isso. A correção lê clientPointer
diretamente e retorna Connected=false / LoggedIn=false quando o
client é nil ou está desconectado, sem tentar reconectar.

Antes / Depois

Cenário Antes Depois
Device desconectado manualmente HTTP 400 HTTP 200 {connected:false}
Client nil (instância nunca conectada) HTTP 400 HTTP 200 {connected:false}
Conectado normalmente HTTP 200 HTTP 200 (sem mudança)

Summary by Sourcery

Report instance connection status without forcing reconnection and add observability and dashboard improvements.

Bug Fixes:

  • Ensure GET /instance/status returns a 200 response with disconnected state when the WhatsApp client is nil or disconnected instead of failing.

Enhancements:

  • Add configurable chat mute duration support with validation and clarify the mute API documentation.
  • Expose Prometheus metrics with HTTP request and instance gauges, and integrate a metrics middleware and /metrics endpoint.
  • Introduce a standalone web dashboard and route wiring to show real-time instance status and server health using the existing API.
  • Expose a repository method to list all instances to back instance-level metrics and the dashboard.
  • Clean up chat route comments and adjust manager routes to separate the dashboard from the main React bundle.

Build:

  • Include the dashboard assets in the Docker image build and add Prometheus client dependencies.

edilsonoliveirama and others added 5 commits April 20, 2026 13:28
The /manager dashboard previously showed only a static placeholder
("Dashboard content will be implemented here..."). This replaces it
with a standalone HTML page that fetches live data from the API and
displays real metrics:

- Total instances count
- Connected instances count and percentage
- Disconnected instances count
- Server health status (GET /server/ok)
- AlwaysOnline count
- Instance table with name, status badge, phone number, client and
  AlwaysOnline indicator
- Auto-refresh every 30 seconds with manual refresh button

Implementation uses a standalone HTML file (Tailwind CDN + vanilla JS
fetch) served at GET /manager, keeping the existing compiled bundle
intact for all other routes (/manager/instances, /manager/login, etc.).

Changes:
- manager/dashboard/index.html: new self-contained dashboard page
- pkg/routes/routes.go: serve dashboard/index.html for GET /manager
  (exact), keep dist/index.html for GET /manager/*any (wildcard)
- Dockerfile: copy manager/dashboard/ into the final image
- .gitignore: exclude manager build artifacts from version control

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes the '// TODO: not working' markers from the six chat endpoints
(pin, unpin, archive, unarchive, mute, unmute). Investigation confirmed
the implementation is correct: the endpoints work on fully-established
sessions that have synced WhatsApp app state keys. The markers were
likely added after testing on a fresh session where keys had not yet
been distributed by the WhatsApp server.

Also fixes the hardcoded 1-hour mute duration: the BodyStruct now
accepts an optional `duration` field (seconds). Sending 0 or omitting
the field mutes the chat indefinitely, matching WhatsApp's own behaviour.
Reject negative duration values with a 400-level validation error.
Document that duration=0 maps to 'mute forever' (BuildMute treats 0
as a zero time.Duration, which causes BuildMuteAbs to set the
WhatsApp sentinel timestamp of -1).
Clamp duration to a maximum of 1 year (31536000 seconds) to avoid
unreasonably large timestamps being sent to the WhatsApp API.
Adds GET /metrics serving standard Prometheus text format.
No authentication required — follows the Prometheus convention of
protecting the endpoint at the network/ingress level.

Metrics exposed:

  evolution_instances_total               total registered instances (gauge)
  evolution_instances_connected           connected instances (gauge)
  evolution_instances_disconnected        disconnected instances (gauge)
  evolution_http_requests_total           HTTP requests by method/path/status (counter)
  evolution_http_request_duration_seconds HTTP latency by method/path (histogram)
  evolution_build_info                    always 1, version label carries the value (gauge)
  evolution_uptime_seconds                seconds since server start (gauge)

Instance gauges use a custom Collector that queries the database on
each scrape, so values are always current without event hooks.
HTTP path labels use Gin registered route patterns (e.g. /instance/:instanceId)
to keep cardinality bounded regardless of distinct IDs in the path.

New dependency: github.com/prometheus/client_golang v1.20.5
…volutionAPI#20

GET /instance/status was calling ensureClientConnected, which returns
an error when the WhatsApp client exists but is not connected (e.g.
after the user manually removes the device from their phone).
This caused the endpoint to return HTTP 400 until the container was
restarted, making it impossible for clients to detect the disconnected
state without restarting the server.

Status is a read-only query: it should report the current state, not
require an active connection to do so. The fix reads clientPointer
directly and returns Connected=false/LoggedIn=false when the client
is nil or disconnected, without attempting reconnection.

Fixes EvolutionAPI#20
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 20, 2026

Reviewer's Guide

Refactors instance status handling to avoid enforcing an active WhatsApp client, adds metrics and a standalone dashboard for monitoring instances, and improves chat mute behavior and routing/docs.

Sequence diagram for updated instance status retrieval without forced reconnection

sequenceDiagram
  actor User
  participant HttpClient as HTTPClient
  participant GinRouter as GinRouter
  participant InstanceHandler as InstanceHTTPHandler
  participant InstanceService as InstanceService
  participant ClientMap as clientPointer
  participant WAClient as WhatsAppClient

  User ->> HttpClient: Trigger GET /instance/status
  HttpClient ->> GinRouter: GET /instance/status
  GinRouter ->> InstanceHandler: Route request
  InstanceHandler ->> InstanceService: Status(instance)

  InstanceService ->> ClientMap: Lookup clientPointer[instance.Id]
  ClientMap -->> InstanceService: client

  alt client is nil
    InstanceService -->> InstanceHandler: StatusStruct{Connected=false, LoggedIn=false}
  else client not nil
    InstanceService ->> WAClient: IsConnected()
    WAClient -->> InstanceService: isConnected
    InstanceService ->> WAClient: IsLoggedIn()
    WAClient -->> InstanceService: isLoggedIn
    InstanceService -->> InstanceHandler: StatusStruct{Connected=isConnected, LoggedIn=isLoggedIn}
  end

  InstanceHandler -->> GinRouter: HTTP 200 JSON
  GinRouter -->> HttpClient: HTTP 200 {connected:false,...} when disconnected
  HttpClient -->> User: Show current instance status
Loading

Class diagram for metrics registry, instance repository, and chat mute changes

classDiagram
  class Registry {
    -prometheus.Registry reg
    -prometheus.CounterVec httpRequests
    -prometheus.HistogramVec httpDuration
    +New(version string, instanceRepo InstanceRepository) Registry
    +Handler() http.Handler
    +GinMiddleware() gin.HandlerFunc
  }

  class InstanceRepository {
    <<interface>>
    +GetAllConnectedInstances() []*Instance
    +GetAllConnectedInstancesByClientName(clientName string) []*Instance
    +GetAll(clientName string) []*Instance
    +GetAllInstances() []*Instance
    +Delete(instanceId string) error
    +GetAdvancedSettings(instanceId string) *AdvancedSettings
    +UpdateAdvancedSettings(instanceId string, settings *AdvancedSettings) error
  }

  class InstanceRepositoryImpl {
    -gorm.DB db
    +GetAllInstances() []*Instance
  }

  InstanceRepository <|.. InstanceRepositoryImpl

  class instanceCollector {
    -InstanceRepository repo
    -prometheus.Desc descTotal
    -prometheus.Desc descConnected
    -prometheus.Desc descDisconnected
    +Describe(ch chan *prometheus.Desc)
    +Collect(ch chan prometheus.Metric)
  }

  class Instance {
    +string Id
    +bool Connected
    +string Name
    +string ClientName
  }

  class InstanceService {
    -map~string, WAClient~ clientPointer
    +Status(instance *Instance) *StatusStruct
  }

  class StatusStruct {
    +bool Connected
    +bool LoggedIn
    +string myJid
    +string Name
  }

  class WAClient {
    +IsConnected() bool
    +IsLoggedIn() bool
    +Store Store
  }

  class Store {
    +string ID
    +string PushName
  }

  class chatService {
    +ensureClientConnected(instanceId string) (WAClient, error)
    +ChatMute(data *BodyStruct, instance *Instance) (string, error)
  }

  class BodyStruct {
    +string Chat
    +int64 Duration
  }

  class appstate {
    +BuildMute(recipient JID, enabled bool, duration time.Duration) AppState
  }

  class GinEngine {
    +Use(middleware gin.HandlerFunc)
    +GET(path string, handler gin.HandlerFunc)
  }

  class DashboardHTML {
    +loadData()
    +apiFetch(path string) Promise
    +renderTable(instances []InstanceView)
  }

  class InstanceView {
    +string name
    +bool connected
    +bool alwaysOnline
    +string jid
    +string clientName
  }

  Registry --> InstanceRepository : uses
  Registry --> instanceCollector : registers
  instanceCollector --> InstanceRepository : queries instances
  InstanceRepositoryImpl --> Instance : persists

  InstanceService --> Instance : reads Id
  InstanceService --> WAClient : uses
  InstanceService --> StatusStruct : returns

  chatService --> BodyStruct : uses
  chatService --> WAClient : uses
  chatService --> appstate : BuildMute

  GinEngine --> Registry : uses GinMiddleware
  GinEngine --> Registry : exposes Handler at /metrics

  DashboardHTML --> InstanceView : displays
  DashboardHTML --> GinEngine : calls /instance/all, /server/ok
Loading

File-Level Changes

Change Details Files
Instance status endpoint now reports connection state without forcing a client reconnect, fixing 400 loops after manual device disconnection.
  • Status service now reads the WhatsApp client from an internal pointer map instead of calling ensureClientConnected
  • When the client is nil or disconnected, the status response returns Connected=false and LoggedIn=false with HTTP 200
  • Simplified StatusStruct construction and return path
pkg/instance/service/instance_service.go
Introduced Prometheus metrics collection and exposure plus a new HTML dashboard for instance/server observability.
  • Added a metrics Registry that tracks HTTP request counts/latencies and instance totals/connection state via a custom collector
  • Wired the metrics registry into the Gin router with a /metrics endpoint and middleware, and extended the instance repository with GetAllInstances for collectors
  • Added prometheus client dependencies and a static Tailwind-based dashboard served at /manager that consumes existing API endpoints for live instance stats
pkg/metrics/metrics.go
cmd/evolution-go/main.go
pkg/instance/repository/instance_repository.go
go.mod
go.sum
manager/dashboard/index.html
Dockerfile
Enhanced chat mute API to support configurable mute durations with validation and updated docs.
  • Extended the chat request body to include a Duration field in seconds, used by mute operations
  • Validated mute duration to be non-negative and capped it to a one-year maximum, returning clear errors otherwise
  • Switched mute implementation to pass the requested duration to appstate.BuildMute and updated Swagger description to document duration semantics
pkg/chat/service/chat_service.go
pkg/chat/handler/chat_handler.go
Adjusted manager routing and cleaned up TODO comments on chat routes without behavior change to core APIs.
  • Split /manager routes so the root serves a standalone dashboard HTML while /manager/*any continues to serve the SPA bundle
  • Removed outdated 'TODO: not working' comments from chat pin/archive/mute routes, leaving handlers and middleware unchanged
pkg/routes/routes.go

Assessment against linked issues

Issue Objective Addressed Explanation
#20 Ensure the GET /instance/status endpoint returns a successful response (HTTP 200) with a valid disconnected/inactive status after the WhatsApp device is manually disconnected, instead of returning HTTP 400 until the container is restarted.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The Prometheus instanceCollector calls repo.GetAllInstances() on every scrape and silently drops metrics on error; consider adding logging and, if the instances table can grow large or scrapes are frequent, adding a context/timeout or lightweight view to avoid DB pressure from metrics.
  • In the new Status implementation, clientPointer is read directly without any synchronization; if other code mutates clientPointer under a lock, you may want to mirror that here to avoid potential data races under concurrent access.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Prometheus instanceCollector calls repo.GetAllInstances() on every scrape and silently drops metrics on error; consider adding logging and, if the instances table can grow large or scrapes are frequent, adding a context/timeout or lightweight view to avoid DB pressure from metrics.
- In the new Status implementation, clientPointer is read directly without any synchronization; if other code mutates clientPointer under a lock, you may want to mirror that here to avoid potential data races under concurrent access.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Erro 400 ao consultar /instance/status após desconectar instância manualmente

1 participant