bug: ClusteredApplicationManager state store operations can fail silently

## Bug Description
The `deploy()`, `start()`, `stop()`, and `undeploy()` methods call state store operations (putApplicationDescriptor, putApplicationState) but don't check for success or handle exceptions, leading to silent failures and inconsistent cluster state.

## Location
`jplatform-cluster/src/main/java/org/flossware/jplatform/cluster/ClusteredApplicationManager.java:137-138,192,221,249`

## Problematic Code
```java
@Override
public synchronized void deploy(ApplicationDescriptor descriptor) throws Exception {
    String appId = descriptor.getApplicationId();

    if (clusterManager != null && clusterManager.isJoined()) {
        logger.info("[{}] Deploying application in cluster mode", appId);

        // Write descriptor to cluster state
        stateStore.putApplicationDescriptor(appId, descriptor);  // Line 137 - can fail silently
        stateStore.putApplicationState(appId, ApplicationState.DEPLOYED);  // Line 138 - can fail silently
        
        // ... continues even if state store writes failed ...
    }
}

@Override
public synchronized void start(String applicationId) throws Exception {
    if (clusterManager != null && clusterManager.isJoined() && scheduler != null) {
        // ...
        super.start(applicationId);

        // Update cluster state
        stateStore.putApplicationState(applicationId, ApplicationState.RUNNING);  // Line 192 - can fail
    }
}

@Override
public synchronized void stop(String applicationId) throws Exception {
    if (clusterManager != null && clusterManager.isJoined() && scheduler != null) {
        // ...
        super.stop(applicationId);

        // Update cluster state
        stateStore.putApplicationState(applicationId, ApplicationState.STOPPED);  // Line 221 - can fail
    }
}
```

## Impact
- Application deployed locally but descriptor not in cluster state
- Other nodes don't see the application
- Application running but cluster state shows DEPLOYED or STOPPED
- Monitoring dashboards show incorrect state
- Leader makes decisions based on stale/incorrect state
- No indication to caller that operation partially failed

## Example
```java
// Hazelcast network partition occurs
ClusteredApplicationManager manager = new ClusteredApplicationManager(...);

manager.deploy(descriptor);  
// putApplicationDescriptor fails due to partition
// putApplicationState fails due to partition
// Method continues, calls super.deploy()
// Application deployed locally
// Cluster state not updated
// Other nodes don't know about application
// No exception thrown

manager.start(appId);
// super.start() succeeds
// putApplicationState fails
// Cluster still shows DEPLOYED but app is RUNNING
// Leader might try to start it on another node
```

## Proposed Fix
```java
@Override
public synchronized void deploy(ApplicationDescriptor descriptor) throws Exception {
    String appId = descriptor.getApplicationId();

    if (clusterManager != null && clusterManager.isJoined()) {
        logger.info("[{}] Deploying application in cluster mode", appId);

        // Write descriptor to cluster state - must succeed before local deployment
        try {
            stateStore.putApplicationDescriptor(appId, descriptor);
            stateStore.putApplicationState(appId, ApplicationState.DEPLOYED);
        } catch (Exception e) {
            logger.error("[{}] Failed to update cluster state during deploy", appId, e);
            throw new Exception("Failed to update cluster state: " + e.getMessage(), e);
        }

        // If leader, try to assign to a node
        if (scheduler != null) {
            try {
                if (clusterManager.isLeader()) {
                    String assignedNode = scheduler.assignApplication(appId);
                    logger.info("[{}] Leader assigned application to node: {}", appId, assignedNode);
                }
            } catch (IllegalStateException e) {
                logger.debug("[{}] Lost leadership during assignment: {}", appId, e.getMessage());
            } catch (Exception e) {
                logger.error("[{}] Failed to assign application", appId, e);
                // Clean up cluster state
                try {
                    stateStore.putApplicationState(appId, ApplicationState.FAILED);
                } catch (Exception se) {
                    logger.error("[{}] Failed to update state to FAILED", appId, se);
                }
                throw new Exception("Failed to assign application: " + e.getMessage(), e);
            }

            // Check if assigned to local node
            if (scheduler.isAssignedToLocalNode(appId)) {
                logger.info("[{}] Application assigned to local node, deploying locally", appId);
                try {
                    super.deploy(descriptor);
                } catch (Exception e) {
                    // Update cluster state to reflect failure
                    try {
                        stateStore.putApplicationState(appId, ApplicationState.FAILED);
                    } catch (Exception se) {
                        logger.error("[{}] Failed to update state to FAILED", appId, se);
                    }
                    throw e;
                }
            }
        }
    } else {
        // Standalone mode
        logger.info("[{}] Deploying application in standalone mode", appId);
        super.deploy(descriptor);
    }
}
```

Similar fixes needed for start(), stop(), and undeploy() methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: ClusteredApplicationManager state store operations can fail silently #251

Bug Description

Location

Problematic Code

Impact

Example

Proposed Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

bug: ClusteredApplicationManager state store operations can fail silently #251

Description

Bug Description

Location

Problematic Code

Impact

Example

Proposed Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions