New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xSQLServerAlwaysOnAvailabilityGroupDatabaseMembership: Should Be Cluster Aware #869

Closed
randomnote1 opened this Issue Oct 13, 2017 · 10 comments

Comments

2 participants
@randomnote1
Contributor

randomnote1 commented Oct 13, 2017

Details of the scenario you tried and the problem that is occurring:
When the xSQLServerAlwaysOnAvailabilityGroupDatabaseMembership resource is applied to all of the nodes in a Failover Cluster Instance (FCI), the nodes will "fight" with each other while trying to manage the database memberships in the availability group.

The DSC configuration that is using the resource (as detailed as possible):
When the following configuration is applied to two or more nodes of the FCI:

xSQLServerAlwaysOnAvailabilityGroupDatabaseMembership 'TestAGDatabaseMemberships'
{
    AvailabilityGroupName = $Node.AvailabilityGroupName
    BackupPath            = '\\SQL1\AgInitialize'
    DatabaseName          = 'DB*', 'AdventureWorks'
    SQLInstanceName       = $Node.SQLInstanceName
    SQLServer             = $Node.NodeName
    Ensure                = 'Present'
    PsDscRunAsCredential  = $SysAdminAccount
}

Version of the Operating System, SQL Server and PowerShell the DSC Target Node is running:
Any

What module (SqlServer or SQLPS) and which version of the module the DSC Target Node is running:
Any

Version of the DSC module you're using, or 'dev' if you're using current dev branch:
Any

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 13, 2017

Contributor

The resource should evaluate if the current node is actively hosting the SQL instance. If it is not hosting the instance, it should write a verbose message stating that it does not own the instance, and then silently exit.

Contributor

randomnote1 commented Oct 13, 2017

The resource should evaluate if the current node is actively hosting the SQL instance. If it is not hosting the instance, it should write a verbose message stating that it does not own the instance, and then silently exit.

@johlju

This comment has been minimized.

Show comment
Hide comment
@johlju

johlju Oct 13, 2017

Contributor

If I understand correctly. This will force the user to always add the same configuration to each node that is part of the availability group? It will no longer be enough (might never worked though?) to add this configuration to one of the nodes and that node will connect to the current primary replica (if needed) to do the changes?

Contributor

johlju commented Oct 13, 2017

If I understand correctly. This will force the user to always add the same configuration to each node that is part of the availability group? It will no longer be enough (might never worked though?) to add this configuration to one of the nodes and that node will connect to the current primary replica (if needed) to do the changes?

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 13, 2017

Contributor

Your assumption is correct. I had anticipated and assumed in the case of an FCI the same configuration would be pushed to all of the nodes to ensure seemless operation in the event of a node failure.

Currently if the resource detects the desired state is false, all the nodes will attempt to add the database to the availability group. Only one of the nodes will be successful in getting the whole way through the process. The other nodes will fail. However, the next time the desired state is tested, it will return true on all nodes.

Contributor

randomnote1 commented Oct 13, 2017

Your assumption is correct. I had anticipated and assumed in the case of an FCI the same configuration would be pushed to all of the nodes to ensure seemless operation in the event of a node failure.

Currently if the resource detects the desired state is false, all the nodes will attempt to add the database to the availability group. Only one of the nodes will be successful in getting the whole way through the process. The other nodes will fail. However, the next time the desired state is tested, it will return true on all nodes.

@johlju

This comment has been minimized.

Show comment
Hide comment
@johlju

johlju Oct 14, 2017

Contributor

This will essentially be a breaking change, right? Configurations out there that is only have the configuration on one node will "break" when the instance is moved to a node not having the configuration.
This need to be documented in the resource and showed in examples.

I'm working on making a first version of a README.md in the examples folder that could contain, or link to, more descriptive examples (issue #462). This is to remove the old examples that is outdated. But haven't yet gotten as far as I hoped. Hope I get further this weekend.

Contributor

johlju commented Oct 14, 2017

This will essentially be a breaking change, right? Configurations out there that is only have the configuration on one node will "break" when the instance is moved to a node not having the configuration.
This need to be documented in the resource and showed in examples.

I'm working on making a first version of a README.md in the examples folder that could contain, or link to, more descriptive examples (issue #462). This is to remove the old examples that is outdated. But haven't yet gotten as far as I hoped. Hope I get further this weekend.

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 14, 2017

Contributor

I don't know. The way I've been using it will not result in a breaking change, however the scenario you describe it could be.

I don't know that I would consider it a breaking change because the design of DSC is to configure the current node. There is no reasonable assumption that DSC can or will configure resources on other nodes. That's my 2 cents anyway.

Contributor

randomnote1 commented Oct 14, 2017

I don't know. The way I've been using it will not result in a breaking change, however the scenario you describe it could be.

I don't know that I would consider it a breaking change because the design of DSC is to configure the current node. There is no reasonable assumption that DSC can or will configure resources on other nodes. That's my 2 cents anyway.

@johlju

This comment has been minimized.

Show comment
Hide comment
@johlju

johlju Oct 18, 2017

Contributor

To all: Please share your opinion on this. Would you consider this be a breaking change?

@randomnote1 You have a point that DSC should configure the target node. But since the resource has up till now configured the instance regardless of active node, this would make it a functional breaking change.
Can this be considered as "Fundamentally changing an existing functionality of a resource" as of the Breaking Changes guidelines?

If this change would be considered as a breaking change, then we will have breaking changes, and changing version number, quite often in the future. Worst case we would end up with version 28 or something in the end. 😄 If we consider this a breaking change, maybe we need to consolidate the changes in some way (different branch?).
So I rather not consider this as a breaking change. But I'm divided between @randomnote1's view of this (which I agree on), and the breaking changes guidelines.

Contributor

johlju commented Oct 18, 2017

To all: Please share your opinion on this. Would you consider this be a breaking change?

@randomnote1 You have a point that DSC should configure the target node. But since the resource has up till now configured the instance regardless of active node, this would make it a functional breaking change.
Can this be considered as "Fundamentally changing an existing functionality of a resource" as of the Breaking Changes guidelines?

If this change would be considered as a breaking change, then we will have breaking changes, and changing version number, quite often in the future. Worst case we would end up with version 28 or something in the end. 😄 If we consider this a breaking change, maybe we need to consolidate the changes in some way (different branch?).
So I rather not consider this as a breaking change. But I'm divided between @randomnote1's view of this (which I agree on), and the breaking changes guidelines.

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 18, 2017

Contributor

@johlju, if we approach it the way I did in Pull Request #890, this will not be a breaking change. I added a parameter called ProcessOnlyOnActiveNode that must be set to $true in order for the resource to not process when the instance is not living on the node.

Contributor

randomnote1 commented Oct 18, 2017

@johlju, if we approach it the way I did in Pull Request #890, this will not be a breaking change. I added a parameter called ProcessOnlyOnActiveNode that must be set to $true in order for the resource to not process when the instance is not living on the node.

@johlju

This comment has been minimized.

Show comment
Hide comment
@johlju

johlju Oct 18, 2017

Contributor

@randomnote1 A much better approach! Thanks for think of that! 😄 Looking good after just browsing the PR quick.

Contributor

johlju commented Oct 18, 2017

@randomnote1 A much better approach! Thanks for think of that! 😄 Looking good after just browsing the PR quick.

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 20, 2017

Contributor

To resolve this issue, a parameter called ProcessOnlyOnActiveNode should be created on the Set-TargetResource and Test-TargetResource functions.

The Get-TargetResource function will execute the helper function Test-ActiveNode to determine if the current node is actively hosting the SQL instance and return the result via the read-only property IsActiveNode.

The Test-TargetResource function will use the value of ProcessOnlyOnActiveNode and IsActiveNode to determine if it should continue. If ProcessOnlyOnActiveNode is $true and IsActiveNode is $false, Test-TargetResource will return $true.

I put together this gist as a guide to updating the resource.

Also, see the xSQLServerAlwaysOnAvailabilityGroup resource for a completed example.

Contributor

randomnote1 commented Oct 20, 2017

To resolve this issue, a parameter called ProcessOnlyOnActiveNode should be created on the Set-TargetResource and Test-TargetResource functions.

The Get-TargetResource function will execute the helper function Test-ActiveNode to determine if the current node is actively hosting the SQL instance and return the result via the read-only property IsActiveNode.

The Test-TargetResource function will use the value of ProcessOnlyOnActiveNode and IsActiveNode to determine if it should continue. If ProcessOnlyOnActiveNode is $true and IsActiveNode is $false, Test-TargetResource will return $true.

I put together this gist as a guide to updating the resource.

Also, see the xSQLServerAlwaysOnAvailabilityGroup resource for a completed example.

@randomnote1

This comment has been minimized.

Show comment
Hide comment
@randomnote1

randomnote1 Oct 26, 2017

Contributor

Starting this one.

Contributor

randomnote1 commented Oct 26, 2017

Starting this one.

@johlju johlju added in progress and removed help wanted labels Oct 26, 2017

@johlju johlju moved this from Backlog to In progress in Resources should be cluster aware Oct 26, 2017

@johlju johlju closed this in #895 Oct 30, 2017

johlju added a commit that referenced this issue Oct 30, 2017

xSQLServerAlwaysOnAvailabilityGroupDatabaseMembership: Make Cluster A…
…ware (#895)

- Changes to xSQLServerAlwaysOnAvailabilityGroupDatabaseMembership
  - Made the resource cluster aware. When ProcessOnlyOnActiveNode is specified,
    the resource will only determine if a change is needed if the target node
    is the active host of the SQL Server instance (issue #869).

@johlju johlju removed the in progress label Oct 30, 2017

@johlju johlju moved this from In progress to Done in Resources should be cluster aware Oct 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment