Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add akka.hosting #148

Merged
merged 18 commits into from Dec 22, 2022
Merged

Add akka.hosting #148

merged 18 commits into from Dec 22, 2022

Conversation

Arkatufus
Copy link
Contributor

This is an expansion of #146 with added Akka.Hosting support

@wesselkranenborg
Copy link
Contributor

@Arkatufus : I think it looks great and you did some nice improvements here. I have only one concent and that's about the following comment I put on my draft PR: This persistence liveness check actor needs to be singleton in a cluster. The persistenceId will otherwise try to write data from all nodes under the same persistenceId to the same storage. That will cause conflicts resulting in the fact that your cluster will become unstable now and then.

Or was this an issue because I did wire up the healthcheck actors myself instead of using the ActorExtension for that?

@wesselkranenborg
Copy link
Contributor

If you could create a beta package for this, I could test this in our solution and see if this works as expected.

@Arkatufus
Copy link
Contributor Author

Arkatufus commented Nov 16, 2022

This persistence liveness check actor needs to be singleton in a cluster. The persistenceId will otherwise try to write data from all nodes under the same persistenceId to the same storage.

That is a bit problematic since the persistence healthcheck supossed to check that the current node can actually access the backend storage and properly read-write from it and not coupled with clustering/remoting.

I think it is better if we change the persistence id instead, tying it to the current cluster node. Problem with this approach is to figure out a way to create a unique persistence id that is guaranteed to not collide with anything else that is trying to access the storage backend.

@wesselkranenborg
Copy link
Contributor

I think it is better if we change the persistence id instead, tying it to the current cluster node.

That's indeed also a good solution. Right now this is the persistenceId: https://github.com/petabridge/akkadotnet-healthcheck/blob/dev/src/Akka.HealthCheck.Persistence/AkkaPersistenceLivenessProbe.cs#L175

Can we not append the nodeIP to that if the node is part of an Akka.Cluster? I think you're safe then.

@wesselkranenborg
Copy link
Contributor

You only then need to make sure that you not only remove latest -1 snapshot and messages but all messages because if nodes leave/join the cluster this might become an unexpected mess of orphaned data.

https://github.com/petabridge/akkadotnet-healthcheck/blob/dev/src/Akka.HealthCheck.Persistence/AkkaPersistenceLivenessProbe.cs#L204

@Arkatufus
Copy link
Contributor Author

Well, if we make sure that every incarnation of the probe have unique persistence ids, there won't be any mess because each of them would have their own snapshots

@wesselkranenborg
Copy link
Contributor

Guid.NewGuid() 😉

@Arkatufus
Copy link
Contributor Author

I guess we can use Guid. I'm a bit leery with Guid because I've heard that the Guid algorithm only guarantee non-collision if it is generated by the same computer, the collision probability increases If multiple computers generates Guid

@Aaronontheweb
Copy link
Member

That is a bit problematic since the persistence healthcheck supossed to check that the current node can actually access the backend storage and properly read-write from it and not coupled with clustering/remoting.

That's correct.

@Aaronontheweb
Copy link
Member

I guess we can use Guid. I'm a bit leery with Guid because I've heard that the Guid algorithm only guarantee non-collision if it is generated by the same computer, the collision probability increases If multiple computers generates Guid

Guid collisions are exceedingly rare so I wouldn't worry about that. I'd prefix the entity name with something like healthcheck-{Guid}.

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some minor cleanup to do - but let's merge this in, make available to users, get some feedback. Can you also make a separate PR with installation / configuration steps for IHealthCheck integration?

_livenessStatus = livenessStatus;
var selfMember = _cluster.State.Members.FirstOrDefault(m => m.Address == _cluster.SelfAddress);
var isUp = selfMember is { Status: MemberStatus.Up or MemberStatus.WeaklyUp };
_livenessStatus = isUp ? new LivenessStatus(true) : DefaultClusterLivenessStatus;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

{
public static class AkkaHostingExtensions
{
public static IServiceCollection WithAkkaHealthCheck(this IServiceCollection services)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the ASP.NET healthchecks (IHealthCheck)

</PropertyGroup>

<ItemGroup>
<PackageReference Include="Akka.Cluster.Hosting" Version="0.5.1" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should derive from common.props / Directory.Build.props


namespace Akka.HealthCheck.Hosting.Services
{
public class AkkaReadinessService : IHealthCheck
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ought to add some XML-DOC here

@@ -0,0 +1 @@
hit2%System.String, System.Private.CoreLib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably exclude this file from git

@Aaronontheweb
Copy link
Member

@Arkatufus guess we just need to address the comments on the review in order to merge + release this?

@Aaronontheweb
Copy link
Member

@Arkatufus is this ready for review?

@Arkatufus
Copy link
Contributor Author

Arkatufus commented Dec 9, 2022

No, not yet, sorry, was busy with Akka.Hosting.
Let me give it another once over to make sure it is review worthy.

@Arkatufus
Copy link
Contributor Author

Arkatufus commented Dec 13, 2022

The persistence healthcheck is still problematic, i'm implementing it with GUID here, and the healthcheck will litter the database with random healthcheck data since the id is generated on each healhtcheck actor instantiation

@Arkatufus
Copy link
Contributor Author

it wasn't a problem before because it will use the same id over and over, since we can't find a per actor discriminator that can survive actor/actor system restart, this could be a problem...

@Arkatufus
Copy link
Contributor Author

We can't use a static counter because there is no guarantee that other nodes would not have another healthcheck running, creating an id collision

@Arkatufus
Copy link
Contributor Author

an ideal solution would be to have a persisted unique id for each actor system node without relying on clustering

# Conflicts:
#	src/common.props
@Aaronontheweb
Copy link
Member

@Arkatufus let me know when this is ready to take out of draft stage

@Arkatufus Arkatufus marked this pull request as ready for review December 21, 2022 19:00
@Arkatufus
Copy link
Contributor Author

@Aaronontheweb This should be ready for review now

@Arkatufus
Copy link
Contributor Author

The GUID problem is still unresolved

Copy link
Member

@Aaronontheweb Aaronontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - need to validate the nuget publication locally (or check what the build server produced.) Don't want to publish any sample projects and need to make sure that the correct README.md files are included.

@@ -0,0 +1,181 @@
# Akka.HealthCheck.Hosting.Web
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this README is what ends up inside the Akka.HealthCheck.Hosting.Web NuGet package when it's pushed. Include some of the other boilerplate about Akka.NET and Akka.Hosting that a user might need to know too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants