Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

❓ How to tell if an Examine Index is Healthy? Possible ASPNET HealthCheck πŸ’‘ #372

Open
warrenbuckley opened this issue Jan 12, 2024 · 2 comments
Labels

Comments

@warrenbuckley
Copy link
Contributor

Hey @Shazwazza πŸ‘‹
Hope you are good mate

Problem

I am currently working on a project where Examine is used with Azure WebApps and scaling.

I was wondering if it was possible with Examine's API or events to know when the index is completed for the first time after the server has booted up?

Currently running into the issue where an instance is added and the server responds and routes traffic to it, however the Examine index is still doing its stuff.

In an ideal world, the solution would be to use ExamineX with Azure Search, however at this time it's not viable

Idea πŸ’‘

My idea was to use the 'framework' and not reinvent the wheel and use ASP.Net Health Checks to notify Azure that a specific instance is Healthy or Unhealthy in order for it to do routing to it etc.

The initial idea was to use the event IndexOperationComplete however from what I can tell this runs every time an index happens and could potentially happen when Azure polls the HealthCheck endpoint and thus the Server/instance could be marked as unhealthy when in fact it was just doing a small index update & incorrectly take the server/instance offline.

The App Service plan can have a maximum of one unhealthy instance replaced per hour and, at most, three instances per day.
https://learn.microsoft.com/en-us/azure/app-service/monitor-instances-health-check?tabs=dotnet#limitations

Next train of thought was to make some sensible assumptions/suggestions but I am unsure if this would be a reliable way to tell if an index is healthy & ready.

Questions ❓

  • Is there anything in the API that I could use to know that an Index has finished its first time index ?
  • If not what do you think would be sensible choices to try and tell if an index is 'healthy'*
    • Just ensuring the index does not have 0 documents ?
    • Performing a simple/fast search to ensure its has at least one result for a key/critical part of the site

So @Shazwazza do you have any smart ideas or thoughts on this problem?

@Shazwazza
Copy link
Owner

Umbraco schedules an index rebuild after the site has started (I think its a minute or two). This is because the site shouldn't wait until the index is ready because in many/most cases, an Umbraco site doesn't require the index to run. In the past this was true because the media cache was in Examine, but now that is not the case. Also in the past, this was very problematic to wait for the index to be built on startup.

There is no perfect way to tell when an index rebuild is complete since everything is done in async operations. Umbraco attempts to do this by using the runtime cache, you can see this in the ExamineManagementController: https://github.com/umbraco/Umbraco-CMS/blob/2e61d6449ae8e0c837dafa1e93ac950eda36c4f2/src/Umbraco.Web.BackOffice/Controllers/ExamineManagementController.cs#L172

Have a look for the references in that controller for _runtimeCache. Essentially it hacks the IndexOperationComplete event. When a rebuild is executed, it binds to the event, puts info into the runtime cache. Then the UI polls to see if the rebuild operation was complete by using the _runtimeCache.

The problem with that, is if another indexing operation takes place while the index is being rebuilt - which would mean the event will fire and the handler of the event in ExamineManagementController will think it is for the rebuild - so this is fairly flaky.

Here's an idea for you - maybe you could replace the ContentIndexPopulator in Umbraco with your own. Let it index all of its normal stuff, but at the end of the populator, you create a dummy index item purely for a flag to indicate that the index was rebuilt. Then in your health check, you just keep querying the index for this dummy item, if it is there, then you know the indexing is complete. In fact, I think this is how Umbraco should do this as it would remove any flakiness that the current ExamineManagementController is doing.

@warrenbuckley
Copy link
Contributor Author

Cheers @Shazwazza for the advice, I have something working but would love a second opinion please :)

I have tried the approach you have suggested and ended up down a rabbit hole of having to inherit lots of things in order to write something to the external examine index.

Such as

  • UmbracoIndexConfig in order to say I want to use my own implementation of ContentValueSetValidator
  • ContentValueSetValidator so that I can check for the Category of the dummy ValueSet item added to the External Index and if it is the one for our dummy marker. Then mark the results as valid so it can be added to the index. This took me a while to figure out why the new document was not not getting added, because it was expecting lots of properties on the ValueSet as it assumes its an Umbraco content node.

For now I have updated the ContentIndexPopulator so I can update properties on the HealthCheck class, rather than add an item into the index (Which I did get working)

I simplified it to update properties on the HealthCheck like this approach shown in the documentation here

Questions ❓

  • Would you have solved it this way or do you have any pointers on how it could be improved?
    • Hopefully you don't tell me I have approached this all wrong πŸ™ˆ 🀞
  • I am seeing different value results for the count of items in the index from the code, any obvious reasons why this might be mate?
    • For example it will sometimes report 0 or say 5 when in fact there is 7 items for the test
// How many items are in the index ?
IIndexDiagnostics indexDiagnostics = _indexDiagnosticsFactory.Create(externalIndex);

// Why is this returning various different results (0, 5, 7)
_healthCheck.NumberOfItemsInIndex = indexDiagnostics.GetDocumentCount();

Look forward to hearing from you and hearing your thoughts on this πŸ‘

Current Approach

UmbracoExternalIndexReadyHealthCheck.cs

using Microsoft.Extensions.Diagnostics.HealthChecks;

namespace Gibe.HealthChecks.ExamineIndex.HealthChecks
{
    public class UmbracoExternalIndexReadyHealthCheck : IHealthCheck
    {
        public bool ExternalIndexReady { get; set; }
        public long NumberOfItemsInIndex { get; set; }
        public TimeSpan DurationToBuildIndex { get; set; }

        public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context,
            CancellationToken cancellationToken = default)
        {
            // The property ExternalIndexReady is set in ExamineHealthCheckContentIndexPopulator
            // Once it has finished creating the external index
            if (ExternalIndexReady)
            {
                return Task.FromResult(HealthCheckResult.Healthy("The external index is ready.",
                    new Dictionary<string, object>
                    {
                        // Provide some extra info about the index for information purposes
                        // That can be displayed as part of the health check response if users use a custom response writer
                        { "numberOfItemsInIndex", NumberOfItemsInIndex },
                        { "durationToBuildIndex", DurationToBuildIndex }
                    }));
            }

            return Task.FromResult(HealthCheckResult.Unhealthy("The external index is still creating its index"));
        }
    }
}

IndexReadyContentIndexPopulator .cs

using System.Diagnostics;
using Examine;
using Gibe.HealthChecks.ExamineIndex.HealthChecks;
using Microsoft.Extensions.Logging;
using Umbraco.Cms.Core;
using Umbraco.Cms.Core.Logging;
using Umbraco.Cms.Core.Services;
using Umbraco.Cms.Infrastructure.Examine;
using Umbraco.Cms.Infrastructure.Persistence;

namespace Gibe.HealthChecks.ExamineIndex.Examine;

public class IndexReadyContentIndexPopulator : PublishedContentIndexPopulator
{
    private readonly UmbracoExternalIndexReadyHealthCheck _healthCheck;
    private readonly IProfilingLogger _profilingLogger;
    private readonly IIndexDiagnosticsFactory _indexDiagnosticsFactory;

    public IndexReadyContentIndexPopulator(
        ILogger<PublishedContentIndexPopulator> logger, 
        IContentService contentService, 
        IUmbracoDatabaseFactory umbracoDatabaseFactory, 
        IPublishedContentValueSetBuilder contentValueSetBuilder, 
        UmbracoExternalIndexReadyHealthCheck healthCheck,
        IIndexDiagnosticsFactory indexDiagnosticsFactory) 
        : base(logger, contentService, umbracoDatabaseFactory, contentValueSetBuilder)
    {
        _healthCheck = healthCheck;
        _indexDiagnosticsFactory = indexDiagnosticsFactory;
    }

    protected override void PopulateIndexes(IReadOnlyList<IIndex> indexes)
    {
        // Lets time how long it takes to do the indexing work
        var stopWatch = new Stopwatch();
        stopWatch.Start();
        
        // Do the usual work from Umbraco CMS
        // Of creating the index and populating it
        base.PopulateIndexes(indexes);

        stopWatch.Stop();
        
        // Ensure we have the external index assigned to this Populator
        // It should be - but good to check
        var externalIndex = indexes.SingleOrDefault(x => x.Name.Equals(Constants.UmbracoIndexes.ExternalIndexName));

        if (externalIndex != null)
        {
            // Update Health Check property with duration
            _healthCheck.DurationToBuildIndex = stopWatch.Elapsed;
             
            // How many items are in the index ?
            IIndexDiagnostics indexDiagnostics = _indexDiagnosticsFactory.Create(externalIndex);
            _healthCheck.NumberOfItemsInIndex = indexDiagnostics.GetDocumentCount();
            
            // Mark the healthcheck as ready
            _healthCheck.ExternalIndexReady = true;
        }
    }
}

HealthCheckBuilderExtensions.cs

using Gibe.HealthChecks.ExamineIndex.Examine;
using Gibe.HealthChecks.ExamineIndex.HealthChecks;
using Microsoft.Extensions.DependencyInjection;
using Umbraco.Cms.Infrastructure.Examine;

namespace Gibe.HealthChecks.ExamineIndex.Extensions
{
    public static class HealthCheckBuilderExtensions
    {
        public static IHealthChecksBuilder AddUmbracoExternalIndexReady(this IHealthChecksBuilder healthChecksBuilder)
        {
            // Because we use it in ExamineHealthCheckContentIndexPopulator
            // We need to explicitly add it into the DI container
            healthChecksBuilder.Services.AddSingleton<UmbracoExternalIndexReadyHealthCheck>();

            // Replace the singleton of PublishedContentIndexPopulator from Umbraco CMS with our own
            // We need this so we can add mark a property on our health check once it's all done its work of indexing

            // There are more than one type of IIndexPopulator hence looking for the specific one we want to replace
            var publishedContentIndexPopulator =
                healthChecksBuilder.Services.SingleOrDefault(x =>
                    x.ImplementationType == typeof(PublishedContentIndexPopulator));
            if (publishedContentIndexPopulator != null)
            {
                healthChecksBuilder.Services.Remove(publishedContentIndexPopulator);
                healthChecksBuilder.Services.AddSingleton<IIndexPopulator, IndexReadyContentIndexPopulator>();
            }

            // Add our health check
            return healthChecksBuilder.AddCheck<UmbracoExternalIndexReadyHealthCheck>("UmbracoExternalIndexReady");
        }
    }
}

Consuming project

HealthChecksComposer.cs

using System.Text;
using System.Text.Json;
using Gibe.HealthChecks.ExamineIndex.Extensions;
using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using Umbraco.Cms.Core.Composing;
using Umbraco.Cms.Web.Common.ApplicationBuilder;

namespace Gibe.HealthChecks.TestSite.Composing
{
    public class HealthChecksComposer : IComposer
    {
        public void Compose(IUmbracoBuilder builder)
        {
            // Add Health Checks
            // Users could choose other packages from https://github.com/Xabaril/AspNetCore.Diagnostics.HealthChecks
            // and chain other checks
            builder.Services.AddHealthChecks()
                .AddUmbracoExternalIndexReady();


            builder.Services.Configure<UmbracoPipelineOptions>(options =>
            {
                options.AddFilter(new UmbracoPipelineFilter("AspNetHealthChecks")
                {
                    Endpoints = app => app.UseEndpoints(endpoints =>
                    {
                        endpoints.MapHealthChecks("/health", new HealthCheckOptions()
                        {
                            // Use a custom response writer to give us some JSON and more info about the health check/s
                            // https://learn.microsoft.com/en-us/aspnet/core/host-and-deploy/health-checks?view=aspnetcore-8.0#customize-output
                            ResponseWriter = WriteResponse
                        });
                    })
                });
            });
        }

        // Output custom JSON for the health check/s
        // https://learn.microsoft.com/en-us/aspnet/core/host-and-deploy/health-checks?view=aspnetcore-8.0#customize-output
        private static Task WriteResponse(HttpContext context, HealthReport healthReport)
        {
            context.Response.ContentType = "application/json; charset=utf-8";

            var options = new JsonWriterOptions { Indented = true };

            using var memoryStream = new MemoryStream();
            using (var jsonWriter = new Utf8JsonWriter(memoryStream, options))
            {
                jsonWriter.WriteStartObject();
                jsonWriter.WriteString("status", healthReport.Status.ToString());
                jsonWriter.WriteStartObject("results");

                foreach (var healthReportEntry in healthReport.Entries)
                {
                    jsonWriter.WriteStartObject(healthReportEntry.Key);
                    jsonWriter.WriteString("status", healthReportEntry.Value.Status.ToString());
                    jsonWriter.WriteString("description", healthReportEntry.Value.Description);
                    jsonWriter.WriteStartObject("data");

                    foreach (var item in healthReportEntry.Value.Data)
                    {
                        jsonWriter.WritePropertyName(item.Key);

                        JsonSerializer.Serialize(jsonWriter, item.Value, item.Value?.GetType() ?? typeof(object));
                    }

                    jsonWriter.WriteEndObject();
                    jsonWriter.WriteEndObject();
                }

                jsonWriter.WriteEndObject();
                jsonWriter.WriteEndObject();
            }

            return context.Response.WriteAsync(Encoding.UTF8.GetString(memoryStream.ToArray()));
        }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants