Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements mainly around front-end caching #8376

Merged
merged 113 commits into from Jun 23, 2021

Conversation

Shazwazza
Copy link
Contributor

@Shazwazza Shazwazza commented Jul 2, 2020

This is a branch of some ideas being chatted about in #8365

  • Reduce the size of data stored in the cmsContentNu table
    • Reducing this data size reduces the amount of data that needs to be transferred over the network which reduces the latency to load it
    • The data is now serialized with MessagePack and the keys used for the data format are single chars
    • All aliases used , including things like culture names to consume this format are now string interned
    • The data storage for this format is now raw bytes instead of serialized strings, this greatly reduces the data size but also removes the overhead for serialization and deserialization. This can be disabled to return to original behavior by: <add key="Umbraco.Web.PublishedCache.NuCache.Serializer" value="JSON"/>. MessagePack will be the default that we ship with. There's a check in place during start to detect if the config switch has changed and it will rebuild the nucache tables
    • Allows for custom pluggable serializers for this data
  • Ability to compresses property values in memory/caches (i.e. RTE and complex json values)
    • By default property level compression is turned off but it can be implemented in a custom way by registering a custom IPropertyCacheCompressionOptions which will determine if a property should be compressed in memory. Example: composition.RegisterUnique<IPropertyCacheCompressionOptions, CustomPropertyCacheCompressionOptions>();
  • Improved SQL indexes which now also utilizes SQL Server INCLUDE columns (Support include in sql server indexes #8630)
  • Improved SQL queries when rebuilding the in memory caches from the cmsContentNu table
  • Adds string interning for common aliases like property aliases
  • More primitive types supported in nucache (Support more primitives in nucache #9440)

Other changes:

  • Fixes a startup issue that doesn't bubble the underlying boot failed exception if the container fails.
  • Ability to have SQL Server indexes with INCLUDE columns
  • Still need to do some benchmarks between this and previous versions for
    • Cold Boot time
    • DB table size of cmsContentNu table
    • Normal Boot time - need to compare what a cold boot vs normal boot looks like with these changes (in hopes we could eventually just get rid of the local cache files)
    • Rendering times with in-memory compressed properties (first load)
    • Time to takes to rebuild the DB nucache table
    • Running mem consumption

Benchmark

The raw numbers here are not the interesting part since that is largely dependent on my machine. It's the % that are the important figures. This Umbraco install has ~ 16k nodes.

Table Sizes

  name rows reserved data index_size unused
BEFORE cmsContentNu 15983 43344 KB 42728 KB 16 KB 600 KB
AFTER cmsContentNu 15983 11664 KB 10096 KB 48 KB 1520 KB

With MsgPack binary serialization the database table size is reduced by ~ 73%

Cold Boot time

An average based on 10 runs

  ave max min
BEFORE 18586.63636 33267 5675
AFTER with cmsContentNu as JSON 6586.444444 7365 5970
AFTER with cmsContentNu as MsgPack 5615.090909 6331 4627

The new code changes and with the MsgPack system performs ~ 70% better than the original implementation. It's also far more consistent. Before on average every 2nd cold boot would have a very long boot time.

Normal Boot time

This is based on the normal boot process (non cold boot) to see what the perf difference is of having the local nucache files vs loading directly from the database (i.e. IgnoreLocalDb = true). With the new DB indexes and code changes we perform about 15% faster when loading in memory nucache from the DB (this figure would be larger if running the database over the network/internet). But as you can see it's still quite a lot faster to have the local nucache files.

  ave max min
BEFORE Boot without nucache files 5321.909091 7007 4394
AFTER Boot without nucache files 4520.454545 5115 3911
AFTER Boot with nucache files 2449.777778 3085 1949

Cold boot memory

This is a comparison of mem consumption during cold booting between this codebase and 8.13. This was run several times with the same result. The cold boot memory footprint of this code base is lower than 8.13

From bootup until page render. Significantly less memory.

image

After first page render and navigating several pages. This one is the most surprising because it seems that a cold boot before these changes even after boot requires more memory than normal to run.

image

Warm boot memory

This is a comparison of mem consumption during warm booting between this codebase and 8.13. This was run several times with the same result. The cold boot memory footprint of this code base is lower than 8.13

From bootup until page render. .Not a huge change but still less memory.

image

After first page render and navigating several pages. Not a huge change but still less memory.

image

String Duplicates

In this codebase there is some usages of string interning. Based on profiling you can see that this is working a bit but it's not perfect. We are trying to intern all common strings that are dynamic such as property type aliases, content type aliases, etc... Here's a comparison of this codebase vs 8.13 with the same data in the database and sorted by the highest duplicate string count. You can see that a lot of alias duplicates are reduced but they certainly still exist. We should continue to string intern these values throughout the codebase in the future to further reduce this.

This codebase:

image

8.13:

image

Shazwazza and others added 17 commits July 3, 2020 14:50
…property basis. Option for compressing the properties in sql/nucache.db. Option for immediate/lazy decompression of properties. Mapping support for shorter property alias.

TODO: config file for property map
TODO:  HasValue and IsValue on propertyvalueconverterbase
…ession

WIP Option C - Compress custom properties for nucache to reduce memory consumption
…che-perf

# Conflicts:
#	src/Umbraco.TestData/LoadTestComposer.cs
…-perf

# Conflicts:
#	src/Umbraco.Core/Sync/DatabaseServerMessenger.cs
#	src/Umbraco.Core/Sync/ISyncBootStateAccessor.cs
#	src/Umbraco.Core/Sync/NonRuntimeLevelBootStateAccessor.cs
#	src/Umbraco.Core/Sync/SyncBootState.cs
#	src/Umbraco.Tests/PublishedContent/NuCacheChildrenTests.cs
#	src/Umbraco.Tests/PublishedContent/NuCacheTests.cs
#	src/Umbraco.Tests/Scoping/ScopedNuCacheTests.cs
#	src/Umbraco.Tests/Services/ContentTypeServiceVariantsTests.cs
#	src/Umbraco.Tests/TestHelpers/TestSyncBootStateAccessor.cs
#	src/Umbraco.Web/Compose/DatabaseServerRegistrarAndMessengerComponent.cs
#	src/Umbraco.Web/PublishedCache/NuCache/NuCacheComposer.cs
#	src/Umbraco.Web/PublishedCache/NuCache/PublishedSnapshotService.cs
@Shazwazza Shazwazza marked this pull request as ready for review June 18, 2021 21:06
@Shazwazza
Copy link
Contributor Author

I've added more benchmarks (above). The running memory and rebuild times aren't significantly different so nothing to really report on there. Though, that might be different if there are tens of thousands of nodes.

The only thing I'd still like to benchmark is the effect of having the compressed property data and rendering.

I've moved this as ready to review. I can see that v8/contrib needs to be re-merged in though.

@Shazwazza Shazwazza changed the title WIP for perf improvements mainly around front-end caching Performance improvements mainly around front-end caching Jun 18, 2021
@Shazwazza Shazwazza added the category/performance Fixes for performance (generally cpu or memory) fixes label Jun 18, 2021
// which then performs a Clustered Index Scan on PK_umbracoPropertyData which means it iterates the entire table which can be enormous!
// especially if there are both a lot of content but worse if there is a lot of versions of that content.
// So is it possible to return this property data without doing an index scan on PK_umbracoPropertyData and without iterating every row
// in the table?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the oldest version of SQL server Umbraco supports. Essentially are indexed views a possibility?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@bergmania bergmania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.. Just small comments. I fear the merge to v9 🙈

src/Umbraco.Core/Migrations/Upgrade/UmbracoPlan.cs Outdated Show resolved Hide resolved
src/Umbraco.Tests/Umbraco.Tests.csproj Outdated Show resolved Hide resolved
src/Umbraco.Tests/Umbraco.Tests.csproj Outdated Show resolved Hide resolved
@Shazwazza
Copy link
Contributor Author

Happy to help with merge to v9. I've updated the PR based on your comments.

@Shazwazza
Copy link
Contributor Author

Shazwazza commented Jun 22, 2021

One thing to test that I don't know if you did is to change the config:

<add key="Umbraco.Web.PublishedCache.NuCache.Serializer" value="MsgPack"/>

to

<add key="Umbraco.Web.PublishedCache.NuCache.Serializer" value="JSON"/>

which will force the nucache table to rebuild on startup with the correct format. You can verify this by looking in the nucache DB table. WIth MsgPack it will populate the binary column, with JSON it will populate the string column.

@bergmania
Copy link
Member

bergmania commented Jun 23, 2021

One thing to test that I don't know if you did is to change the config:
....

I tested this and it seems to work as expected 🎉

I also executed the acceptance tests.. A Single test does not pass, but that is not related to this PR. (Fixed in this PR #10516)

@bergmania bergmania merged commit 40dbfa7 into v8/contrib Jun 23, 2021
@bergmania bergmania deleted the v8/feature/nucache-perf branch June 23, 2021 07:32
@bergmania bergmania restored the v8/feature/nucache-perf branch June 23, 2021 09:23
@nul800sebastiaan nul800sebastiaan mentioned this pull request Jun 23, 2021
1 task
@nul800sebastiaan nul800sebastiaan deleted the v8/feature/nucache-perf branch June 23, 2021 14:25
@Shazwazza
Copy link
Contributor Author

Please remember to mark with a release version. I've marked as 8.15.

@nul800sebastiaan
Copy link
Member

Thanks @Shazwazza I was about to do this as well! 👍

bergmania added a commit that referenced this pull request Jun 25, 2021
This is a merge PR which merges in #8376 to v9
@ronaldbarendse
Copy link
Contributor

@Shazwazza I just noticed some separate commits to fix the migration from this PR to support SQL CE, but that doesn't seem to drop the temp table after moving all rows back to the original table:

if (DatabaseType.IsSqlCe())
{
// SQLCE does not support altering NTEXT, so we have to jump through some hoops to do it
// All column ordering must remain the same as what is defined in the DTO so we need to create a temp table,
// drop orig and then re-create/copy.
Create.Table<ContentNuDtoTemp>(withoutKeysAndIndexes: true).Do();
Execute.Sql($"INSERT INTO [{TempTableName}] SELECT nodeId, published, data, rv FROM [{Constants.DatabaseSchema.Tables.NodeData}]").Do();
Delete.Table(Constants.DatabaseSchema.Tables.NodeData).Do();
Create.Table<ContentNuDto>().Do();
Execute.Sql($"INSERT INTO [{Constants.DatabaseSchema.Tables.NodeData}] SELECT nodeId, published, data, rv, NULL FROM [{TempTableName}]").Do();
}

@Shazwazza
Copy link
Contributor Author

@ronaldbarendse good catch - I guess we're too late now and it will need to be in the next minor release 🤦‍♂️ 🤦‍♂️ 🤦‍♂️

@nul800sebastiaan any thoughts?

@ronaldbarendse
Copy link
Contributor

We can add a release note mentioning you can manually remove the table (DROP TABLE [umbracocmsContentNuTEMP]) if you're using SQL CE and upgraded your site to 8.15.0.

Because it's only on SQL CE, this should only happen on development databases, so it's not a big issue anyway. And if we fix up this migration in a next patch release, upgrading a pre 8.15.0 site to 8.15.1 won't have this problem 👍🏻


Also note the double prefix of Constants.DatabaseSchema.TableNamePrefix/umbraco and cms:

private const string TempTableName = Constants.DatabaseSchema.TableNamePrefix + "cms" + "ContentNuTEMP";

This is probably because of the comment on the original cmsContentNu table constant:

public const string NodeData = /*TableNamePrefix*/ "cms" + "ContentNu";

We can also fix this up, but it shouldn't matter, as it's a temporary table anyway 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/performance Fixes for performance (generally cpu or memory) fixes release/8.15.0 type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants