-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed patch for U4-10756 - Guid TypedContent performance #2398
Conversation
Makes a lot of sense for the short term, we'll review more closely next week, thanks! |
@nul800sebastiaan It was one of those where I was trying to figure it out. So no expectation (from me) in this being the accepted solution. Definitely a discussion point. 👍 |
Yep, no worries, it's a bit late to put it in the next patch, but can make it into 7.7.10 - we do have a larger solution already in mind but this will help for now! Created http://issues.umbraco.org/issue/U4-10854 for this one, cheers! |
I edited your link above to go to the new issue. @ HQ when merging, please use http://issues.umbraco.org/issue/U4-10854 |
I wonder what the xpath or xpath navigator performance would be to go look at every Guid in the content cache and cache it all up-front when first requested as opposed to one by one? |
} | ||
} | ||
|
||
private static ConcurrentDictionary<Guid, int> _guidToIntLoopkup; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not initialize it straight away instead of incurring the cost of a null check every time TypedDocumentById
is called? An empty dictionary costs so little plus the check is not thread safe.
private static readonly ConcurrentDictionary<Guid, int> GuidToIntLoopkup = new ConcurrentDictionary<Guid, int>();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I was trying to initialising it once TypedDocumentById(Guid)
is called - as opposed to whenever PublishedContentQuery
is instantiated. e.g. if you don't use that method, then it wouldn't be created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Init cost is very little compared to the null check hit. As soon as the class is referenced anywhere the reference will be created.
|
||
// When we have the node, we add the GUID/INT value to the lookup | ||
if (doc != null) | ||
_guidToIntLoopkup.TryAdd(id, doc.Id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we know that the key/value pair here are correct. There's no need for TryAdd
since you are not checking the result anyway. _guidToIntLoopkup[id] = doc.Id
should be sufficient.
@Shazwazza - Populating a lookup upfront is fine. Performance wise, I guess it's how many published nodes the website has? But then the lookup would need to be accessible from an app-startup event, so would need better API design, (than what I offer in this PR). In @AndyButland's workaround solution - it does a database hit to get all the Guid/Int IDs: Alternatively, in my comment on @hartvig's other PR (#2367 (comment)), I mention a I guess it comes down to how much time/effort is expected for this workaround? |
da11acd
to
9277ae6
Compare
as per @JimBobSquarePants's comment: umbraco#2398 (comment)
@leekelleher Just to clarify what i mean, you've said
I'm not talking about populating the dictionary on startup. I'm talking about populating the dictionary on first access - just like you are doing in this PR but instead of looking up the single entry, on the very first hit we could lookup all GUIDs -> INTs that exist in the xml doc. I'm saying this could be an easy win if the performance of looking up all GUIDs in the document using xpath isn't much more overhead than looking up a single one. making sense? |
@Shazwazza - ok gotcha. I'll take a look 👍 |
This is a proposal for a temporary workaround to issue U4-10756. It uses a local static ConcurrentDictionary to store a lookup of Guid/int values. If the Guid isn't in the lookup, then the traditional XPath is used, which would add the resulting node ID (int) to the lookup. If the lookup contains the Guid, then the returned int value will be used to perform a more efficient retrieval. <http://issues.umbraco.org/issue/U4-10756>
The enhancement in PR umbraco#2367 removed the `"@isdoc"` check, which was the main difference between the legacy and current XML schema. Reducing the XPath to `"//*[@key=$guid]"` will perform the same for both types of schema. _(and saves on a couple of allocations)_
as per @JimBobSquarePants's comment: umbraco#2398 (comment)
as per @Shazwazza's comment: umbraco#2398 (comment) Tested against 12,000 nodes (in the XML cache), profiling showed it took 55ms.
9277ae6
to
0780979
Compare
@Shazwazza - I've added in code to eagerly populate the Guid lookup, (see 0780979). I tested against 12,000 nodes, it took 55ms, (using MiniProfiler). I'm not sure what our level of acceptance would be here? |
Sounds good enough for me as it is now. |
Cool, thanks @zpqrtbnk! |
if (_guidToIntLoopkup.Count == 0) | ||
{ | ||
// TODO: Remove the debug profile logger | ||
using (ApplicationContext.Current.ProfilingLogger.DebugDuration<PublishedContentQuery>("Populate GUID/INT lookup")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zpqrtbnk @Shazwazza Do you see any issue with using DebugDuration
here?
I'm ever so slightly concerned about any performance bottlenecks here.
|
I was reading the discussion you had here and just wanted to say "thank you" for putting so much effort into Umbraco. #h5yr 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m no concurrency expert but does this not introduce race conditions where multiple threads populate the cache on first access at the same time? The same with returning the it afterwards: wouldn’t it be better to use GetOrAdd with the func overload to remove that specific race condition?
This is a proposal for a temporary workaround to issue U4-10756.
It uses a local static ConcurrentDictionary to store a lookup of Guid/int values.
http://issues.umbraco.org/issue/U4-10854
Note: I have also removed the
UseLegacyXmlSchema
check, see commit da11acd for notes.