Skip to content

Optimize Tracking: convert ChangedValue & TrackingStackFrame to readonly structs #249

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

SergeiPavlov
Copy link
Contributor

No description provided.

Comment on lines 33 to 38
"00240000048000009400000006020000002400005253413100040000010001007f8790a0573a2a" +
"4f7b77907240977b669d33774bcaef77ef39d5186de4d4c836623fb661be674c7db731bc376184" +
"f6848fc6b64eeba506654c8212de624e9d1f15ad1fe765f769e9a5b59ecaab2866632a983e7828" +
"f3bf20ff7afffa021951cf8adc9ac1e786716430b0637893fb71b7566405fe14acc5e340c4af4a" +
"a293c994")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extensions should not rely on internal members and should not be in friendly assemblies

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even providers have no access to internals

Copy link
Contributor Author

@SergeiPavlov SergeiPavlov Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to make DirectSessionAccessor.GetChangedEntitiesInternal() public
IMO it is better to move some extensions into main Xtensive.Orm project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it is better to move some extensions into main Xtensive.Orm project.

Why? What is the criteria for this movement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? What is the criteria for this movement?

In this case GetChangedEntitiesInternal() optimization.

But in general integration of stable extensions into main project simplifies using DO.
Then we need to reference only 2 packages from the app: Orm + DB driver (SqlServer or other)

Copy link
Contributor

@alex-kulakov alex-kulakov Jul 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of extension is that it is optional one-purpose package, if you need it then you install it, otherwise you don't. You look at this from your perspective when you 100% use it. That's biased opinion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of extension is that it is optional one-purpose package, if you need it then you install it, otherwise you don't.

The same idea as for any function or class.
This Xtensive.Orm.Tracking.dll extension is just 16KB of binary. Not a big deal if the main package will be 16KB bigger and we avoid all installation efforts.

Copy link
Contributor Author

@SergeiPavlov SergeiPavlov Jul 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed implementation to use conditional compilation:

    public IEnumerable<EntityState> GetChangedEntities(PersistenceState persistenceState) =>
#if DO_SAFE_COLLECTION_WRAPPER
      Session.EntityChangeRegistry.GetItems(persistenceState);
#else
      Session.EntityChangeRegistry.GetContainer(persistenceState);
#endif

This allows to have safe implementation for public DO package and the optimized for ourown build (depending on environment variable)
And we need not visibility-attributes and internal method anymore

We can apply this approach to other places where we create unnecessary ReadOnly-wrappers

@alex-kulakov
Copy link
Contributor

alex-kulakov commented Jul 14, 2022

So you want to return the original HashSet but don't want public API to have an opportunity to change HashSet by just casting it to the original type, right? What if we returned IReadOnlyCollection and also made sure that it was not a pure HashSet but lightweight wrapper which implements IReadOnlyCollection?

@SergeiPavlov
Copy link
Contributor Author

So you want to return the original HashSet but don't want public API to have an opportunity to change HashSet by just casting it to the original type, right?

Yes

What if we returned IReadOnlyCollection and also made sure that it was not a pure HashSet but lightweight wrapper which implements IReadOnlyCollection?

Lightweight wrapper is not better than additional foreach. They both have some allocation price.

@alex-kulakov
Copy link
Contributor

Well, everything has allocation price otherwise we would be C++ing :) that doesn't mean we should expose everything. I'll check the performance losses and allocations.

@alex-kulakov
Copy link
Contributor

Some update on performance of EntityChangeRegistry.GetItems(). I checked several scenarios of enumeration, including:

  1. current enumeration;
  2. Lightweight class wrapper with pure HashSet<T>.Enumerator that implements IReadOnlyCollection<T> that is returned as IReadOnlyCollection<T>;
  3. Lightweight class wrapper with pure HashSet<T>.Enumerator that implements IReadOnlyCollection<T> that is returned as original type;
  4. Lightweight struct wrapper with pure HashSet<T>.Enumerator that implements IReadOnlyCollection<T> that is returned as IReadOnlyCollection<T>;
  5. Lightweight struct wrapper with pure HashSet<T>.Enumerator that implements IReadOnlyCollection<T> that is returned as original type;
  6. HashSet<T> returned as IReadOnlyCollection<T>;
  7. HashSet<T> returned.

I did one enumeration of 100 items per case
The results (order of methods matches the order of cases above)

Method Mean Error StdDev Rank Gen 0 Allocated
GetItemsWithForeach 1,032.4 ns 3.95 ns 3.50 ns 6 0.0134 64 B
GetItemsClassAsInterface 792.8 ns 2.45 ns 2.29 ns 4 0.0134 64 B
GetItemsPureClass 337.1 ns 1.03 ns 0.96 ns 2 0.0048 24 B
GetItemsWithStructAsInterface 862.8 ns 2.81 ns 2.62 ns 5 0.0134 64 B
GetItemsPureStruct 304.0 ns 1.06 ns 0.99 ns 1 - -
GetItemsHashSetAsReadOnly 765.0 ns 2.13 ns 1.99 ns 3 0.0076 40 B
GetItemsInternalField 306.6 ns 1.41 ns 1.32 ns 1 - -

Summary: current enumeration via foreach is dead-slow comparing to other variants and we should change it. Returning interface is twice slower that original type. Returning class wrapper is fast and relatively light-weight (2.6x less memory consumed). Closest to raw internal HashSet<T> is struct wrapper type which implements IReadOnlyCollection<T>(or similar interface) and returns raw internal enumerator of HashSet<T>, and it protects internal hashset from modifications (assuming no reflection used)

I also assume that results of EntityChangeRegistry.GetItems() are generally used to perform some kind of enumeration right away (foreach, .ToList()/.ToArray(), other IEnumerable<T> operations).

The struct I used for benchmarks

  public struct ReadOnlyWrapper<T> : IReadOnlyCollection<T>
  {
    private readonly HashSet<T> hashSet;

    public int Count => hashSet.Count;

    public HashSet<T>.Enumerator GetEnumerator()
      => hashSet.GetEnumerator();

    /// <inheritdoc/>
    IEnumerator<T> IEnumerable<T>.GetEnumerator()
      => hashSet.GetEnumerator();

    /// <inheritdoc/>
    IEnumerator IEnumerable.GetEnumerator()
      => hashSet.GetEnumerator();

    public ReadOnlyWrapper(HashSet<T> hashset)
    {
      hashSet = hashset;
    }
  }

@SergeiPavlov
Copy link
Contributor Author

SergeiPavlov commented Jul 16, 2022

Thank you for benchmarking.
I think it is useless to make the wrapper a struct. If the method returns it as IReadOnlyCollection than the wrapper will be boxed.
Or you are suggesting to make return type to be ReadOnlyWrapper<T>?
Then probably we have to call it ReadOnlyHashSetWrapper<>, because there will be wrappers for other collection types

@alex-kulakov
Copy link
Contributor

alex-kulakov commented Jul 19, 2022

I didn't choose great name, I needed some name for benchmarking and it was good enough for it :) It was just a proof-of-concept.

I think it is useless to make the wrapper a struct. If the method returns it as IReadOnlyCollection than the wrapper will be boxed.

I agree, GetItemsWithStructAsInterface shows this clearly, returning interface basically doubles execution time or even worse

Or you are suggesting to make return type to be ReadOnlyWrapper?

Yes, I thought about this. Here's my point, highly likely that the result of GetItems will be enumerated right after caller have got it, we already enumerate them everywhere, but my main concern is users' code base, they can start caching results, though it is pointless (we clear internal hashsets in certain scenarios like session disposal, persist of changes, etc.).

I took a wider view and I saw that we have three registries with similar GetItems() methods - EntityChangeRegistry, EntitySetChangeRegistry and ReferenceFieldChangeRegistry. All of them have HashSets inside and all of them could benefit from returning a faster wrapper struct. So it could be some general name (but still relevant to registries) like RegistryItems<T>. Three birds with one stone, right? Any drawbacks that come to your mind?

@alex-kulakov
Copy link
Contributor

I'm going to make the changes for registries' GetItems() methods and create such wrapper type in a separate PR.

After I merge my PR to master you'll need to revert commit af4427e, especially visibility changes.

@SergeiPavlov
Copy link
Contributor Author

After I merge my PR to master you'll need to revert commit af4427e, especially visibility changes.

Ok. Looks to be not a big deal

@alex-kulakov
Copy link
Contributor

I have merged the #283 with registry changes

@alex-kulakov alex-kulakov merged commit 1610db8 into DataObjects-NET:master Nov 18, 2022
@SergeiPavlov SergeiPavlov deleted the upstream/OptimizeTracking branch November 18, 2022 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants