Log Consistency Providers #1854

sebastianburckhardt · 2016-06-17T20:32:21Z

An alternative provider API for persistent grains whose state is defined as a view of a log. This will be used for supporting event sourcing, and geo-replicated grains.

xiazen · 2016-10-05T02:16:36Z

src/Orleans/GrainDirectory/MultiClusterRegistrationStrategy.cs

@@ -8,7 +9,7 @@ namespace Orleans.GrainDirectory
    /// Strategy objects are used as keys to select the proper registrar.
    /// </summary>
    [Serializable]
-    internal abstract class MultiClusterRegistrationStrategy
+    public abstract class MultiClusterRegistrationStrategy


should we have a public interface IMultiClusterRegistrationStrategy so that we don't need to make this internal class public?

I have looked into this a bit and I think this should be fixed by making the IProtocolServices more abstract. I will work on this.

xiazen · 2016-10-05T23:22:09Z

src/Orleans/Core/GrainAttributes.cs

+        {
+            public LogViewProviderAttribute()
+            {
+            }


shouldn't you set up the ProviderName here ? otherwise I'm confused why you want to inherit from PersistenceProviderAttribute

They inherit from the same base attribute because they are alternative ways to specify a persistence mechanism for the grain. For log view grains, users can either specify a storage provider or a log view provider. Those are different kinds of persistence providers, but they are somewhat interchangeable (a standard storage provider can be used for log view grains). See the code inside Catalog.SetupPersistenceProvider - it finds out which persistence attribute is specified, and retrieves the appropriate type of provider.

There is no default log view provider (because the default is to use the default storage provider), therefore I am not setting the string there.

Would probably be much beneficial to extend some of these explanations straight to the code for the benefit of coder readers in later times.

Yes, that makes sense.

xiazen · 2016-10-06T00:26:09Z

src/Orleans/LogViews/LogViewGrain.cs

+    /// <typeparam name="TView">The type for the log view, i.e. state of this grain.</typeparam>
+    /// <typeparam name="TLogEntry">The type for log entries.</typeparam>
+    /// </summary>
+    public abstract class LogViewGrain<TView, TLogEntry> :


What's the difference or usage between LogViewGrain and LogViewGrainBase.

There are additional alternatives to the LogViewGrain class (QueuedGrain, JournaledGrain, not in this PR) that inherit from LogViewGrainBase but hide the log-view API underneath a terminology that is more specific to the intended situation.

Such classes can be defined by the user, too - it gives them a chance to control how exactly the storage API should look like. If they inherit from LogViewGrain, they cannot hide the logview api.

xiazen · 2016-10-06T00:56:16Z

src/OrleansProviders/LogViews/CustomStorageProvider.cs

+        public Task Close()
+        {
+            return TaskDone.Done;
+        }


What does this method intend to do? based on its implementation, it is not related to its context CustomStorageProvider

All providers have this method, it's in the IProvider interface - I think it's used to clean up (e.g. close database connections) when the silo shuts down. This provider does not have any cleanup it needs to do, so it just returns.

xiazen · 2016-10-07T23:49:27Z

src/OrleansProviders/LogViews/SharedStorageProvider.cs

+        public Task Close()
+        {
+            return TaskDone.Done;
+        }


xiazen · 2016-10-10T20:03:41Z

test/TestGrainInterfaces/ISimpleLogViewGrain.cs

+        #endregion
+    }
+
+    public struct AB


what does A B stands for ?

It is a struct containing two fields A and B. These fields don't really have any meaning, they are used by a unit test. The struct is there so a grain method can return both A and B at the same time.

It would nice to have in the comments this explanation. I bet plenty of people reading the code afterwards will have this same question. In other words, it would lighten the cognitive load to have it in writing there in the code. :)

xiazen · 2016-10-10T20:45:37Z

submited a PR to your branch to clean code styles :). Was assigned by @jdom to do a first pass on code style cleaning

veikkoeeva · 2016-11-16T17:15:59Z

src/OrleansProviders/LogViews/SharedStorageProvider.cs

+
+        public Logger Log { get; private set; }
+
+        private static int counter;


What does this count?

In general (maybe someone has an idea), I wonder if this would be problematic when running multi-silo cluster on one computer (say, a dev-box in reliable setup) or if it would work together with the DI implementation currently coming along too.

It is used to construct a unique id for the provider. Matters only for tracing. Therefore, it is actually safe even if you have multiple silos on one computer in the same app domain (but I think a LOT of other stuff would break - static fields seem to be used all over the place).

veikkoeeva · 2016-11-16T17:21:30Z

src/OrleansRuntime/LogViews/GrainStateWithMetaData.cs

+        /// Bits are toggled when writing, so that the retry logic can avoid appending an entry twice
+        /// when retrying a failed append. 
+        /// </summary>
+        public string WriteVector { get; set; }


Worth considering: BitVector32 or BitArray? Otherwise a note for later refactoring, it looks like this is a candinate to turn into class of its own (and have easily separable tests for this part).

Logically, this is not actually a bitvector, but a map from strings to a bit. I will change the comment to clarify.

In an earlier version we used Dictionary<string,bool> here. But I changed it to use a special representation as a simple string to reduce serialization/deserialization overhead, which is somewhat critical here. This is also why it is not a separate class.

I don't mind sacrificing maintainability for performance when necessary, but I don't yet see how it's necessary here. Can you help me understand why serialization speed is critical enough to justify this pattern.

this metadata gets added to grain state of every grain, and is read/written from/to storage every time the grain state is read or written. If the grain state is already a large complicated object, perhaps it does not matter so much. But for grains with simple small state, serializing and deserializing an entire dictionary or hashset every time is a lot of unnecessary overhead and increases memory pressure. Also, it means you can't easily look at the content of that field in storage.

Understood. Thanks for the clarification.
If I understand you're goals, you want to avoid the cost in cpu and space of serializing a dictionary, and you want the contents to be in a human readable format when written to storage. Is this correct?
If so, I suggest two properties. One for programmatic use, one for storage. Something like:

[NonSerialized] public HashSet<string> Flag { get;set;} public string FlagsString { get { return Join(Flags, ","); } set { Flags = new HashSet<string>(Split(value, ",")); } }

The code uses the Flags fields, but when serialized the FlagsString property is used, converting Flags to/from a user friendly string.

An additional consideration for this type of string manipulation, in both existing PR and my suggestion, is the case where the delimiter is used in the string. If it is possible (which I suspect that it is) for the serviceId or the notification origin to contain the delimiter, the logic will break. If we are going to use a delimiter we need to add logic to address this case.

I don't think it is worth constructing a separate HashSet - reading the original string directly is faster and does not allocate any memory. Note that this field is accessed only once per time it is serialized, so the speed of HashSet does not buy us anything.

If you are seriously concerned about code maintainability, I can encapsulate the operations in a static class and add a unit test.

As for commas: yes, I thought of that. ClusterIds must not contain commas. The code in GlobalConfiguration.cs enforces this when parsing configurations.

I noticed the documentation does not state this clearly, I will change that.

Ah. So origin is also a cluster id. KK. Still a whole (programmatic config), but it's very small. Probably safe to ignore.

As for hashset, the hash set was not for speed of access as much as ability to check containment. You'd mentioned using a dictionary, but if containment check is all that is needed, hashset is simpler. I don't understand the single access per serialize comment as it's accessed at least twice in the write and once per notification.

I'm not going to push on this too hard, but in general, yes, code maintainability is very important, and writing custom logic to encode multiple fields into a string is not preferable to using a container. Having said that, the complexity is mostly isolated behind the Toggle and Contains calls, so potential harm is limited.

Just to clarify the perf argument... The two things that happen frequently: are
a) toggle bit then, store and send
b) receive in notification

And here is the corresponding work required for each implementation choice:

if using HashSet:
a) HashSet.lookup, HashSet.update, convert to string
b) construct HashSet from string

if using only string:
a) indexof, update string
b) nothing

I think the latter is quite a bit faster... string operations are close to the metal, and the strings are not large in this case. HashSet puts more pressure on GC.

This is tangential, but https://youtu.be/sWgDk-o-6ZE?t=1178. In other words, strings likely are closer to metal than HashSets, but I wouldn't say they are close. Which is the reason I wondered about the layout and if it could be separated cleanly (and later maybe refactored by someone). In the linked snippet (guru) Sean explains how O(n*lg(n)) algorithm are likely more impated by layout than operations. Then at https://youtu.be/sWgDk-o-6ZE?t=1716 there's an example. The deeper point is that in larger piece of software keeping data in cache is important for performance performance (throughput and latency). From personal experience I believe this to be true, but maybe I just wanted to share this video as it's so good. :)

veikkoeeva · 2016-11-16T17:24:13Z

src/OrleansRuntime/LogViews/NotificationTracker.cs

+            sendworkers = new Dictionary<string, NotificationWorker>();
+            this.maxnotificationbatchsize = maxnotificationbatchsize;
+
+            foreach (var x in configuration.Clusters)


A matter of style this brace thing. :)

ok, I will add them.

veikkoeeva · 2016-11-16T17:25:24Z

src/OrleansRuntime/LogViews/NotificationTracker.cs

+        }
+
+        /// <summary>
+        /// last observed exception, or null if last notification attempts were successful for all clusters


OK. Rationale or pointers to somewhere feels like would be nice here. Especially for those who come later and if they need to know this when working on other storage backends.

The fact that we are saving the last exception here (for users to look at if they want) would need to be documented at the level of the grain API (QueuedGrain, LogViewGrain, JournaledGrain).

However, I am actually going to change this after discussion with @creyke: rather than saving the exception for inspection, I will add a virtual method that users can override to get notified when the exception happens. They can then decide to save it themselves if they want, or do anything else (custom grain health monitoring stuff, from example).

veikkoeeva · 2016-11-16T17:36:52Z

src/OrleansRuntime/LogViews/NotificationTracker.cs

+        }
+
+
+        public enum NotificationQueueState : byte


This looks like would ask for justification to changing the storage type. Layout?

We had a longish time ago discussion in other PR where some evidence indicated changing from int to byte was worse for performance (maybe @jason-bragg remembers better the exact issue).

Naively I am assuming byte is better because it uses less storage. If anyone has any evidence to the contrary, let me know.

I did not actually know about using byte for enums until @jdom asked me to add it to one of the enums in one of my other PRs.

I do not recall any perf issues with using byte enums, though it would not surprise me.

The concern was at #400 (comment). Byte enums look like causing boxing whereas int enums won't. I might misread, so better read. :) It does affect the storage layout though, if that matters.

Ok, I will remove it. It is not important in any way.

Thanks for the link, it's good be reminded that C# sometimes does boxing even if there is no good reason for it.

veikkoeeva · 2016-11-16T17:43:04Z

src/OrleansRuntime/LogViews/PrimaryBasedLogViewAdaptor.cs

+            need_initial_read = true;
+            // kick off notification for initial read cycle with a bit of delay
+            // so that we don't do this several times if user does strong sync
+            await Task.Delay(10);


Maybe a matter of preference and especially now that this is an "arbitrary technical delay", but using explicit TimeSpan would look more robust.

veikkoeeva · 2016-11-16T17:44:38Z

src/OrleansRuntime/LogViews/PrimaryBasedLogViewAdaptor.cs

+            // kick off notification for initial read cycle with a bit of delay
+            // so that we don't do this several times if user does strong sync
+            await Task.Delay(10);
+            Services.Verbose2("Notify (initial read)");


Should this indicate rather this is nameof(KickOffInitialRead) just about to start an initial read worker.Notify?

veikkoeeva · 2016-11-16T17:48:39Z

test/TesterInternal/GeoClusterTests/LogViewProviderTestBattery.cs

+            public void SetALocal(string grainclass, int i, int a)
+            {
+                var grainRef = GrainClient.GrainFactory.GetGrain<ISimpleLogViewGrain>(i, grainclass);
+                grainRef.SetALocal(a).Wait();


.GetAwaiter().GetResult() would be better and set a good tone in the code, see at aspnet/Security#59. See also in other functions. Could these functions be asynchronous?

Yes, that makes sense.

The functions are synchronous because they are called from a different AppDomain (the class derives from MarshalByRefObject). Don't know if there is another way to do it.

veikkoeeva · 2016-11-16T17:49:56Z

test/TesterInternal/GeoClusterTests/LogViewProviderTestBattery.cs

+                }
+                // check local stability
+                for (int i = 0; i < numclusters; i++)
+                    AssertEqual(333, Client[i].GetALocal(grainClass, x), grainidentity);


Coud a const int Xyz = 333 introduced here to give a name to the constant?

sure, why not.

veikkoeeva · 2016-11-16T17:56:45Z

test/TesterInternal/GeoClusterTests/LogViewProviderTestBattery.cs

+                    t.Add(Task.Run(() =>
+                    {
+                        while (Client[c].GetALocal(grainClass, x) != 1)
+                            System.Threading.Thread.Sleep(100);


Maybe this too could be async?

The clients are in a different AppDomain so they are called synchronously.

But I suppose the sleep can be replaced with Task.Delay. I will do that.

veikkoeeva · 2016-11-16T17:57:27Z

test/TesterInternal/GeoClusterTests/LogViewProviderTestBattery.cs

+            };
+
+            Func<int, Task> checker7 = (int variation) => Task.Run(() =>
+            {


Maybe this too could be async and use await Task.Delay?

veikkoeeva

Idle commenting while kids eat evening snack.

jason-bragg · 2016-11-16T22:52:24Z

src/OrleansRuntime/LogViews/StorageProviderLogViewAdaptor.cs

+
+                Services.Verbose("write apparently failed {0}", nextglobalstate);
+
+                while (true) // be stubborn until we can read what is there


Suggest using AsyncExecutorWithRetries rather than writing custom backoff logic.

I don't think AsyncExecutorWithRetries is general enough... there are nested loops here, the control flow is very subtle.

I have refactored the retry logic - still not using AsyncExecutor, but the code is now more readable because the retry delay is tracked in a separate class (ConnectionIssue).

jason-bragg · 2016-11-16T23:26:36Z

src/Orleans/Async/BatchWorker.cs

+            Task<Task> waitForTaskTask = null;
+            Task waitForTask = null;
+
+            lock (this)


please avoid lock 'this', 'this' is public so you can't protect it from other locks. No good way to be confident of avoiding deadlocks.

yeah, I don't usually do this. But here, the lock is actually exposed on purpose because code that is enqueueing work uses it.

Still, I don't like it much myself. It's probably a better idea for code maintenance to use a protected member, so only subclasses get access to the lock. I will do that.

…specific and not always accurate (depending on the consistency provider)

jason-bragg

Some comments on naming and organization, but in general, it looks good.
Still need to walk through some details a little more, so will do another pass, but wanted to get these comments out for consideration.

jason-bragg · 2016-12-10T00:08:45Z

src/Orleans/LogConsistency/ILogConsistencyProvider.cs

+    public interface ILogViewAdaptorFactory  
+    {
+        /// <summary> Returns true if a storage provider is required for constructing adaptors. </summary>
+        bool UsesStorageProvider { get; }


This is currently true in all cases. Can you elaborate on when it wouldn't be?

Orleans.EventSourcing.CustomStorage.LogConsistencyProvider returns false, because it does not use a storage provider for interfacing with storage (rather it uses user-supplied read and update functions).

jason-bragg · 2016-12-10T00:43:29Z

src/Orleans/LogConsistency/IProtocolServices.cs

+    /// Functionality for use by log view adaptors that use custom consistency or replication protocols.
+    /// Abstracts communication between replicas of the log-consistent grain in different clusters.
+    /// </summary>
+    public interface IProtocolServices


I understand this interface is scoped by the namespace, but I suspect the naming is a bit too generic. Maybe IMultiClusterConsistencyService? IConsistencyProtocol? ...

Yes, that makes sense. I will go with ILogConsistencyProtocolServices, since it makes it very clear that this is related to the LogConsistencyProvider and the LogConsistentGrain classes.

jason-bragg · 2016-12-10T02:13:21Z

src/Orleans/MultiCluster/IProtocolParticipant.cs

+    /// <summary>
+    /// Grain interface for grains that participate in multi-cluster-protocols.
+    /// </summary>
+    public interface IProtocolParticipant  : IGrain  


As IProtocalService, name is a bit generic.

jason-bragg · 2016-12-10T02:19:54Z

src/OrleansEventSourcing/Common/StringEncodedWriteVector.cs

+
+namespace Orleans.EventSourcing.Common
+{
+    public static class StringEncodedWriteVector


I think it's quite possible that someone who implements a LogConsistencyProvider in a different dll might want to reuse some of the classes inside OrleansEventSourcing/Common, including this one.

… behaves deterministically instead of starting after timed delay

… remote instances even though there is just a single global instance

ReubenBond · 2016-12-14T19:42:05Z

src/OrleansRuntime/LogConsistency/ProtocolServices.cs

+
+        public async Task<IProtocolMessage> SendMessage(IProtocolMessage payload, string clusterId)
+        {
+            var silo = Silo.CurrentSilo;


Could the silo be passed through the constructor instead? Similarly for InsideRuntimeClient.Current below. If not, that's fine, I'll submit a PR for it in the coming weeks.

I started refactoring, but this quickly caused cascading changes that are likely redundant with what you are doing in the other PR, so I will leave it the way it is for now.

ReubenBond · 2016-12-14T19:43:40Z

src/Orleans/GrainDirectory/MultiClusterRegistrationStrategy.cs

+        /// <param name="mcConfig">The multi-cluster configuration</param>
+        /// <param name="myClusterId">The cluster id of this cluster</param>
+        /// <returns></returns>
+        IEnumerable<string> GetRemoteInstances(MultiClusterConfiguration mcConfig, string myClusterId);


I can't find any usages of this.

Intellisense glitch? At first I did not see any either, but now it reports 6 references. For example, in OrleansEventSourcing\Common\PrimaryBasedLogViewAdaptor.

ReubenBond · 2016-12-14T19:58:17Z

src/Orleans/LogConsistency/ILogViewAdaptor.cs

+        ///<para>Waits until all previously submitted entries appear in the confirmed prefix of the log, and forces a refresh of the confirmed prefix.</para>
+        /// </summary>
+        /// <returns></returns>
+        Task SynchronizeNowAsync();


'Now' is superfluous here, right?

Given TryAppend isn't suffixed with Async, should this and the above have the same treatment?

Yes, it is somewhat superfluous ...I included "Now" to clarify that the only reason to call this function is if we want to wait for synchronization NOW - normally, we don't call it and it still happens in the background.

I was worried that someone who is not reading the documentation may see this method in Intellisense and think they have to call it after every change (as they used to with WriteAsync). I think I will add some more description to the XML comment.

BTW, I suppose that according to the concluding remark in #2324 by @sergeybykov, I should remove all the async suffixes from all new method names.

Sounds good to me - I'm fine with SynchonizeNow and removing Async suffix from the methods.

ReubenBond · 2016-12-14T20:05:33Z

src/Orleans/LogConsistency/IProtocolServices.cs

+
+
+        /// <summary>
+        /// The id of this cluster. Returns "I" if no multi-cluster network is present.


Why not null? I would expect this value to equal Silo.ClusterId, which returns null when there is no multi-cluster configured. Similarly, I would expect ActiveClusters to return null or an empty collection.

It's more convenient and consistent for the log-consistency providers if they don't each have to do a null check every time they look at this, and then handle it in some way, but instead all use the same fixed "default" multi-cluster configuration - and this default configuration works exactly as if running on a one-cluster multicluster.

No worries 👍 do you think Silo.ClusterId could be made to return this same default value?

Yes, I think that would make sense. But I guess we would need some other way to check whether there is a MultiClusterOracle present or not.

I have changed the code, Silo.ClusterId now always returns a non-null string (the pseudo-cluster-id if there is no multi-cluster network). To check whether there is a multi-cluster network, there is now a boolean Silo.HasMultiClusterNetwork.

Do we really need both ClusterId and DeploymentId as different values? Maybe we should consolidate? DeploymentId is really just a logical name for the cluster, it's just that it typically maps to the cloud service's deployment id when deploying to it, but it's an awful name when used in a different environment, as it doesn't map to its logical intended purpose

I think DeploymentId has an Azure-specific meaning. So, I don't think they are the same. ClusterId is a logical name for the cluster (e.g., "uswest", "europe", "front1", "front2"). DeploymentId is something like "a74ff75b9bd6424ebbff3bde9643074b" and changes everytime I redeploy the service.

That's exactly why I say it's an awful name, as it implies the wrong meaning. DeploymentId is the cluster name. It's just that when you deploy to azure cloud services (not even other flavors of azure), it picks its DeploymentId as the cluster name by default (and only if unset). Nevertheless when deploying in a different way (on premises, other clouds, other azure services, yams, etc) you must set the DeploymentId to a logical name explicitly, because it is in fact the cluster id.

ReubenBond · 2016-12-14T20:08:48Z

test/TestGrains/LogConsistentGrainVariations.cs

+    {
+
+        // we use fake in-memory state as the storage
+        MyGrainState state;


Nit: the norm in this codebase is to use explicit access modifiers.

ReubenBond · 2016-12-14T20:15:27Z

src/Orleans/SystemTargetInterfaces/IProtocolGateway.cs

+    /// The  protocol gateway is a relay that forwards incoming protocol messages from other clusters
+    /// to the appropriate grain in this cluster.
+    /// </summary>
+    internal interface IProtocolGateway : ISystemTarget


I wonder if restricting this to a SystemTarget puts a bound on throughput.

yes, it probably does limit throughput, at least a little. I don't think it's too bad right now, it did not show up in my measurements as a bottleneck, at least for the configurations I tested. Though it may be worth to optimize this in the future.

I agree, and it's not a blocker.

ReubenBond · 2016-12-14T22:49:24Z

src/OrleansEventSourcing/Common/ConnectionIssues.cs

+
+
+        [ThreadStatic]
+        static Random random;


Would the static SafeRandom class be sufficient instead?

It looks like we already have two SafeRandom classes in our codebase, one static and one not. Unfortunately I don't have access to either one. But the code is quite simple even without using that class.

ReubenBond · 2016-12-15T05:21:48Z

src/OrleansEventSourcing/Common/NotificationTracker.cs

+            /// <summary>
+            /// Queue state
+            /// </summary>
+            public NotificationQueueState QueueState = NotificationQueueState.Empty;


This is marked as public, but I'm not convinced it should be - at least not for writes.

Yes, that's right. I'll make it private.

… edits on comments

ReubenBond · 2016-12-19T20:18:36Z

src/OrleansEventSourcing/Common/PrimaryBasedLogViewAdaptor.cs

+            bool first = true;
+            foreach (var e in logEntries)
+            {
+                SubmitInternal(time, e, pos++, first ? promise : null);


Should the promise be specified on the first or the last entry? I would have assumed the last, so that the caller is notified on successful submission of the entire range of entries.

It does not matter since all entries are submitted at the same time (as a batch).

… remove async suffix

…ter network is present

Conflicts: src/Orleans/Logging/ErrorCodes.cs src/Orleans/Orleans.csproj src/OrleansRuntime/MultiClusterNetwork/MultiClusterOracle.cs src/OrleansRuntime/Silo/Silo.cs test/Tester/Tester.csproj test/TesterInternal/TesterInternal.csproj

attilah · 2017-01-12T00:09:40Z

src/Orleans/LogConsistency/IProtocolServices.cs

+        /// <summary> Output the specified message at <c>Verbose2</c> log level. </summary>
+        void Verbose2(string format, params object[] args);
+        /// <summary> Output the specified message at <c>Verbose3</c> log level. </summary>
+        void Verbose3(string format, params object[] args);


Why are we putting logging specific facades here? It will make it even harder than now to move away from our own logging and it is very likely that when we move to a logging abstraction, Verbose2, 3 and customer log levels will vanish and here it would mean breaking an API contract, which makes the move really hard. We should have a better way getting around this.

The point here was to provide common log formatting for all log consistency providers.

But I agree that listing all these methods in the interface is not clean (from a software maintenance perspective).

I will instead just expose the Logger object, and use a static class to provide the common log formatting. This class would not need to be part of the interface and can be hidden away together with the provider implementations in OrleansEvent Sourcing.dll

Hmm, this did not really work... ErrorCodes are internal to runtime, so I can't format errors from outside the runtime. I don't want to create a dependency from OrleansEventSourcing.dll on OrleansRuntime.dll. And the error reporting should use the same formatting as the logging. So, for now I will keep this as it is mostly, and only remove the verbosity overloads.

…ProviderManager

sergeybykov · 2017-01-12T17:22:09Z

src/Orleans/LogConsistency/ILogConsistencyProtocolServices.cs

+        /// </summary>
+        /// <returns></returns>
+        string MyClusterId { get; }
+


Looks like comments got lost here because of file renaming.
@jdom

Do we really need both ClusterId and DeploymentId as different values? Maybe we should consolidate? DeploymentId is really just a logical name for the cluster, it's just that it typically maps to the cloud service's deployment id when deploying to it, but it's an awful name when used in a different environment, as it doesn't map to its logical intended purpose

@sebastianburckhardt

I think DeploymentId has an Azure-specific meaning. So, I don't think they are the same. ClusterId is a logical name for the cluster (e.g., "uswest", "europe", "front1", "front2"). DeploymentId is something like "a74ff75b9bd6424ebbff3bde9643074b" and changes everytime I redeploy the service.

@dom

That's exactly why I say it's an awful name, as it implies the wrong meaning. DeploymentId is the cluster name. It's just that when you deploy to azure cloud services (not even other flavors of azure), it picks its DeploymentId as the cluster name by default (and only if unset). Nevertheless when deploying in a different way (on premises, other clouds, other azure services, yams, etc) you must set the DeploymentId to a logical name explicitly, because it is in fact the cluster id.

Yes, ClusterId and DeploymentId are really synonyms. Although in the hindsight, ClusterId is probably a better name. But it's too late to change it in our existing APIs. However, if here MyClusterId is contained within the log consistency provider scope, and within the geo-distribution context ClusterId makes more sense in conjunction with the notion of multi-cluster, I don't see a problem using MyClusterId within this context. Unless I missed some other implications.

I am not strongly attached to any particular naming conventions. Perhaps we can use DeploymentId everywhere instead of ClusterId. Some questions though:

Is DeploymentId always guaranteed to have non-empty, non-null value? This is something I currently rely on for ClusterId.

For my multi-cluster Azure services samples, things get a bit confusing because the Orleans DeploymentId (e.g. uswest, europewest), as configured, would now be something different from the Azure Services Deployment Id (e.g. "a74ff75b9bd6424ebbff3bde9643074b", as in the management portal).

DeploymentId is indeed always non-null. I completely agree that ClusterId is a better name. I did not mean that we should consolidate the two in this PR, we should discuss in its own issue.
Perhaps since ClusterId is a better name, we can get rid of DeploymentId in the much bigger breaking change release (2.0), since config will probably be changed too.

Is DeploymentId always guaranteed to have non-empty, non-null value?

Yes. It is required for cluster membership to work.

For my multi-cluster Azure services samples, things get a bit confusing because the Orleans DeploymentId (e.g. uswest, europewest), as configured, would now be something different from the Azure Services Deployment Id (e.g. "a74ff75b9bd6424ebbff3bde9643074b", as in the management portal).

Here I'm confused. In Azure PaaS DeploymentId is always a GUID, isn't it?

On MSDN:
public static string DeploymentId { get; }

"Deployment IDs are opaque strings and should not be parsed or otherwise inspected because their format might change."

That's fine, we use DeploymentId in AzureUtils by default, and that simplifies connecting silos and clients. We don't try to parse it. And app code can always override DeploymentId programmatically.

For now, the only think I will change is that I use the DeploymentId as a default ClusterId if ClusterId is not specified.

For now, the only think I will change is that I use the DeploymentId as a default ClusterId if ClusterId is not specified.

Why not take it always? Why would somebody specify ClusterId separately?

I agree. I am just following @jdom's advice:

we should discuss in its own issue.

The question of how/when to consolidate ClusterId and DeploymentId should not block this PR. We already have both of them in master, ClusterId was already introduced with #1109.

…nstead of a made-up identifier

sebastianburckhardt · 2017-01-12T23:51:01Z

I think I have addressed all comments at this point.

sergeybykov · 2017-01-13T00:19:33Z

I kicked off functional tests and a load test.

If you have concerns about merging this after tests pass, please raise them soon.

galvesribeiro · 2017-01-14T00:49:18Z

🎆 weeee!

Thanks @sebastianburckhardt ! 😃

dnfclas added the cla-already-signed label Jun 17, 2016

sebastianburckhardt added this to the Multi-Cluster milestone Jun 28, 2016

sebastianburckhardt mentioned this pull request Jun 28, 2016

Geo-Distribution Project #948

Closed

jthelin added the enhancement label Jul 10, 2016

sergeybykov modified the milestones: 1.4.0, Multi-Cluster Sep 23, 2016

sebastianburckhardt force-pushed the geo-logviewproviders branch from 819b602 to 4ba9b02 Compare September 28, 2016 16:29

sebastianburckhardt added a commit to sebastianburckhardt/orleans that referenced this pull request Sep 28, 2016

squashed commit for dotnet#1854

4ba9b02

xiazen reviewed Oct 5, 2016

View reviewed changes

xiazen reviewed Oct 6, 2016

View reviewed changes

xiazen reviewed Oct 7, 2016

View reviewed changes

src/OrleansProviders/LogViews/SharedStorageProvider.cs

public Task Close()

{

return TaskDone.Done;

}

Copy link

Contributor

xiazen Oct 7, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

xiazen reviewed Oct 10, 2016

View reviewed changes

sebastianburckhardt added the Status: work-in-progress label Oct 14, 2016

veikkoeeva reviewed Nov 16, 2016

View reviewed changes

jason-bragg reviewed Nov 16, 2016

View reviewed changes

rename FetchAllEventsNow() to RefreshNow() because the former is too …

7c9008a

…specific and not always accurate (depending on the consistency provider)

jason-bragg reviewed Dec 10, 2016

View reviewed changes

sebastianburckhardt added 2 commits December 13, 2016 09:32

break activation of log view adaptor into two steps - so initial load…

08160dc

… behaves deterministically instead of starting after timed delay

add general check to catch errors where protocol messages are sent to…

7c14e2e

… remote instances even though there is just a single global instance

ReubenBond reviewed Dec 14, 2016

View reviewed changes

ReubenBond reviewed Dec 15, 2016

View reviewed changes

some renaming (LogConsistencyProtocol instead of Protocol), and a few…

5e3f603

… edits on comments

ReubenBond reviewed Dec 19, 2016

View reviewed changes

fix some access modifiers, remove an unused field, add some comments,…

9008b1b

… remove async suffix

sergeybykov mentioned this pull request Jan 7, 2017

Define a programming model/API for Event Sourcing #343

Closed

sebastianburckhardt added 2 commits January 11, 2017 14:55

Change Silo.ClusterId to return Pseudo-ID, not null, if no multi-clus…

711a9c6

…ter network is present

attilah reviewed Jan 12, 2017

View reviewed changes

fix confllict: pass provider runtime in constructor of LogConsistency…

8d4cbdc

…ProviderManager

sergeybykov reviewed Jan 12, 2017

View reviewed changes

sebastianburckhardt added 2 commits January 12, 2017 10:44

remove logging severity overloads from ILogConsistencyProtocolServices

eca92aa

use DeploymentId when creating a pseudo-multicluster-configuration, i…

641f53d

…nstead of a made-up identifier

sergeybykov merged commit 0f4ba04 into dotnet:master Jan 14, 2017

galvesribeiro mentioned this pull request Jan 16, 2017

Provide docs for Log Consistency Providers and split test #2598

Closed

sebastianburckhardt deleted the geo-logviewproviders branch January 17, 2017 18:23

github-actions bot locked and limited conversation to collaborators Dec 12, 2023


		public Logger Log { get; private set; }

		private static int counter;


		Services.Verbose("write apparently failed {0}", nextglobalstate);

		while (true) // be stubborn until we can read what is there



		/// <summary>
		/// The id of this cluster. Returns "I" if no multi-cluster network is present.

Log Consistency Providers #1854

Log Consistency Providers #1854

Conversation

sebastianburckhardt commented Jun 17, 2016

xiazen Oct 5, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiazen commented Oct 10, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

veikkoeeva Nov 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jason-bragg Nov 16, 2016 • edited Loading

Choose a reason for hiding this comment

jason-bragg Nov 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

veikkoeeva Nov 16, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

veikkoeeva left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jason-bragg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xiazen Oct 5, 2016 •

edited

Loading

xiazen commented Oct 10, 2016 •

edited

Loading

veikkoeeva Nov 16, 2016 •

edited

Loading

jason-bragg Nov 16, 2016 •

edited

Loading

jason-bragg Nov 16, 2016 •

edited

Loading

veikkoeeva Nov 16, 2016 •

edited

Loading