Unit testing of grains #48

Closed
sergeybykov opened this Issue Jan 29, 2015 · 24 comments

Projects

None yet

10 participants

@sergeybykov
Member

@yevhen and @kspeakman expressed some ideas for a better approach to unit testing grains. Let's discuss the ideas here.

@jthelin jthelin added the help wanted label Jan 29, 2015
@kspeakman

Gist of our conversation is that grains can't easily be tested in isolation. Good reference material here.

@sergeybykov
Member

From some of our customers' experience, the traditional approach to unit testing grain classes isn't very useful and may be even misleading because it's not easy to mimic the execution environment of a silo outside of the silo environment: turn-based execution of grain logic and external calls, timer and reminders, etc. The approach that we've been using and suggesting is to run such tests within a silo (shown in the Tester project) solves these problems, but obviously is not for everyone's taste of unit testing. It would be great to have a robust alternative.

@kspeakman

@sergeybykov I don't know why they would need to mimic the execution environment for unit tests. For integration tests with timers etc., sure. I noticed most of the examples have dependencies which are (effectively) tightly coupled; e.g. grains fetching and calling other grains. If that is typical usage, then I can see how it would be hard to test outside the silo. However, externalizing dependencies should help alleviate that. I assume it's possible (in the Orleans runtime) to pass dependencies (e.g. other grains) as method parameters? If so, different practices (reflected in the examples) could make the testing story more palatable. Testing is already not done enough, and adding knowledge dependencies (e.g. Orleans runtime, prefix injection) to it isn't going to help.

@sergeybykov
Member

@kspeakman Don't get me wrong - we are all for having an alternative approach. It would be great to have a way to easily externalize dependencies and test grain logic in isolation. And yes, grain references are first class entities - can be passed as method arguments, saved to storage, etc.

I'm only sharing our prior experience with people trying to test their grain implementation as just ordinary classes and getting surprised that the behavior in production was different. In one example a fan-out call to multiple targets was being executed sequentially in a deterministic order in their test environment, while in production (in a silo) those calls were being executed in a non-deterministic order, because of the different task scheduler. So I think the new unit test infrastructure needs to be aware of such subtleties.

@kspeakman

Hmm, I wouldn't naturally expect a fan-out call to execute in any particular order (even on the default scheduler, given enough tasks). But I don't know their code. I'm sure there are reasons. And integration testing is definitely desirable for a number of circumstances.

I probably don't fully grasp the impact of some of the other things you mentioned and why integration tests would be needed for them. I will just have to play with it so I can reason about it more intelligently. My perspective is from smaller but logically similar systems. I never got around to doing a lot of the cool robustness things you guys have done.

@jkonecki
Contributor

Although I started with the view that grains should be testable in isolation outside of the grain, Hoop's comment during the second Virtual Meetup made me reconsider my position. He stated that they treat Orleans framework the same way as .Net framework - it's the runtime for your code and there is little point in trying to cut it out from tests.

If silo creation doesn't hugely impact the running time of the test than I would be ok with that.

@kspeakman

@jkonecki Well, that's one way to look at it. But you also have to look at Hoop's perspective. A 3 month project is basically disposable code which could be rewritten next cycle if it didn't work out well enough. What kind of testing emphasis would you put on potentially disposable code?

@jkonecki
Contributor

Taking into account the exposure I wouldn't call Halo services a disposable
code... ;-)

I'm still sitting on the fence whether to treat Orleans as an
implementation detail for the domain or as a first class concept in the
domain.

IMHO the question boils down to if one wants to be able to replace Orleans
with something else and keep the domain intact.

If there is a way to abstract actor framework in order to achieve it, I'm
all for it. If it would require a considerable amount of work, than maybe
the cost/benefit ratio doesn't justify it.

On Sat, 31 Jan 2015 17:54 Kasey Speakman notifications@github.com wrote:

@jkonecki https://github.com/jkonecki Well, that's one way to look at
it. But you also have to look at Hoop's perspective. A 3 month project is
basically disposable code which could be rewritten next cycle if it didn't
work out well enough. What kind of testing emphasis would you put on
potentially disposable code?


Reply to this email directly or view it on GitHub
#48 (comment).

@jthelin jthelin added the enhancement label Feb 2, 2015
@MovGP0
MovGP0 commented Feb 14, 2015

❗️ Having the tests run within a Silo is not Unit Testing, but Integration Testing. That's a big difference!

In Unit Testing, only the user code of the Grain should be tested. The infrastructure in GrainBase should not be required in order to test most of the behavior. The only exception is when the user code overrides or calls methods of the base class - in that case the GrainBase needs a setup with test objects. External infrastructure may not be used at all in Unit Testing.

To enable Unit Testing, the dependencies on all other classes with business logic needs to be resolved using Dependency Injection.

see also: Issue #39 - Support for dependency injection.

@sergeybykov
Member

This terminology debate has happened more than once in the past. :-) Whatever we call it, unit or integration testing, this - running tests within silos - has been the primary means of testing the Orleans codebase so far. It seems to me that DI would be useful even in that setting. The fact that DI will help folks that wish so to test their grain code in isolation is only goodness. We are all for it. Just please excuse us upfront for referring to our existing tests as "unit tests" because that's how we used to call them for so long. :-)

@jkonecki
Contributor

I am of the opinion that in an actor system an actor is a unit. Therefore
the test of a single actor / grain is the smallest unit test one can write.

Since each actor is an independent entity that can communicate with other
actors only by sending messages which are processed asynchronously, a test
that spans multiple actors / grains would be an integration test.

Integration tests that obviously require communication between actors could
execute inside a silo since we want to test the real world communication
between actors provided by the framework.

Unit tests of a single actor should be possible without the silo provided
that we can instanciate the actor outside of the silo and be able to mock
any framework specific dependencies, ie. sending messages to other actors,
registering reminders, etc.

Do you think that integration tests that span multiple actors should be
possible to execute outside of the silo?

On Tue, 17 Feb 2015 22:41 Yevhen Bobrov notifications@github.com wrote:

Unit Test, System Test, Red Test, Green Test
http://www.gilzilberfeld.com/2015/02/unit-test-system-test-red-test-green.html


Reply to this email directly or view it on GitHub
#48 (comment).

@MovGP0
MovGP0 commented Feb 18, 2015

I agree that a single actor is a unit. As soon as other objects, besides test objects (Dummy, Mock, Fake, Stub, Shim, Spy), get involved, like other actors or infrastructure, it is an Integration Test.

I think the terminology confusion arises from the fact that we use Unit Testing Frameworks to write Integration Tests.

However, the smallest testable thing is not necessarily an actor. An unit test might only test a single method, delegate, function, property, value, etc., which is not necessarily part of an actor. Especially in functional languages like F#.

@mistakenot

+1 "If silo creation doesn't hugely impact the running time of the test than I would be ok with that."

Could there be anyway of having a "lite" start up mode that reduces time by cutting out services that aren't required for a testing system with only a single silo instance and no cold persistence?

@gabikliot
Contributor

Yes, a single silo with membership grain for liveness instead of Azure Table for liveness and Reminder grain for reminders instead of Azure Table for reminders and no storage providers or no stream providers loaded should be a pretty minimal setup. If you are interested, I can document the exact config file.

You can start with this minimal setting and if that is still not good enough (we do think it is pretty fast), can look at how to optimize it further on.

@mistakenot

I would be very interested in having a look at a config file for that, thank you. It would probably be helpful if I had more of an idea where Orleans spent most of its time when booting up, but at the moment the default out-of-the-box config file usually takes about 20-30s to boot, which is a bit of a drag for rapid iterations of tests.

@gabikliot
Contributor

@mistakenot,here is the config file: https://github.com/dotnet/orleans/blob/master/src/SDK/OrleansConfiguration.xml
It's the same file that is used in samples that use the local silo SDK.
It is also documented here: http://dotnet.github.io/orleans/Orleans-Configuration-Guide/Typical-Configurations as part of Local Development.

The things to watch:

  1. Don't deploy grain dlls that you don't use. Make sure you only have the dlls you need (grains and others) in the folder from which you are starting the silo (in SDK it is Microsoft Project Orleans SDK v1.0\SDK\LocalSilo folder and Microsoft Project Orleans SDK v1.0\SDK\LocalSilo\Applications folder).
  2. if you are running silos and client all in one process in different app domains, like done in Unit Test host and in Hello World sample, again make sure only the right dlls are present in the folder. Since we reflect upon all dlls upon silo startup to find grains, having redundant dlls may take extra time.

If you find any bottlenecks or things that could be improved in the silo startup sequence, it would be greatly appreciated. I think our current experience that starting a silo takes about 10-15 seconds.

@veikkoeeva
Contributor

@mistakenot I made a few quick profiler runs and for my setup about 60% of the time is spent in SiloHost.StartOrleansSilo() and a ca. 26% on loading new AppDomain instances. I didn't profile rigorously, but it should be indicative. The code I profiled build a primary silo, two secondary ones, calls a grain and then tears down the silos. My code is based on that in the testing project and I'm loading silos with AppDomain.CreateDomain (I like to additional control), but if you any of the wrappers, they are thin ones and time spent them is neglible.

@sergeybykov, @gabikliot and others correct me, but here's what I think.

For a reason or another, I didn't got symbol data from Orleans, but here's (I think) what happens when the silos load. From the aforementioned StartOrleansilo() call the control flow is quickly transferred to Silo.DoStart(). There might be some benefit to be had for running the start methods in parallel, as indicated in the comments (is there a typo regarding the number of methods?), but the real benefit comes from rethinking from the line where there's typeManager.Start(). There is this var loader = new SiloAssemblyLoader(); and then on the various methods defined in SiloAssemblyLoader are used.

A couple of ideas:

  1. Allow the type information be loaded to the silo explicitly.
  2. Make the cache public and serializable (i.e. it could be persistent with a serializer).
  3. Make cache initialization/probing function a public one, or phrased other way, make a probing function signature a public one and provide one, the currently used, as a default.

Points 1. and 2. would make it possible to share cached type info in local test setting and provide it quickly to new silos. The same type data could be used to speed up client side discovery too, I think. Point 3. would allow one to cache the found type info (paths to Assemblies?) and skip the time-consuming detection process. Specifying further, if loading persistent type data failed, one could fall-back to "full scan" as done currently (part of probing function signature?). There are some possibilities in the reflection code itself too (is GAC probed even when there's no key token?).

A plan like this, I think, would involve making TypeManager and SiloAssemblyLoader public and refactoring their public interface accordingly and then allow them to be added through a constructor/property to the appropriate consumer classes.

If this is doable, I believe this would make testing this functionality a bit easier and perhaps would make some way to hotswapping code (which, though, is an altogether different beast).

<edit: Thinking this further, it would make sense to divide the functionality so that it'd be possible to point to the location of assemblies to load with a probing function, load assemblies from an arbitrary location, e.g. file share, blob store etc. via loader function and then cache the results locally. This cache would be the interface used by Orleans and if no caching is present (e.g. the functionality (Func?) that defines how the cache works) they are just always reloaded from the given source.

@gabikliot
Contributor

Hi @veikkoeeva. These are some good ideas. Some things could be done even easier.

  1. Starting multiple silos and client can be done in parallel, on multiple threads. The test just needs to wait for all them to start. Staring silos in parallel will work just out of the box, with zero extra effort. For client, if its configured with a static list of gateways, which is the default in test settings, the last stage of client startup will need to wait for gateway silos to be up first (in Azure client host this is done automatically, but not in test client host). So there will need to be a tiny change in the client startup logic, very minimal. And the majority of the client startup sequence will still go in parallel.
  2. Back to a single silo startup. There are definitely some places in the DoStart that can be parallelized (the ones you pointed out). This is where good profiling will really help, to find out first how much it can buy us.
  3. As you pointed out assembly loading is the biggest time. We can do something relatively easy here: load assemblies in parallel. They will need to synch on the shared data structures, but if indeed this is the majority of the startup time, parallel loading can really speed it up.
  4. Looking into app domain and why it takes 26% like you wrote could be another place.

And I think extensibility of plugging in a new assembly loading logic, in the DI style discussion we had before, is a separate interesting issue, but unrelated to speeding up startup I think.

@veikkoeeva
Contributor

@gabikliot Do you mean not even the primary silo needs not to be started before the secondary ones? I haven't actually tried this. I dispose the secondary silos in parallel and the primary one as the last one, because if I dispose all of them in parallel, several exceptions are thrown, catched, reconnection attempts etc. prolonging the system teardown duration significantly compared to this two-phase teardown of domains.

Do I understand correctly: a minor refactoring change would allow the client gateway started in parallel with starting the primary silo? Does the primary silo still need to be loaded before others? A related note, for clarity here, if the primary needs to be loaded before otehrs, being able to share the already found type information would do wonders. Having the possibility to re-read it from a persistent cache would make it even better, but as you allude, it would require more refactoring and is more on the to-do list.

While poking the reflection assembly (redirection) binding bug (FYI @mistakenot, fixed since long time ago), it mattered in which order the assemblies were handled. It was about the non-redirected assembly loaded before the redirected one, then an exception was thrown, catched, and execution continued which hindered loading redirected assemblies. I may remember this one wrong and the functionality has been refactored, @sergeybykov can weigh in here, I believe. Otherwise you are spot on, parallel assembly handling would shorten the wall clock time significantly. It would do wonders to the development flow.

Actually now that I ponder this here more, one of the stated goals is ease of getting the system running, and maybe there could be a note in the example configurations that one may need to add some firewall configurations and access controls. They are a somewhat of a nuisance in many "enterprise scenarios", I have experienced. Perhaps just something simple as adding

<!-- You may need to execute the following commands to reserve static ports to Orleans and allow
       traffic through the firewall.
-->
<!-- netsh http add urlacl url=http://*:22222/ user=DOMAIN\user -->
<!-- netsh advfirewall firewall add rule name="Orleans CI Test Cluster" dir=in action=allow protocol=TCP localport=22222 -->

What do people think, should I or someone else having a few minutes free time go ahead and these to the configurations?

@veikkoeeva
Contributor

@gabikliot

  1. Looking into app domain and why it takes 26% like you wrote could be another place.

I phrased this one poorly. This includes a call to Silo constructor.

type SiloHost(siloName, siloType, clusterConfig, appDomainSetupProvider) =        

    /// If this host has been disposed or not.
    let mutable disposed = false

    let loadSiloInNewAppDomain (siloName:string) (siloType:Orleans.Runtime.Silo.SiloType) (clusterConfig:ClusterConfiguration) (appDomainSetupProvider:unit -> AppDomainSetup) =
        let appDomain = AppDomain.CreateDomain(siloName, null, appDomainSetupProvider())
        let args:obj[] = [|siloName; siloType; clusterConfig|]        
        let silo = appDomain.CreateInstanceAndUnwrap("OrleansRuntime", typeof<Silo>.FullName, false, BindingFlags.Default, null, args, CultureInfo.CurrentCulture, Array.empty) :?> Silo
        silo, appDomain

    let silo, siloAppDomain = loadSiloInNewAppDomain siloName siloType clusterConfig appDomainSetupProvider
@gabikliot
Contributor

minor refactoring change would allow the client gateway started in parallel with starting the primary silo?

Yes, @veikkoeeva , this is correct.

Does the primary silo still need to be loaded before others?

No, it does not. All silos can start in parallel.
I just prepared a new PR #359 that slightly modifies our unit test host to allow starting silos in parallel. I will see later on if I can also do the "minor refactoring change would allow the client gateway started in parallel".

Just to stress the obvious, if someone ever reads this issue without a context:
Orleans of course does not require any special seed or primary nodes in a real production setting. Primary/Seed node is only used in dev setup, where we emulate Azure storage with a grain. In a real production setup all silos are equal and symmetric, and do not have any single point of failure (as opposite to some other actor model implementations which do require special seed nodes in production). And thus all silos in Orleans can start in parallel, without requiring any special startup order.

@nehmebilal
Contributor

I proposed some changes to the Grain class (#362) that makes it possible to Unit test a grain without starting a Silo. Please let me know what you think.

@gabikliot
Contributor

Parallel silos start in Unit Test was checked in #359. If someone could look at parallel assembly loading that would be great!

@gabikliot gabikliot removed the help wanted label Oct 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment