Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advice for adding an heir #11

Closed
edescourtis opened this issue Nov 9, 2018 · 16 comments
Closed

Advice for adding an heir #11

edescourtis opened this issue Nov 9, 2018 · 16 comments

Comments

@edescourtis
Copy link
Contributor

edescourtis commented Nov 9, 2018

Hi,

I want to add an heir to my :pobox so that another process in case of failure can inherit it. I need some advice about how to implement various aspects of this. I would also like to get your thoughts or suggestions before I start implementing it.

The first issue I see is that the start_link argument list is getting long. For example, could I provide a start_link(Opts) and start_link(Name, Opts) that would have a map of options. Do we need to support versions of Erlang older than 17 and is that a good way of dealing with this?

What about the format of the message sent when the heir is to take over, what fields would be included?

Any advice, suggestions, concerns would be much appreciated.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 9, 2018

Adding comments from README to discussion:

This is more a wishlist than a roadmap, in no particular order:

  • Implement give_away and/or heir functionality to PO Boxes to make them behave like ETS tables, or possibly just implementing controlling_process to make them behave more like ports. Right now the semantics are those of a mailbox, provided nobody unlinks the POBox from its owner process.
  • Provide default filter functions in a new module

Should we behave like ports or more like an ETS table? I think the ETS heir/give_away seems more appropriate since I want this for cases where my process fails.

@ferd
Copy link
Owner

ferd commented Nov 10, 2018

A map argument for the header would be neat; we could conditionally export it based on maps availability in the language as long as use maps module functions to access them rather than the literal syntax.

As for the message format, I would look at what ETS does. For the give_away/3 function it essentially sends {'ETS-TRANSFER',Tab,FromPid,GiftData} as a message, so we could do {pobox_transfer, Tab, FromPid, Data}. For the heir functionality, the same message format is used.

Do note the following in the docs:

The process Pid must be alive, local, and not already the owner of the table. The calling process must be the table owner.

Notice that this function does not affect option heir of the table. A table owner can, for example, set heir to itself, give the table away, and then get it back if the receiver terminates.

I think we wouldn't have to enforce the locality of the pid (since we do it by hand and it's not a BIF/NIF limitation, but we should respect the liveliness and the owner. It would also make sense that if we allow both a give_away and a heir, we respect the same policies as ETS.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 13, 2018

Did you know that :gen_statem captures the {'EXIT', Parent... message and calls terminate directly?

See here

Well I guess I will do some really ugly stuff with enter_loop in terminate unless you have a better idea.

@ferd
Copy link
Owner

ferd commented Nov 14, 2018

ah yeah all OTP processes capture it directly. They can still call terminate if trap_exit is set, but that won't cut it for this since it would drop all active traffic. Changing the values in the pdict might work, but that's overly hackish, and might only make sense with a synchronous give_away call.

Otherwise setting a heir might work better if the POBox unlinks itself from its parent and replaces it with a monitor, which would then allow to handle things right (no auto-triggered EXIT, but the POBox user won't die if the POBox dies, which isn't ideal). We'd also need to link ourself to the new heir once the parent has triggered a 'DOWN' message.

@edescourtis
Copy link
Contributor Author

I have something like this that seems to work:

terminate(_Reason, _StateName, S=#state{owner = Owner, heir = Heir, heir_data = HeirData, name = Name}) ->
    case send_ownership_transfer(Owner, Heir, HeirData) of
      {ok, HeirPid} ->
        erlang:unlink(Owner),
        erlang:link(HeirPid),
        case Name of
          undefined ->
            put('$ancestors', [HeirPid]),
            gen_statem:enter_loop(?MODULE, [], passive, S#state{owner= HeirPid, heir=none, heir_data=none});
          DefinedName ->
            put('$ancestors', [HeirPid]),
            gen_statem:enter_loop(?MODULE, [], passive, S#state{owner= HeirPid, heir=none, heir_data=none}, DefinedName, [])
        end;
      {error, no_process} ->
           ok
    end.

I can do a monitor instead if you really think this is too hacky. The benefit of course is that if we crash the link will function as expected.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 14, 2018

See #13 for the current status of implementation.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 14, 2018

Heir functionality:

eric@eric-desktop:~/dev/pobox$ rebar3 shell
===> Verifying dependencies...
===> Compiling pobox
Erlang/OTP 20 [erts-9.3.3.5] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V9.3.3.5  (abort with ^G)
1> H = erlang:spawn(fun F () -> receive X -> io:format("~p~n", [X]) end, F() end).
<0.108.0>
2> {ok, Pid} = pobox:start_link(#{heir => H, heir_data => foofoovalve}).
{ok,<0.110.0>}
3> sys:get_state(Pid).
{notify,{state,{buf,queue,100,0,0,{[],[]}},
               <0.105.0>,undefined,undefined,<0.108.0>,foofoovalve,
               undefined}}
4> 1 = 2.
{pobox_transfer,<0.110.0>,<0.105.0>,foofoovalve}
** exception error: no match of right hand side value 2
5> sys:get_state(Pid).
{passive,{state,{buf,queue,100,0,0,{[],[]}},
                <0.108.0>,undefined,undefined,none,none,undefined}}
6> 

Give away functionality:

eric@eric-desktop:~/dev/pobox$ rebar3 shell
===> Verifying dependencies...
===> Compiling pobox
Erlang/OTP 20 [erts-9.3.3.5] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:0] [hipe] [kernel-poll:false]

Eshell V9.3.3.5  (abort with ^G)
1> H = erlang:spawn(fun F () -> receive X -> io:format("~p~n", [X]) end, F() end).
<0.103.0>
2> {ok, Pid} = pobox:start_link(#{}).
{ok,<0.105.0>}
3> pobox:give_away(Pid, H, foofus, 5000).
{pobox_transfer,<0.105.0>,<0.100.0>,foofus}
true
4> sys:get_state(Pid).
{passive,{state,{buf,queue,100,0,0,{[],[]}},
                <0.103.0>,undefined,undefined,none,none,undefined}}
5> 

@edescourtis
Copy link
Contributor Author

Today I will be looking for ways around map syntax and seeing what modifications would be required for monitor instead of link without changing the behaviour of pobox. Possibility we need to be more careful about running external code in pobox since we are essentially an error kernel the second we switch to monitor.

@ferd
Copy link
Owner

ferd commented Nov 15, 2018

Sounds good. I'll try to set time aside to review the PR.

@edescourtis
Copy link
Contributor Author

@ferd I just pushed my changes for today. It's not completely done yet but I would say it is most of the way there.

@edescourtis
Copy link
Contributor Author

The main issue to resolve is do we transfer ownership when the owner terminates normally?

@ferd
Copy link
Owner

ferd commented Nov 16, 2018

A heir would generally mean that no matter which reason killed the owner, it gets transferred anyway. Any exit signal from the owner should count the same.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 16, 2018

I guess we have it set up that way already.

How can we be convincing about having covered all of the cases? Do you have suggestions on property based tests we could run on this?

I am still new the property based testing. We could compare bahviour with ETS.

@ferd
Copy link
Owner

ferd commented Nov 16, 2018

Properties are not necessarily critical here. The thing that I think would make sense to test the most is ensuring that transferring the ownership always resets to a passive mode.

A good property test would probably be one that is stateful or a state-machine property. We should have it simulate all the operations we want:

  • send a message to pobox
  • looking at the state of pobox (passive, notifying, active)
  • look at the owner's mailbox (what is in there?)
  • setting the pobox state (notify vs. passive vs. active)
  • starting a heir and setting it
  • starting a heir and giving the pobox away
  • killing the owner when a heir is present

through each of these we have to define invariants we want to maintain in the model's state: who has seen what messages, what are they, what are the states we can check? The property is going to run the list of 7 commands above in a random order and we will have to be able to predict the outcome of each operation.

The weakness of that property would be in the case of an active pobox where we send a message and instantly die; it is possible that pobox will forward the message to a process that already died before noticing about it (especially if it is severely backlogged) and so this will ruin model predictability. As such, it might make sense to only test the notify and passive modes with such a property.

Do note that this property would ignore everything related to message drops and metrics; it is purely about validating behaviour around transfers. Dropping and metrics could be tested in a distinct property.

If you feel like trying your hand at that, let me know whatever kind of help you might need, but I wouldn't block the PR on having this feature around unless you wouldn't feel safe about it without the tests.

@edescourtis
Copy link
Contributor Author

Thanks for the input. I will work on it. I am going to be deploying this into production in the not too distant future which is why I need to test this as much as possible.

By the way, check out the travis build it looks like we don't support anything older than Erlang 19 because of gen_statem. I might remove prop list support; also I might remove checks for map support.

@edescourtis
Copy link
Contributor Author

edescourtis commented Nov 22, 2018

I have addressed the remaining issues.​ Feel free to merge the PR when you are ready.

@ferd ferd closed this as completed Dec 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants