Skip to content

Support Erlang's map type (coming in 17.0) #15

Open
devinus opened this Issue Mar 4, 2014 · 25 comments

7 participants

@devinus
devinus commented Mar 4, 2014

No description provided.

@hdima
Owner
hdima commented Mar 11, 2014

Thanks, I'll investigate it.

@okeuday
okeuday commented May 1, 2014

@hdima I was wondering if you would want to combine efforts for the support of the Erlang binary term format in python and ruby. I have https://github.com/CloudI/CloudI/blob/master/src/api/python/erlang.py and https://github.com/CloudI/CloudI/blob/master/src/api/ruby/erlang.rb which I like as a single module focused on Erlang binary term format encoding and decoding. I need to add MAP_EXT, ATOM_UTF8_EXT, and SMALL_ATOM_UTF8_EXT, but these changes would be better as a common dependency to avoid duplication of effort. I was planning on adding both modules to separate repositories which could help this effort.

@okeuday
okeuday commented May 1, 2014

@hdima I have modifications for MAP_EXT, ATOM_UTF8_EXT, and SMALL_ATOM_UTF8_EXT at https://github.com/okeuday/erlang_rb and https://github.com/okeuday/erlang_py . I know tests are lacking, but it would be nice to consolidate efforts.

@devinus
devinus commented May 1, 2014

@okeuday Those packages on RubyGems and PyPI would go a long way. Simple an easy, which is just what I was looking for.

@zugolosian

+1 to @okeuday term format integration

@vans163
vans163 commented Feb 27, 2016

+1 for this, really needed.

@okeuday
okeuday commented Feb 28, 2016

@devinus https://rubygems.org/gems/erlang_rb/ and https://pypi.python.org/pypi/erlang_py/ have existed for awhile. I have been using the testing from erlport for the various Erlang binary term format implementations which have grown (https://github.com/okeuday/erlang_rb , https://github.com/okeuday/erlang_py , https://github.com/okeuday/erlang_pl , https://github.com/okeuday/erlang_js , https://github.com/okeuday/erlang_php) but the silence on this issue appears to indicate that erlport doesn't want to change its Erlang binary term format source code right now.

@devinus
devinus commented Feb 28, 2016

Thanks @okeuday these are great projects.

@vans163
vans163 commented Feb 28, 2016

@okeuday Yea thanks, your erlang_py worked perfect. No issues.

@beano
beano commented Mar 19, 2016

I've looked into implementing this for Python. The encoding and decoding routines were easy to to implement but the hashing restrictions in Python don't really match up with Erlang.

For obvious reasons, the following constructs are legal Erlang but illegal in Python:

#{[] => []}
#{#{} => #{}} 

To implement this in Python I think we will need to return frozen types for lists and maps/dicts. We can accept mutable constructs for the encoding routines as well as their immutable/frozen versions. This will be an API change. @hdima , @okeuday, @devinus - what do you think?

@okeuday
okeuday commented Mar 19, 2016

@beano For lists, it seems best to use an object due to the possibility of the Erlang list being improper (when doing binary_to_term, e.g., https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L401-L409). That approach doesn't prevent you from handling a list without an object (when doing term_to_binary, e.g., https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L575-L576). The main problem should be using a frozendict instead of a dict, which is a good solution. I like the implementation at http://code.activestate.com/recipes/414283-frozen-dictionaries/ when using all the comment modifications, and it would be safe to add to the erlang.py implementation since this frozendict inherits from dict. One problem in the erlang.py code is that the binary() function and the hash() function don't have their values cached, so that might be a consideration. So this information isn't focused on erlport, but same concern. (https://github.com/okeuday/erlang_py/blob/5369bf7a0c73705a3c33f92ba60b808fe780c58e/erlang.py#L468-L473 shows the modification)

@beano
beano commented Mar 20, 2016

@okeuday - thanks for the hints. I've taken them on board and things are progressing pretty well. I still have tests to add write etc. I was going to try to do ruby support but I think my ruby knowledge isn't going to be strong enough. If my changes are accepted would somebody be willing to do the equivalent in ruby?

@okeuday
okeuday commented Mar 20, 2016

@beano As long as a Hash is being used in the Ruby source code for storing the Erlang map there shouldn't be a problem requiring changes (i.e., doesn't seem like Ruby would have a similar problem with any type usage).

@beano
beano commented Apr 3, 2016

@okeuday @hdima - I've implemented this for Python2.7. Doing this correctly is a little involved especially as I've avoided breaking backwards compatibility wherever possible.

The decoding routines will always return immutable types. This prevents users from accidentally modifying the returned structures and breaking hash guarantees. I've not optimized anything at this stage.

Most of the unit tests pass without modification. The single exception was opaque objects - dicts were treated as opaque - we now treat them as maps.

Main change:
beano@3e09843

Change for OpaqueObjects:
beano@652f273

Could you please provide feedback before I do Python3 support?

@scohen
scohen commented Apr 7, 2016

I'm interested in this as well, how can I help?

@beano
beano commented Apr 8, 2016

@scohen - I'm not a rubyist so if you want to do Ruby support that would be great! I looked into it and I think the change is a bit easier than in Python. Ruby's hashes and lists are hashable - but using mutable hash keys has the sorts of odd effects you'd expect. Ruby doesn't appear to protect/prevent users from accidentally hashing mutable types. However, Ruby does support freezing lists and hashes which may be a good idea here. I'm not sure that this is necessary as rubyists may already have an expectation that modifying mutable keys is permissible but dangerous. Again, I'm not a rubyist so please feel to ignore this comment if I am mistaken.

I'll try to get Python3 support committed over the weekend.

@beano
beano commented Apr 11, 2016

@hdima, @scohen, @okeuday Python3 support is complete in my fork.

@vans163
vans163 commented Apr 18, 2016 edited

Going to test this. thanks @beano

Just tested. Can you open up issues on your fork beano? I seem to get an OpaqueObject if I pass a map to python 2.7 with your fork.

#{"key"=>5}
OpaqueObject('t\x00\x00\x00\x01m\x00\x00\x00\x03keya\x05', Atom('erlang'))

#Something like this works:
decode_term(map.data)[0]["key"]

But I thought the point was to have it directly as a dictionary in python without anything extra?

@beano
beano commented Apr 19, 2016

@vans163 - thanks for your help. I forgot to commit an erlang change. Could you please test again?

@vans163
vans163 commented Apr 19, 2016

@beano. Thanks. Working :)

@beano
beano commented Apr 19, 2016

@vans163 - Excellent. Thanks!

@beano
beano commented Apr 21, 2016

@hdima , @scohen, @okeuday, @vans163 - one interesting possibility that this change opens up is support for keyword arguments.

Something like:

Json = python:call(Pid, 'json', 'dumps', [[1, 2, 3]], #{indent => 4}).

This will reduce the need for erlang shim functions and other workarounds when dealing with python functions that take keyword arguments.

I've done a proof of concept and I think that this can be implemented in a minimally invasive way.

What do people think of this idea?

@vans163
vans163 commented Apr 21, 2016 edited

@beano All my code is shims which motivate me everyday I see them to rewrite in Erlang.

Reminder

Joking aside (pun intended). How would the python side look like?

python:call(Pid, 'test', 'named_args', [], #{indent => 4, timeout => 20000}).
def named_args(indent=0,page=1,timeout=10000)

Something like that I imagine?

@beano
beano commented Apr 21, 2016

@vans163 - exactly. The list would contain positional args and the map would contain keyword args.

@vans163
vans163 commented Apr 22, 2016

@beano, If all old functionality is preserved and nothing breaks I don't see why not. Maybe even do it so that you can do:

python:call(Pid, 'test', 'named_args', #{indent => 4, timeout => 20000}).

and check for is_map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.