Creating derived fields with lambda functions #3434

forrestglines · 2021-07-13T21:06:28Z

Bug report

Bug summary

Adding fields using a lambda function fails to add the field to a dataset. Trying to access created fields that use a lambda function results in a YTFieldNotFound exception.

Code for reproduction

import yt

ds = yt.testing.fake_amr_ds(fields=["density"],units=["g/cm**3"])

for i in range(3):
    ds.add_field(
            name=("gas", f"field_{i}"),
            function=lambda field,data : i*data["density"],
            sampling_type="local",
            units="g/cm**3"
            )

for i in range(3):
    name = ("gas",f"field_{i}")
    print(f"{name} in ds.field_info:", name in ds.field_info) #All false

Actual outcome

yt : [INFO     ] 2021-07-13 16:50:44,250 Parameters: current_time              = 0.0
yt : [INFO     ] 2021-07-13 16:50:44,250 Parameters: domain_dimensions         = [32 32 32]
yt : [INFO     ] 2021-07-13 16:50:44,250 Parameters: domain_left_edge          = [0. 0. 0.]
yt : [INFO     ] 2021-07-13 16:50:44,250 Parameters: domain_right_edge         = [1. 1. 1.]
yt : [INFO     ] 2021-07-13 16:50:44,250 Parameters: cosmological_simulation   = 0.0
('gas', 'field_0') in ds.field_info: False
('gas', 'field_1') in ds.field_info: False
('gas', 'field_2') in ds.field_info: False

None of the fields defined with lambda functions get added to field_info. No errors or exceptions are reported by add_field. But when I try to access the supposedly created fields, I get a YTFieldNotFound exception.

Expected outcome

Lambda functions should work with add_field like other function objects. Or it should be documented that they don't work.

I remember running into this issue once before several years ago when I was just starting out with yt, it might have existed for a while.

Version Information

Operating System: Linux
Python Version: Python 3.9.5
yt version: Fork of 4.0.dev0, installed from source
Other Libraries (if applicable):

The text was updated successfully, but these errors were encountered:

welcome · 2021-07-13T21:06:29Z

Hi, and welcome to yt! Thanks for opening your first issue. We have an issue template that helps us to gather relevant information to help diagnosing and fixing the issue.

matthewturk · 2021-07-14T12:08:55Z

Hi @forrestglines -- can you verify that if the field is defined as a function (potentially in its own closure), it does get added?

forrestglines · 2021-07-14T15:17:28Z

Yes - I tried out a few ways of adding the fields

#Add the fields with functions defined in loop - works
for i in range(3,6):
    def _field_i(field,data):
        return np.full_like(data["density"],i)
    ds.add_field(
            name=("gas", f"field_{i}"),
            function=_field_i,
            sampling_type="local",
            units="g/cm**3"
            )
#Add fields
def _field_6(field,data):
    return np.full_like(data["density"],6)

ds.add_field(
    name=("gas", "field_6"),
    function=_field_6,
    sampling_type="local",
    units="g/cm**3"
        )

for i in range(7):
    name = ("gas",f"field_{i}")
    print(f"{name} in ds.field_info:", name in ds.field_info)

    if name in ds.field_info:
        print(f"  {name} mean = {ds.all_data()[name].mean()}")

('gas', 'field_3') in ds.field_info: True
  ('gas', 'field_3') mean = 3.0 g/cm**3
('gas', 'field_4') in ds.field_info: True
  ('gas', 'field_4') mean = 4.0 g/cm**3
('gas', 'field_5') in ds.field_info: True
  ('gas', 'field_5') mean = 5.0 g/cm**3
('gas', 'field_6') in ds.field_info: True
  ('gas', 'field_6') mean = 6.0 g/cm**3

These all work. However, defining the functions within the for-loop works as expected in this example but in my analysis code leads to them all sharing the last added function (i.e. they would all print 5.0 in the example shown here, not all different and not all 6.0). I haven't been able to reproduce that behavior in a sharable snippet.

neutrinoceros · 2021-07-15T09:09:17Z

Looks like a bug to me. I'm not 100% sure it that it should work with lambdas (or generally with non-function callables), though I would be inclined to think so. On the off chances that there's a good, deep reason not to add support for it, it should at the very least produce a clear error message.
I note that there isn't a single example of this in the code base or documentation so if anything it doesn't look like it was ever actively pushed.

matthewturk · 2021-07-15T13:01:54Z

My guess is that there's something yt does with the function handles that is not available to lambdas.

To make the functions work the way you're wanting to (which I recognize is a stopgap) you can try adding a layer of enclosure to them. For instance,

def _make_field(n):
    def _field_i(field,data):
        return np.full_like(data["density"],n)
    return _field_i

for i in range(3,6):
    ds.add_field(
            name=("gas", f"field_{i}"),
            function=_make_field(i),
            sampling_type="local",
            units="g/cm**3"
            )

That should do it.

neutrinoceros · 2021-07-15T14:14:51Z

yes this is how it is done internally.
I couldn't find a solution to this, and it looks like it's even hard to distinguish a function from a lambda at runtime, but there is a way to do it, so I'll use that to explicitly forbid lambdas in a PR.

forrestglines · 2021-07-15T14:46:08Z

To make the functions work the way you're wanting to (which I recognize is a stopgap) you can try adding a layer of enclosure to them.

This works -- both in my example and in my analysis code.

I'm still a little confused and curious as to what features of lambdas are missing which make them break as a derived field.

forrestglines · 2021-07-15T15:03:48Z

After a bit of googling, I found another piece of the puzzle. The function only needs to be given a name other than <lambda>, even changing to lambda works. I.e.

for i in range(3):
    name = ("gas", f"field_{i}")
    _my_field = lambda field,data : np.full_like(data["density"],i)
    _my_field.__name__ = "lambda"

    ds.add_field(
            name=name,
            function=_my_field,
            sampling_type="local",
            units="g/cm**3"
            )

works fine.

The only check I can find for <lambda> in the code is on yt/fields/derived_field.py:253. If I remove the check on that line and just always invoke e[self.name], my example works as expected. However, that's probably removing something important.

neutrinoceros · 2021-07-15T15:47:03Z

Fascinating ! @matthewturk, it seems you authored this line 8yrs ago but the commit message doesn't help much, any idea if this check could have become irrelevant ?

neutrinoceros added api-consistency naming conventions, code deduplication, informative error messages, code smells... UX user-experience bug labels Jul 15, 2021

neutrinoceros mentioned this issue Jul 15, 2021

UX: add a clear error message in case a DerivedField is passed a lambda instead of a function #3438

Closed

neutrinoceros mentioned this issue Jul 15, 2021

BUG: add support for setting up derived fields from lambdas #3440

Merged

neutrinoceros linked a pull request Jul 15, 2021 that will close this issue

BUG: add support for setting up derived fields from lambdas #3440

Merged

matthewturk closed this as completed in #3440 Jul 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating derived fields with lambda functions #3434

Creating derived fields with lambda functions #3434

forrestglines commented Jul 13, 2021

welcome bot commented Jul 13, 2021

matthewturk commented Jul 14, 2021

forrestglines commented Jul 14, 2021

neutrinoceros commented Jul 15, 2021

matthewturk commented Jul 15, 2021

neutrinoceros commented Jul 15, 2021

forrestglines commented Jul 15, 2021

forrestglines commented Jul 15, 2021

neutrinoceros commented Jul 15, 2021

Creating derived fields with lambda functions #3434

Creating derived fields with lambda functions #3434

Comments

forrestglines commented Jul 13, 2021

Bug report

welcome bot commented Jul 13, 2021

matthewturk commented Jul 14, 2021

forrestglines commented Jul 14, 2021

neutrinoceros commented Jul 15, 2021

matthewturk commented Jul 15, 2021

neutrinoceros commented Jul 15, 2021

forrestglines commented Jul 15, 2021

forrestglines commented Jul 15, 2021

neutrinoceros commented Jul 15, 2021