this notebook tests some ways of initializing selector objects from data containers so that we don't have to pass pickled data objects to different processes. 

In [1]:
import yt 
ds = yt.load_sample("snapshot_033")

yt : [INFO     ] 2020-09-22 10:24:21,110 Files located at /home/chavlin/hdd/data/yt_data/yt_sample_sets/snapshot_033.tar.gz.untar/snapshot_033/snap_033.
yt : [INFO     ] 2020-09-22 10:24:21,110 Default to loading snap_033.0.hdf5 for snapshot_033 dataset
yt : [INFO     ] 2020-09-22 10:24:21,166 Parameters: current_time              = 4.343952725460923e+17 s


In [2]:
sp = ds.sphere(ds.domain_center,(2,'code_length')) 

yt : [INFO     ] 2020-09-22 10:26:56,367 Allocating for 4.194e+06 particles
Loading particle index: 100%|██████████| 12/12 [00:00<00:00, 198.98it/s]


In [6]:
sel = sp.selector
sel

<yt.geometry.selection_routines.SphereSelector at 0x7f18bc185110>

In [7]:
dir(sel)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__pyx_fuse_0count_points',
 '__pyx_fuse_0select_points',
 '__pyx_fuse_1count_points',
 '__pyx_fuse_1select_points',
 '__pyx_vtable__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_base_hash',
 '_hash_vals',
 'count_oct_cells',
 'count_octs',
 'count_points',
 'fill_mask',
 'fill_mesh_cell_mask',
 'fill_mesh_mask',
 'get_periodicity',
 'max_level',
 'min_level',
 'select_grids',
 'select_points']

In [70]:
ad = ds.all_data()

In [71]:
hsmls = 0 

In [72]:
n0 = 100000 
n1 = 500000
mask = sel.select_points(
                ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
            )

In [73]:
print(mask)

[False False False ... False False False]


In [74]:
mask

array([False, False, False, ..., False, False, False])

In [75]:
ad['x'][n0:n1][mask].shape

(5109,)

In [24]:
type(sel)

yt.geometry.selection_routines.SphereSelector

In [25]:
type(sp)

yt.data_objects.selection_objects.spheroids.YTSphere

In [29]:
sp_man = yt.data_objects.selection_objects.spheroids.YTSphere(
    ds.domain_center,(2,'code_length'),ds=None
) 

RuntimeError: Error: ds must be set either through class type or parameter to the constructor

so initializing the container requires the dataset.... 

`YTSelectionContainer` in `yt/data_objects/selection_objects/data_selection_objects.py`

is where the `selector` attribute is set: 

```
@property
    def selector(self):
        if self._selector is not None:
            return self._selector
        s_module = getattr(self, "_selector_module", yt.geometry.selection_routines)
        sclass = getattr(s_module, f"{self._type_name}_selector", None)
        if sclass is None:
            raise YTDataSelectorNotImplemented(self._type_name)

        if self._data_source is not None:
            self._selector = compose_selector(
                self, self._data_source.selector, sclass(self)
            )
        else:
            self._selector = sclass(self)
        return self._selector
```        

In [31]:
import pickle 

In [32]:
sel_pi = pickle.dumps(sel)

TypeError: no default __reduce__ due to non-trivial __cinit__

so let's try doing that selector init manually? "self" here would be `sp`: 

In [34]:
type(sp)

yt.data_objects.selection_objects.spheroids.YTSphere

In [36]:
ds.selector_module

AttributeError: 'OWLSDataset' object has no attribute 'selector_module'

In [37]:
s_module = yt.geometry.selection_routines

In [38]:
sp._type_name

'sphere'

In [39]:
sclass = s_module.sphere_selector
sclass

yt.geometry.selection_routines.SphereSelector

In [40]:
sp_sel = sclass(sp)

what do we need for a mock sphere?

The `SphereSelector` init is: 

```
    def __init__(self, dobj):
        for i in range(3):
            self.center[i] = _ensure_code(dobj.center[i])
        self.radius = _ensure_code(dobj.radius)
        self.radius2 = self.radius * self.radius
        center = _ensure_code(dobj.center)
        for i in range(3):
            self.center[i] = center[i]
            self.bbox[i][0] = self.center[i] - self.radius
            self.bbox[i][1] = self.center[i] + self.radius
            if self.bbox[i][0] < dobj.ds.domain_left_edge[i]:
                self.check_box[i] = False
            elif self.bbox[i][1] > dobj.ds.domain_right_edge[i]:
                self.check_box[i] = False
            else:
                self.check_box[i] = True
```                

In [85]:
sp._selector_module

AttributeError: 'YTSphere' object has no attribute '_selector_module'

In [52]:
class MockDs(object):
    def __init__(self,ds):
        self.domain_left_edge = ds.domain_left_edge
        self.domain_right_edge = ds.domain_right_edge
        self.periodicity = ds.periodicity
        
class MockSphere(object):
    # a stripped down sphere that records only the attributes required to initialize the sphere Selector Object
    def __init__(self,sp):
        self.center = sp.center 
        self.radius = sp.radius
        self.ds = MockDs(sp.ds)            
        

In [53]:
sp.center

unyt_array([12.5, 12.5, 12.5], 'code_length')

In [54]:
sp.radius

unyt_array(2, 'code_length')

In [55]:
sp_M = MockSphere(sp)

In [56]:
sp_M.center

unyt_array([12.5, 12.5, 12.5], 'code_length')

In [57]:
sp_M.ds.domain_left_edge

unyt_array([0., 0., 0.], 'code_length')

In [62]:
sp_M.ds.domain_right_edge

unyt_array([25., 25., 25.], 'code_length')

In [58]:
sel_M = yt.geometry.selection_routines.SphereSelector(sp_M)

In [84]:
n0 = 100000 
n1 = 500000
ad = ds.all_data()
mask_2 = sel_M.select_points(
                ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
            )
print(mask_2)

[False False False ... False False False]


In [86]:
mask_2[mask_2].shape

(5109,)

In [81]:
n0 = 100000 
n1 = 500000
ad = ds.all_data()
mask = sel.select_points(
                ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
            )
print(mask)

[False False False ... False False False]


In [87]:
mask[mask].shape

(5109,)

Ok, and `sp_M` be pickled, so can then pass that and iniitalize the sphere on each processor"?

In [89]:
sp_M_pi = pickle.dumps(sp_M)

so the dask function to wrap would have something like

In [90]:
sp_M_unpi = pickle.loads(sp_M_pi) # dask would do this in the backend
sel_M = yt.geometry.selection_routines.SphereSelector(sp_M_unpi)

In [91]:
n0 = 100000 
n1 = 500000
ad = ds.all_data()
mask_2 = sel_M.select_points(
                ad['x'][n0:n1], ad['y'][n0:n1], ad['z'][n0:n1], hsmls
            )
mask_2[mask_2].shape

(5109,)

the question is how to generalize to all the selector objects? Can easily add the `sclass` as a string attribute, e.g. `sclass = 'geometry.selection_routines.SphereSelector'` . But we'd need to know what each subclass `SelectorObject` needs for initialization... 

but let's just try out the sphere selector with dask... 