Skip to content

Latest commit

 

History

History
443 lines (279 loc) · 16.1 KB

pstar_plist.md

File metadata and controls

443 lines (279 loc) · 16.1 KB

pstar.plist(self, *args, **kwargs)

list subclass for powerful, concise data processing.

Homogeneous access:

plist is the natural extension of object-orientation to homogeneous lists of arbitrary objects. With plist, you can treat a list of objects of the same type as if they are a single object of that type, in many (but not all) circumstances.

pl = plist['abc', 'def', 'ghi']
assert ((pl + ' -> ' + pl.upper()).aslist() ==
        ['abc -> ABC', 'def -> DEF', 'ghi -> GHI'])

Indexing:

Indexing plists is meant to be both powerful and natural, while accounting the fact that the elements of the plist may need to be indexed as well.

See __getitem__, __setitem__, and __delitem__ for more details.

Indexing into the plist itself:

foos = plist([pdict(foo=0, bar=0), pdict(foo=1, bar=1), pdict(foo=2, bar=0)])

# Basic scalar indexing:
assert (foos[0] ==
        dict(foo=0, bar=0))

# plist slice indexing:
assert (foos[:2].aslist() ==
        [dict(foo=0, bar=0), dict(foo=1, bar=1)])

# plist int list indexing:
assert (foos[[0, 2]].aslist() ==
        [dict(foo=0, bar=0), dict(foo=2, bar=0)])

Indexing into the elements of the plist:

# Basic scalar indexing:
assert (foos['foo'].aslist() ==
        [0, 1, 2])

# tuple indexing
assert (foos[('foo', 'bar')].aslist() ==
        [(0, 0), (1, 1), (2, 0)])

# list indexing
assert (foos[['foo', 'bar', 'bar']].aslist() ==
        [0, 1, 0])

Indexing into the elementes of the plist when the elements are indexed by ints, slices, or other means that confict with plist indexing:

pl = plist[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Basic scalar indexing:
assert (pl._[0].aslist() ==
        [1, 4, 7])

# slice indexing (note the use of the 3-argument version of slicing):
assert (pl._[:2:1].aslist() ==
        [[1, 2], [4, 5], [7, 8]])

# list indexing:
pl = pl.np()
assert (pl._[[True, False, True]].apply(list).aslist() ==
        [[1, 3], [4, 6], [7, 9]])

root and uproot:

plists all have a root object. For newly created plists, the root is self, but as computations are performed on the plist, the root of the resulting plists almost always remain the original plist:

pl = plist[1, 2, 3]
# plist operations don't modify the original (except where natural)!
assert ((pl + 5) is not pl)
assert ((pl + 5).root() is pl)

In some cases, you don't want to maintain the original root. To reset the root to self, simply call uproot:

pl2 = pl + 5
assert (pl2.root() is not pl2)
assert (pl2.uproot().root() is pl2)
assert (pl2.root() is pl2)

See root and uproot for more details.

Filtering:

plist overrides comparison operations to provide filtering. This is reasonable, since an empty plist is a False value, just like an empty list, so a filter that filters everything is equivalent to the comparison failing.

Filtering always returns the root of the plist, which allows you to filter a plist on arbitrary values computed from the root, and then proceed with your computation on the (filtered) original data.

See comparator and filter for more details.

foos = plist([pdict(foo=0, bar=0), pdict(foo=1, bar=1), pdict(foo=2, bar=0)])
# Filtering on a property:
zero_bars = foos.bar == 0
# The result is a plist of the original pdicts, correctly filtered:
assert (zero_bars.aslist() ==
        [{'foo': 0, 'bar': 0},
         {'foo': 2, 'bar': 0}])

# filter can take any function to filter by, but it defaults to bool():
nonzero_bars = foos.bar.filter()
assert (nonzero_bars.aslist() ==
        [{'foo': 1, 'bar': 1}])

Grouping and Sorting:

Just as with filtering, you can group and sort a plist on any arbitrary value computed from the plist.

This shows a basic grouping by a property of the data. Note that groupby returns the root, just like filtering:

foos = plist([pdict(foo=0, bar=1), pdict(foo=1, bar=0), pdict(foo=2, bar=1)])
# Note that the `bar == 1` group comes before the `bar == 0` group. The ordering
# is determined by the sort order of the plist.
assert (foos.bar.groupby().aslist() ==
        [[{'bar': 1, 'foo': 0}, {'bar': 1, 'foo': 2}], [{'bar': 0, 'foo': 1}]])
# Note that foos is unchanged:
assert (foos.aslist() ==
        [{'bar': 1, 'foo': 0}, {'bar': 0, 'foo': 1}, {'bar': 1, 'foo': 2}])

In contrast, sorting a plist modifies the order of both the current plist and its root, but returns the current plist instead of the root:

assert (foos.bar.sortby().aslist() ==
        [0, 1, 1])
assert (foos.aslist() ==
        [{'bar': 0, 'foo': 1}, {'bar': 1, 'foo': 0}, {'bar': 1, 'foo': 2}])

This distinction between the behavios of groupby and sortby permits natural chaining of the two when sorted groups are desired. It also ensures that plists computed from the same root will be ordered in the same way.

foos = plist([pdict(foo=0, bar=1), pdict(foo=1, bar=0), pdict(foo=2, bar=1)])
assert (foos.bar.sortby().groupby().aslist() ==
        [[{'bar': 0, 'foo': 1}], [{'bar': 1, 'foo': 0}, {'bar': 1, 'foo': 2}]])

See groupby and sortby for more details.

Function Application and Multiple Arguments:

The most prominent case where you can't treat a plist as a single object is when you need to pass a single object to some function that isn't a property of the elements of the plist. In this case, just use apply:

pl = plist['abc', 'def', 'ghi']
assert (pl.apply('foo: {}'.format).aslist() ==
        ['foo: abc', 'foo: def', 'foo: ghi'])

Where apply shines (and all calls to plist element functions) is when dealing with multi-argument functions. In this case, you will often find that you want to call the function with parallel values from parallel plists. That is easy and natural to do, just like calling the function with corresponding non-plist values:

foos = plist([pdict(foo=0, bar=0), pdict(foo=1, bar=1), pdict(foo=2, bar=0)])
foos.baz = 'abc' * foos.foo
# Do a multi-argument string format with plist.apply:
assert (foos.foo.apply('foo: {} bar: {} baz: {baz}'.format, foos.bar, baz=foos.baz).aslist() ==
        ['foo: 0 bar: 0 baz: ', 'foo: 1 bar: 1 baz: abc', 'foo: 2 bar: 0 baz: abcabc'])
# Do the same string format directly using the plist as the format string:
assert (('foo: ' + foos.foo.pstr() + ' bar: {} baz: {baz}').format(foos.bar, baz=foos.baz).aslist() ==
        ['foo: 0 bar: 0 baz: ', 'foo: 1 bar: 1 baz: abc', 'foo: 2 bar: 0 baz: abcabc'])

See __call__, apply, and reduce for more details.

Methods and Properties:

Constructs plist.

Causes the next call to self to be performed as deep as possible in the plist.

Causes the next call to self to be performed on the innermost plist.

Call each element of self, possibly recusively.

Implements the in operator to avoid inappropriate use of plist comparators.

Deletes an attribute on elements of self.

Deletes items of self using a variety of indexing styles.

Delegates to __delitem__ whenever possible. For compatibility with python 2.7.

Allow natural tab-completion on self and its contents.

Allow the use of plists in with statements.

Allow the use of plists in with statements.

Recursively attempt to get the attribute name.

Returns a plist of the attribute for self, or for each element.

Returns a new plist using a variety of indexing styles.

Delegates to __getitem__ whenever possible. For compatibility with python 2.7.

Sets an attribute on a plist or its elements to val.

Sets items of self using a variety of indexing styles.

Delegates to __setitem__ whenever possible. For compatibility with python 2.7.

Returns self if args[0] evaluates to True for all elements of self.

Returns self if args[0] evaluates to True for any elements of self.

Apply an arbitrary function to elements of self, forwarding arguments.

Recursively convert all nested plists from self to lists, inclusive.

Convert self to a pdict if there is a natural mapping of keys to values in self.

Recursively convert all nested plists from self to psets, inclusive.

Recursively convert all nested plists from self to tuples, inclusive.

plist binary operation; applied element-wise to self.

plist comparison operator. Comparisons filter plists.

Copy self to new plist. Performs a shallow copy.

Wrap the current plist values in tuples where the first item is the index.

Filter self by an arbitrary function on elements of self, forwarding arguments.

Group self.root() by the values in self and return self.root().

Returns a list with the structure of self filled in order from v.

plist logical operation. Logical operations perform set operations on plists.

Sets the current plist as a variable available in the caller's context.

Returns self if args[0] evaluates to False for all elements of self.

Returns a new plist with empty sublists removed.

Converts the elements of self to numpy.arrays, forwarding passed args.

Stores self into a plist of tuples that gets extended with each call.

Converts self into a pandas.DataFrame, forwarding passed args.

Returns a plist of the recursive depth of each leaf element, from 0.

Convert self to a pdict if there is a natural mapping of keys to values in self.

Shortcutting recursive equality function.

Returns a plist with the structure of self filled in order from v.

Returns a plist with the structure of self filled plen(-1) to 0.

Returns a plist of the length of a recursively-selected layer of self.

Convenience method for managing matplotlib.pyplot state within a plist chain.

Converts the elements of self into pset objects.

Returns a plist of the same structure as self, filled with leaf lengths.

Returns a plist with leaf elements converted to strings.

Returns a list of the number of elements in each layer of self.

Returns a new plist with only a single element of each value in self.

Applies logging function qj to self for easy in-chain logging.

Apply a function repeatedly to its own result, returning a plist of length at most 1.

Returns a new plist of pdicts based on selected data from self.

Returns the root of the plist.

Sorts self and self.root() in-place and returns self.

plist unary operation; applied element-wise to self.

Inverts the last grouping operation applied and returns a new plist.

Sets the root to self so future root() calls return this plist.

Returns a plist with the structure of self filled with value.

Adds and returns an outer plist around self.

Zips self with others, recursively.