fio command for calculating new properties #273

perrygeo · 2015-09-22T17:51:25Z

fio calc could take a stream of GeoJSON features on stdin calculate additional properties.

For each feature, it could evaluate one or more expressions, place the derived variable(s) in the properties dict then print the feature to stdout.

So for this input:

{'type': 'Feature',
 'id': 1,
 'properties': {'a': 1, 'b': 2}}

You might have a CLI something like this

fio calc --new "c" "f['properties']['a'] + f['properties']['b'] * 2"

which would yield

{'type': 'Feature',
 'id': 1,
 'properties': {'a': 1, 'b': 2, 'c': 5}}

There are obviously some details to work out

what is in the namespace when the expression is evaluated
python expressions or some other DSL?
what's the best interface for args and options

If there's interest in the general idea, I can get started on a PR

The text was updated successfully, but these errors were encountered:

geowurster · 2015-09-22T18:10:53Z

@perrygeo Yes! I have been thinking about this as well but have not had a reason to dig in. I have a fio-geoprocessing project that has been sitting around for a while that might interest you. The goal is to take advantage of click command chaining to string together a bunch of shapely and/or custom commands that operate on single features. The open PR covers some of the background but the big question is whether or not we can make the current fio CLI work with click chaining and what complications that introduces for plugin commands. I haven't had much time to work on it, but this issue and #272 issue are both features I have thought about or prototyped.

This could look something like:

fio \
    cat infile.geojson \
    buffer --dist 100 \
    simplify --tolerance 10 \
    reproject --dst-crs EPSG:4326 \
    calc --new area "shape(f['geometry']).area" \
    load --driver GeoJSON out.geojson

Or maybe just reading from stdin and writing to stdout with cat and loads handling the streaming and dumping to disk:

fio cat infile.geojson | \
    buffer --dist 100 \
    simplify --tolerance 10 \
    reproject --dst-crs EPSG:4326 \
    calc --new area "shape(feature['geometry']).area" |\
        fio load --driver GeoJSON out.geojson

Or as its own subcommand with an initial command to open the file and a final command to save:

fio pipe \
    open infile.geojson \
    buffer --dist 100 \
    simplify --tolerance 10 \
    reproject --dst-crs EPSG:4326 \
    calc --new area "shape(feature['geometry']).area" \
    save --driver GeoJSON out.geojson

perrygeo · 2015-09-22T19:33:43Z

@geowurster Thanks for pointing out fio-geoprocessing - that really opens up the doors for some awesome fio-based workflows. I'll check it out...

As for the click command chaining, it looks great, cleaner syntax for sure. And it avoids serializing-deserializing at each command, firing up a new interpreter, etc. A bit OT for this issue but where does #173 stand?

command-chaining vs pipes aside, what do you think about the interface (--new propertyname expression) and the use of python expressions in general? I noticed you included a call to shape in your example so there is at least one function we should have in the namespace - any others that might be useful?

geowurster · 2015-09-24T02:11:20Z

@perrygeo I only pointed out fio-geoprocessing because we both have had some of the same thoughts and wanted you to see how far I got in the context of this issue, #173, and #272.

#173 is held up by some design decisions that are dictated by whether or not the current CLI can be made chain-able or if chaining will need to be relegated to a subcommand. Each have their pros and cons. I haven't had time to experiment with it but it miiiiiiight be possible to make the current CLI chain-able.

As far as the syntax and use of Python expressions go, see my comment in #272.

perrygeo · 2015-09-24T09:11:18Z

Does @geowurster's pyin utility cover this use case? Yep, and more. Maybe a fio calc is superfluous?

the original example would become

pyin \
  "json.loads(line)" \
  "line['properties']['c'] = line['properties']['a'] + line['properties']['b'] * 2"

simple enough and more general-purpose. If we had dot notation in pyin, it starts looking even better at the expense of being a little bit longer

pyin \
  "json.loads(line)" \
  "from munch import munchify; munchify(line)" \
  "line.properties.c = line.properties.a + line.properties.b * 2"

So is there any value in a fio calc when other tools will work? Maybe it would just be a convenience to handle the json and munch stuff and expose the feature as f.

geowurster · 2015-09-24T12:56:48Z

@perrygeo No need to import munch, "munch.munchify(json.loads(line))" works automagically™.

Unfortunately eval() can't handle assignment, so your last statement throws a syntax error, but exec() can. With a little regex to direct expressions to the right function, and some experimentation to see what exec() has access to, I can definitely bring it into pyin.

Another option is to depend on pyin and use its pmap() generator for fio calc, which would let us control the variable name and feed it loaded dictionaries. Adding an optional base scope argument would be pretty easy.

sgillies · 2016-05-05T16:49:23Z

@geowurster @perrygeo is this one resolved or is it still active?

perrygeo · 2016-05-06T18:57:42Z

I'd love to see something like this, the geojson cli equivalent to "attribute table calculator" which is core functionality in many GIS workflows. Just waiting for some consensus on if/how it should be implemented in fiona. Still an open conversation, let's not close just yet.

My take: adding this functionality would be simple and generically useful. I've developed lots of feature processing functions which do more specialized tasks (e.g. add a new id property with a uuiid) but I realize they could all be solved with a general tool that evaluates python expressions and stores those values as properties.

Is that a good idea? Are there other general json processing tools that already do this? Or is this the realm of python code and doesn't really fit the command line?

geowurster · 2016-05-18T06:02:27Z

Agreed that a command line field calculator would be a great addition. Seems like eval() should be able to handle this as well, although I'm not sure it supports item assignment, so maybe exec()?

perrygeo added enhancement cli labels Sep 22, 2015

geowurster mentioned this issue Sep 24, 2015

CLI for filtering features, fio filter #272

Merged

perrygeo mentioned this issue Sep 26, 2015

CLI streaming perrygeo/python-rasterstats#100

Closed

perrygeo mentioned this issue Feb 17, 2016

Geometry sequence as features mapbox/cligj#14

Merged

perrygeo mentioned this issue May 21, 2016

fio calc #348

Merged

sgillies added this to the 1.7 milestone May 25, 2016

geowurster closed this as completed in #348 May 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fio command for calculating new properties #273

fio command for calculating new properties #273

perrygeo commented Sep 22, 2015

geowurster commented Sep 22, 2015

perrygeo commented Sep 22, 2015

geowurster commented Sep 24, 2015

perrygeo commented Sep 24, 2015

geowurster commented Sep 24, 2015

sgillies commented May 5, 2016

perrygeo commented May 6, 2016 •

edited

geowurster commented May 18, 2016 •

edited

fio command for calculating new properties #273

fio command for calculating new properties #273

Comments

perrygeo commented Sep 22, 2015

geowurster commented Sep 22, 2015

perrygeo commented Sep 22, 2015

geowurster commented Sep 24, 2015

perrygeo commented Sep 24, 2015

geowurster commented Sep 24, 2015

sgillies commented May 5, 2016

perrygeo commented May 6, 2016 • edited

geowurster commented May 18, 2016 • edited

perrygeo commented May 6, 2016 •

edited

geowurster commented May 18, 2016 •

edited