How to keep the original filename on merge & --split #396

christian-weiss · 2020-01-08T09:49:37Z

Is there a build-in way to keep the original filenames after split? Or do i have to prepare the input files by adding a .feature[].properties.originalFilename and then rename the output files based on that field?

This question is inspired by: #365

mbloch · 2020-01-08T18:07:10Z

@christian-weiss, could you elaborate a bit on how you want the output from -split to be named? -split tranforms a single layer into multiple layers... do you want all of the split-apart layers to have the same name (the name of the original file)?

christian-weiss · 2020-01-09T17:18:12Z

yes, same as the original filenames.
If someone is using --split only, then the output can stay as it is currently (generated).
But if someone uses combine-files & -merge-layers in combination with --split it would be very cool to re-use name of "merged" (input) files.

mbloch · 2020-01-10T03:43:38Z

If I understand correctly, you want to input multiple files as separate layers, merge them to a single layer, edit the merged layer, then split the layer apart again and save the layers using the original file names... here's how you can do it.

mapshaper src/*.json combine-files \
-each 'name = this.layer_name' \
-merge-layers name="" \
-split name \
-o out/

christian-weiss · 2020-01-11T11:22:04Z

Resulted in an error:

Error: [each] Command expects a single value. Received: name = this.layer_name
Run mapshaper -h to view help

Version i use is 0.4.152

mbloch · 2020-01-11T13:14:33Z

The expression you give the -each command has to be surrounded in single or double quotes, like in my example (-each 'name = this.layer_name'). The error you reported would occur if you left off the quotes.

christian-weiss · 2020-01-11T22:38:35Z

My initial command was already with -each 'name = this.layer_name', but thanks for pointing this out, as it helps me to find an issue in my wrapper script.

My request was:

mapshaper temp/raw/*.geojson combine-files -each 'name = this.layer_name' -merge-layers name="" -split name -o temp/simplified/

Where mapshaper is a wrapper script to pass all options to a docker container:

#!/usr/bin/env bash
myUsername=$(whoami)
uid=$(id -u $myUsername)
gid=$(id -g $myUsername)
exec docker run --rm -v $(pwd):/data --user "$uid:$gid" freifunkhamm/mapshaper:latest $@

Last line is now changed to:

exec docker run --rm -v $(pwd):/data --user "$uid:$gid" freifunkhamm/mapshaper:latest "$@"

to handle the single quotes issue.

Another option is to omit the single quotes and run:

mapshaper temp/raw/*.geojson combine-files -each 'name=this.layer_name' -merge-layers name="" -split name -o temp/simplified/

without spaces before and after =.

mbloch · 2020-01-11T22:54:33Z

I see... if you're passing one shell command through another shell command, then things get a bit more complicated. You may need to use nested quotes and/or add escape characters.

christian-weiss · 2020-01-12T08:40:13Z

@mbloch Is there a list of available reference names (like this.layer_name) or a way to output available references?

Why is it required to rename the edited layers to name="" (by using -merge-layers name=""?)?
I guess it is because there is an internal mechanism that generates a name for edited layers, and you want to prepare these edited layers from not being outputted by -split name later on.

Please confirm that the meaning of -split name is: "split by field name; take only features with a non-empty value in that field". Output of mapshaper -help split is not 100% accurate, as

Command
  -split        split features into separate layers using a data field

should be

Command
  -split        generates one dedicated layer per feature

as using a field= or <field> is optional, as stated in the options.

Suggestion for field= or <field>

Options
  <field>       shortcut for field=
  field=        split by this field name; not existing or empty field will make this feature to be skipped

Current format of help screen looks broken:

Options
  <field>       shortcut for field=
  field=        name of an attribute field (omit to split all features)
  no-replace, + retain the original layer(s) instead of replacing
  target=       layer(s) to target (comma-sep. list)

should be formatted:

Options
  <field>         shortcut for field=
  field=          name of an attribute field (omit to split all features)
  no-replace      keep the original layer(s) instead of replacing
  target=         layer(s) to target (comma-sep. list)

Description is not 100% clear, as one needs to know how mapshaper is internally organized/operating.
When is mapshaper replacing a layer? What kinds of layers other then original layer exists?
How to identify a target layer? Is there a layer name or id? How can i know its name? Would be good if this screen outputs some references to related documentation (that describes internal operations / representation of data).

mbloch · 2020-01-15T05:48:06Z

You brought up a lot of different things...

The command line help is very concise and doesn't give full explanations for many features. My eventual goal is to create a documentation site for mapshaper. Meanwhile, the GitHub wiki has more documentation than the -help command.

The entry for -each in the Command Reference page of the wiki has information on JavaScript expressions, including this.layer_name (https://github.com/mbloch/mapshaper/wiki/Command-Reference#-each)

I slightly reformatted the help display for the +/no-replace option, hopefully it looks a bit better now:

+, no-replace  retain both input and output layer(s)

The wiki has some information about the meaning of target= and +. See https://github.com/mbloch/mapshaper/wiki/Introduction-to-the-Command-Line-Tool#working-with-layers

To summarize, mapshaper refers to the main input layer or layers of a command as "targets". By default, editing commands modify their target layers (by "modify", I mean they replace the input data with the output data). If you want to retain the original contents of a command's target layer(s) instead of replacing them, you would use the + option (not all commands support +).

By default, the output of a command becomes the target for the next command (for most commands). Most of the time, you won't need to explicitly set a command's target. But if you do need to set a command's target, then you can use the -target command or use a command's target= option. Target switching is only needed when you are working with multiple layers at the same time.

As for the description of the -split command... I changed the wording a bit. I do not favor your suggestion of generates one dedicated layer per feature, because the command can create multiple-feature layers if you use the field= option.

The reason for using name="" with the -merge-layers command is related to the way that -split names its output layers. if the layer you are splitting has a non-empty name, then the split-apart layers will include the original layer name in their names. You can see how it works by experimenting yourself.

Hope this is useful.

christian-weiss changed the title ~~How to keep the original filename on --merge & --split~~ How to keep the original filename on merge & --split Jan 11, 2020

mbloch closed this as completed Jan 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to keep the original filename on merge & --split #396

How to keep the original filename on merge & --split #396

christian-weiss commented Jan 8, 2020

mbloch commented Jan 8, 2020

christian-weiss commented Jan 9, 2020 •

edited

mbloch commented Jan 10, 2020

christian-weiss commented Jan 11, 2020

mbloch commented Jan 11, 2020

christian-weiss commented Jan 11, 2020

mbloch commented Jan 11, 2020

christian-weiss commented Jan 12, 2020 •

edited

mbloch commented Jan 15, 2020 •

edited

How to keep the original filename on merge & --split #396

How to keep the original filename on merge & --split #396

Comments

christian-weiss commented Jan 8, 2020

mbloch commented Jan 8, 2020

christian-weiss commented Jan 9, 2020 • edited

mbloch commented Jan 10, 2020

christian-weiss commented Jan 11, 2020

mbloch commented Jan 11, 2020

christian-weiss commented Jan 11, 2020

mbloch commented Jan 11, 2020

christian-weiss commented Jan 12, 2020 • edited

mbloch commented Jan 15, 2020 • edited

christian-weiss commented Jan 9, 2020 •

edited

christian-weiss commented Jan 12, 2020 •

edited

mbloch commented Jan 15, 2020 •

edited