Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to keep the original filename on merge & --split #396

Closed
christian-weiss opened this issue Jan 8, 2020 · 9 comments
Closed

How to keep the original filename on merge & --split #396

christian-weiss opened this issue Jan 8, 2020 · 9 comments

Comments

@christian-weiss
Copy link

Is there a build-in way to keep the original filenames after split? Or do i have to prepare the input files by adding a .feature[].properties.originalFilename and then rename the output files based on that field?

This question is inspired by: #365

@mbloch
Copy link
Owner

mbloch commented Jan 8, 2020

@christian-weiss, could you elaborate a bit on how you want the output from -split to be named? -split tranforms a single layer into multiple layers... do you want all of the split-apart layers to have the same name (the name of the original file)?

@christian-weiss
Copy link
Author

christian-weiss commented Jan 9, 2020

yes, same as the original filenames.
If someone is using --split only, then the output can stay as it is currently (generated).
But if someone uses combine-files & -merge-layers in combination with --split it would be very cool to re-use name of "merged" (input) files.

@mbloch
Copy link
Owner

mbloch commented Jan 10, 2020

If I understand correctly, you want to input multiple files as separate layers, merge them to a single layer, edit the merged layer, then split the layer apart again and save the layers using the original file names... here's how you can do it.

mapshaper src/*.json combine-files \
-each 'name = this.layer_name' \
-merge-layers name="" \
-split name \
-o out/

@christian-weiss
Copy link
Author

Resulted in an error:

Error: [each] Command expects a single value. Received: name = this.layer_name
Run mapshaper -h to view help

Version i use is 0.4.152

@christian-weiss christian-weiss changed the title How to keep the original filename on --merge & --split How to keep the original filename on merge & --split Jan 11, 2020
@mbloch
Copy link
Owner

mbloch commented Jan 11, 2020

The expression you give the -each command has to be surrounded in single or double quotes, like in my example (-each 'name = this.layer_name'). The error you reported would occur if you left off the quotes.

@christian-weiss
Copy link
Author

My initial command was already with -each 'name = this.layer_name', but thanks for pointing this out, as it helps me to find an issue in my wrapper script.

My request was:

mapshaper temp/raw/*.geojson combine-files -each 'name = this.layer_name' -merge-layers name="" -split name -o temp/simplified/

Where mapshaper is a wrapper script to pass all options to a docker container:

#!/usr/bin/env bash
myUsername=$(whoami)
uid=$(id -u $myUsername)
gid=$(id -g $myUsername)
exec docker run --rm -v $(pwd):/data --user "$uid:$gid" freifunkhamm/mapshaper:latest $@

Last line is now changed to:

exec docker run --rm -v $(pwd):/data --user "$uid:$gid" freifunkhamm/mapshaper:latest "$@"

to handle the single quotes issue.

Another option is to omit the single quotes and run:

mapshaper temp/raw/*.geojson combine-files -each 'name=this.layer_name' -merge-layers name="" -split name -o temp/simplified/

without spaces before and after =.

@mbloch
Copy link
Owner

mbloch commented Jan 11, 2020

I see... if you're passing one shell command through another shell command, then things get a bit more complicated. You may need to use nested quotes and/or add escape characters.

@mbloch mbloch closed this as completed Jan 11, 2020
@christian-weiss
Copy link
Author

christian-weiss commented Jan 12, 2020

@mbloch Is there a list of available reference names (like this.layer_name) or a way to output available references?

Why is it required to rename the edited layers to name="" (by using -merge-layers name=""?)?
I guess it is because there is an internal mechanism that generates a name for edited layers, and you want to prepare these edited layers from not being outputted by -split name later on.

Please confirm that the meaning of -split name is: "split by field name; take only features with a non-empty value in that field". Output of mapshaper -help split is not 100% accurate, as

Command
  -split        split features into separate layers using a data field

should be

Command
  -split        generates one dedicated layer per feature

as using a field= or <field> is optional, as stated in the options.

Suggestion for field= or <field>

Options
  <field>       shortcut for field=
  field=        split by this field name; not existing or empty field will make this feature to be skipped

Current format of help screen looks broken:

Options
  <field>       shortcut for field=
  field=        name of an attribute field (omit to split all features)
  no-replace, + retain the original layer(s) instead of replacing
  target=       layer(s) to target (comma-sep. list)

should be formatted:

Options
  <field>         shortcut for field=
  field=          name of an attribute field (omit to split all features)
  no-replace      keep the original layer(s) instead of replacing
  target=         layer(s) to target (comma-sep. list)

Description is not 100% clear, as one needs to know how mapshaper is internally organized/operating.
When is mapshaper replacing a layer? What kinds of layers other then original layer exists?
How to identify a target layer? Is there a layer name or id? How can i know its name? Would be good if this screen outputs some references to related documentation (that describes internal operations / representation of data).

@mbloch
Copy link
Owner

mbloch commented Jan 15, 2020

You brought up a lot of different things...

The command line help is very concise and doesn't give full explanations for many features. My eventual goal is to create a documentation site for mapshaper. Meanwhile, the GitHub wiki has more documentation than the -help command.

The entry for -each in the Command Reference page of the wiki has information on JavaScript expressions, including this.layer_name (https://github.com/mbloch/mapshaper/wiki/Command-Reference#-each)

I slightly reformatted the help display for the +/no-replace option, hopefully it looks a bit better now:

+, no-replace  retain both input and output layer(s)

The wiki has some information about the meaning of target= and +. See https://github.com/mbloch/mapshaper/wiki/Introduction-to-the-Command-Line-Tool#working-with-layers

To summarize, mapshaper refers to the main input layer or layers of a command as "targets". By default, editing commands modify their target layers (by "modify", I mean they replace the input data with the output data). If you want to retain the original contents of a command's target layer(s) instead of replacing them, you would use the + option (not all commands support +).

By default, the output of a command becomes the target for the next command (for most commands). Most of the time, you won't need to explicitly set a command's target. But if you do need to set a command's target, then you can use the -target command or use a command's target= option. Target switching is only needed when you are working with multiple layers at the same time.

As for the description of the -split command... I changed the wording a bit. I do not favor your suggestion of generates one dedicated layer per feature, because the command can create multiple-feature layers if you use the field= option.

The reason for using name="" with the -merge-layers command is related to the way that -split names its output layers. if the layer you are splitting has a non-empty name, then the split-apart layers will include the original layer name in their names. You can see how it works by experimenting yourself.

Hope this is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants