Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New layer-stats tool to explore data #293

Merged
merged 2 commits into from
Jun 5, 2020
Merged

Conversation

nyurik
Copy link
Member

@nyurik nyurik commented Jun 5, 2020

The tool is modeled on the OpenMapTiles qa/ tools. The tool shows some statistics for a given layer's field for each zoom. It currently supports 3 methods:

  • frequency -- Shows how often each unique value occurs in a layer's column. If more than one column is given, shows unique combinations.
  • toplength -- Shows the longest N values for a given layer's column.
  • variance -- Shows a few statistical metrics for a column's numeric value.

This is a simple test script I used outside of the docker, and it could be adapted to run inside the docker as well. https://gist.github.com/nyurik/457102422adfb380ec9e8552d7656112

@TomPohys we can merge this and then add a new flag, e.g. --markdown to generate .md output.

Sample output:

$ layer-stats frequency openmaptiles.yml landcover class subclass
======= Analyzing value occurrence in layer 'landcover' for fields [class, subclass] =======
class    subclass      z0    z1    z2    z3    z4    z5    z6    z13    z14
-------  ----------  ----  ----  ----  ----  ----  ----  ----  -----  -----
ice      glacier       11    11   377   377   377  1886  1886
ice      ice_shelf                 64    64    64   159   159
grass    garden                                                    2     29
grass    park                                                      2     12
grass    grass                                                            1
sand     beach                                                            4
wood     forest                                                           2
wood     wood                                                             3


$ layer-stats toplength openmaptiles.yml water_name name_en
======= Analyzing longest field values in layer 'water_name' for field [name_en] =======
name_en                       z14
--------------------------  -----
Stade nautique Rainier III     26
Fontaine Japonaise             18
Le Méridien • Pool             18
Fontaine du Casino             18


$ layer-stats variance openmaptiles.yml building render_height
======= Analyzing field value variance in layer 'building' for field [render_height] =======
zoom      count    min    max      avg    stddev    variance
------  -------  -----  -----  -------  --------  ----------
z14        1171      4    170  6.96328   11.1951     125.329

The tool is modeled on the OpenMapTiles `qa/` tools. The tool shows some statistics for a given layer's field for each zoom. It currently supports 3 methods:

* `frequency` -- Shows how often each unique value occurs in a layer's column. If more than one column is given, shows unique combinations.
* `toplength` -- Shows the longest N values for a given layer's column.
* `variance` -- Shows a few statistical metrics for a column's numeric value.

This is a simple test script I used outside of the docker, and it could be adapted to run inside the docker as well. https://gist.github.com/nyurik/457102422adfb380ec9e8552d7656112
@nyurik nyurik requested a review from TomPohys June 5, 2020 04:36
@nyurik nyurik merged commit b486d0b into openmaptiles:master Jun 5, 2020
@nyurik nyurik deleted the layer-stats branch June 5, 2020 16:53
TomPohys added a commit to openmaptiles/openmaptiles that referenced this pull request Nov 12, 2020
With the new release of OMT-T (5.3) are available tools [`layer-stat`](openmaptiles/openmaptiles-tools#293). 

With the new release of OMT-T can be replaced `make` target `generate-qareports` by `generate-qa`

Used as:
```
make generate-qa STAT_FUNCTION=frequency LAYER=transportation ATTRIBUTE=class
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants