# OSM tags filter

**QuackOSM** allows users to filter the data from the `*.osm.pbf` file. Filtering will reduce a number of features parsed from the original file.

This notebook will explain how to use the OSM tags filtering mechanism.

## Filter format

Library expects a filter in the `dict` form (or `JSON` if provided via CLI).

**QuackOSM** uses two formats of filters: `OsmTagsFilter` and `GroupedOsmTagsFilter`.

In [None]:
from quackosm._osm_tags_filters import GroupedOsmTagsFilter, OsmTagsFilter

The first one, `OsmTagsFilter`, is a basic `dict` object that defines how to filter OSM based on their tags.

It is based on the filter object used in the [OSMnx](https://osmnx.readthedocs.io/en/stable/index.html) library, but it has more functionalities.

In [None]:
OsmTagsFilter

The key of the `dict` is expected to be an OSM tag key and the value can be one of: `bool`, a single OSM tag value or a list of OSM tag values.

In [None]:
# amenity=bench
filter_1 = {"amenity": "bench"}

# amenity=ice_cream and amenity=cafe
filter_2 = {"amenity": ["ice_cream", "cafe"]}

# all amenities
filter_3 = {"amenity": True}

# amenity=bar and building=office
filter_4 = {"amenity": "bar", "building": "office"}

# all amenities and all highways
filter_5 = {"amenity": True, "highway": True}

Second object, `GroupedOsmTagsFilter`, allows assigning filters to groups. It is a `dict` object with a group name being a `key` and `OsmTagsFilter` being a value.

This can become useful for grouping features into semantical categories for machine learning applications.

In [None]:
GroupedOsmTagsFilter

In [None]:
# benches
grouped_filter_1 = {"benches": {"amenity": "bench"}}

# swimming sport facilities
grouped_filter_2 = {"swimming_sport": {"leisure": "swimming_pool", "sport": "swimming"}}

# shops, tourism and traffic related objects
grouped_filter_3 = {
    "shopping": {"shop": True, "landuse": "retail"},
    "tourism": {"tourism": True, "historic": True},
    "traffic": {"amenity": "parking", "highway": True},
}

## Basic usage

Examples below show how to use the basic OSM tags filters.

In [None]:
import urllib.request

from quackosm import get_features_gdf

In [None]:
monaco_pbf_url = "https://download.geofabrik.de/europe/monaco-latest.osm.pbf"
monaco_pbf_file = "monaco.osm.pbf"
urllib.request.urlretrieve(monaco_pbf_url, monaco_pbf_file)

### Benches only

In [None]:
tags_filter = {"amenity": "bench"}
get_features_gdf(monaco_pbf_file, tags_filter=tags_filter, silent_mode=True)

### Cafes, bars and restaurants

In [None]:
tags_filter = {"amenity": ["cafe", "restaurant", "bar"]}
get_features_gdf(monaco_pbf_file, tags_filter=tags_filter, silent_mode=True)

### All amenities and leisures

In [None]:
tags_filter = {"amenity": True, "leisure": True}
get_features_gdf(monaco_pbf_file, tags_filter=tags_filter, silent_mode=True)

### Shopping and tourism related objects (grouped filters)

In [None]:
grouped_tags_filter = {
    "shopping": {"shop": True, "landuse": "retail"},
    "tourism": {"tourism": True, "historic": True},
}
get_features_gdf(monaco_pbf_file, tags_filter=grouped_tags_filter, silent_mode=True)

## Compact and exploded tags

Paramaters logic table.

The table shows how the columns for the result are generated based on value of the `explode_tags`, `keep_all_tags` parameters with and without OSM tags filter being present.

Legend:
- ✔️ - `True`
- ❌ - `False`
- 📦 - Compact tags (single `all_tags` column)
- 💥 - Exploded tags (separate columns per each tag key, or group name)

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:inherit;border-style:solid;border-width:1px;font-size:1em;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:inherit;border-style:solid;border-width:1px;font-size:1em;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-1wig{font-weight:bold;text-align:left;vertical-align:top}
.tg .tg-baqh{text-align:center;vertical-align:top}
.tg .tg-lqy6{text-align:right;vertical-align:top}
.tg .tg-8d8j{text-align:center;vertical-align:bottom}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-lqy6"><code>explode_tags</code></th>
    <th class="tg-baqh" colspan="2"><code>None</code></th>
    <th class="tg-baqh" colspan="2">✔️</th>
    <th class="tg-baqh" colspan="2">❌</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-lqy6"><code>keep_all_tags</code></td>
    <td class="tg-baqh">✔️</td>
    <td class="tg-baqh">❌</td>
    <td class="tg-baqh">✔️</td>
    <td class="tg-baqh">❌</td>
    <td class="tg-baqh">✔️</td>
    <td class="tg-baqh">❌</td>
  </tr>
  <tr>
    <td class="tg-1wig">Without OSM tags filter</td>
    <td class="tg-8d8j">📦</td>
    <td class="tg-8d8j">📦</td>
    <td class="tg-8d8j">💥</td>
    <td class="tg-8d8j">💥</td>
    <td class="tg-8d8j">📦</td>
    <td class="tg-8d8j">📦</td>
  </tr>
  <tr>
    <td class="tg-1wig">With OSM tags filter</td>
    <td class="tg-8d8j">📦</td>
    <td class="tg-8d8j">💥</td>
    <td class="tg-8d8j">💥</td>
    <td class="tg-8d8j">💥</td>
    <td class="tg-8d8j">📦</td>
    <td class="tg-8d8j">📦</td>
  </tr>
</tbody>
</table>

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;margin:0px auto;}
.tg td{border-style:solid;border-width:1px;font-size:1em;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-style:solid;border-width:1px;font-size:1em;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0so2{font-family:inherit;text-align:center;vertical-align:middle}
@media screen and (max-width: 767px) {.tg {width: auto !important;}.tg col {width: auto !important;}.tg-wrap {overflow-x: auto;-webkit-overflow-scrolling: touch;margin: auto 0px;}}</style>
<div class="tg-wrap"><table class="tg">
<thead>
  <tr>
    <th class="tg-0so2">OSM filter</th>
    <th class="tg-0so2"><code>keep_all_tags</code></th>
    <th class="tg-0so2"><code>explode_tags</code></th>
    <th class="tg-0so2">Resulting columns</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2"><code>None</code></td>
    <td class="tg-0so2">📦</td>
  </tr>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">💥</td>
  </tr>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2">📦</td>
  </tr>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2"><code>None</code></td>
    <td class="tg-0so2">💥</td>
  </tr>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">💥</td>
  </tr>
  <tr>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2">📦</td>
  </tr>
  <tr>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2"><code>N/A</code></td>
    <td class="tg-0so2"><code>None</code></td>
    <td class="tg-0so2">📦</td>
  </tr>
  <tr>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2"><code>N/A</code></td>
    <td class="tg-0so2">✔️</td>
    <td class="tg-0so2">💥</td>
  </tr>
  <tr>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2"><code>N/A</code></td>
    <td class="tg-0so2">❌</td>
    <td class="tg-0so2">📦</td>
  </tr>
</tbody>
</table></div>

## Positive and negative filters

## Wildcard filters

## Invalid filters