### OSMGetPOI.jl Tutorial
Full documentation: https://github.com/mkloe/OSMgetPOI

In [1]:
include("src/OSMgetPOI.jl")
using .OSMgetPOI

### 1. Download .osm data file
1. Go to https://download.bbbike.org/osm/bbbike/ and select a city.
2. Download the file for a selected city in *OSM XML gzip'd* format.
3. Save the file to /datasets repository.

In the future, we'll also add a functionality to download .osm file via API.

### 2. Configure POI types

1. Create a config.json file which describes the types of POIs that you want to extract.
    - Note: For this simple tutorial, we'll be only extracting schools.
    - The documentation describes queries for 25 proposed primary_types and subtypes.
    - To extract all proposed 25 POI types, use the POI_config.json file from /datasets directory.

The tutorial_config.json file look be as follows:

```
[
    {
        "primary_type": "education",
        "subtypes":
            {
                "subtype": "school",
                "query": "--keep= \" amenity=school =music_school =language_school \""
            }
    }
]
```

### 3. Generate a dictionary type object
Generate a dictionary type object from the .osm file, based on the tutorial_config.json file

In [2]:
singapore_school_vector = generate_poi_vectors("Singapore.osm", "datasets", "tutorial_config.json")

1-element Vector{Vector{Main.OSMgetPOI.ProcessedPOI}}:
 [Main.OSMgetPOI.ProcessedPOI("education", "school", 521081225, Dict("name" => "ACC EduHub", "amenity" => "school", "addr:housenumber" => "51"), 1.3027978, 103.8403767), Main.OSMgetPOI.ProcessedPOI("education", "school", 891923353, Dict("school" => "entrance"), 1.34844, 103.9511787), Main.OSMgetPOI.ProcessedPOI("education", "school", 891923364, Dict("barrier" => "gate", "school" => "entrance"), 1.3481086, 103.9519021), Main.OSMgetPOI.ProcessedPOI("education", "school", 1080233666, Dict("amenity" => "school"), 1.303544, 103.7982942), Main.OSMgetPOI.ProcessedPOI("education", "school", 1112379905, Dict("highway" => "crossing"), 1.3421529, 103.6881378), Main.OSMgetPOI.ProcessedPOI("education", "school", 1364155806, Dict("barrier" => "gate"), 1.3461092, 103.8465071), Main.OSMgetPOI.ProcessedPOI("education", "school", 1703872641, Dict("name" => "Hua Language Centre", "amenity" => "school", "addr:unit" => "02-09", "contact:phone" => "+656

### 4. Generate a dataframe with the results.
First, we generate the dataframe with results.

In [3]:
school_df = create_poi_df(singapore_school_vector)

Row,primary_type,subtype,lat,lon,name,amenity,addr:housenumber,school,barrier,highway,addr:unit,contact:phone,opening_hours,ref,door,wheelchair,access,horse,level,contact:website,designation,addr:city,addr:street,addr:postcode,alt_name,addr:housename,wikipedia,wikidata,note,phone,email,addr:country,fax,operator,website,name:en,source,office,addr:floor,description,name:zh,shop,language,operator:type,full_name,start_date,opening_hours:covid19,bicycle,motor_vehicle,fixme,entrance,name:zh-Hant,grades,short_name,foot,max_age,min_age,sport,vehicle,contact:facebook,name:ms,contact:instagram,disused:amenity,religion,internet_access,building,landuse,name:zh-Hans,name:ta,branch,old_name,GFA,building:levels,denomination,alt,official_name,acronym,construction,name:ko,layer,contact:email,contact:fax,isced:level,blind:description:en,deaf:description:en,outdoor_seating,indoor_seating,handrail,tactile_paving,indoor,source:name,leisure,addr:suburb
Unnamed: 0_level_1,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any
1,education,school,1.3028,103.84,ACC EduHub,school,51,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
2,education,school,1.34844,103.951,missing,missing,missing,entrance,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
3,education,school,1.34811,103.952,missing,missing,missing,entrance,gate,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
4,education,school,1.30354,103.798,missing,school,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
5,education,school,1.34215,103.688,missing,missing,missing,missing,missing,crossing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
6,education,school,1.34611,103.847,missing,missing,missing,missing,gate,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
7,education,school,1.38727,103.87,Hua Language Centre,school,missing,missing,missing,missing,02-09,+6562555060,Mo-Fr 11:00-21:00; Sa-Su 09:00-18:00,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
8,education,school,1.38727,103.87,Jan & Elly English Language School,school,missing,missing,missing,missing,02-10,67627783,Mo-Fr 12:00-21:00; Sa-Su 09:00-18:00,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
9,education,school,1.29761,103.794,Gate B,missing,missing,missing,gate,missing,missing,missing,missing,Gate B,yes,yes,customers,no,0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing
10,education,school,1.30214,103.773,missing,parking,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing


We extracted a dataframe with 93 columns. This is because in the original .osm file, the schools were described by different tags. We expanded all unique tags as column names. If no tag was included for a selected school, then the field value is *missing*.

Let's only keep the columns, that have 50% or more non-missing values.

In [4]:
filtered_school_df = filter_columns_by_threshold(school_df, 0.3)

Row,primary_type,subtype,lat,lon,name,amenity,addr:housenumber,school,addr:city,addr:street,addr:postcode,addr:country,name:zh
Unnamed: 0_level_1,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any,Any
1,education,school,1.3028,103.84,ACC EduHub,school,51,missing,missing,missing,missing,missing,missing
2,education,school,1.34844,103.951,missing,missing,missing,entrance,missing,missing,missing,missing,missing
3,education,school,1.34811,103.952,missing,missing,missing,entrance,missing,missing,missing,missing,missing
4,education,school,1.30354,103.798,missing,school,missing,missing,missing,missing,missing,missing,missing
5,education,school,1.34215,103.688,missing,missing,missing,missing,missing,missing,missing,missing,missing
6,education,school,1.34611,103.847,missing,missing,missing,missing,missing,missing,missing,missing,missing
7,education,school,1.38727,103.87,Hua Language Centre,school,missing,missing,missing,missing,missing,missing,missing
8,education,school,1.38727,103.87,Jan & Elly English Language School,school,missing,missing,missing,missing,missing,missing,missing
9,education,school,1.29761,103.794,Gate B,missing,missing,missing,missing,missing,missing,missing,missing
10,education,school,1.30214,103.773,missing,parking,missing,missing,missing,missing,missing,missing,missing


We can also filter by the selected colnames. It will then contain the following columns:
- *primary_type* from config file
- *subtype* from config file
- *lat* and *lon* - from parsed .osm file
- columns from the vector in function argument

In [5]:
filtered_by_colnames_school_df = filter_columns_by_colnames(school_df, ["name", "name:en"])

Row,primary_type,subtype,lat,lon,name,name:en
Unnamed: 0_level_1,Any,Any,Any,Any,Any,Any
1,education,school,1.3028,103.84,ACC EduHub,missing
2,education,school,1.34844,103.951,missing,missing
3,education,school,1.34811,103.952,missing,missing
4,education,school,1.30354,103.798,missing,missing
5,education,school,1.34215,103.688,missing,missing
6,education,school,1.34611,103.847,missing,missing
7,education,school,1.38727,103.87,Hua Language Centre,missing
8,education,school,1.38727,103.87,Jan & Elly English Language School,missing
9,education,school,1.29761,103.794,Gate B,missing
10,education,school,1.30214,103.773,missing,missing


### 5. Save the results to a .csv file.

In [6]:
using CSV
CSV.write("output_csv/singapore_schools.csv", filtered_school_df)

"output_csv/singapore_schools.csv"

### Remarks
This research was funded in whole or in part by [National Science Centre,  Poland][2021/41/B/HS4/03349]. For the software’s  documentation for the purpose of Open Access, the author has applied a CC-BY public copyright licence to any Author Accepted Manuscript (AAM) version arising from this submission.