Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong municipality name for Pedersöre #17

Open
dataninjafi opened this issue Aug 18, 2019 · 3 comments
Open

Wrong municipality name for Pedersöre #17

dataninjafi opened this issue Aug 18, 2019 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed question Further information is requested
Projects

Comments

@dataninjafi
Copy link

In municipality_key_2019$kunta_name Pedersöre's name is "Pedersören kunta" while other municipalities lack " kunta" ending.

@antagomir
Copy link
Member

I think this has been implemented in inst/extras/create_municipality_keys.R and "kunta" has been added afterwards. This might be for compatibility reasons with other data sources (@muuankarski ?).

I also think that "Pedersöre" would be more handy. Things brings in mind two topics to decide:

  1. Should we stick to the original names in the default data table, and then provide a separate wrapper that can be used if one likes to further harmonize the names or convert them into different formats, depending on the compatibility needs

  2. The field "kunta_name" might be better renamed as "municipality_fi" or something?

@antagomir antagomir added documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed question Further information is requested labels Aug 18, 2019
@antagomir antagomir added this to the First CRAN release milestone Aug 18, 2019
@muuankarski
Copy link
Collaborator

Most probably Pedersören kunta is correct name in Finnish. Had a quick look at few kuntadata resources (code below) and they all had Pedersören kunta in Finnish and Pedersöre in Swedish. Also wikipedia and their website (at the bottom) uses Pedersören kunta. It is odd and there must be a reason for this, but I think we better stick to Pedersören kunta in Finnish names.

As for column names, there are also name_fi and name_sv columns. name_fi equals kunta_name and therefore kunta_name could be completely removed

# Lets query some kuntadata to see how pedersöre is written
library(dplyr)
library(rvest)

# 1. Tilastokeskuksen kuntaluokitus 
## In Finnish
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index.html") %>% 
  html_table(fill = TRUE) %>% 
  .[2] %>%
  .[[1]] %>% 
  as_tibble(.name_repair = "universal") %>% 
  filter(grepl("Pedersö", X2))
# X1 X2              
# <int> <chr>           
# 599 Pedersören kunta

## In Swedish
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index_sv.html") %>% 
  html_table(fill = TRUE) %>% 
  .[2] %>%
  .[[1]] %>% 
  as_tibble(.name_repair = "universal") %>% 
  filter(grepl("Pedersö", X2))
# X1 X2       
# <int> <chr>    
# 599 Pedersöre

## In English
read_html("https://www.tilastokeskus.fi/meta/luokitukset/kunta/001-2019/index_en.html") %>% 
  html_table(fill = TRUE) %>% 
  .[2] %>%
  .[[1]] %>% 
  as_tibble(.name_repair = "universal") %>% 
  filter(grepl("Pedersö", X2))
# X1 X2       
# <int> <chr>    
# 599 Pedersöre

# 2. Kuntaliitto: Alueluokat ja kuntanumerot 2019
fly <- tempfile()
download.file("https://www.kuntaliitto.fi/sites/default/files/media/file/Alueluokat%20ja%20kuntanumerot%202019.xlsx",
              fly)
readxl::read_excel(fly, skip = 12) %>% 
  filter(grepl("Pedersö", `Kunnan nimi`)) %>% 
  select(1:3)
# Kuntanumero `Kunnan nimi`    `Ruotsinkielilinen nimi`
# <chr>       <chr>            <chr>                   
# 599         Pedersören kunta Pedersöre   

# 3. MML:n kuntarajat Paituli paikkatietopalvelusta
library(ows4R)
wfs <- WFSClient$new("http://avaa.tdata.fi/geoserver/paituli/wfs",
                     serviceVersion = "2.0.0",
                     logger = "INFO")

caps <- wfs$getCapabilities()
ft <- caps$findFeatureTypeByName("paituli:mml_hallinto_2018_10k", exact = TRUE)
shape <- ft$getFeatures()
shape %>% 
  filter(grepl("Pedersö", NAMEFIN)) %>% 
  select(NATCODE,NAMEFIN,NAMESWE)
# NATCODE          NAMEFIN   NAMESWE                       the_geom
#      599 Pedersören kunta Pedersöre MULTISURFACE (POLYGON ((287...

@antagomir
Copy link
Member

I think it is good to use the official names by default (and yes let's remove "kunta_name" field).

The data generation script in inst/extras/create_municipality_keys.R seems to make some modifications so let us make sure that the names are kept in their official formats.

If there is a need we can add wrappers that can convert the official names to shorter or other alternative forms for the names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed question Further information is requested
Projects
Data
  
Awaiting triage
Development

No branches or pull requests

3 participants