Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add in english names to jpnprefs dataset #21

Merged
merged 2 commits into from Sep 20, 2018
Merged

add in english names to jpnprefs dataset #21

merged 2 commits into from Sep 20, 2018

Conversation

@Ryo-N7
Copy link
Contributor

@Ryo-N7 Ryo-N7 commented Sep 20, 2018

When working with the jpndistrict package I often find myself needing the English names of the prefectures, capitals, regions, etc. For the convenience of other non-Japanese users of this package I thought I would add in the English names for the prefectures, prefecture capital, region, and major island taken from Wikipedia to the "jpnprefs" dataset. I have also updated the tests as well.

Please review the changes at your convenience, thanks!

@uribo uribo self-requested a review Sep 20, 2018
@@ -14,6 +14,10 @@ library(tidyverse)
# dplyr # 0.7.6
# tidyr # 0.8.1
# purrr # 0.2.5
# (stringr) # 1.3.1

library(polite) # 0.0.0.9004

This comment has been minimized.

@uribo

uribo Sep 20, 2018
Owner

Why should we introduce the polite package?
This package is certainly useful, but has not yet been registered with CRAN.

This comment has been minimized.

@Ryo-N7

Ryo-N7 Sep 20, 2018
Author Contributor

yeah, I suppose it's not entirely necessary to use this package for now. We're only scraping from Wikipedia anyways. I use it as part of my workflow but I understand from a package development/maintenance point of view that it's not necessary.

We can just replace it with the regular rvest code instead:

url <- "https://en.wikipedia.org/wiki/Prefectures_of_Japan"
jpn_pref_raw <- read_html(url) %>% 
    html_nodes("table.wikitable:nth-child(49)") %>% 
    html_table() %>% 
    purrr::flatten_df()

url2 <- "https://en.wikipedia.org/wiki/List_of_Japanese_prefectures_by_population"
jpn_pref2_raw <- read_html(url) %>% 
  html_nodes("table.wikitable:nth-child(7)") %>% 
  html_table() %>% 
  purrr::flatten_df()
purrr::flatten_df()

jpn_pref_df <- jpn_pref_raw %>%
janitor::clean_names() %>%

This comment has been minimized.

@uribo

uribo Sep 20, 2018
Owner

I do not feel motivated to use janitor for this process.
Is it possible to change to a method using dplyr::select() which explicitly selects and renames a variable?

This comment has been minimized.

@Ryo-N7

Ryo-N7 Sep 20, 2018
Author Contributor

Yeah, sure! I'll do it similar to how you set the names for the Japanese table using set_colnames():

jpn_pref_df <- jpn_pref_raw %>% 
  select(2, 4, 5) %>% 
  set_colnames(c("kanji", "region_en", "major_island_en")) %>% 
  mutate(region_en = region_en %>% iconv(from = "UTF-8", to = "ASCII//TRANSLIT")) 
purrr::flatten_df()

jpn_pref2_df <- jpn_pref2_raw %>%
janitor::clean_names() %>%

This comment has been minimized.

@uribo

uribo Sep 20, 2018
Owner

Same as above.

This comment has been minimized.

@Ryo-N7

Ryo-N7 Sep 20, 2018
Author Contributor

jpn_pref2_df <- jpn_pref2_raw %>% 
  select(3, 2, 4) %>% 
  set_colnames(c("kanji", "prefecture_en", "capital_en")) %>% 
  mutate(prefecture_en = prefecture_en %>% iconv(from = "UTF-8", to = "ASCII//TRANSLIT"),
         capital_en = capital_en %>% iconv(from = "UTF-8", to = "ASCII//TRANSLIT"))
@uribo uribo merged commit 72f0ab1 into uribo:master Sep 20, 2018
1 of 2 checks passed
1 of 2 checks passed
continuous-integration/travis-ci/pr The Travis CI build could not complete due to an error
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
@uribo
Copy link
Owner

@uribo uribo commented Sep 20, 2018

Thanks your contribution :) 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.