The code in this repo is no longer maintained - all webpage code inference functionality has been moved to https://github.com/alastairrushworth/htmldf
R or Python? Simple Classification of Webpages by Code Content
installation
devtools::install_github("alastairrushworth/rorpy", dependencies = T)
dplyr: a webpage with R code
rorpy("http://dplyr.tidyverse.org") # 99% sure it's R.......
# A tibble: 1 x 3
other py r
<dbl> <dbl> <dbl>
1 0.00450 0.00600 0.990
Keras: a webpage with Python code
rorpy("https://keras.io") # also about 99% sure it's python
# A tibble: 1 x 3
other py r
<dbl> <dbl> <dbl>
1 0.00950 0.988 0.00250
Google: a webpage with no code tags
rorpy("https://google.com") # alas, there is no code to be had
# A tibble: 1 x 3
other py r
<dbl> <dbl> <dbl>
1 0. 0. 0.