I had the problem that submit_form always produced the following error when trying to enter a specific web page:
Submitting with 'login'
Error in function (type, msg, asError = TRUE) : <url> malformed
A couple of days ago someone posted the same issue on SO and the answer given by MrFlick solved my issue:
Before submitting the form you have to explicitly set the url of the login form.
It seems that rvest has some problems when interpreting absolute URLs without the server name.
Reproducible example (The other one can be found on SO):
library(rvest)
library(magrittr)
my_url = "https://www.openair.com/index.pl"
openair <- html_session(my_url)
login <- html_form(openair) %>%
extract2(1) %>%
set_values(
account_nickname = "does_not_matter_here",
user_nickname = "does_not_matter_here",
password = "does_not_matter_here"
)
openair %<>% submit_form(login)
The code above produces the described error. Taking a look at the beginning of login:
<form> 'login_page' (POST /index.pl)
<input hidden> '_form_has_changed': 0
...
However, adding login$url <- 'https://www.openair.com/index.pl' before submitting the form solves it.
In this case the start of login looks like this:
<form> 'login_page' (POST https://www.openair.com/index.pl)
<input hidden> '_form_has_changed': 0
I had the problem that
submit_formalways produced the following error when trying to enter a specific web page:A couple of days ago someone posted the same issue on SO and the answer given by
MrFlicksolved my issue:Before submitting the form you have to explicitly set the url of the login form.
It seems that
rvesthas some problems when interpreting absolute URLs without the server name.Reproducible example (The other one can be found on SO):
The code above produces the described error. Taking a look at the beginning of
login:However, adding
login$url <- 'https://www.openair.com/index.pl'before submitting the form solves it.In this case the start of
loginlooks like this: