Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape function error with Minor League Data #25

Open
cidawkins opened this issue Mar 18, 2015 · 6 comments

Comments

@cidawkins
Copy link

commented Mar 18, 2015

This has been driving me crazy for the last few days and I can't figure why it continues to fail. Every single time I try to run the scrape function it comes up with this error:

If file names don't print right away, please be patient.
Error in function (type, msg, asError = TRUE) :
Could not resolve host:

I tried following your post on nonMLBdata and only tweaked it for import into a MySQL database. I shut off my laptop and router firewall for a short period of time to test it and it still returned the same error.

this is my main initial code:

library(dplyr)
library(RMySQL)
library(pitchRx)
drv = dbDriver("MySQL")
con= dbConnect(drv, user="myusername", password = "mypassword", dbname= "nonmlb", host= "localhost")
nonMLB08 <- nonMLBgids[grep("2008", nonMLBgids)]
scrape(start = "2008-01-01", end = "2009-01-01", game.ids = nonMLB08, connect = con)

traceback()
8: fun(structure(list(message = msg, call = sys.call()), class = c(typeName,
"GenericCurlError", "error", "condition")))
7: function (type, msg, asError = TRUE)
{
if (!is.character(type)) {
i = match(type, CURLcodeValues)
typeName = if (is.na(i))
character()
else names(CURLcodeValues)[i]
}
typeName = gsub("^CURLE_", "", typeName)
fun = (if (asError)
stop
else warning)
fun(structure(list(message = msg, call = sys.call()), class = c(typeName,
"GenericCurlError", "error", "condition")))
}(6L, "Could not resolve host: \016", TRUE)
6: .Call("R_curl_easy_perform", curl, .opts, isProtected, .encoding,
PACKAGE = "RCurl")
5: curlPerform(curl = curl, .opts = opts, .encoding = .encoding)
4: getURL(urls, async = async)
3: urlsToDocs(urls, async = async, quiet = quiet)
2: XML2Obs(inning.filez, as.equiv = TRUE, url.map = FALSE, ...)
1: scrape(start = "2008-01-01", end = "2009-01-01", game.ids = nonMLB08,
connect = con)

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] pitchRx_1.6 ggplot2_1.0.0 dplyr_0.4.1 RMySQL_0.10.2 DBI_0.3.1

loaded via a namespace (and not attached):
[1] assertthat_0.1 bitops_1.0-6 colorspace_1.2-6 digest_0.6.8 grid_3.1.2 gtable_0.1.2
[7] hexbin_1.27.0 lattice_0.20-30 magrittr_1.5 MASS_7.3-39 Matrix_1.1-5 mgcv_1.8-5
[13] munsell_0.4.2 nlme_3.1-120 parallel_3.1.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.5
[19] RCurl_1.95-4.5 reshape2_1.4.1 scales_0.2.4 stringr_0.6.2 tools_3.1.2 XML_3.98-1.1
[25] XML2R_0.0.6

I really need help with this.

@cidawkins

This comment has been minimized.

Copy link
Author

commented Mar 18, 2015

Also sometimes it lists a single character after "Could not resolve host:"

I've had a "5", "F", and nothing off the top of my head.

@cpsievert

This comment has been minimized.

Copy link
Owner

commented Mar 19, 2015

Specifying a start and end date isn't necessary if you're using the game.ids argument. Try removing those.

@cidawkins

This comment has been minimized.

Copy link
Author

commented Mar 19, 2015

Tried it and same error

@cpsievert

This comment has been minimized.

Copy link
Owner

commented Mar 19, 2015

Ah, I'm pretty sure this is happening because "inning/inning_all.xml" files don't exist for most (if not all) minor league games. The other file types should work though. For example,

x <- head(nonMLBgids)
files <- c("inning/inning_hit.xml", "miniscoreboard.xml", "players.xml")
dat <- scrape(game.ids = x, suffix = files)

I don't have time now, but hopefully in the next few months I'll make some modifications to grab "inning_[0-9].xml" files when "inning_all.xml" doesn't exist (for example)

@cidawkins

This comment has been minimized.

Copy link
Author

commented Apr 21, 2015

I was wondering if you had a chance to work on this error

@cpsievert

This comment has been minimized.

Copy link
Owner

commented Jun 1, 2015

This likely won't get fixed (by me) anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.