Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow SELECT queries #179

Open
imcullan opened this issue Jan 18, 2017 · 2 comments
Open

Very slow SELECT queries #179

imcullan opened this issue Jan 18, 2017 · 2 comments

Comments

@imcullan
Copy link

imcullan commented Jan 18, 2017

Hello,
Below is my sessioninfo():

> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] highcharter_0.4.0.9999 RJDBC_0.2-5            rJava_0.9-8            RMySQL_0.10.9         
 [5] DBI_0.4-1              gganimate_0.1          sp_1.2-3               data.table_1.9.6      
 [9] dplyr_0.5.0            png_0.1-7              jpeg_0.1-8             httr_1.2.1            
[13] plotly_3.6.0           ggplot2_2.2.0          RCurl_1.95-4.8         bitops_1.0-6          
[17] jsonlite_1.0          

loaded via a namespace (and not attached):
 [1] zoo_1.7-13        reshape2_1.4.2    purrr_0.2.2       lattice_0.20-33   colorspace_1.2-6 
 [6] htmltools_0.3.5   viridisLite_0.1.3 yaml_2.1.13       base64enc_0.1-3   chron_2.3-47     
[11] TTR_0.23-1        plyr_1.8.4        quantmod_0.4-5    stringr_1.1.0     munsell_0.4.3    
[16] gtable_0.2.0      htmlwidgets_0.7   psych_1.6.6       labeling_0.3      parallel_3.3.0   
[21] curl_1.1          broom_0.4.1       xts_0.9-7         Rcpp_0.12.8       scales_0.4.1     
[26] gridExtra_2.2.1   mnormt_1.5-4      digest_0.6.10     stringi_1.1.2     rlist_0.4.6.1    
[31] tools_3.3.0       magrittr_1.5      lazyeval_0.2.0    tibble_1.2        tidyr_0.5.1      
[36] lubridate_1.5.6   assertthat_0.1    viridis_0.3.4     R6_2.1.2          igraph_1.0.1     
[41] nlme_3.1-128    

I have approximately 20 million rows in my database and I have the Play table indexed on game_id and play_id.
I am running the following query in R:

system.time(sample_play <- dbGetQuery(nbacon, 
paste0("SELECT p.player_id, p.lastname, 
              p.firstname, p.jersey, p.x_loc, p.y_loc, p.game_clock, p.shot_clock, p.home_description, 
              p.visitor_description, t.teamname, t.teamabbrev
              FROM Play p
              INNER JOIN Team t ON
              p.team_id = t.team_id
              WHERE game_id = '0021500290' AND play_id = 218)))

This returns 7150 rows of 12 variables.
The total time it takes is:

user system elapsed
0.023 0.014 8.489

It seems awfully long to fetch 7150 rows in 8.5 seconds.

That being said, I am running the exact same query on MySQLWorkbench and it takes:

Duration / Fetch Time
0.734 sec / 0.0039 sec

Any ideas on how to speed this up? All help is greatly appreciated!

@imcullan imcullan changed the title Very slow queries Very slow SELECT queries Jan 18, 2017
@maryskalicky
Copy link

I am running into this same issue. Are there any updates on this issue?

@dalekube
Copy link

I am also experiencing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants