Skip to content

Commit

Permalink
local state
Browse files Browse the repository at this point in the history
  • Loading branch information
RMHogervorst committed Mar 3, 2017
1 parent 0a17d33 commit 8bb6d77
Show file tree
Hide file tree
Showing 15 changed files with 271 additions and 29 deletions.
3 changes: 3 additions & 0 deletions .Rbuildignore
@@ -1,2 +1,5 @@
^.*\.Rproj$
^\.Rproj\.user$
^README\.Rmd$
^README-.*\.png$
^\.travis\.yml$
26 changes: 26 additions & 0 deletions .travis.yml
@@ -0,0 +1,26 @@
# Sample .travis.yml for R projects

language: r
matrix:
include:
- os: linux
dist: precise

sudo: false
cache: packages
r:
- oldrel
- release
- devel



warnings_are_errors: true

r_github_packages:
- jimhester/covr



after_success:
- Rscript -e 'library(covr);codecov()'
14 changes: 8 additions & 6 deletions DESCRIPTION
@@ -1,14 +1,16 @@
Package: imdb
Title: Download IMDB Series Information Into Dataframe
Version: 0.0.0.9000
Title: Download IMDB Series Information Into a Dataframe
Version: 0.1.0
Authors@R: person("Roel M.", "Hogervorst", email = "hogervorst.rm@gmail.com", role = c("aut", "cre"))
Description: This package downloads imdb information using http://
www.omdbapi.com and puts it into a dataframe. In the background a json is
downloaded and converted into a dataframe.
Description: This package downloads imdb information using
http://www.omdbapi.com and puts it into a dataframe.
In the background a json is
downloaded and converted into a dataframe using 'jsonlite'.
Depends:
R (>= 3.2.2)
Imports:
jsonlite(>= 0.9.17)
License: MIT, CC BY 4.0
License: MIT
LazyData: true
RoxygenNote: 5.0.1
Suggests: testthat
2 changes: 2 additions & 0 deletions NAMESPACE
@@ -1,2 +1,4 @@
# Generated by roxygen2: do not edit by hand

export(enrichIMDB)
export(imdbSeries)
18 changes: 9 additions & 9 deletions R/enrichIMDB.R
@@ -1,31 +1,31 @@
#' IMDB information enricher.
#'
#' This function downloads episode information into a dataframe.
#'
#' This function downloads episode information into a dataframe.
#' Use it to create a dataframe with extra information about every epidode.
#' it downloads runtime, director, writer, actors, plot and imdb votes.
#' @param df Name of dataframe you want to add info to. NEEDS an imdbID variable.
#' @examples
#' \dontrun{
#' @examples
#' \dontrun{
#' enrichIMDB("IMDB")
#' }
#' @keywords imdb, enrich
#' @export
enrichIMDB<- function(df){
library(jsonlite)
# read all unique id's
#read all unique id's
IDs<- unique(df$imdbID)
#issue with rbind that breaks the column names, forces me to create a useless row
imdbID <- "t103"
imdbID <- "t103"
runtime<- "asdf"
director<- "asdf"
writer<- "asdf"
actors<- "asdf"
plot<-"adsf"
votes<-"456"
dataframe<-data.frame(imdbID, runtime,director, writer,actors, plot,votes, stringsAsFactors = F)
# loop through ids and add information into a row and adding it to dataframe.
# loop through ids and add information into a row and adding it to dataframe.
for(i in IDs) {
link <- paste("http://www.omdbapi.com/?i=", i ,"&plot=full&r=json", sep = "")
hold<-fromJSON(link)
hold<-jsonlite::fromJSON(link)
newrow<- c( i, hold$Runtime, hold$Director, hold$Writer, hold$Actors, hold$Plot, hold$imdbVotes)
dataframe<-rbind(dataframe, newrow)
}
Expand Down
17 changes: 8 additions & 9 deletions R/imdbSeries.R
@@ -1,22 +1,21 @@
#' IMDB information downloader for series.
#'
#'
#' This function downloads series information into a dataframe.
#' @param seasons Information about hat seasons do you want to download? Defaults to 1.
#' @param seriesname Give the name of the series for example "House MD"
#' @examples
#' \dontrun{
#' @examples
#' \dontrun{
#' imdbSeries("House MD", 1:2)
#' }
#' @keywords imdb, series

#' @export
imdbSeries<-function(seriesname, seasons = 1) {
library(jsonlite)
df<-data.frame(Title = character(0), Released = character(0),
Episode = character(0), imdbRating = character(0),
df<-data.frame(Title = character(0), Released = character(0),
Episode = character(0), imdbRating = character(0),
imdbID = character(0), Season =numeric(0)) #creates empty dataframe
for( i in seasons) {
link<-gsub(pattern = " ", replacement = "%20", x=(paste("http://www.omdbapi.com/?t=",seriesname,"&Season=",i, sep = "")))
hold<-fromJSON(link) # link with spaces replaced
hold<-jsonlite::fromJSON(link) # link with spaces replaced
dftemp<-hold$Episodes #using only the Episodes part
dftemp$Season <-i # adding variable season
df<- rbind(df, dftemp)# combining
Expand All @@ -26,5 +25,5 @@ imdbSeries<-function(seriesname, seasons = 1) {
df$Released<-as.Date(df$Released)
df$Episode<- as.numeric(df$Episode)
df$imdbRating<- as.numeric(df$imdbRating)
return(df)
df
}
Binary file added README-combining everything-1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
84 changes: 84 additions & 0 deletions README.Rmd
@@ -0,0 +1,84 @@
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```

# Short description of the package

imdb helps you in downloading series information from imdb. It has two functions
one for basic information and a second one that also downloads synopsis, actors etc.


# Installation instructions
For now you will have to install using `devtools::install_github("rmhogervorst/imdb")`


in the near future:
**_Installation via cran with : install_packages(NAMEOFPACKAGE)_**

# Example usage
imdb has 2 functions:

- imdbSeries() and
- enrichIMDB()

With the function 'imdbSeries(seriesname = "name of series", seasons = number(s))'
you can call up general information about series. Note that the api is does
not really care about case. "Game of Thrones" or "game of thrones" or
"gAmE oF tHrONes " is all fine.

```{r example for game of trones season 1}
library(imdb)
imdbSeries("game of thrones ")
```

The command will return a data frame with Title, releasedate, episodenumber
imdb-rating, imdb ID and season.

Would you like to know more about your series? Use the `enrichIMDB` command:

```{r enrichIMDB example game of thrones}
season2GOT <-imdbSeries("game of thrones", seasons = 2)
season2GOT_enriched <- enrichIMDB(season2GOT)
```

The enrichIMDB command returns a seperate dataframe with
imdbID, runtime, director, writer, actors, plot (complete synopsis), and votes
per episode. It uses the imdbid of the episode to scour for more information.
So if you'd like to know how many times Jon Snow appears in the synopsis,
or how many times Peter Dinklage plays in season 2, you can now search for it.

```{r examples}
grep("Jon", season2GOT_enriched$plot)
grep("Peter Dinklage", season2GOT_enriched$actors)
```

Combining the information from the two dataframes can also be very useful.

```{r combining everything}
library(ggplot2)
suppressPackageStartupMessages(library(dplyr))
GOTall<-imdbSeries("game of Thrones", 1:6)
GOT <-left_join(GOTall, enrichIMDB(GOTall), by = "imdbID")
ggplot(GOT, aes(Episode, imdbRating)) +
geom_smooth(aes(color = as.factor(Season)),se = FALSE , alpha = 1/10)+
geom_point(aes(color = as.factor(Season), size = votes))+
ggtitle("Rating per episode of GoT, \ncolored by season\nwith smoothlines")
```


# Contact
I'm always looking for people to help me improve my work.
Contact me directly, use an [issue](https://github.com/RMHogervorst/imdb/issues), fork me or submit a pull request.

[![star this repo](http://githubbadges.com/star.svg?user=RMHogervorst&repo=imdb&style=flat)](https://github.com/RMHogervorst/imdb)
[![fork this repo](http://githubbadges.com/fork.svg?user=RMHogervorst&repo=imdb&style=flat)](https://github.com/RMHogervorst/imdb/fork)
96 changes: 94 additions & 2 deletions README.md
@@ -1,2 +1,94 @@
# imdb
IMDB r package based on omdbapi.com

<!-- README.md is generated from README.Rmd. Please edit that file -->
Short description of the package
================================

imdb helps you in downloading series information from imdb. It has two functions one for basic information and a second one that also downloads synopsis, actors etc.

Installation instructions
=========================

For now you will have to install using `devtools::install_github("rmhogervorst/imdb")`

in the near future: ***Installation via cran with : install\_packages(NAMEOFPACKAGE)***

Example usage
=============

imdb has 2 functions:

- imdbSeries() and
- enrichIMDB()

With the function 'imdbSeries(seriesname = "name of series", seasons = number(s))' you can call up general information about series. Note that the api is does not really care about case. "Game of Thrones" or "game of thrones" or "gAmE oF tHrONes " is all fine.

``` r
library(imdb)
imdbSeries("game of thrones ")
#> Title Released Episode imdbRating
#> 1 Winter Is Coming 2011-04-17 1 8.9
#> 2 The Kingsroad 2011-04-24 2 8.7
#> 3 Lord Snow 2011-05-01 3 8.6
#> 4 Cripples, Bastards, and Broken Things 2011-05-08 4 8.7
#> 5 The Wolf and the Lion 2011-05-15 5 9.0
#> 6 A Golden Crown 2011-05-22 6 9.1
#> 7 You Win or You Die 2011-05-29 7 9.2
#> 8 The Pointy End 2011-06-05 8 8.9
#> 9 Baelor 2011-06-12 9 9.5
#> 10 Fire and Blood 2011-06-19 10 9.4
#> imdbID Season
#> 1 tt1480055 1
#> 2 tt1668746 1
#> 3 tt1829962 1
#> 4 tt1829963 1
#> 5 tt1829964 1
#> 6 tt1837862 1
#> 7 tt1837863 1
#> 8 tt1837864 1
#> 9 tt1851398 1
#> 10 tt1851397 1
```

The command will return a data frame with Title, releasedate, episodenumber imdb-rating, imdb ID and season.

Would you like to know more about your series? Use the `enrichIMDB` command:

``` r
season2GOT <-imdbSeries("game of thrones", seasons = 2)
season2GOT_enriched <- enrichIMDB(season2GOT)
```

The enrichIMDB command returns a seperate dataframe with imdbID, runtime, director, writer, actors, plot (complete synopsis), and votes per episode. It uses the imdbid of the episode to scour for more information. So if you'd like to know how many times Jon Snow appears in the synopsis, or how many times Peter Dinklage plays in season 2, you can now search for it.

``` r
grep("Jon", season2GOT_enriched$plot)
#> [1] 3 6 7 8 10
grep("Peter Dinklage", season2GOT_enriched$actors)
#> [1] 1 2 3 4 5 6 7 8 9 10
```

Combining the information from the two dataframes can also be very useful.

``` r
library(ggplot2)
suppressPackageStartupMessages(library(dplyr))
GOTall<-imdbSeries("game of Thrones", 1:6)
#> Warning in imdbSeries("game of Thrones", 1:6): NAs introduced by coercion
GOT <-left_join(GOTall, enrichIMDB(GOTall), by = "imdbID")
#> Warning in enrichIMDB(GOTall): NAs introduced by coercion
ggplot(GOT, aes(Episode, imdbRating)) +
geom_smooth(aes(color = as.factor(Season)),se = FALSE , alpha = 1/10)+
geom_point(aes(color = as.factor(Season), size = votes))+
ggtitle("Rating per episode of GoT, \ncolored by season\nwith smoothlines")
#> Warning: Removed 1 rows containing non-finite values (stat_smooth).
#> Warning: Removed 1 rows containing missing values (geom_point).
```

![](README-combining%20everything-1.png)

Contact
=======

I'm always looking for people to help me improve my work. Contact me directly, use an [issue](https://github.com/RMHogervorst/imdb/issues), fork me or submit a pull request.

[![star this repo](http://githubbadges.com/star.svg?user=RMHogervorst&repo=imdb&style=flat)](https://github.com/RMHogervorst/imdb) [![fork this repo](http://githubbadges.com/fork.svg?user=RMHogervorst&repo=imdb&style=flat)](https://github.com/RMHogervorst/imdb/fork)
5 changes: 5 additions & 0 deletions imdb.Rproj
Expand Up @@ -5,8 +5,13 @@ SaveWorkspace: No
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 8
Encoding: UTF-8

RnwWeave: knitr
LaTeX: pdfLaTeX

AutoAppendNewline: Yes
StripTrailingWhitespace: Yes

Expand Down
4 changes: 2 additions & 2 deletions man/enrichIMDB.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/imdbSeries.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions tests/testthat.R
@@ -0,0 +1,4 @@
library(testthat)
library(imdb)

test_check("imdb")
1 change: 1 addition & 0 deletions tests/testthat/test_enrichIMDB.R
@@ -0,0 +1 @@
#this tests enrichIMDB
24 changes: 24 additions & 0 deletions tests/testthat/test_imdbSeries.R
@@ -0,0 +1,24 @@
context("imdbSeries characters")


test_that("several caps and types return identical frames", {
a<- imdbSeries("Game OF THRONES")
b<- imdbSeries("Game-of-thrones")
expect_equal(a,b)
})

test_that("usefull error messages are created", {
skip_on_travis()
expect_error(imdbSeries("Game OF THRONES", seasons = 1-5),
regexp = "season numbers")
})

sample<- imdbSeries("Game OF THRONES", seasons = 3)
sample_2 <- enrichIMDB(sample)
test_that("enrichment works", {
expect_match(sample$Title[5], "Kissed by Fire")
expect_match(sample_2$plot[5], "Jon breaks his vows.")
})


rm(sample, sample_2)

0 comments on commit 8bb6d77

Please sign in to comment.