###Scrape product reviews from Amazon and generate some basic insights with R.

- How to scrape multiple pages from the Amazon website to gather reviews


To get the information from the website, I will use selector gadget. The gadget is super easy to use and is intended to select the parts of the website to get the relevant tag. You can supply one of css or xpath depending on whether you want to use a CSS or XPath 1.0 selector. For now I will use css. For more information you can visit: https://selectorgadget.com/

I will get 4 variables from the reviews.

profilename - Profile name of user who wrote the review
reviewdata - Review content
iconalt - How many stars the user gave to product
reviewdate - Date the review was posted

###Libraries

In [15]:
install.packages("qdap")

package 'qdap' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\natab\AppData\Local\Temp\RtmpiuLz20\downloaded_packages


In [17]:
library(pander)
library(rvest)
library(stringr)
library(dplyr)
library(lubridate)
library(ggplot2)
library(ggthemes)
library(tidytext)
library(textdata)
library(tidyr)
library(wordcloud)

###First, I will scrape  each variables separetely.

Scrape review dates from pages 1-5

In [18]:
reviewdate <- lapply(paste0("https://www.amazon.com/little-Prince-Antoine-Saint-Exup%C3%A9ry/product-reviews/0544656490/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=", 1:5),
                     function(url){
                       url %>% read_html() %>% 
                         html_nodes(".review-date") %>% 
                         html_text()%>%
                         str_trim()
                       
                     })

Scrape usernames from pages 1-5

In [19]:
profilename <- lapply(paste0("https://www.amazon.com/little-Prince-Antoine-Saint-Exup%C3%A9ry/product-reviews/0544656490/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=", 1:5),
                     function(url){
                       url %>% read_html() %>% 
                         html_nodes(".a-profile-name") %>% 
                         html_text()%>%
                         str_trim()
                       
                     })

Scrape review stars from pages 1-5

In [20]:
iconalt <- lapply(paste0("https://www.amazon.com/little-Prince-Antoine-Saint-Exup%C3%A9ry/product-reviews/0544656490/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=", 1:5),
                  function(url){
                    url %>% read_html() %>% 
                      html_nodes(".review-rating") %>% 
                      html_text()%>%
                      str_trim()
                       })

Scrape reviews from pages 1-5

In [21]:
reviewdata <- lapply(paste0("https://www.amazon.com/little-Prince-Antoine-Saint-Exup%C3%A9ry/product-reviews/0544656490/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=", 1:5),
                     function(url){
                       url %>% read_html() %>% 
                         html_nodes(".review-text-content") %>% 
                         html_text() %>%
                         str_trim()
                         })
                

###Now data is in "list" format. First I will unlist it, and then append and create a dataframe.

In [23]:
reviewdata <- unlist(reviewdata)
reviewdate <- unlist(reviewdate)
iconalt <- unlist(iconalt)
profilename <- unlist(profilename)

###Append and create a dataframe

In [24]:
allreviews <- as.data.frame(cbind(reviewdata, reviewdate, iconalt, profilename))

In [25]:
dim(allreviews)

Data consists of 4 columns and 50 rows. (10 review per page, 5 pages)

In [26]:
head(allreviews);

reviewdata,reviewdate,iconalt,profilename
We loved this movie and this book. I wanted to order the beautiful pop up for my children to enjoy as I read to them. When I did It came completely destroyed the cover is barely on and when I contacted the seller they told me to send it to some address not do an amazon return so I ended up being stuck with it. We still read it’s just disappointing that it came damaged and we couldn’t get it fixed. So my five tars is for the book it’s self it’s a joy and truely enjoyable.,"March 17, 2019",5.0 out of 5 stars,Dmbait
This pop up book is magical my whole family Adores it! Great quality! Will be buying it as Christmas presents for other family members!,"November 10, 2018",5.0 out of 5 stars,Liz
"I saw the French edition for the first time in the gift shop at the ""Louvre"", went on Amazon immediately, found the English edition, it's better priced than the museum gift shop and I didn't have to add to my luggage! It's a WIN, WIN, WIN.Great execution of the popups, they interesting, smooth, no snags, I'd recommend this for gifts for all ages (including yourself)!","June 16, 2018",5.0 out of 5 stars,Evan
"The pop-up art in this version of ""The Little Prince"" is really well made and great quality. Just be careful with handling the book because pop-ups have a tendency to break or rip if mishandled. It doesn't distract from the elements of the story at all, as a matter fact, I think it adds more of an interactive perspective to the reader! You'll love this unabridged version of ""The Little Prince"", it's really worth it!","September 28, 2019",5.0 out of 5 stars,Anonymous Amazon User
"A gorgeous edition of this beautiful classic. I took down one star though, because I could not find the access code for downloading the audio read by Viggo Mortensen. The jacket says that the access code is inside the book, but it was nowhere to be found.","October 24, 2019",4.0 out of 5 stars,Jeanne Del
"This book is absolutely amazing. The quality and delicacy of the pop up and illustrations is top notch, although one page upon opening the book gently for the first time had the pop up base ripped off :( it is an unabridged version and a great book for any age to read, however If it’s for a younger age child like a toddler, best to read for them and not let them grab it as it’s very delicate and will most likely destroy the book. This book makes me happy.","December 4, 2019",4.0 out of 5 stars,Watson


create stars variable from iconalt

In [28]:
allreviews$stars <- substring(allreviews$iconalt, 1, 3)
allreviews$stars <- as.numeric(allreviews$stars)
head(allreviews$stars)

In [30]:
head(allreviews)

reviewdata,reviewdate,iconalt,profilename,stars
We loved this movie and this book. I wanted to order the beautiful pop up for my children to enjoy as I read to them. When I did It came completely destroyed the cover is barely on and when I contacted the seller they told me to send it to some address not do an amazon return so I ended up being stuck with it. We still read it’s just disappointing that it came damaged and we couldn’t get it fixed. So my five tars is for the book it’s self it’s a joy and truely enjoyable.,"March 17, 2019",5.0 out of 5 stars,Dmbait,5
This pop up book is magical my whole family Adores it! Great quality! Will be buying it as Christmas presents for other family members!,"November 10, 2018",5.0 out of 5 stars,Liz,5
"I saw the French edition for the first time in the gift shop at the ""Louvre"", went on Amazon immediately, found the English edition, it's better priced than the museum gift shop and I didn't have to add to my luggage! It's a WIN, WIN, WIN.Great execution of the popups, they interesting, smooth, no snags, I'd recommend this for gifts for all ages (including yourself)!","June 16, 2018",5.0 out of 5 stars,Evan,5
"The pop-up art in this version of ""The Little Prince"" is really well made and great quality. Just be careful with handling the book because pop-ups have a tendency to break or rip if mishandled. It doesn't distract from the elements of the story at all, as a matter fact, I think it adds more of an interactive perspective to the reader! You'll love this unabridged version of ""The Little Prince"", it's really worth it!","September 28, 2019",5.0 out of 5 stars,Anonymous Amazon User,5
"A gorgeous edition of this beautiful classic. I took down one star though, because I could not find the access code for downloading the audio read by Viggo Mortensen. The jacket says that the access code is inside the book, but it was nowhere to be found.","October 24, 2019",4.0 out of 5 stars,Jeanne Del,4
"This book is absolutely amazing. The quality and delicacy of the pop up and illustrations is top notch, although one page upon opening the book gently for the first time had the pop up base ripped off :( it is an unabridged version and a great book for any age to read, however If it’s for a younger age child like a toddler, best to read for them and not let them grab it as it’s very delicate and will most likely destroy the book. This book makes me happy.","December 4, 2019",4.0 out of 5 stars,Watson,4


Dataframe is ready for further analysis