Skip to content

Neloy-Barman/Goodreads-Book-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goodreads Book Data Analysis

Project Development Journal

Problem Statement

Data Collection

Analysis Requirements Blueprint

  • Find the average ratings for each author based on the book data.(Top 5)
  • Find the number of books written by an author.(Top 5)
  • Find out the top 5 most rated books.
  • Find out the top 5 books with most number of ratings.
  • Find out the top 5 books with most number of reviews.
  • Plot a relationship between the total no. of reviews and avg ratings as well as total no. of ratings.
  • Do the total number of 5 star, 4 star and 2 star ratings create an impact to get the book a descent avg. ratings? Do they have any graphical relationship?
  • In which number of genres, the avg. ratings and no. of reviews are mostly populated?
  • Find if there exists any relationship between descriptions word count and avg. ratings and no. of reviews.

DashBoard

You can find all the analysis within this Tableau DashBoard

Analysis and Observations

i. Author and Book Data Analysis

  • Author Data Findings
    • Top 5 authors with most avg. ratings based on book ratings: -
      Author Avg. Ratings
      Mustafa Kemal Atatürk 4.81
      Quino 4.77
      MsKingBean89 4.753
      Hayao Miyazaki 4.75
      Chanel Miller 4.71
    • Top 5 authors with most no. of books and avg. ratings: -
      Author No. of Books Avg. Ratings
      Stephen King 82 4.022
      Agatha Christie 77 3.902
      Sherrilyn Kenyon 73 4.236
      James Patterson 69 3.956
      Misba 57 4.595
  • Book Data Findings
    • Top 5 most rated books: -
      Title Avg. Ratings
      A Song to Drown Rivers 4.92
      Nutuk 4.81
      All the Young Dudes - Volume Two: Years 5 - 7 4.81
      The Complete Calvin and Hobbes 4.80
      Mark of the Lion Trilogy 4.78
    • Top 5 books with most number of ratings: -
      Title No. of Ratings
      Harry Potter and the Sorcerer’s Stone 9,699,750
      Harry Potter and the Philosopher’s Stone 9,679,526
      Hungerspelen 8,325,671
      The Hunger Games 8,324,081
      To Kill a Mockingbird 5,917,556
    • Top 5 books with most number of reviews: -
      Title No. of Reviews
      Los siete maridos de Evelyn Hugo 248,179
      The Seven Husbands of Evelyn Hugo 247,810
      It Ends with Us 234,402
      Verity 216,065
      Hungerspelen 210,444

ii. Different Kinds of Relationships Findings

  • Reviews vs Avg. Ratings and Total No. of Ratings
    I plotted a relationship between the total number of reviews for a book with it's average ratings and total number of ratings given by the readers. The main goal here was to find out if any kind of dependency on each other exists that may result in increasing or decreasing values. In the case of reviews vs avg. ratings, the most avg. rated book "A Song to Drown River" with 4.92 value has only 46 reviews whereas the lowest one "Revealing Eden" with 2.00 value has 375 reviews. The book with most number of reviews is "Los siete maridos de Evelyn Hugo" and it's average ratings is 4.43. Proceeding to reviews vs no. of ratings plot, we could have expected a relationship but the values got scattered. Here the book with most number of ratings is "Harry Potter and the Sorcerer's Stone" with a value of 9699750 has 156453 reviews. The lowest one here has 9 ratings and 1 reviews. So, there is no dependent relationship between any of these variables.
  • Descriptions Word vs Avg. Ratings and No. of Reviews
    Different descriptions have different word counts. So, I plotted the words count and wanted to see if any dependency occurs with avg. ratings and total reviews. Looking at average ratings, some points may show relation. But the overall situation shows nothing. In the case of plot with total reviews, the book with 1201 descriptive words has only 358 reviews whereas the one with 199 words got 248179 reviews. The lowest word count I got is only 1. So, there is no relationship in between those both practical and hypothetical variables.
  • Genre Count vs Avg. Ratings and No. of Reviews
    In the website, a book falls within multiple genres. So, I tried to figure out in which area, the average ratings and reviews mostly depend. In the case of average ratings, the darker regions are mostly closer to 4.00 value. Looking at Y-axis, the books that are categorized between 12 and 14 closer values, fall within the darker areas. Going to the number of reivews case, in the same way 12-14 closer values categoried books get a descent amount of reviews. So, we can conclude saying people are interested in books that contains writings relatable to 12-14 genres.
  • Avg. Ratings vs Total 5, 4 & 2 Star Ratings
    To see the effects of total 5, 4 and 3 star ratings in final average ratings I created a plot. It doesn't seem to have any identical relationship. But more the upper ratings value a book has, the more it gets closer to a good ratings. Such as "Harry Potter and the Sorcerer's Stone" and "Harry Potter and the Philosopher's Stone" both have descent a amount of 5 star and 4 star ratings that results in getting an average ratings of 4.43.

Releases

No releases published

Packages

No packages published

Languages