Skip to content

epj-alter/Data-Thieves-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ironhack Logo

Data Thieves

Comparing GoodReads, Amazon and NYTimes

Description

Comparing Bestseller lists, by analyzing how the ranking system functions. GoodReads review system, Amazon copies sold, NYTimes publisher copies sold. The point is to see how books receive their bestseller position and what common themes were see in all three sources.

Organization

We knew what sources we wanted to scrape. Considering the sources and the information we could obtain we could do a limited comparison. NYTimes provides only one value which is a book place and 3 books that are in the 2 other websites. Amazon has reviews, book place (rank of the most sold, like NYTimes) and also number of reviews. Goodreads has a horde of reviews, rankings, number of readers, number of reviewers. An abundance of values to compare to the other two queries.

After scraping, we needed to figure out what information was in common. Some books were in the all 3 of the lists. We could obtain avarege reviews from the 2 of the lists. Compare the number of reviewers. After that dropping all of the irrelevant information Creating a central data point.

For fun, we tried to figure out how to make it a product, based on a real example and who would be interested in it.

Project structure

  • Assets:
    • Holds one raw dataset csv file.
  • Data:
    • All clean data each scraped site is in its own csv file.
  • Notebooks:
    • Work of each person, not cleaned, just the scrape everybody conducted.
  • Utility:
    • Queries and organizing of the datasets
  • Magic Baby Maker:
    • Is the main list generator. It draws functions from the utility folder.

Project status

Regina - crying

Edgar - saveing the day

Cezar - on point

Resources

https://www.amazon.com/best-sellers-books-Amazon/zgbs/books https://www.goodreads.com/book/most_read?category=all&country=all&duration=m https://www.nytimes.com/books/best-sellers/2020/06/21/

Files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published