Data Thieves

Comparing GoodReads, Amazon and NYTimes

Description

Comparing Bestseller lists, by analyzing how the ranking system functions. GoodReads review system, Amazon copies sold, NYTimes publisher copies sold. The point is to see how books receive their bestseller position and what common themes were see in all three sources.

Organization

We knew what sources we wanted to scrape. Considering the sources and the information we could obtain we could do a limited comparison. NYTimes provides only one value which is a book place and 3 books that are in the 2 other websites. Amazon has reviews, book place (rank of the most sold, like NYTimes) and also number of reviews. Goodreads has a horde of reviews, rankings, number of readers, number of reviewers. An abundance of values to compare to the other two queries.

After scraping, we needed to figure out what information was in common. Some books were in the all 3 of the lists. We could obtain avarege reviews from the 2 of the lists. Compare the number of reviewers. After that dropping all of the irrelevant information Creating a central data point.

For fun, we tried to figure out how to make it a product, based on a real example and who would be interested in it.

Project structure

Assets:
- Holds one raw dataset csv file.
Data:
- All clean data each scraped site is in its own csv file.
Notebooks:
- Work of each person, not cleaned, just the scrape everybody conducted.
Utility:
- Queries and organizing of the datasets
Magic Baby Maker:
- Is the main list generator. It draws functions from the utility folder.

Project status

Regina - crying

Edgar - saveing the day

Cezar - on point

Resources

https://www.amazon.com/best-sellers-books-Amazon/zgbs/books https://www.goodreads.com/book/most_read?category=all&country=all&duration=m https://www.nytimes.com/books/best-sellers/2020/06/21/

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
data		data
notebooks		notebooks
utility		utility
.gitignore		.gitignore
README.md		README.md
magic_baby_maker.ipynb		magic_baby_maker.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Thieves

Description

Organization

Project structure

Project status

Resources

Files

About

Releases

Packages

Contributors 3

Languages

epj-alter/Data-Thieves-Project

Folders and files

Latest commit

History

Repository files navigation

Data Thieves

Description

Organization

Project structure

Project status

Resources

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages