In [1]:
import sys
import os

sys.path.insert(0, '/usr/hdp/current/spark2-client/python')
sys.path.insert(0, '/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip')

os.environ['SPARK_HOME'] = '/usr/hdp/current/spark2-client/'
os.environ['SPARK_CONF_DIR'] = '/etc/spark2/conf'
os.environ['PYSPARK_PYTHON'] = '/opt/anaconda3/bin/python'

import pyspark
conf = pyspark.SparkConf()
conf.setMaster("yarn")
conf.set("spark.driver.memory","5g")
conf.set("spark.executor.instances", "7")
conf.set("spark.executor.memory","40g")
conf.set("spark.executor.cores","8")

sc = pyspark.SparkContext(conf=conf)

In [2]:
sc

In [3]:
!hdfs dfs -ls -h

Found 6 items
drwxr-xr-x   - fs926226 hadoop           0 2020-01-05 13:47 .sparkStaging
-rw-r--r--   3 fs926226 hadoop       1.1 G 2020-01-03 14:26 Kindle_Store.json.gz
drwxr-xr-x   - fs926226 hadoop           0 2020-01-02 14:12 apps
drwxr-xr-x   - fs926226 hadoop           0 2020-01-02 14:12 data
drwxr-xr-x   - fs926226 hadoop           0 2020-01-02 14:19 kindle-reviews
-rw-r--r--   3 fs926226 hadoop      11.4 G 2020-01-03 14:52 meta_Kindle_Store.json.gz


### Amazon Kindle Store Category Reviews

Amazon kindle product reviews and ratings are a very important business. Customers on Amazon often make purchasing decisions based on those reviews, and a single bad review can cause a potential purchaser to reconsider.

This is a subset of dataset of product reviews from Amazon Kindle Store category.

This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:
More reviews:
The total number of reviews is 233.1 million (142.8 million in 2014).
Newer reviews:
Current data includes reviews in the range May 1996 - Oct 2018.
Each reviewer has at least 5 reviews and each product has at least 5 reviews in this dataset.

The Contents of the dataset:

•	asin - ID of the product, like B000FA64PK , Amazon Standard Identification Number. The Amazon Standard     Identification Number (ASIN) is a 10-character alphanumeric unique identifier assigned by Amazon.com for product  identification.
•	vote - helpfulness rating of the review 
•	overall - rating of the product. 
•	reviewText - text of the review (heading). 
•	reviewTime - time of the review (raw). 
•	reviewerID - ID of the reviewer, like A3SPTOKDG7WBLN 
•	reviewerName - name of the reviewer. 
•	summary - summary of the review (description). 
•	unixReviewTime - unix timestamp.
•	Style - Paperback /hardcover
•	verified - Amazon verified /not Verified

Metadata file(more than 40gb) consists of information such as 
•	also_buy, also_view, asin, brand, category, main_cat, details, price, rank, title

From this dataset, several analyses are possible some of them are as follows:
1.   Find the books which have the highest ratings over the years.
2.   Find the unique books, reviewers and the ratings given by each reviewer.
3.   The number of books read by each user.
4.   The highest read book.
5.   The data such as highest and the lowest ratings given by the customers can be used for sentimental analysis.
6.   Find users who read the books most frequently and take the data to analyze which types of books people like to        read and make an important marketting decisions.
7.   The number of books read every day/month/year.
8.   Find the count and the respective category of books in Amazon Kindle Store.
8.   From every data many different analysis can be done and each information adds some value to the company's future      growth area.

## Task 1. Basic Computations
- Content of the dataset
- Reading the json file in sql sparkContext
- Calculating the Unique Book ID's and the count
- Calculating the Unique Reviewer ID's and the count
- Calculating the Books with highest rating and the review count
- Calculating the count and the % of the Amazon verified customers and Not Verified Customers

In [4]:
rev = sc.textFile("Kindle_Store.json.gz")

In [5]:
rev.take(4)

['{"overall": 4.0, "verified": true, "reviewTime": "12 29, 2012", "reviewerID": "A27UD5HYAKBL97", "asin": "1423600150", "style": {"Format:": " Hardcover"}, "reviewerName": "Cheryl", "reviewText": "If you like making salsas this is a great book with different ideas for party dips. I gave it as a gift.", "summary": "Great Book", "unixReviewTime": 1356739200}',
 '{"overall": 5.0, "vote": "3", "verified": true, "reviewTime": "03 6, 2012", "reviewerID": "A8P5DK8LLOYGH", "asin": "1423600150", "style": {"Format:": " Hardcover"}, "reviewerName": "Shay365", "reviewText": "great little book. simple and right to the point. A good basic Salsas and Tacos cooking guide.  I found it quite useful.", "summary": "great little book", "unixReviewTime": 1330992000}',
 '{"overall": 5.0, "verified": true, "reviewTime": "08 12, 2009", "reviewerID": "A3OM9W7DXSUIIY", "asin": "1423600150", "style": {"Format:": " Hardcover"}, "reviewerName": "R. Peckham", "reviewText": "This book has good pics of the recipes and

In [6]:
sqlContext = pyspark.SQLContext(sc)
sqlContext

<pyspark.sql.context.SQLContext at 0x7fbb7ee342d0>

In [7]:
re = sqlContext.read.option("multiline", "true").json(rev).cache()
display(re)

DataFrame[asin: string, image: array<string>, overall: double, reviewText: string, reviewTime: string, reviewerID: string, reviewerName: string, style: struct<Format::string>, summary: string, unixReviewTime: bigint, verified: boolean, vote: string]

In [8]:
re.printSchema()

root
 |-- asin: string (nullable = true)
 |-- image: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- overall: double (nullable = true)
 |-- reviewText: string (nullable = true)
 |-- reviewTime: string (nullable = true)
 |-- reviewerID: string (nullable = true)
 |-- reviewerName: string (nullable = true)
 |-- style: struct (nullable = true)
 |    |-- Format:: string (nullable = true)
 |-- summary: string (nullable = true)
 |-- unixReviewTime: long (nullable = true)
 |-- verified: boolean (nullable = true)
 |-- vote: string (nullable = true)



In [9]:
%%time
re.count()

CPU times: user 6.49 ms, sys: 3.92 ms, total: 10.4 ms
Wall time: 1min 3s


5722988

In [10]:
re.show(10)

+----------+-----+-------+--------------------+-----------+--------------+--------------------+-----------------+--------------------+--------------+--------+----+
|      asin|image|overall|          reviewText| reviewTime|    reviewerID|        reviewerName|            style|             summary|unixReviewTime|verified|vote|
+----------+-----+-------+--------------------+-----------+--------------+--------------------+-----------------+--------------------+--------------+--------+----+
|1423600150| null|    4.0|If you like makin...|12 29, 2012|A27UD5HYAKBL97|              Cheryl|     [ Hardcover]|          Great Book|    1356739200|    true|null|
|1423600150| null|    5.0|great little book...| 03 6, 2012| A8P5DK8LLOYGH|             Shay365|     [ Hardcover]|   great little book|    1330992000|    true|   3|
|1423600150| null|    5.0|This book has goo...|08 12, 2009|A3OM9W7DXSUIIY|          R. Peckham|     [ Hardcover]|very good bok wit...|    1250035200|    true|null|
|1423600150| nul

In [11]:
re.registerTempTable("reviews")

#### Calculating the Unique Book ID's and the count

In [12]:
uniqueproductsid = sqlContext.sql("SELECT  asin AS Book_ID, COUNT(asin) AS REVIEWED_COUNT  \
                                FROM reviews where asin is not null GROUP BY asin ORDER BY count(asin) DESC")



In [13]:
uniqueproductsid.count()

493849

In [14]:
uniqueproductsid.show(20)

+----------+--------------+
|   Book_ID|REVIEWED_COUNT|
+----------+--------------+
|B00C2WDD5I|         15345|
|B00YN6XHMU|         14568|
|B00DMCV7K0|         12061|
|B0142IHZPI|          9428|
|B00XSSYR50|          8054|
|B00WCD5GYS|          6458|
|B00571F26Y|          6124|
|B00ABLJ5X6|          5912|
|B015BIHKH6|          5795|
|B001BPYMCU|          5785|
|B017R65QS0|          5367|
|B006KWAKDE|          4650|
|B00CATSONE|          4433|
|B00GIUG3ES|          4362|
|B01D6NM4VA|          4078|
|B00IJYII4E|          3926|
|B018SCGDWK|          3922|
|B00EFWU91E|          3767|
|B00EOARZ4G|          3753|
|B012P4ZORW|          3514|
+----------+--------------+
only showing top 20 rows



#### Calculating the Unique Reviewer ID's and the count

In [15]:
uniquereviewersid = sqlContext.sql("SELECT reviewerID AS USERID, COUNT(reviewerID) AS NUM_OF_BOOKS \
                                FROM reviews where reviewerID like 'A%' GROUP BY reviewerID ORDER BY NUM_OF_BOOKS DESC")
uniquereviewersid.count()

2409262

In [16]:
uniquereviewersid.show()

+--------------+------------+
|        USERID|NUM_OF_BOOKS|
+--------------+------------+
|A328S9RN3U5M68|        2031|
|A13QTZ8CIMHHG4|        1606|
|A1JLU5H1CCENWX|        1391|
|A2W4Z0J9DFZFSR|        1184|
|A1CIS4LOWYGZGA|        1080|
|A3GWE80SUGORJD|        1016|
| A2YJ8VP1SSHJ7|         948|
|A37LY77Q2YPJVL|         946|
| AOYBZI9248RXA|         932|
|A2VXSQHJWZAQGY|         914|
|A37BRR2L8PX3R2|         906|
|A2AOG5TS7W6OXY|         877|
| A320TMDV6KCFU|         863|
|A2G5IFYYHFIQNB|         860|
|A3L1Z0R3ZB2OFU|         852|
|A2Z3DQ7P9JYRLG|         850|
| AG63N054UPJH2|         845|
|A28D20IM3BNAJ1|         837|
|A16AL32R7ZZVXB|         813|
|A1ESF76N9NLS0P|         800|
+--------------+------------+
only showing top 20 rows



#### Calculating the Books with highest rating and the review count

In [17]:
highestrating = sqlContext.sql("SELECT  asin AS BOOK_ID, COUNT(asin) AS REVIEWED_COUNT,overall AS RATING \
                                FROM reviews where overall='5' GROUP BY overall,asin ORDER BY RATING, REVIEWED_COUNT DESC")
highestrating.show()

+----------+--------------+------+
|   BOOK_ID|REVIEWED_COUNT|RATING|
+----------+--------------+------+
|B00YN6XHMU|          9383|   5.0|
|B00DMCV7K0|          8953|   5.0|
|B00C2WDD5I|          7036|   5.0|
|B00XSSYR50|          6630|   5.0|
|B0142IHZPI|          5843|   5.0|
|B001BPYMCU|          4638|   5.0|
|B017R65QS0|          4092|   5.0|
|B00571F26Y|          4078|   5.0|
|B00WCD5GYS|          3783|   5.0|
|B015BIHKH6|          3294|   5.0|
|B00ABLJ5X6|          3262|   5.0|
|B00GIUG3ES|          3149|   5.0|
|B00EFWU91E|          3069|   5.0|
|B006KWAKDE|          3044|   5.0|
|B00EFWU9QY|          2991|   5.0|
|B00ESJ3S94|          2912|   5.0|
|B01D6NM4VA|          2607|   5.0|
|B012P4ZORW|          2453|   5.0|
|B00IJYII4E|          2388|   5.0|
|B017H8DIBK|          2373|   5.0|
+----------+--------------+------+
only showing top 20 rows



#### Calculating the count and the % of the Amazon verified customers Vs Not Verified Customers

In [18]:
verifiedpurchase = sqlContext.sql("SELECT verified , COUNT(*) AS VERIFIED_PURCHASE, (COUNT(*)*100)/5722988 AS VERIFIED_PUR_PER FROM reviews GROUP BY verified")
verifiedpurchase.show()

+--------+-----------------+-----------------+
|verified|VERIFIED_PURCHASE| VERIFIED_PUR_PER|
+--------+-----------------+-----------------+
|    true|          4036164|70.52546676666105|
|   false|          1686824|29.47453323333895|
+--------+-----------------+-----------------+



## Task 2. Calculate the number of reviews every day/month/year

In [19]:
date= sqlContext.sql("SELECT from_unixtime(unixReviewTime,'YYYY-MM-dd') AS DATE, COUNT(*) AS NUM_REVIEWS FROM reviews\
                    WHERE unixReviewTime is not null GROUP BY DATE  ORDER BY NUM_REVIEWS DESC")

date.show(20)

+----------+-----------+
|      DATE|NUM_REVIEWS|
+----------+-----------+
|2015-02-19|      12301|
|2016-03-28|      11457|
|2016-02-15|      10333|
|2016-04-18|       8864|
|2015-02-27|       8836|
|2015-10-06|       8613|
|2015-03-09|       8612|
|2015-09-24|       8118|
|2015-05-29|       8055|
|2016-03-02|       8012|
|2016-01-22|       7865|
|2015-06-07|       7661|
|2016-02-03|       7486|
|2016-03-14|       7368|
|2016-06-06|       7071|
|2015-07-08|       7069|
|2015-03-17|       6965|
|2016-02-26|       6695|
|2016-07-14|       6637|
|2014-09-30|       6592|
+----------+-----------+
only showing top 20 rows



In [20]:
date.registerTempTable("date_new")
date.printSchema()

root
 |-- DATE: string (nullable = true)
 |-- NUM_REVIEWS: long (nullable = false)



In [21]:
from pyspark.sql.types import StringType
def displaymonth(s):
    month_lst = ['January', 'Feburary', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']
    mon = month_lst[(s%12)-1]
    return mon

sqlContext.udf.register("display_month",displaymonth)

month_mod = sqlContext.sql("SELECT YEAR(DATE) AS year , display_month(MONTH(DATE)) AS month ,SUM(NUM_REVIEWS) AS values \
                            FROM date_new GROUP BY MONTH(DATE),YEAR(DATE) ORDER BY year DESC")
month_mod.show(20)


+----+---------+------+
|year|    month|values|
+----+---------+------+
|2018|  October|    11|
|2018|    March| 31111|
|2018|   August| 14370|
|2018|September|  4308|
|2018|     July| 15892|
|2018|    April| 27774|
|2018| Feburary| 30676|
|2018|  January| 36293|
|2018|      May| 25597|
|2018|     June| 26381|
|2018| December|  1141|
|2017|    April| 43889|
|2017|September| 42601|
|2017|     June| 33616|
|2017|  January| 76292|
|2017| Feburary| 46676|
|2017| December| 29160|
|2017|     July| 45643|
|2017|   August| 50148|
|2017| November| 30403|
+----+---------+------+
only showing top 20 rows



## Task 3 Below are the Ratings and respective % of the people who rated Amazon Kindle books

- Example 60% of people have given Rating as 5

In [22]:
rating_per = sqlContext.sql("SELECT  overall  AS RATING, (COUNT(*)*100/5722988) AS REVIEW_PERC\
                                FROM reviews where overall is not null GROUP BY overall ORDER BY RATING DESC")
rating_per.show()

+------+--------------------+
|RATING|         REVIEW_PERC|
+------+--------------------+
|   5.0|   60.80355227024764|
|   4.0|  21.878588597424983|
|   3.0|   8.746864400204927|
|   2.0|  3.8615142998727237|
|   1.0|   4.709462958859953|
|   0.0|1.747338977471209E-5|
+------+--------------------+



## Task 4 User Interactive Analysis from the dataset
- If the user enters the BOOK ID(Product ID -asin), Rating of the book and the respective summary of the book will be displayed 

In [23]:
bookid = input("Enter the book id :")
bookdetails = sqlContext.sql("SELECT overall, reviewText, summary FROM reviews WHERE asin='%s' ORDER BY overall DESC" %bookid)
bookdetails.show(10)

Enter the book id :B00YN6XHMU
+-------+--------------------+--------------------+
|overall|          reviewText|             summary|
+-------+--------------------+--------------------+
|    5.0|    Love this series|             Awesome|
|    5.0|Having read the f...|          Important!|
|    5.0|Very good read .....|      Very good read|
|    5.0|As usual another ...|Stop reading revi...|
|    5.0|these books are a...|    love these books|
|    5.0|A lot of FSOG fan...|A nice treat whil...|
|    5.0|Continuing the st...|                Yum.|
|    5.0|Grey ends on a ve...|I hope that James...|
|    5.0|The movies were a...| Love love this book|
|    5.0|I loved this whol...|        True Romance|
+-------+--------------------+--------------------+
only showing top 10 rows



In [24]:
reviewer_id = input("Enter the reviewer id :")
reviewdetails = sqlContext.sql("SELECT asin,reviewTime,summary FROM reviews WHERE reviewerID='%s'" %reviewer_id)
reviewdetails.show(10)

Enter the reviewer id :A1JLU5H1CCENWX
+----------+-----------+----------+
|      asin| reviewTime|   summary|
+----------+-----------+----------+
|B001892DGG| 11 3, 2013|       Max|
|B001CN45ZA| 11 3, 2013|     Simon|
|B0026OQYYO| 11 3, 2013|      Rick|
|B002A4MICW| 11 3, 2013|    Adrian|
|B002VFPS9A| 11 3, 2013|      Gabe|
|B0043GX2FW|12 14, 2013|     Jesse|
|B004OA5ZTS|01 29, 2014|      Alex|
|B004XW3LLQ|06 22, 2013|Luke stark|
|B0052ZAV9S| 08 1, 2013|    Travis|
|B0055OPM4U|09 28, 2014|     Valor|
+----------+-----------+----------+
only showing top 10 rows



## Task 5 - Reading and Tasks with MetaData - using join

In [25]:
rem_rdd = sc.textFile("meta_Kindle_Store.json.gz")

In [26]:
rem_rdd.cache()

meta_Kindle_Store.json.gz MapPartitionsRDD[150] at textFile at NativeMethodAccessorImpl.java:0

In [27]:
rem_df = sqlContext.read.option("multiline", "true").json(rem_rdd).cache()

In [28]:
rem_df.printSchema()


root
 |-- also_buy: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- also_view: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- asin: string (nullable = true)
 |-- brand: string (nullable = true)
 |-- category: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- details: string (nullable = true)
 |-- image: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- main_cat: string (nullable = true)
 |-- price: string (nullable = true)
 |-- rank: string (nullable = true)
 |-- title: string (nullable = true)



In [29]:
rem_df.show(5)

+--------------------+--------------------+----------+--------------------+--------------------+--------------------+-----+------------+-----+--------------------+-----+
|            also_buy|           also_view|      asin|               brand|            category|             details|image|    main_cat|price|                rank|title|
+--------------------+--------------------+----------+--------------------+--------------------+--------------------+-----+------------+-----+--------------------+-----+
|                null|                null|0143065971|Visit Amazon's Ra...|[Kindle Store, Ki...|                null| null|Buy a Kindle| null|1,857,911PaidinKi...| null|
|                null|[B010CKZO4E, B015...|1423600150|     Susan D. Curtis|[Kindle Store, Ki...|                null| null|Buy a Kindle| null|682,905PaidinKind...| null|
|[B007NLCJBC, B01F...|[B000FBF81K, B00P...|B000FA5KKA|    Arthur K. Barnes|[Kindle Store, Ki...|
  <div class="co...| null|Buy a Kindle| null|1,716,84

In [30]:
rem_df.count()

493859

In [31]:
rem_df.registerTempTable("reviews_meta")

#### The data of the book ID , Books Purchased and the Books Viewed are as follows

In [37]:
Books_bought = sqlContext.sql("SELECT asin as BookID, also_buy AS Books_Purchased,also_view AS View_Suggestions \
                              FROM reviews_meta where asin is not null and also_buy is not null and also_view is not null \
                              and asin like 'B%'")
Books_bought.show()


+----------+--------------------+--------------------+
|    BookID|     Books_Purchased|    View_Suggestions|
+----------+--------------------+--------------------+
|B000FA5KKA|[B007NLCJBC, B01F...|[B000FBF81K, B00P...|
|B000FA5M3K|[B00AYWTHZS, B071...|        [B00AYWTHZS]|
|B000FA5KX2|[B000SEGKF2, B004...|[B018LE1KUK, B000...|
|B000FA64PK|[B000FA64QO, B005...|[B000FC1BN8, B005...|
|B000FA5PV4|[B074YLWXKY, B076...|[B01HP8PICO, B000...|
|B000FA65EK|[B000FC1A9I, B00F...|[B004N624O2, B000...|
|B000FA671Q|[B003WJQ6LS, B001...|[B00D2DU5AW, B003...|
|B000FA65ZY|[B00B7TD69U, B00M...|[B00B7TD69U, B00M...|
|B000FA64QO|[B000FA64PK, B005...|[B000JMKNQ0, B000...|
|B000FBFMHU|[B003F3PKW2, B008...|[B000FC1GOC, B000...|
|B000FBFJ3W|[B00AQZAI9M, B005...|[B00E7VB3VS, B01L...|
|B000FBFM1G|[B000FBFM4I, B000...|[B000FBFM4I, B07B...|
|B000FBFMUM|[B072Q1KBRT, B00J...|[B01KUGTMQG, B004...|
|B000FBFJNC|        [B073TJBYTB]|[B073RPMC4L, B001...|
|B000FBFLPI|[B00J2DZIB2, B002...|[B0732Q2Y92, B00C...|
|B000FBJBA

#### The data of the Book ID , Genre of the book and the count of the Books in the respective genre
- Joined asin(Book ID ) from table reviews with Category of table reviews_meta

In [33]:
cate = sqlContext.sql("SELECT r.asin , rm.category[2] AS Book_Cat , COUNT(*) AS BOOK_COUNT  FROM reviews_meta as rm INNER JOIN reviews as r \
                      ON rm.asin = r.asin WHERE r.asin like 'B%' GROUP BY r.asin ,rm.category[2] ORDER BY BOOK_COUNT DESC")                      
cate.show(truncate = False)


+----------+----------------------------+----------+
|asin      |Book_Cat                    |BOOK_COUNT|
+----------+----------------------------+----------+
|B00C2WDD5I|Literature & Fiction        |15345     |
|B00YN6XHMU|Literature & Fiction        |14568     |
|B00DMCV7K0|Literature & Fiction        |12061     |
|B0142IHZPI|Literature & Fiction        |9428      |
|B00XSSYR50|Politics & Social Sciences  |8054      |
|B00WCD5GYS|Arts & Photography          |6458      |
|B00571F26Y|Health, Fitness & Dieting   |6124      |
|B00ABLJ5X6|Mystery, Thriller & Suspense|5912      |
|B015BIHKH6|Romance                     |5795      |
|B001BPYMCU|Business & Money            |5785      |
|B017R65QS0|Literature & Fiction        |5367      |
|B006KWAKDE|Teen & Young Adult          |4650      |
|B00CATSONE|Biographies & Memoirs       |4433      |
|B00GIUG3ES|Science Fiction & Fantasy   |4362      |
|B01D6NM4VA|Literature & Fiction        |4078      |
|B00IJYII4E|Mystery, Thriller & Suspense|3926 

#### Below are the count of the Books in each Genre(Category of the Book)
- Joined asin(Book ID ) from table reviews with Category of table reviews_meta and display only the Category and the respective count of each category of Books in Amazon Kindle Store

In [35]:
cat_genre = sqlContext.sql("SELECT rm.category[2] AS Book_Cat , COUNT(*) AS BOOK_COUNT  FROM reviews_meta as rm INNER JOIN reviews as r \
                      ON rm.asin = r.asin WHERE r.asin like 'B%' GROUP BY rm.category[2] ORDER BY BOOK_COUNT DESC")                      
cat_genre.show(truncate = False)

+----------------------------+----------+
|Book_Cat                    |BOOK_COUNT|
+----------------------------+----------+
|Literature & Fiction        |2190878   |
|Romance                     |795983    |
|Religion & Spirituality     |401463    |
|Science Fiction & Fantasy   |370056    |
|Mystery, Thriller & Suspense|318444    |
|Children's eBooks           |220466    |
|Health, Fitness & Dieting   |191341    |
|Teen & Young Adult          |181305    |
|Business & Money            |180389    |
|Biographies & Memoirs       |96813     |
|Cookbooks, Food & Wine      |90089     |
|Humor & Entertainment       |77882     |
|History                     |77074     |
|Politics & Social Sciences  |60615     |
|Arts & Photography          |46133     |
|Sports & Outdoors           |44119     |
|Education & Teaching        |41389     |
|Crafts, Hobbies & Home      |41353     |
|Computers & Technology      |35098     |
|Reference                   |32341     |
+----------------------------+----

In [38]:
sc.stop()