# Scavenger hunt
___

Now it's your turn! Here's the questions I would like you to get the data to answer:

* How many stories (use the "id" column) are there of each type (in the "type" column) in the full table?
* How many comments have been deleted? (If a comment was deleted the "deleted" column in the comments table will have the value "True".)
* **Optional extra credit**: read about [aggregate functions other than COUNT()](https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#aggregate-functions) and modify one of the queries you wrote above to use a different aggregate function.

In order to answer these questions, you can fork this notebook by hitting the blue "Fork Notebook" at the very top of this page (you may have to scroll up). "Forking" something is making a copy of it that you can edit on your own without changing the original.

In [6]:
# Your code goes here :)
import bq_helper
hacker_news = bq_helper.BigQueryHelper(active_project="bigquery-public-data",
                                   dataset_name="hacker_news")

In [7]:
hacker_news.list_tables()

['comments', 'full', 'full_201510', 'stories']

In [10]:
hacker_news.table_schema('full')

[SchemaField('by', 'string', 'NULLABLE', "The username of the item's author.", ()),
 SchemaField('score', 'integer', 'NULLABLE', 'Story score', ()),
 SchemaField('time', 'integer', 'NULLABLE', 'Unix time', ()),
 SchemaField('timestamp', 'timestamp', 'NULLABLE', 'Timestamp for the unix time', ()),
 SchemaField('title', 'string', 'NULLABLE', 'Story title', ()),
 SchemaField('type', 'string', 'NULLABLE', 'Type of details (comment, comment_ranking, poll, story, job, pollopt)', ()),
 SchemaField('url', 'string', 'NULLABLE', 'Story url', ()),
 SchemaField('text', 'string', 'NULLABLE', 'Story or comment text', ()),
 SchemaField('parent', 'integer', 'NULLABLE', 'Parent comment ID', ()),
 SchemaField('deleted', 'boolean', 'NULLABLE', 'Is deleted?', ()),
 SchemaField('dead', 'boolean', 'NULLABLE', 'Is dead?', ()),
 SchemaField('descendants', 'integer', 'NULLABLE', 'Number of story or poll descendants', ()),
 SchemaField('id', 'integer', 'NULLABLE', "The item's unique id.", ()),
 SchemaField('ran

In [11]:
hacker_news.head('full')

Unnamed: 0,by,score,time,timestamp,title,type,url,text,parent,deleted,dead,descendants,id,ranking
0,kevinchen,3.0,1449206166,2015-12-04 05:16:06+00:00,Let’s Encrypt and DreamHost,story,https://www.dreamhost.com/blog/2015/12/03/lets...,,,,,1.0,10674806,
1,StavrosK,,1340205250,2012-06-20 15:14:10+00:00,,comment,,I had a look at a Galaxy Tab the other day in ...,4136609.0,,,,4137339,
2,javajosh,,1353972384,2012-11-26 23:26:24+00:00,,comment,,Whoah. That's some serious balls - def someone...,4834270.0,,,,4834594,
3,synicalx,,1501636197,2017-08-02 01:09:57+00:00,,comment,,"Neato, from a &#x27;user&#x27; perspective how...",14898853.0,,,,14907052,
4,forensic,,1312344642,2011-08-03 04:10:42+00:00,,comment,,Would it be illegal to sell firmware that uses...,2839234.0,,,,2839800,


In [19]:
query1 = """
SELECT type, COUNT(id)
FROM `bigquery-public-data.hacker_news.full`
GROUP BY type
"""
number_of_type_by_stories = hacker_news.query_to_pandas(query1)
number_of_type_by_stories.head()

Unnamed: 0,type,f0_
0,story,2846448
1,comment,13494262
2,job,10164
3,pollopt,11806
4,poll,1728


In [30]:
query2 = """
SELECT SUM(deleted_int)
FROM (
    SELECT deleted, CAST(deleted as INT64) as deleted_int
    FROM `bigquery-public-data.hacker_news.full`
    ) as SubQuery
"""
number_of_deleted_stories = hacker_news.query_to_pandas(query2)

In [31]:
number_of_deleted_stories

Unnamed: 0,f0_
0,494700


Please feel free to ask any questions you have in this notebook or in the [Q&A forums](https://www.kaggle.com/questions-and-answers)! 

Also, if you want to share or get comments on your kernel, remember you need to make it public first! You can change the visibility of your kernel under the "Settings" tab, on the right half of your screen.