# Exploring Hacker News Posts
The forum website, Hacker News, is forum-type website where users share posts and they are voted and commented upon, like Reddit; the site is very popular among technology and startup circles.

Let's explore two types of posts that we can find on Hacker News.
- 'Ask HN' posts are questions asked by users to the Hacker News community for advice or assistance
- 'Show HN' posts are submitted by users to show-off something interesting to Hacker News community like some product or project

We'll compare these two types of posts to determine:
- If 'Ask HN' or 'Show HN' posts receive more comments on average
- If posts submitted at a certain time receive more comments on average

In [18]:
from csv import reader

of = open('hacker_news.csv')
rf = reader(of)
hn = list(rf)

hn[0:3]

[['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at'],
 ['12224879',
  'Interactive Dynamic Video',
  'http://www.interactivedynamicvideo.com/',
  '386',
  '52',
  'ne0phyte',
  '8/4/2016 11:52'],
 ['10975351',
  'How to Use Open Source and Shut the Fuck Up at the Same Time',
  'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/',
  '39',
  '10',
  'josep2',
  '1/26/2016 19:30']]

In [19]:
headers = hn[0]
hn = hn[1:]
print(headers)
hn[0:5]

['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at']


[['12224879',
  'Interactive Dynamic Video',
  'http://www.interactivedynamicvideo.com/',
  '386',
  '52',
  'ne0phyte',
  '8/4/2016 11:52'],
 ['10975351',
  'How to Use Open Source and Shut the Fuck Up at the Same Time',
  'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/',
  '39',
  '10',
  'josep2',
  '1/26/2016 19:30'],
 ['11964716',
  "Florida DJs May Face Felony for April Fools' Water Joke",
  'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/',
  '2',
  '1',
  'vezycash',
  '6/23/2016 22:20'],
 ['11919867',
  'Technology ventures: From Idea to Enterprise',
  'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429',
  '3',
  '1',
  'hswarna',
  '6/17/2016 0:01'],
 ['10301696',
  'Note by Note: The Making of Steinway L1037 (2007)',
  'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0',
  '8',
  '2',
  'walterbell',
  '9/30/2015 4:12']]

In [20]:
ask_posts = []
show_posts = []
other_posts = []

for row in hn:
    title = row[1]
    if title.lower().startswith('ask hn'):
        ask_posts.append(row)
    elif title.lower().startswith('show hn'):
        show_posts.append(row)
    else:
        other_posts.append(row)
        
print('len(ask_posts) =', len(ask_posts))
print('len(show_posts) =', len(show_posts))
print('len(other_posts) =', len(other_posts))

len(ask_posts) = 1744
len(show_posts) = 1162
len(other_posts) = 17194


In [25]:
def avg_comments(ds):
    total = 0
    for row in ds:
        comments = row[4]
        total += int(comments)
    avg_comments = total / len(ds)
    return avg_comments
        
print("Average comments per 'Ask HN' post:", avg_comments(ask_posts))
print("Average comments per 'Show HN' post:", avg_comments(show_posts))


Average comments per 'Ask HN' post: 14.038417431192661
Average comments per 'Show HN' post: 10.31669535283993


By summing the number of comments for each post and dividing it by the number of posts, we found that 'Ask HN' posts have a higher average comments per post than 'Show HN' posts.
- 14 comments on average per 'Ask HN' post
- 10 comments on average per 'Show HN' post

This makes sense as 'Ask HN' posts will typically involve more communication between the author and readers as they try to solve the problem or answer the query asked. 

'Show HN' posts typically consist of the author showing the community something interesting in their post and will not typically require as much input as an 'Ask HN' post besides one or two reactions from readers of the post, whereas readers of an 'Ask HN' post may comment several times to help the author.

In [41]:
import datetime as dt

result_list = []
counts_by_hour = {}
comments_by_hour = {}

for row in ask_posts:
    created_at = row[6]
    num_comments = int(row[4])
    result_list.append([created_at, num_comments])

for row in result_list:
    created_at = row[0]
    num_comments = row[1]
    parse_format = '%m/%d/%Y %H:%M'
    hour = dt.datetime.strptime(created_at, parse_format)
    hour = hour.strftime('%H')
    if hour not in counts_by_hour:
        counts_by_hour[hour] = 1
        comments_by_hour[hour] = num_comments
    else:
        counts_by_hour[hour] += 1
        comments_by_hour[hour] += num_comments

In [47]:
avg_by_hour = []

for key, val in comments_by_hour.items():
    count = counts_by_hour[key]
    comments = val
    avg_comments = comments / count
    avg_by_hour.append([key, avg_comments])
    
avg_by_hour

[['09', 5.5777777777777775],
 ['13', 14.741176470588234],
 ['10', 13.440677966101696],
 ['14', 13.233644859813085],
 ['16', 16.796296296296298],
 ['23', 7.985294117647059],
 ['12', 9.41095890410959],
 ['17', 11.46],
 ['15', 38.5948275862069],
 ['21', 16.009174311926607],
 ['20', 21.525],
 ['02', 23.810344827586206],
 ['18', 13.20183486238532],
 ['03', 7.796296296296297],
 ['05', 10.08695652173913],
 ['19', 10.8],
 ['01', 11.383333333333333],
 ['22', 6.746478873239437],
 ['08', 10.25],
 ['04', 7.170212765957447],
 ['00', 8.127272727272727],
 ['06', 9.022727272727273],
 ['07', 7.852941176470588],
 ['11', 11.051724137931034]]

In [57]:
swap_avg_by_hour = []

for row in avg_by_hour:
    swap_avg_by_hour.append([row[1], row[0]])
    
print(swap_avg_by_hour)
sorted_swap = sorted(swap_avg_by_hour, reverse=True)
print('\nTop 5 Hours for Ask Posts Comments:')
for row in sorted_swap[:5]:
    hour = dt.datetime.strptime(row[1], '%H')
    comments = row[0]
    temp = '{0}: {1:.2f} average comments per post'.format(hour.strftime('%H:%M'), comments)
    print(temp)

[[5.5777777777777775, '09'], [14.741176470588234, '13'], [13.440677966101696, '10'], [13.233644859813085, '14'], [16.796296296296298, '16'], [7.985294117647059, '23'], [9.41095890410959, '12'], [11.46, '17'], [38.5948275862069, '15'], [16.009174311926607, '21'], [21.525, '20'], [23.810344827586206, '02'], [13.20183486238532, '18'], [7.796296296296297, '03'], [10.08695652173913, '05'], [10.8, '19'], [11.383333333333333, '01'], [6.746478873239437, '22'], [10.25, '08'], [7.170212765957447, '04'], [8.127272727272727, '00'], [9.022727272727273, '06'], [7.852941176470588, '07'], [11.051724137931034, '11']]

Top 5 Hours for Ask Posts Comments:
15:00: 38.59 average comments per post
02:00: 23.81 average comments per post
20:00: 21.52 average comments per post
16:00: 16.80 average comments per post
21:00: 16.01 average comments per post


The best time to upload an 'Ask HN' post to Hacker News would be 3:00 PM, according to the average comments per Ask post grouped by hour. The next best times are 2:00 AM (24 comments/post) and 8:00 PM (22 comments/post), respectively, though the difference in average comments per hour between the two are only ~2 comments; you're likely better off just posting at 8 PM instead of waking up at 2 AM to post for a minimal increase in comments. 

Fortunately, 3:00 PM is a doable time to post for most people, and the likelihood for more comments are much higher (39 comments/post) than 2 AM and 8 PM. 

If you have an important question about something you're developing or are in desperate for any and every piece of advice you can get, upload your 'Ask HN' post at 3:00 PM for the best chance to reach the most users and receive the most comments!