### Exploring Hacker News Posts

We're specifically interested in posts whose titles begin with either Ask HN or Show HN. Users submit Ask HN posts to ask the Hacker News community a specific question. Likewise, users submit Show HN posts to show the Hacker News community a project, product, or just generally something interesting.

We'll compare these two types of posts to determine the following:

- Do Ask HN or Show HN receive more comments on average?
- Do posts created at a certain time receive more comments on average?

In [4]:
opened_file = open('hacker_news.csv')
from csv import reader
read_file = reader(opened_file)
hn = list(read_file)

for row in hn[0:6]:
    print(row)
    print('\n')

['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at']


['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52']


['10975351', 'How to Use Open Source and Shut the Fuck Up at the Same Time', 'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/', '39', '10', 'josep2', '1/26/2016 19:30']


['11964716', "Florida DJs May Face Felony for April Fools' Water Joke", 'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/', '2', '1', 'vezycash', '6/23/2016 22:20']


['11919867', 'Technology ventures: From Idea to Enterprise', 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429', '3', '1', 'hswarna', '6/17/2016 0:01']


['10301696', 'Note by Note: The Making of Steinway L1037 (2007)', 'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0', '8', '2', 'walterbell', '9/30/2015 4:12']




In [5]:
headers = hn[0]
hn = hn[1:]

print(hn[0])

['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52']


In [7]:
ask_posts = []
show_posts = []
other_posts = []

for row in hn:
    title = row[1]
    if title.lower().startswith('ask hn'):
        ask_posts.append(row)
    elif title.lower().startswith('show hn'):
        show_posts.append(row)
    else:
        other_posts.append(row)
        
print(len(ask_posts))
print(len(show_posts))
print(len(other_posts))      

1744
1162
17194


### Calculating the Average Number of Comments for Ask HN and Show HN Posts

In [10]:
total_ask_comments = 0

for post in ask_posts:
    num_comments = int(post[4])
    total_ask_comments += num_comments
    
avg_ask_comments = total_ask_comments / len(ask_posts)
print('Average number of ask comments ', avg_ask_comments)

Average number of ask comments  14.038417431192661


In [11]:
total_show_comments = 0

for post in show_posts:
    num_comments = int(post[4])
    total_show_comments += num_comments

avg_show_comments = total_show_comments / len(show_posts)
print('Average number of show comments ', avg_show_comments)    

Average number of show comments  10.31669535283993


Ask posts get on average 40% commets more than show posts.

### Finding the Amount of Ask Posts and Comments by Hour Created

In [32]:
import datetime as dt
result_list = []
for post in ask_posts:
    created_at = post[6]
    num_comments = int(post[4])
    result_list.append([created_at, num_comments])
    
counts_by_hour = {}
comments_by_hour = {}

for result in result_list:
    hour = dt.datetime.strptime(result[0], '%m/%d/%Y %H:%M').hour
    
    if hour not in counts_by_hour:
        counts_by_hour[hour] = 1
        comments_by_hour[hour] = result[1]
    else:
        counts_by_hour[hour] += 1
        comments_by_hour[hour] += result[1]
    
for hour, counts in counts_by_hour.items():
    print('Number of post on hour {0}: {1}'.format(hour, counts))
    
print('\n')
for hour, counts in comments_by_hour.items():
    print('Number of comments on hour {0}: {1}'.format(hour, counts))

Number of post on hour 9: 45
Number of post on hour 13: 85
Number of post on hour 10: 59
Number of post on hour 14: 107
Number of post on hour 16: 108
Number of post on hour 23: 68
Number of post on hour 12: 73
Number of post on hour 17: 100
Number of post on hour 15: 116
Number of post on hour 21: 109
Number of post on hour 20: 80
Number of post on hour 2: 58
Number of post on hour 18: 109
Number of post on hour 3: 54
Number of post on hour 5: 46
Number of post on hour 19: 110
Number of post on hour 1: 60
Number of post on hour 22: 71
Number of post on hour 8: 48
Number of post on hour 4: 47
Number of post on hour 0: 55
Number of post on hour 6: 44
Number of post on hour 7: 34
Number of post on hour 11: 58


Number of comments on hour 9: 251
Number of comments on hour 13: 1253
Number of comments on hour 10: 793
Number of comments on hour 14: 1416
Number of comments on hour 16: 1814
Number of comments on hour 23: 543
Number of comments on hour 12: 687
Number of comments on hour 17: 114

### Calculating the Average Number of Comments for Ask HN Posts By Hour

In [36]:
avg_by_hour = []

for hour, count in counts_by_hour.items():
    avg_num_comments = comments_by_hour[hour] / count
    avg_by_hour.append([hour, avg_num_comments])
    
for hour in avg_by_hour:
    print(hour)

[9, 5.5777777777777775]
[13, 14.741176470588234]
[10, 13.440677966101696]
[14, 13.233644859813085]
[16, 16.796296296296298]
[23, 7.985294117647059]
[12, 9.41095890410959]
[17, 11.46]
[15, 38.5948275862069]
[21, 16.009174311926607]
[20, 21.525]
[2, 23.810344827586206]
[18, 13.20183486238532]
[3, 7.796296296296297]
[5, 10.08695652173913]
[19, 10.8]
[1, 11.383333333333333]
[22, 6.746478873239437]
[8, 10.25]
[4, 7.170212765957447]
[0, 8.127272727272727]
[6, 9.022727272727273]
[7, 7.852941176470588]
[11, 11.051724137931034]


### Sorting and Printing Values from a List to List

In [50]:
swap_avg_by_hour = []

for row in avg_by_hour:
    swap_avg_by_hour.append([row[1], row[0]])
    
sorted_swap = sorted(swap_avg_by_hour, reverse=True)

for row in sorted_swap[0:6]:
    hour_obj = dt.datetime.strptime(str(row[1]), '%H')
    hour_str = hour_obj.strftime('%H:%M')
    print('{}: {:.2f} average comments per post'.format(hour_str, row[0]))
    

    


15:00: 38.59 average comments per post
02:00: 23.81 average comments per post
20:00: 21.52 average comments per post
16:00: 16.80 average comments per post
21:00: 16.01 average comments per post
13:00: 14.74 average comments per post


One has the highest chance of receiving comments when creating post in:
- early afternoon hours (between 15 and 16) 
- evening between 20 and 21
- late at night at 2 in the moring