# Finding the Best Time to Post on Hacker News to Get the Most Comments
Introduction
Our main task in this project is to analyze Hacker News posts. We want to find out which posts are more likely to get the most comments. We also want to understand if the post creation time affects the number of comments they receive. As for some extra, we are going to do the same research for the points the posts receive.


There are two types of posts on Hacker News. 
- One of them is Ask HN where users can ask the Hacker News community a specific question.
- And the second one is Show HN where users show the Hacker News community a project, product, or just something interesting. We are going to focusing on these two types of posts to do our research.

In our research, we learned that Ask HN gets the most comments. The creation time of the post also affects the number of comments per the post. The following will show the steps we used to achieve our goal.

In [13]:
from csv import reader
import datetime as dt

In [14]:
opened_file = open('hacker_news.csv',
                encoding = 'utf8')
read_file = reader(opened_file)

hn = list(read_file)

header = hn[0]
hn = hn[1:]

In [15]:
print(header)

['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at']


- `id`: the unique Identifier from Hacker News for the post
- `title`: the title of the post
- `url`: the URL that the posts links to, if the post has a URL
- `num_points`: the number of points the post acquired, calculated as the total number of upvotes minus the total number of downvotes
- `num_comments`: the number of comments on the post
- `author`: the username of the person who submitted the post
- `created_at`: the date and time of the post's submission

In [16]:
print(header)

['id', 'title', 'url', 'num_points', 'num_comments', 'author', 'created_at']


In [17]:

for row in hn[:5]:
    print(row)

['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52']
['10975351', 'How to Use Open Source and Shut the Fuck Up at the Same Time', 'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/', '39', '10', 'josep2', '1/26/2016 19:30']
['11964716', "Florida DJs May Face Felony for April Fools' Water Joke", 'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/', '2', '1', 'vezycash', '6/23/2016 22:20']
['11919867', 'Technology ventures: From Idea to Enterprise', 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429', '3', '1', 'hswarna', '6/17/2016 0:01']
['10301696', 'Note by Note: The Making of Steinway L1037 (2007)', 'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0', '8', '2', 'walterbell', '9/30/2015 4:12']


# Step 1 . Extracting Ask HN and Show HN Posts
First, we will separate the Ask HN and Show HN posts. to be able trhive down more information about our current dataset


In [18]:

ask_posts = []
show_posts = []
other_posts = []

for post in hn:
    title = post[1]
    if title.lower().startswith("ask hn"):
        ask_posts.append(post)
    elif title.lower().startswith("show hn"):
        show_posts.append(post)
    else:
        other_posts.append(post)
        
print('There Are of Number \'Ask Hn\'Post',len(ask_posts))
print(len(show_posts))
print(len(other_posts))


There Are of Number 'Ask Hn'Post 1744
1162
17194


In [19]:
total_ask_comment = 0

for post in ask_posts:
    total_ask_comment += int(post[4])

avg_ask_comments = total_ask_comment / len(ask_posts)
print(avg_ask_comments)

14.038417431192661


# Find average 
of *Total_ask_comment* and *total_Show_comment*

In [36]:
total_show_comment = 0

for post in ask_posts:
    total_ask_comment += int(post[4])

avg_show_comments = total_show_comment / len(show_posts)
print(total_show_comment)

0


On Average Ask post in our sample receive Approxiamtly 14 comment wheres show posts receive 10.since post are morelikely to recieve comment 


# Finding The ammmount ask post and comment by Hour Created


In [29]:
result_list = []

for post in ask_posts :
    result_list.append(
        [post[6],int(post[4])])

counts_by_hour = {}
comments_by_hour = {}
date_format = "%m/%d/%Y %H:%M"

for each_row in result_list:
    date = each_row[0]
    comment = each_row[1]
    time = dt.datetime.strptime(date, date_format).strftime("%H")
    if time in counts_by_hour:
        comments_by_hour[time] += comment
        counts_by_hour[time] += 1
    else:
        comments_by_hour[time] = comment
        counts_by_hour[time] = 1
comments_by_hour


{'09': 251,
 '13': 1253,
 '10': 793,
 '14': 1416,
 '16': 1814,
 '23': 543,
 '12': 687,
 '17': 1146,
 '15': 4477,
 '21': 1745,
 '20': 1722,
 '02': 1381,
 '18': 1439,
 '03': 421,
 '05': 464,
 '19': 1188,
 '01': 683,
 '22': 479,
 '08': 492,
 '04': 337,
 '00': 447,
 '06': 397,
 '07': 267,
 '11': 641}

In [55]:
# counting the average comment for #Ask_hn Post by hour
avg_by_hour = []

for avg_hour in comments_by_hour:
    avg_by_hour.append([avg_hour , comments_by_hour[avg_hour]/counts_by_hour[avg_hour]])

avg_by_hour




[['09', 5.5777777777777775],
 ['13', 14.741176470588234],
 ['10', 13.440677966101696],
 ['14', 13.233644859813085],
 ['16', 16.796296296296298],
 ['23', 7.985294117647059],
 ['12', 9.41095890410959],
 ['17', 11.46],
 ['15', 38.5948275862069],
 ['21', 16.009174311926607],
 ['20', 21.525],
 ['02', 23.810344827586206],
 ['18', 13.20183486238532],
 ['03', 7.796296296296297],
 ['05', 10.08695652173913],
 ['19', 10.8],
 ['01', 11.383333333333333],
 ['22', 6.746478873239437],
 ['08', 10.25],
 ['04', 7.170212765957447],
 ['00', 8.127272727272727],
 ['06', 9.022727272727273],
 ['07', 7.852941176470588],
 ['11', 11.051724137931034]]

# Sorting and printing Values from list of list

the result in counting the average is not appealing,isn't? so we have to sorting the materials to 5 . this gives the reader appealing moments


In [56]:
swap_avg_by_hour = []

for row in avg_by_hour:
    swap_avg_by_hour.append([row[1], row[0]])
print(swap_avg_by_hour)

sorter_swap = sorted(swap_avg_by_hour, reverse = True)
sorter_swap


[[5.5777777777777775, '09'], [14.741176470588234, '13'], [13.440677966101696, '10'], [13.233644859813085, '14'], [16.796296296296298, '16'], [7.985294117647059, '23'], [9.41095890410959, '12'], [11.46, '17'], [38.5948275862069, '15'], [16.009174311926607, '21'], [21.525, '20'], [23.810344827586206, '02'], [13.20183486238532, '18'], [7.796296296296297, '03'], [10.08695652173913, '05'], [10.8, '19'], [11.383333333333333, '01'], [6.746478873239437, '22'], [10.25, '08'], [7.170212765957447, '04'], [8.127272727272727, '00'], [9.022727272727273, '06'], [7.852941176470588, '07'], [11.051724137931034, '11']]


[[38.5948275862069, '15'],
 [23.810344827586206, '02'],
 [21.525, '20'],
 [16.796296296296298, '16'],
 [16.009174311926607, '21'],
 [14.741176470588234, '13'],
 [13.440677966101696, '10'],
 [13.233644859813085, '14'],
 [13.20183486238532, '18'],
 [11.46, '17'],
 [11.383333333333333, '01'],
 [11.051724137931034, '11'],
 [10.8, '19'],
 [10.25, '08'],
 [10.08695652173913, '05'],
 [9.41095890410959, '12'],
 [9.022727272727273, '06'],
 [8.127272727272727, '00'],
 [7.985294117647059, '23'],
 [7.852941176470588, '07'],
 [7.796296296296297, '03'],
 [7.170212765957447, '04'],
 [6.746478873239437, '22'],
 [5.5777777777777775, '09']]

In [49]:
# And Conclude that with Gives 5 Best  Top Hours for ask Post comments

for avg, avg_by_hour in sorter_swap[:5]:
    print(
        "{}: {:.2f} average comments per post".format(
            dt.datetime.strptime(avg_by_hour, "%H").strftime("%H:%M"),avg
        )
    )

15:00: 38.59 average comments per post
02:00: 23.81 average comments per post
20:00: 21.52 average comments per post
16:00: 16.80 average comments per post
21:00: 16.01 average comments per post


# Conclusion And Considerations

According to the analysis, the hours of 15:00, 02:00, and 20:00 have the highest average comments per post, with 38.59, 23.81, and 21.52 respectively. Posting during these hours may increase the chances of receiving comments on a post. On the other hand, the hours of 16:00 and 21:00 have a relatively lower average of 16.80 and 16.01 comments per post respectively. However, it is important to note that these findings are based on the data set's time zone, and converting the times to the time zone you live in may yield different results.

Comments are a great way to measure engagement with a post on Hacker News. They are certainly not the only metric though. This project limited it's scope to comments of just certain types of posts. Other engagement metrics such as view and votes were not considered but may play a significant role in measuring overall engagement.

The next steps for determining when the best time to submit a post is involve a similar examination of views and votes. Analyzing all 3 metrics across time should provide a more sophisticated recommendation on the optimum submission time for maximizing engagement.