# Optimising social media post interaction - Hacker News
Hacker News is a user-submission site similar to reddit where posts can be voted and commented on.

|Index|Variable name | Description|
|:---|---|---:|
|`0`|`id` | unique identifier for post|
|`1`|`title`|title of post|
|`2`|`url`|url link for post|
|`3`|`num_points`|number of points aquired by post|
|`4`|`num_comments`| number of comments on the post|
|`5`|`author`|username of person who submitted the post|
|`6`|`created_at`|date and time post was posted (Eastern Standard Time -5 hours)|

The assignment is particularly interest in posts with `Ask HN` and `Show HN` as a tag to determine some insight.

## More about the data:
The data comes from the following link on [Kaggle](https://www.kaggle.com/hacker-news/hacker-news-posts).

The data is obtained using a scraper on Hacker News and has data on posts for the previous year (up until 26th September 2016).

## Goals of the project:
I want to find someone insight into what kind of posts has the most interaction form users of the social media site, Hacker News.

As a proxy of interaction I will use two different variables: number of comments and number of points.

My main conclusion I reached using this dataset is that on average "Ask" posts generate more comments than "Show" posts. 

For "Ask" posts, the optimal time to post would be around 3pm as this generates on average the most comments and points.

## Evaluation
I think I could have fared better if I had implemented functions rather than typing the code out in full. 

It would have been better in terms of memory management as well due to writing the code out seperate lead to many redundant variables.

The code I used for each post type was reusable and would have been less complex than doing each one seperately.

Another thing I would have changed is disaggregated the other posts further as both ask and show posts only consist of around 15% of the dataset.

In [1]:
from csv import reader
open_file = open('hacker_news.csv')
read_file = reader(open_file)
hn = list(read_file)
hn_header = hn[0]
hn = hn[1:]

for i in hn[0:5]:
    print(i)
    print('')

['12224879', 'Interactive Dynamic Video', 'http://www.interactivedynamicvideo.com/', '386', '52', 'ne0phyte', '8/4/2016 11:52']

['10975351', 'How to Use Open Source and Shut the Fuck Up at the Same Time', 'http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-fuck-up-at-the-same-time/', '39', '10', 'josep2', '1/26/2016 19:30']

['11964716', "Florida DJs May Face Felony for April Fools' Water Joke", 'http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/', '2', '1', 'vezycash', '6/23/2016 22:20']

['11919867', 'Technology ventures: From Idea to Enterprise', 'https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429', '3', '1', 'hswarna', '6/17/2016 0:01']

['10301696', 'Note by Note: The Making of Steinway L1037 (2007)', 'http://www.nytimes.com/2007/11/07/movies/07stein.html?_r=0', '8', '2', 'walterbell', '9/30/2015 4:12']



### Filtering out the "Ask" and "Show" posts
Now we need to filter out the posts which are `Ask HN` and `Show HN`

In [2]:
ask_posts = []
show_posts = []
other_posts = []

for i in hn:
    title = i[1]
    if title.lower().startswith('ask hn'):
        ask_posts.append(i)
    elif title.lower().startswith('show hn'):
        show_posts.append(i)
    else:
        other_posts.append(i)

In [3]:
len_ask = len(ask_posts)
len_show = len(show_posts)
len_other = len(other_posts)

print("There are {:,} \"Ask\" posts".format(len_ask))
print("There are {:,} \"Show\" posts".format(len_show))
print("There are {:,} \"Other\" posts".format(len_other))

There are 1,744 "Ask" posts
There are 1,162 "Show" posts
There are 17,194 "Other" posts


### Check for failure in filtration
Inspecting `other_posts` for any posts we may have missed due to not following the Hacker News post guidelines

In [4]:
for i in other_posts:
    title = i[1].lower()
    if 'show hn' in title:
        print(i)
    if 'ask hn' in title:
        print(i)    

['10758328', "Idea: on the Show page, remove 'Show HN' from the titles", '', '7', '9', 'joelanman', '12/18/2015 13:39']
['11216184', 'Tell HN: Get your app classified before Show HN', '', '12', '1', 'gadders', '3/3/2016 11:26']
['12301483', 'Tell/ask HN: vimtutor teaches vim basics. Is there something similar for Emacs?', '', '8', '2', 'mettamage', '8/16/2016 23:41']
['11274792', 'Our 36 Hours on Show HN', 'https://medium.com/@justinlaing/our-36-hours-on-show-hn-34d47b6b56ee#.k9a8i7pt4', '5', '4', 'justinlaing', '3/12/2016 21:57']


Inspecting `other_posts`, the remaining titles with `Show HN` and `Ask HN` seem to be posts that are about the two tags so are not relevant

# Comments
## Which type of post on average performs better?

In [5]:
total_ask_comments = 0
for i in ask_posts:
    num_comments_ask = int(i[4])
    total_ask_comments += num_comments_ask

avg_ask_comments = total_ask_comments/(len(ask_posts))
print('Average number of ask comments is {:.2f}'.format(avg_ask_comments))

total_show_comments = 0
for i in show_posts:
    num_comments_show = int(i[4])
    total_show_comments += num_comments_show

avg_show_comments = total_show_comments/(len(show_posts))
print('Average number of show comments is {:.2f}'.format(avg_show_comments))

total_other_comments = 0
for i in other_posts:
    num_comments_other = int(i[4])
    total_other_comments += num_comments_other

avg_other_comments = total_other_comments/(len(other_posts))
print('Average number of other comments is {:.2f}'.format(avg_other_comments))

Average number of ask comments is 14.04
Average number of show comments is 10.32
Average number of other comments is 26.87


### Insight
On average ask posts tend to receive more comments from users compared to show posts.

This implies users interact (in the form of comments) more often to posts where the original poster is asking for help of some sort.

This might be due to the fact that show posts may include memes and other short clips where users may not have much to say to whereas in ask posts, the user is directly asking for other users input.

Other posts has a very large average number of comments. This is possibly due to the fact that this category is a mix of other type of posts are not included in this analysis. There might be a more popular category amongst the other post which skews the average upwards.

## Does time of posting affect comments?
Now we want to analyse whether time of post may affect the number of comments a post receives

### Forming frequency tables
### Ask posts
We form two different frequency tables: one for the number of posts at each hour and another for the number of comments at each hour

In [6]:
import datetime as dt
from datetime import *

result_list_ask = []
for i in ask_posts:
    created_at_ask = i[6]
    num_comments_ask = i[4]
    result_list_ask.append([created_at_ask, num_comments_ask])

counts_by_hour_ask = {}
comments_by_hour_ask = {}

for i in result_list_ask:
    created_at_ask_1 = i[0]
    num_comments_ask_1 = int(i[1])
    created_at_ask_1 = dt.datetime.strptime(created_at_ask_1, "%m/%d/%Y %H:%M")
    ask_hour = dt.datetime.strftime(created_at_ask_1, "%H:00")
    if ask_hour in counts_by_hour_ask:
        counts_by_hour_ask[ask_hour] += 1
        comments_by_hour_ask[ask_hour] += num_comments_ask_1
    else:
        counts_by_hour_ask[ask_hour] = 1
        comments_by_hour_ask[ask_hour] = num_comments_ask_1
        
print('Posts by the hour for ask posts')
print(counts_by_hour_ask)
print('')
print('Comments by the hour for ask posts')
print(comments_by_hour_ask)

Posts by the hour for ask posts
{'09:00': 45, '13:00': 85, '10:00': 59, '14:00': 107, '16:00': 108, '23:00': 68, '12:00': 73, '17:00': 100, '15:00': 116, '21:00': 109, '20:00': 80, '02:00': 58, '18:00': 109, '03:00': 54, '05:00': 46, '19:00': 110, '01:00': 60, '22:00': 71, '08:00': 48, '04:00': 47, '00:00': 55, '06:00': 44, '07:00': 34, '11:00': 58}

Comments by the hour for ask posts
{'09:00': 251, '13:00': 1253, '10:00': 793, '14:00': 1416, '16:00': 1814, '23:00': 543, '12:00': 687, '17:00': 1146, '15:00': 4477, '21:00': 1745, '20:00': 1722, '02:00': 1381, '18:00': 1439, '03:00': 421, '05:00': 464, '19:00': 1188, '01:00': 683, '22:00': 479, '08:00': 492, '04:00': 337, '00:00': 447, '06:00': 397, '07:00': 267, '11:00': 641}


### Show posts

In [7]:
result_list_show = []
for i in show_posts:
    created_at_show = i[6]
    num_comments_show = i[4]
    result_list_show.append([created_at_show, num_comments_show])

counts_by_hour_show = {}
comments_by_hour_show = {}

for i in result_list_show:
    created_at_show_1 = i[0]
    num_comments_show_1 = int(i[1])
    created_at_show_1 = dt.datetime.strptime(created_at_show_1, "%m/%d/%Y %H:%M")
    show_hour = dt.datetime.strftime(created_at_show_1, "%H:00")
    if show_hour in counts_by_hour_show:
        counts_by_hour_show[show_hour] += 1
        comments_by_hour_show[show_hour] += num_comments_show_1
    else:
        counts_by_hour_show[show_hour] = 1
        comments_by_hour_show[show_hour] = num_comments_show_1
        
print('Posts by the hour for show posts')
print(counts_by_hour_show)
print('')
print('Comments by the hour for show posts')
print(comments_by_hour_show)

Posts by the hour for show posts
{'14:00': 86, '22:00': 46, '18:00': 61, '07:00': 26, '20:00': 60, '05:00': 19, '16:00': 93, '19:00': 55, '15:00': 78, '03:00': 27, '17:00': 93, '06:00': 16, '02:00': 30, '13:00': 99, '08:00': 34, '21:00': 47, '04:00': 26, '11:00': 44, '12:00': 61, '23:00': 36, '09:00': 30, '01:00': 28, '10:00': 36, '00:00': 31}

Comments by the hour for show posts
{'14:00': 1156, '22:00': 570, '18:00': 962, '07:00': 299, '20:00': 612, '05:00': 58, '16:00': 1084, '19:00': 539, '15:00': 632, '03:00': 287, '17:00': 911, '06:00': 142, '02:00': 127, '13:00': 946, '08:00': 165, '21:00': 272, '04:00': 247, '11:00': 491, '12:00': 720, '23:00': 447, '09:00': 291, '01:00': 246, '10:00': 297, '00:00': 487}


### Other posts

In [8]:
result_list_other = []
for i in other_posts:
    created_at_other = i[6]
    num_comments_other = i[4]
    result_list_other.append([created_at_other, num_comments_other])

counts_by_hour_other = {}
comments_by_hour_other = {}

for i in result_list_other:
    created_at_other_1 = i[0]
    num_comments_other_1 = int(i[1])
    created_at_other_1 = dt.datetime.strptime(created_at_other_1, "%m/%d/%Y %H:%M")
    other_hour = dt.datetime.strftime(created_at_other_1, "%H:00")
    if other_hour in counts_by_hour_other:
        counts_by_hour_other[other_hour] += 1
        comments_by_hour_other[other_hour] += num_comments_other_1
    else:
        counts_by_hour_other[other_hour] = 1
        comments_by_hour_other[other_hour] = num_comments_other_1
        
print('Posts by the hour for show posts')
print(counts_by_hour_other)
print('')
print('Comments by the hour for show posts')
print(comments_by_hour_other)

Posts by the hour for show posts
{'11:00': 660, '19:00': 980, '22:00': 758, '00:00': 611, '04:00': 454, '09:00': 534, '16:00': 1101, '18:00': 1084, '10:00': 591, '12:00': 789, '20:00': 911, '03:00': 407, '17:00': 1169, '14:00': 958, '13:00': 918, '01:00': 500, '23:00': 674, '08:00': 496, '02:00': 441, '21:00': 874, '15:00': 1040, '06:00': 408, '05:00': 388, '07:00': 448}

Comments by the hour for show posts
{'11:00': 19532, '19:00': 26167, '22:00': 17635, '00:00': 16544, '04:00': 10953, '09:00': 14732, '16:00': 27959, '18:00': 29186, '10:00': 15728, '12:00': 23944, '20:00': 21080, '03:00': 10918, '17:00': 32727, '14:00': 30973, '13:00': 28363, '01:00': 11536, '23:00': 16592, '08:00': 13405, '02:00': 12254, '21:00': 20635, '15:00': 30700, '06:00': 8714, '05:00': 9768, '07:00': 12010}


### Average comment per post by the hour
Using the above two frequency tables, I am going to calculate the average comments per post split up by hour posted

### Ask posts

In [9]:
avg_com_per_post_ask = []
for i in comments_by_hour_ask:
    num_comments_ask_2 = comments_by_hour_ask[i]
    num_posts_ask = counts_by_hour_ask[i]
    avg_comments_per_post_ask = num_comments_ask_2/num_posts_ask
    avg_com_per_post_ask.append([i, avg_comments_per_post_ask])
print(avg_com_per_post_ask)

[['09:00', 5.5777777777777775], ['13:00', 14.741176470588234], ['10:00', 13.440677966101696], ['14:00', 13.233644859813085], ['16:00', 16.796296296296298], ['23:00', 7.985294117647059], ['12:00', 9.41095890410959], ['17:00', 11.46], ['15:00', 38.5948275862069], ['21:00', 16.009174311926607], ['20:00', 21.525], ['02:00', 23.810344827586206], ['18:00', 13.20183486238532], ['03:00', 7.796296296296297], ['05:00', 10.08695652173913], ['19:00', 10.8], ['01:00', 11.383333333333333], ['22:00', 6.746478873239437], ['08:00', 10.25], ['04:00', 7.170212765957447], ['00:00', 8.127272727272727], ['06:00', 9.022727272727273], ['07:00', 7.852941176470588], ['11:00', 11.051724137931034]]


### Show posts

In [10]:
avg_com_per_post_show = []
for i in comments_by_hour_show:
    num_comments_show_2 = comments_by_hour_show[i]
    num_posts_show = counts_by_hour_show[i]
    avg_comments_per_post_show = num_comments_show_2/num_posts_show
    avg_com_per_post_show.append([i, avg_comments_per_post_show])
print(avg_com_per_post_show)

[['14:00', 13.44186046511628], ['22:00', 12.391304347826088], ['18:00', 15.770491803278688], ['07:00', 11.5], ['20:00', 10.2], ['05:00', 3.0526315789473686], ['16:00', 11.655913978494624], ['19:00', 9.8], ['15:00', 8.102564102564102], ['03:00', 10.62962962962963], ['17:00', 9.795698924731182], ['06:00', 8.875], ['02:00', 4.233333333333333], ['13:00', 9.555555555555555], ['08:00', 4.852941176470588], ['21:00', 5.787234042553192], ['04:00', 9.5], ['11:00', 11.159090909090908], ['12:00', 11.80327868852459], ['23:00', 12.416666666666666], ['09:00', 9.7], ['01:00', 8.785714285714286], ['10:00', 8.25], ['00:00', 15.709677419354838]]


### Other posts

In [11]:
avg_com_per_post_other = []
for i in comments_by_hour_other:
    num_comments_other_2 = comments_by_hour_other[i]
    num_posts_other = counts_by_hour_other[i]
    avg_comments_per_post_other = num_comments_other_2/num_posts_other
    avg_com_per_post_other.append([i, avg_comments_per_post_other])
print(avg_com_per_post_other)

[['11:00', 29.593939393939394], ['19:00', 26.701020408163266], ['22:00', 23.265171503957784], ['00:00', 27.076923076923077], ['04:00', 24.125550660792953], ['09:00', 27.588014981273407], ['16:00', 25.394187102633968], ['18:00', 26.924354243542435], ['10:00', 26.612521150592215], ['12:00', 30.34727503168568], ['20:00', 23.13940724478595], ['03:00', 26.825552825552826], ['17:00', 27.99572284003422], ['14:00', 32.33089770354906], ['13:00', 30.896514161220043], ['01:00', 23.072], ['23:00', 24.617210682492583], ['08:00', 27.026209677419356], ['02:00', 27.786848072562357], ['21:00', 23.60983981693364], ['15:00', 29.51923076923077], ['06:00', 21.357843137254903], ['05:00', 25.175257731958762], ['07:00', 26.808035714285715]]


### Top 5 
Here I will clean up the output shown above and present the top 5 times that on average achieve the most comments

In [12]:
swap_avg_by_hour_ask = []
for i in avg_com_per_post_ask:  
    swap_avg_by_hour_ask.append([i[1],i[0]])
swap_avg_by_hour_ask.sort(reverse = True)
for i in swap_avg_by_hour_ask:
    avg_comments_per_post_ask_1 = i[0]
    avg_comments_per_post_ask = "{:.2f}".format(avg_comments_per_post_ask_1)
    i[0] = avg_comments_per_post_ask
print('')
print('Ask posts - Top 5 times to post are:')
print(swap_avg_by_hour_ask[0:5])

swap_avg_by_hour_show = []
for i in avg_com_per_post_show:  
    swap_avg_by_hour_show.append([i[1],i[0]])
swap_avg_by_hour_show.sort(reverse = True)
for i in swap_avg_by_hour_show:
    avg_comments_per_post_show_1 = i[0]
    avg_comments_per_post_show = "{:.2f}".format(avg_comments_per_post_show_1)
    i[0] = avg_comments_per_post_show
print('')
print('Show posts - Top 5 times to post are:')
print(swap_avg_by_hour_show[0:5])

swap_avg_by_hour_other = []
for i in avg_com_per_post_other:  
    swap_avg_by_hour_other.append([i[1],i[0]])
swap_avg_by_hour_other.sort(reverse = True)
for i in swap_avg_by_hour_other:
    avg_comments_per_post_other_1 = i[0]
    avg_comments_per_post_other = "{:.2f}".format(avg_comments_per_post_other_1)
    i[0] = avg_comments_per_post_other
print('')
print('Other posts - Top 5 times to post are:')
print(swap_avg_by_hour_other[0:5])


Ask posts - Top 5 times to post are:
[['38.59', '15:00'], ['23.81', '02:00'], ['21.52', '20:00'], ['16.80', '16:00'], ['16.01', '21:00']]

Show posts - Top 5 times to post are:
[['15.77', '18:00'], ['15.71', '00:00'], ['13.44', '14:00'], ['12.42', '23:00'], ['12.39', '22:00']]

Other posts - Top 5 times to post are:
[['32.33', '14:00'], ['30.90', '13:00'], ['30.35', '12:00'], ['29.59', '11:00'], ['29.52', '15:00']]


### Insight
This shows that the most active hours to receive replies for Ask HN post is around 3pm

This would be the optimal time to post as looking at the top 5, both 3pm and 4pm are listed as the hours with one of the highest average comment per post by hour so even if no comment occurs at 3pm, there is a fairly high chance of receiving a reply at 4pm.

The second best option would be to post at 8pm as same logic as above applies, the subsequent hour period (9pm) also has a fairly high chance of receiving a comment.

For show posts, the posts that received the most comments on average are the posts which are posted in the hour after 6pm. However there seems to be no pattern between the hour the post is posted and the average number of comment received. For show posts, posting during the hour after 6pm or at 12am tend to on average net the most amounts of comments.

For other posts, the hours from 11-3pm is the best times to receive the most amounts of comments. The difference in averages from the hours after 1pm, 12pm, 11pm and 3pm do not vary by much. Even the hour after 2pm only on average receiving 2 extra posts compared to the rest of the top 5. 

So overall, as long as the post is not a show post, the best time would be to post around 3pm.

# Points
## Which type of post on average performs better? 
I will be using the same stages as above to analyse another measure of interaction (points)

In [13]:
total_ask_points = 0
for i in ask_posts:
    num_points_ask = int(i[3])
    total_ask_points += num_points_ask
    
avg_total_ask_points = total_ask_points/(len(ask_posts))
print('Average number of ask posts is ' + str(avg_total_ask_points))

total_show_points = 0
for i in show_posts:
    num_points_show = int(i[3])
    total_show_points += num_points_show
    
avg_total_show_points = total_show_points/(len(show_posts))
print('Average number of show posts is ' + str(avg_total_show_points))

total_other_points = 0
for i in other_posts:
    num_points_other = int(i[3])
    total_other_points += num_points_other
    
avg_total_other_points = total_other_points/(len(other_posts))
print('Average number of other posts is ' + str(avg_total_other_points))

Average number of ask posts is 15.061926605504587
Average number of show posts is 27.555077452667813
Average number of other posts is 55.4067698034198


### Insight
As previously stated, show posts are more likely to be content such as memes that the user wants share with other users. 

Due to the nature of show posts, posts such as memes tend to generate more likes than comments. This is proven by the averages shown above. 

As shown show posts generate more likes on average compared to ask posts. 

Other posts vastly out done the other two types of posts, generating an average of 55 likes per post. As suggested before, this might be due to the fact a certain popular category is skewing the average.

### Average points per post by the hour
I will not show any results and will go straight to the desired output as it is essentially the same code as above

In [14]:
count_by_points_ask = {}
for i in ask_posts:
    created_at_ask_1 = i[6]
    num_points_ask = int(i[3])
    created_at_ask_1 = dt.datetime.strptime(created_at_ask_1, "%m/%d/%Y %H:%M")
    ask_hour_1 = dt.datetime.strftime(created_at_ask_1, "%H:00")
    if ask_hour_1 in count_by_points_ask:
        count_by_points_ask[ask_hour_1] += num_points_ask
    else:
        count_by_points_ask[ask_hour_1] = num_points_ask

count_by_points_show = {}
for i in show_posts:
    created_at_show_1 = i[6]
    num_points_show = int(i[3])
    created_at_show_1 = dt.datetime.strptime(created_at_show_1, "%m/%d/%Y %H:%M")
    show_hour_1 = dt.datetime.strftime(created_at_show_1, "%H:00")
    if show_hour_1 in count_by_points_show:
        count_by_points_show[show_hour_1] += num_points_show
    else:
        count_by_points_show[show_hour_1] = num_points_show
        
count_by_points_other = {}
for i in other_posts:
    created_at_other_1 = i[6]
    num_points_other = int(i[3])
    created_at_other_1 = dt.datetime.strptime(created_at_other_1, "%m/%d/%Y %H:%M")
    other_hour_1 = dt.datetime.strftime(created_at_other_1, "%H:00")
    if other_hour_1 in count_by_points_other:
        count_by_points_other[other_hour_1] += num_points_other
    else:
        count_by_points_other[other_hour_1] = num_points_other

In [15]:
avg_points_by_hour_ask = []
for i in count_by_points_ask:
    num_points_ask_1 = count_by_points_ask[i]
    num_posts_ask_1 = counts_by_hour_ask[i]
    avg_points_ask = num_points_ask_1/num_posts_ask_1
    avg_points_by_hour_ask.append([float("{:.2f}".format(avg_points_ask)), i])

avg_points_by_hour_show = []
for i in count_by_points_show:
    num_points_show_1 = count_by_points_show[i]
    num_posts_show_1 = counts_by_hour_show[i]
    avg_points_show = num_points_show_1/num_posts_show_1
    avg_points_by_hour_show.append([float("{:.2f}".format(avg_points_show)), i])
                                   
avg_points_by_hour_other = []
for i in count_by_points_other:
    num_points_other_1 = count_by_points_other[i]
    num_posts_other_1 = counts_by_hour_other[i]
    avg_points_other = num_points_other_1/num_posts_other_1
    avg_points_by_hour_other.append([float("{:.2f}".format(avg_points_other)), i])

In [16]:
print('Average points by hour for ask posts')
print(avg_points_by_hour_ask)
print('')
print('Average points by hour for show posts')
print(avg_points_by_hour_show)
print('')
print('Average points by hour for other posts')
print(avg_points_by_hour_other)

Average points by hour for ask posts
[[7.31, '09:00'], [24.26, '13:00'], [18.68, '10:00'], [11.98, '14:00'], [23.35, '16:00'], [8.54, '23:00'], [10.71, '12:00'], [19.41, '17:00'], [29.99, '15:00'], [15.79, '21:00'], [14.39, '20:00'], [13.67, '02:00'], [15.97, '18:00'], [6.93, '03:00'], [12.0, '05:00'], [13.75, '19:00'], [11.67, '01:00'], [7.2, '22:00'], [10.73, '08:00'], [8.28, '04:00'], [8.2, '00:00'], [13.43, '06:00'], [10.62, '07:00'], [14.22, '11:00']]

Average points by hour for show posts
[[25.43, '14:00'], [40.35, '22:00'], [36.31, '18:00'], [19.0, '07:00'], [30.32, '20:00'], [5.47, '05:00'], [28.32, '16:00'], [30.95, '19:00'], [28.56, '15:00'], [25.15, '03:00'], [27.11, '17:00'], [23.44, '06:00'], [11.33, '02:00'], [24.63, '13:00'], [15.26, '08:00'], [18.43, '21:00'], [14.85, '04:00'], [33.64, '11:00'], [41.69, '12:00'], [42.39, '23:00'], [18.43, '09:00'], [25.0, '01:00'], [18.92, '10:00'], [37.84, '00:00']]

Average points by hour for other posts
[[57.57, '11:00'], [60.01, '19

In [17]:
avg_points_by_hour_ask.sort(reverse = True)
avg_points_by_hour_show.sort(reverse = True)
avg_points_by_hour_other.sort(reverse = True)

In [18]:
print('Top 5 average points by hour for ask posts')
print(avg_points_by_hour_ask[:5])
print('')
print('Top 5 average points by hour for show posts')
print(avg_points_by_hour_show[:5])
print('')
print('Top 5 average points by hour for other posts')
print(avg_points_by_hour_other[:5])

Top 5 average points by hour for ask posts
[[29.99, '15:00'], [24.26, '13:00'], [23.35, '16:00'], [19.41, '17:00'], [18.68, '10:00']]

Top 5 average points by hour for show posts
[[42.39, '23:00'], [41.69, '12:00'], [40.35, '22:00'], [37.84, '00:00'], [36.31, '18:00']]

Top 5 average points by hour for other posts
[[62.53, '13:00'], [61.79, '14:00'], [60.54, '15:00'], [60.48, '10:00'], [60.01, '19:00']]


### Insight
The times with most points by hour posted is again 3pm with 4pm closely following behind. For ask posts, it seems to be the best to post at 3pm as it, on average, achieves both most amount of comments and likes.

For show posts, it seems to be better to post around 11pm/12am. Previously we showed that the averages for comments on show posts was very close for 6pm and 12am. Now however the difference posting at 12am seems to net a larger amount of likes (this difference is negligible) compared to 6pm. It seems to be optimal to post at 12am to gain the most amounts of like.

For other posts, it exhibited the same pattern as the top 5 comment by hour. The differences amongst the top 5 is very small (a maximum of 2 likes). However previously, we showed that it is optimal to post between 11am-3pm as it on average obtains the most comments. However this data shows a more narrow window of posting between 1pm-3pm to optimise the number of likes.

# Conclusion
To conclude, I will write a short few sentences on what I think are the optimal times to post certain posts.

For ask posts it seems to be that the optimal time to post would be at 3pm as it obtains the most amount of likes and comments.

For show posts, the optimal time to post would be around 12am. However if the poster wants to maximise likes alone, the poster should post it around 11pm instead. To maximise comments its optimal to post at 6pm even tho the difference between 12am and 6pm is negligible.

For other types of posts, it is optimal to post within the time frame 1pm-3pm to reach the largest interaction (likes and comments). The differences between the hours are negligible however posting at 1pm/2pm is slightly better than 3pm.