## Exploring Hacker News Posts
In this project, I will work with data set of submissions to popular technology site [Hacker News](https://news.ycombinator.com)

I will work with posts whose title begin with either `Ask HN` or `Show HN` and analyze them

### Opening The Data
The data set can be found [here](https://www.kaggle.com/hacker-news/hacker-news-posts)

In [None]:
from csv import reader
open_file = open("HN_posts_year_to_Sep_26_2016.csv", encoding='utf-8')
read_file = reader(open_file)
hn = list(read_file)
headers = hn[0]
hn = hn[1:]

In [None]:
print(headers,'\n')
print(hn[:5])

### Extracting Ask HN and Show HN Posts
To find the posts that begin with either `Ask HN` or `Show HN`, I'll use the string method `startswith`. 

In [None]:
ask_posts = []
show_posts = []
other_posts = []
for row in hn:
    title = row[1]
    title = title.lower()
    if title.startswith('ask hn'):
        ask_posts.append(row)
    elif title.startswith('show hn'):
        show_posts.append(row)
    else:
        other_posts.append(row)
        
print(len(ask_posts))
print(len(show_posts))
print(len(other_posts))

In [None]:
print(ask_posts[:3])

### Calculating the Average Number of Comments for Ask HN and Show HN Posts

In [None]:
## To find total number of comments in ask posts
total_ask_comments = 0
for row in ask_posts:
    num_ask_comments = int(row[4])
    total_ask_comments += num_ask_comments
avg_ask_comments = total_ask_comments/len(ask_posts)
print('Average Ask HN comments: {:.2f}'.format(avg_ask_comments))

## To find total number of comments in ask posts
total_show_comments = 0
for row in show_posts:
    num_comments = int(row[4])
    total_show_comments += num_comments
avg_show_comments = total_show_comments/len(show_posts)
print('Average Show HN comments: {:.2f}'.format(avg_show_comments))

We can see that the Ask HN posts receive more comments on average, over two times more than the Show HN posts

### Finding the Amount of Ask Posts and Comments by Hour Created
I'll determine if ask posts created at a certain time are more likely to attract comments. The following steps will be used to perform this analysis:
1. Calculate the amount of ask posts created in each hour of the day, along with th e number of comments received
2. Calculate the average number of comments ask posts receive by hour created

In [None]:
import datetime as dt
result_list = []
for row in ask_posts:
    created_at = row[-1]
    num_comments = int(row[4])
    result_list.append([created_at,num_comments])

# print(result_list[:10])

counts_by_hour = {}
comments_by_hour = {}
for row in result_list:
    comment = num_comments
    date_and_time = row[0]
    date_and_time = dt.datetime.strptime(date_and_time,"%m/%d/%Y %H:%M")
    hour = date_and_time.strftime("%H")
    
    if hour in counts_by_hour:
        counts_by_hour[hour] += 1
        comments_by_hour[hour] += comment
    else:
        counts_by_hour[hour] = 1
        comments_by_hour[hour] = comment

print(counts_by_hour,'\n')
print(comments_by_hour)

### Calculating the Average Number of Comments for Ask HN Posts by Hour

In [None]:
avg_by_hour = []
for time in counts_by_hour:
    avg_by_hour.append([time, round((comments_by_hour[hour]/
                                       counts_by_hour[time]),2)])
for row in avg_by_hour:   
    print(row[0],":",row[1])

In [None]:
## Sorting and Printing values from List of Lists
swap_avg_by_hour = []
for row in avg_by_hour:
    swap_avg_by_hour.append([row[1],row[0]])
                        
print(swap_avg_by_hour)

In [None]:
### Sorting the average comments

sorted_swap = sorted(swap_avg_by_hour, reverse = True)
# for row in sorted_swap:
#     print(row[1],':',row[0])
print(sorted_swap,'\n')
print("Top 5 Hours for Ask Posts Comments")
print(sorted_swap[:5])

In [None]:
### Showing the average comments per post for the top five hours of the day
for row in sorted_swap[:5]:
    time = row[1]
    comment = row[0]
    time = dt.datetime.strptime(time,"%H").strftime("%I%p")
    print("{0:} {1} average comments per post".format(time,comment))

From my analysis so far, it can be seen that the best time to post a `Ask HN` comment is between 4AM and 7AM. This is the time when the platform comment section gets engaging.

EE