# Script to populate 50 trending topics in Greater Melbourne

This script will first load in <b>TRENDING.csv</b> which contains all the trending keywords/topics across the 31 LGAs. We will need to group these keywords into one in order to get the trending topics across the whole of Greater Melbourne along with their frequencies. The resulting dataframe will then be a .csv output which will be uploaded onto <a href="https://www.wordclouds.com/">free online word cloud generator</a> to generate a .png image for Community Helper's home page. After which, the .png image will then be uploaded <a href="https://onlinepngtools.com/create-transparent-png/">free online image background remover</a> to remove the white background of the initial image. The resulting .png image will then be edited into the main picture in Community Helper's home page.


### Output Documents
1. <b>GM_TRENDING.csv</b> - contains 50 trending keywords/topics in Greater Melbourne along with their frequencies
2. <b>wordcloud.png</b> - image of word cloud generated from www.wordcloud.com
3. <b>output-onlinepngtools.png</b> - image of word cloud with it's background removed

### Note
- Ensure that `TRENDING.csv` is in the same directory as this script file in order to ensure the script can be run

In [None]:
# read in the TRENDING.csv data
data = read.csv("./TRENDING.csv", header=TRUE)

In [None]:
# import data manipulation library
library("dplyr")

# perform groupby to get frequency, sort by descending order, get top 50, reverse the order of columns
output = data %>% group_by(word) %>% 
    summarise(Frequency=sum(n)) %>% 
    arrange(desc(Frequency)) %>% 
    top_n(50, Frequency) %>%
    select(-word,word)

In [None]:
# output dataframe as csv file for further use
write.csv(output, './GMtopics.csv',row.names = FALSE)

# ------------------------------------- END OF SCRIPT --------------------------------------#