## Introduction
<p><img src="https://assets.datacamp.com/production/project_1174/img/trendlines.jpg" alt="Image of two trendlines over time."></p>
<p>It’s important to stay informed about trends in programming languages and technologies. Knowing what languages are growing or shrinking can help you decide where to invest. </p>
<p>An excellent source to gain a better understanding of popular technologies is <a href="https://stackoverflow.com/">Stack Overflow</a>. Stack overflow is an online question-and-answer site for coding topics. By looking at the number of questions about each technology, you can get an idea of how many people are using it.</p>
<p>You'll be working with a dataset with one observation for each tag in each year. The dataset was downloaded from the <a href="https://data.stackexchange.com/">Stack Exchange Data Explorer</a>. Below you can find an overview of the data that is available to you:<br><br></p>
<div style="background-color: #efebe4; color: #05192d; text-align:left; vertical-align: middle; padding: 15px 25px 15px 25px; line-height: 1.6;">
    <div style="font-size:20px"><b>datasets/stack_overflow_data.csv</b></div>
<ul>
    <li><b>year:</b> The year the question was asked.</li>
    <li><b>tag:</b> A word or phrase that describes the topic of the question.</li>
    <li><b>number:</b> The number of questions with a certain tag in that year.</li>
    <li><b>year_total:</b> The total number of questions asked in that year.</li>
</ul>
    </div>
<p>From here on out, it will be your task to explore and manipulate the existing data until you are able to answer the questions described in the instructions panel. Feel free to add as many cells as necessary. Finally, remember that you are only tested on your answer, not on the methods you use to arrive at the answer!</p>
<p><em><strong>Note:</strong> If you haven't completed a DataCamp project before you should check out the <a href="https://projects.datacamp.com/projects/41">Intro to Projects</a> first to learn about the interface. In this project, you also need to know your way around data manipulation and visualization in the Tidyverse and it's recommended that you take a look at the course <a href="https://www.datacamp.com/courses/introduction-to-the-tidyverse">Introduction to the Tidyverse</a>.</em></p>

In [44]:
# Use this cell to begin your analysis, and add as many as you would like!
library(readr)
library(dplyr)
library(ggplot2)

stack_overflow <- read_csv("datasets/stack_overflow_data.csv")
head(stack_overflow)

#What fraction of the total number of questions asked in 2019 had the R tag? 
#Save your answer as a variable r_percentage in percentage format (e.g. 0.5 becomes 50).

r_percentage <- stack_overflow %>%
    filter(year == "2019", tag == "r") %>%
    summarize(percent_r = number/year_total * 100)


#What were the five most asked-about tags in the last 5 years (2015-2020)? 
#Save your answer as a variable highest_tags in the form of character vector.

pop_tags <- stack_overflow %>%
    filter(year %in% as.character(c(2015:2020))) %>%
    group_by(tag) %>%
    summarize(top_tags = sum(number)) %>%
    arrange(desc(top_tags))
  

highest_tags <- pop_tags$tag[1:5]


Parsed with column specification:
cols(
  year = [32mcol_double()[39m,
  tag = [31mcol_character()[39m,
  number = [32mcol_double()[39m,
  year_total = [32mcol_double()[39m
)


year,tag,number,year_total
<dbl>,<chr>,<dbl>,<dbl>
2008,treeview,69,168541
2008,scheduled-tasks,30,168541
2008,specifications,21,168541
2008,rendering,35,168541
2008,http-post,6,168541
2008,static-assert,1,168541


percent_r
<dbl>
0.9656728
