In [None]:
library(tidyverse)
library(data.table)
library(plotly) # for interactive ploting
library(DT) # for interactive tabulation

In [None]:
options(repr.matrix.max.rows=20, repr.matrix.max.cols=15) # for limiting the number of top and bottom rows of tables printed 

In [None]:
datapath <- "~/data"

# Population shares

Please first import the objects for the WEO dataset: 

In [None]:
# wide data with features in the columns and countries/years in the rows
weo_wide2 <- readRDS(sprintf("%s/rds/01_01_weo_wide2.rds", datapath))

In [None]:
weo_countries <- readRDS(sprintf("%s/rds/01_01_weo_countries.rds", datapath))
weo_subject <- readRDS(sprintf("%s/rds/01_01_weo_subject.rds", datapath))

Remember the nice widget to navigate through and search in tabular data:

In [None]:
weo_subject %>% datatable(
  filter = "top",
  options = list(pageLength = 20)
)

Your task is to:

- Filter the data for year 2019
- Select the country code (ISO) and population (LP) features
- Create a third column called population_share which holds the population share of each country in world total
- You may use data.table's ":=" or dplyr's mutate function

Note: When you use sum() function, na.rm = T should be supplied so that NA's do not cause trouble

The resulting output should look like this:

<div class="lm-Widget p-Widget jp-RenderedHTMLCommon jp-RenderedHTML jp-mod-trusted jp-OutputArea-output" data-mime-type="text/html"><table class="dataframe">
<caption>A data.table: 194 × 3</caption>
<thead>
	<tr><th scope="col">ISO</th><th scope="col">LP</th><th scope="col">population_share</th></tr>
	<tr><th scope="col">&lt;chr&gt;</th><th scope="col">&lt;dbl&gt;</th><th scope="col">&lt;dbl&gt;</th></tr>
</thead>
<tbody>
	<tr><td>ABW</td><td> 0.112</td><td>1.478473e-05</td></tr>
	<tr><td>AFG</td><td>37.209</td><td>4.911832e-03</td></tr>
	<tr><td>AGO</td><td>30.128</td><td>3.977094e-03</td></tr>
	<tr><td>ALB</td><td> 2.870</td><td>3.788588e-04</td></tr>
	<tr><td>ARE</td><td>10.749</td><td>1.418939e-03</td></tr>
	<tr><td>ARG</td><td>44.939</td><td>5.932243e-03</td></tr>
	<tr><td>ARM</td><td> 2.969</td><td>3.919275e-04</td></tr>
	<tr><td>ATG</td><td> 0.097</td><td>1.280464e-05</td></tr>
	<tr><td>AUS</td><td>25.522</td><td>3.369071e-03</td></tr>
	<tr><td>AUT</td><td> 8.859</td><td>1.169446e-03</td></tr>
	<tr><td>⋮</td><td>⋮</td><td>⋮</td></tr>
	<tr><td>VCT</td><td> 0.110</td><td>1.452072e-05</td></tr>
	<tr><td>VEN</td><td>27.817</td><td>3.672027e-03</td></tr>
	<tr><td>VNM</td><td>96.462</td><td>1.273362e-02</td></tr>
	<tr><td>VUT</td><td> 0.293</td><td>3.867792e-05</td></tr>
	<tr><td>WBG</td><td> 4.977</td><td>6.569967e-04</td></tr>
	<tr><td>WSM</td><td> 0.201</td><td>2.653332e-05</td></tr>
	<tr><td>YEM</td><td>31.648</td><td>4.177744e-03</td></tr>
	<tr><td>ZAF</td><td>58.775</td><td>7.758686e-03</td></tr>
	<tr><td>ZMB</td><td>18.321</td><td>2.418492e-03</td></tr>
	<tr><td>ZWE</td><td>14.905</td><td>1.967558e-03</td></tr>
</tbody>
</table>
</div>

## Answer

tidyverse approach:

In [None]:
weo_wide2 %>%
filter(year == 2019) %>%
select(ISO, LP) %>%
mutate(population_share = LP / sum(LP, na.rm = T))

data.table approach:

In [None]:
# ":=" is a silent operator, to print out the final table add [] at the end:
weo_wide2[year == 2019, .(ISO, LP)][, population_share := LP / sum(LP, na.rm = T)][]