# Data Visualization

## Assignment 2: Visual Encoding Channels

You can't learn technical subjects without hands-on practice. The assignments are an important part of the course. To submit this assignment you will need to make sure that you save your Jupyter notebook. 

Below are the links to 2 videos that explain:

1. [How to save your Jupyter notebook](https://youtu.be/0aoLgBoAUSA) and,       
2. [How to answer a question in a Jupyter notebook assignment](https://youtu.be/7j0WKhI3W4s).

<div class="alert alert-info" style="color:black">
    
### Assignment Learning Goals:

By the end of the module, students are expected to:

- Choose effective visual channels for information display.
- Visualize frequencies with bar plots.
- Facet plots to explore more variables simultaneously.
- Customize axes labels and scales.


This assignment covers [Module 2](https://viz-learn.mds.ubc.ca/en/module2) of the online course. You should complete this module before attempting this assignment.
 
</div>

Any place you see `...`, you must fill in the function, variable, or data to complete the code. Substitute the `None` and the `raise NotImplementedError # No Answer - remove if you provide an answer` with your completed code and answers then proceed to run the cell!

Note that some of the questions in this assignment will have hidden tests. This means that no feedback will be given as to the correctness of your solution. It will be left up to you to decide if your answer is sufficiently correct. These questions are worth 2 points.

In [None]:
# Import libraries needed for this assignment

from hashlib import sha1
import altair as alt
import pandas as pd
import numpy as np
from hashlib import sha1
import test_assignment2 as t
alt.data_transformers.disable_max_rows()


## 0. Gapminder ... REVISITED!

Remember the [Gapminder](https://www.gapminder.org) data that we explored in assignment 1? Well, it's making a comeback! 

Let's continue our exploration of this data. 

As a reminder, the data that we have provided to you contains values up until 2018 for most of the features.

Here are the descriptions and column names just to remind yourself of the data that you are using. 

| Column                | Description                                                                                  |
|-----------------------|--------------------------------------------------------------------------------------------|
| country               | Country name                                                                                 |
| year                  | Year of observation                                                                          |
| population            | Population in the country at each year                                                       |
| region                | Continent the country belongs to                                                             |
| sub_region            | Sub-region the country belongs to                                                            |
| income_group          | Income group [as specified by the world bank in 2018]                                                |
| life_expectancy       | The mean number of years a newborn would <br>live if mortality patterns remained constant    |
| income                | GDP per capita (in USD) <em>adjusted <br>for differences in purchasing power</em>            |
| children_per_woman    | Average number of children born per woman                                                    |
| child_mortality       | Deaths of children under 5 years <break>of age per 1000 live births                          |
| pop_density           | Average number of people per km<sup>2</sup>                                                  |
| co2_per_capita        | CO2 emissions from fossil fuels (tonnes per capita)                                          |
| years_in_school_men   | Mean number of years in primary, secondary,<br>and tertiary school for 25-36 years old men   |
| years_in_school_women | Mean number of years in primary, secondary,<br>and tertiary school for 25-36 years old women |

[as specified by the world bank in 2018]: https://datahelpdesk.worldbank.org/knowledgebase/articles/378833-how-are-the-income-group-thresholds-determined
    


**Question 0.1** 
    <br> {points: 2}

Before you do anything, you must read in the data just like we did in assignment 1. The preprocessed version of the 2018 Gapminder data is stored in a csv file named `world-data-gapminder.csv`.  

Use `read_csv` from `pandas` to load the data from the `data` folder and  assign it to a variable named `gapminder_df`. 

Make sure to parse any time columns using the `parse_dates` argument.


In [None]:
gapminder_df = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
gapminder_df.head()

In [None]:
t.test_0_1_1(gapminder_df)

In [None]:
t.test_0_1_2(gapminder_df)

# 1. Quick Questions 

This next section is something that we're calling "Quick Questions". Here we are going to ask some questions that confirm your knowledge and if you've been keeping up with your readings.

**Question 1.1** <br> {points: 1}  

With respect to static (non-interactive) 3D plots, which of the following statements are true?

Select all that apply:

A) 3D plots should be avoided at all cost. 

B) 3D plots can sometimes be misleading in the information that it conveys. 

C) Many 3D plots can be more effectively represented with 2 dimensions. 

D) Using 3D plots is usually the only possible way to present it when we have 3-dimensional data.

*To answer the question, select all that apply and add the letter(s) associated with the correct answer(s) to a list and assign it to a variable named `answer1_1`. For example, if you believe that A and B are True, then your answer would look like this:*

`answer1_5 = ["A", "B"]`

In [None]:
answer1_1 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_1

In [None]:
t.test_1_1(answer1_1)

**Question 1.2** <br> {points: 1}  

Generally speaking, which of the following is the easiest to compare when using plots?

A) Angles

B) Area

C) Length/position

D) Volume

To answer the question, assign the letter associated with the correct answer to a variable in the code cell below.

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_2`.*


In [None]:
answer1_2 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_2

In [None]:
t.test_1_2(answer1_2)

Use the plots below to answer the following 2 questions (ignore the code). 

In [None]:
plot_a = alt.Chart(gapminder_df).mark_line().encode(
    x='year', 
    y='population', 
    color='region').properties(
    title='Plot A',width=300, height=200)

plot_b = alt.Chart(gapminder_df).mark_line().encode(
    x='year', 
    y='sum(population)', 
    color='region').properties(
    title='Plot B',width=300, height=200)

plot_c = alt.Chart(gapminder_df).mark_area().encode(
    x='year', 
    y='sum(population)', 
    color='region').properties(
    title='Plot C',width=300, height=200)

plot_d = alt.Chart(gapminder_df).mark_circle().encode(
    alt.X('year', scale=alt.Scale(zero=False)),
    alt.Y('population'),
    alt.Color('region')).properties(
    title='Plot D',width=300, height=200)

alt.vconcat(alt.hconcat(
    plot_a, plot_b
).resolve_scale(
    color='independent'),  
alt.hconcat(
    plot_c, plot_d
).resolve_scale(
    color='independent'))

**Question 1.3** <br> {points: 1}  

Which plot is most effective in answer the question "Between Africa and the Americas, which had the lowest population in 1960?"

A) `plot_a`

B) `plot_b`

C) `plot_c`

D) `plot_d`

To answer the question, assign the letter associated with the correct answer to a variable in the code cell below.

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_3`.*

In [None]:
answer1_3 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_3

In [None]:
t.test_1_3(answer1_3)

**Question 1.4** <br> {points: 2}  

Which plot is most effective in answer the question "Between the years 1960 and 2000, how much growth did the global population sustain?"

A) `plot_a`

B) `plot_b`

C) `plot_c`

D) `plot_d`

To answer the question, assign the letter associated with the correct answer to a variable in the code cell below.

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_4`.*

In [None]:
answer1_4 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_4

In [None]:
# check that the variable exists
assert 'answer1_4' in globals(
), "Please make sure that your solution is named 'answer1_4'"

# This test has been intentionally hidden. It will be up to you to decide if your solution
# is sufficiently good.

**Question 1.5** <br> {points: 1}  

In a histogram plot, what data types are displayed on each of the axes?


A) Categorical on both the x and y axes.

B) Numeric on one of the axis (either the x-axis or the y-axis) and categorical on the other.

C) Numeric on both the x and y axes.

D) Categorical on the x-axis and Numeric on the y-axis.

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_5`.*


In [None]:
answer1_5 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_5

In [None]:
t.test_1_5(answer1_5)

**Question 1.6** <br> {points: 1}  

When making bar charts or histograms and using the color channel, what kind of plot does Altair make by default? 

A) Stacked bar chart or histogram.

B) Faceted bar chart or histogram.

C) A bar chart or histogram that displays the mean of all categories in the color channel. 

D) A plot with a single bar displaying the total number of examples in the data.

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_6`.*


In [None]:
answer1_6 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_6

In [None]:
t.test_1_6(answer1_6)

**Question 1.7** <br> {points: 1}  

To make a histogram what mark is needed? 

A) `.mark_hist()`

B) `.mark_histogram()`

C) `.mark_bars()`

D) `.mark_bar()`

*Answer in the cell below using the uppercase letter associated with your answer. Place your answer between `""`, assign the correct answer to an object called `answer1_7`.*

In [None]:
answer1_7 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_7

In [None]:
t.test_1_7(answer1_7)

**Question 1.8** <br> {points: 1}  

Let's now combine everything we've learned and see if we can identify which of the following statements are true.

Select all that apply:

A) Curvature is a type of channel used in visualizations.

B) To facet our plots into a 2 * 3 grid, we would use `.facet(column=3, row=3)` 

C) Histograms help shows how the data is distributed. 

D) When plotting a single trend over time, it's better to use a line plot over an area plot.  

*To answer the question, select all that apply and add the letter(s) associated with the correct answer(s) to a list and assign it to a variable named `answer1_5`. For example, if you believe that A and B are True, then your answer would look like this:*

`answer1_8 = ["A", "B"]`


In [None]:
answer1_8 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer1_8

In [None]:
t.test_1_8(answer1_8)

# 2. Family planning

We learned in the [video from the previous assignment](https://www.youtube.com/watch?v=usdJgEwMinM) that there are many perception that people have regarding the statistics of countries around the world. Let's take a look and see if  child mortality is correlated with family sizes.

***How has child mortality changed over time for different income levels? Does a country's income level have any relationship to the average family size?***


Let's have a look at the data to see how this relationship has evolved over time.

In the plots we are going to make,
it is important to note that it is not possible to tell causation,
just correlation.
However,
in the [Gapminder](https://www.gapminder.org/videos/) video library, there are a few videos on this topic
(including [this](https://www.gapminder.org/answers/will-saving-poor-children-lead-to-overpopulation/)
and [this](https://www.gapminder.org/videos/population-growth-explained-with-ikea-boxes/) one),
discussing how reducing poverty can help slow down population growth
through decreased family sizes.
Current estimates suggest that the world population
will stabilize around 11 billion people
and the average number of children per woman
will be close to two worldwide in the year 2100.

**Question 2.1**
<br> {points: 1}

Filter the data to include only the years 1918, 1938, 1958, 1978, 1998, and 2018.

*Hint: You can do this elegantly by filtering the data by year using the function [`.isin()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html). The first answer of this [Stack Overflow](https://stackoverflow.com/questions/12096252/use-a-list-of-values-to-select-rows-from-a-pandas-dataframe) page may be helpful.* 

*Save this new dataframe in an object named `gapminder_every20`.*

In [None]:
gapminder_every20 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
gapminder_every20

In [None]:
t.test_2_1(gapminder_every20)

**Question 2.2** 
<br> {points: 2}

With the filtered data `gapminder_every20` from  **Question 2.1**, make a scatter plot using filled circles with children per women on the x-axis, child mortality on the y-axis, and  color the circles by the income group.

Make sure to specify a proper title and appropriate axis labels using the `title` argument in `alt.X()` and `alt.Y()`.

*Save the plot in an object named `family_plot`.*

In [None]:
family_plot = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
family_plot

In [None]:
t.test_2_2_1(family_plot)

In [None]:
t.test_2_2_2(family_plot)

**Question 2.3** 
<br> {points: 1}

Now add on to the plot `family_plot` from **Question 2.2**. 

Facet your data into six subplots, one for each year laid out in 3 columns and 2 rows. 

You shouldn't have to do any copy and pasting for this question. You should be able to take the plot `family_plot` and chain your desired `facet()` to it. 

*Save this plot in an object named `family_plot_faceted`.*

In [None]:
family_plot_faceted = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
family_plot_faceted

In [None]:
t.test_2_3(family_plot_faceted)

Take a look at the plot from **Question 2.3** and answer the following questions.

**Question 2.4** 
<br> {points: 1}

Let's return to our original question and answer accordingly: 

> How has child mortality rates changed over time for different income levels?

Select all that apply: 

A) Child mortality has decreased for high-income countries.

B) Child mortality has decreased for middle-income countries. 

C) Child mortality has decreased for low-income countries.

D) Child mortality has increased for high-income countries.

E) Child mortality has increased for middle-income countries. 

F) Child mortality has increased for low-income countries.

G) Child mortality has remained constant for high-income countries.

H) Child mortality has remained constant for middle-income countries. 

I) Child mortality has remained constant for low-income countries.


*To answer the question, select all that apply and add the letter(s) associated with the correct answer(s) to a list and assign it to a variable named `answer2_4`. For example, if you believe that A and B are True, then your answer would look like this:*

`answer2_4 = ["A", "B"]`



In [None]:
answer2_4 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer2_4

In [None]:
t.test_2_4(answer2_4)

**Question 2.5** 
<br> {points: 2}

Let's take a look at the original statements that we wanted to explore: 

> Does a country's income level have any relationship to the average family size?

Look at the visualization that you made above and select from the following:

A) It appears that parents in countries with low child mortality tend to have **less** children.

B) It appears that parents in countries with low child mortality tend to have **more** children.

C) There is no apparent relationship between child mortality and the number of children women birth.


Answer in the cell below using the uppercase letter associated with your answer. Place your answer between "", assign the correct answer to an object called `answer2_5`.


In [None]:
answer2_5 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer2_5

In [None]:
# check that the variable exists
assert 'answer2_5' in globals(
), "Please make sure that your solution is named 'answer2_5'"

# This test has been intentionally hidden. It will be up to you to decide if your solution
# is sufficiently good.

# 3. Carbon dioxide emissions

CO2 emissions are often talked about in it's relation to climate change. 

Let's explore the year 2014 and answer the following questions: 

- ***Which countries emitted the most CO2 in tonnes and per capita?***

- ***Which region is contributing the most to the total mass of Carbon Dioxide emissions?***

*Note that the units for `co2_per_capita` is tonnes per person.*

**Question 3.1** 
<br> {points: 1}

Using the dataframe `gapminder_df`, what is the most recent year that `co2_per_capita` was measured?

*Hint: What is the maximum year where the column `co2_per_capita` does not have null values.*

*Save the year in an object of type `int` named `most_recent_co2`.*

In [None]:
most_recent_co2 = None 

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
most_recent_co2

In [None]:
t.test_3_1(most_recent_co2)

**Question 3.2** 
<br> {points: 1}

Filter the data to include only the most recent year when `'co2_per_capita` was measured (the answer to **Question 3.1**)


*Save the dataframe in an object named `gm_recent_co2`.*

In [None]:
gm_recent_co2 = None 

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
gm_recent_co2.head()

In [None]:
t.test_3_2(gm_recent_co2)

**Question 3.3** 
<br> {points: 1}

From the dataframe `gm_recent_co2`, use [`nlargest`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.nlargest.html) to select the top 40 countries in CO2 production per capita for that year.

*Save the result as a **dataframe** in an object named `co2_top_40`.*

In [None]:
co2_top_40 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer

In [None]:
t.test_3_3(co2_top_40)

**Question 3.4** 
<br> {points: 2}

Let's plot the data from the object `co2_top_40`. Since we have only one value per country per year, create a bar chart to visualize it. Map the CO2 per capita as on the x-axis, the country on the y-axis, and the region as the color.

[Sort](https://altair-viz.github.io/gallery/bar_chart_sorted.html) your bar chart so that the highest CO2 per capita is the closest to the x-axis (the bottom of the chart). 

Don't forget to give your plot an appropriate title and axis labels.

*Save the plot in an object named `co2_plot`.*


In [None]:
co2_plot = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
co2_plot

In [None]:
t.test_3_4_1(co2_plot)

In [None]:
t.test_3_4_2(co2_plot)

Using the plot above, answer the following questions. 

**Question 3.5** 
<br> {points: 1}

Which country emits the most Carbon Dioxide per capita? 

*Save your answers as a string in an object named `answer3_5`*. 

In [None]:
answer3_5 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer

answer3_5

In [None]:
t.test_3_5(answer3_5)

**Question 3.6** 
<br> {points: 1}

How many of the top 40 CO2 carbon-emitting countries are from Africa?

*Save your answers in an object of type `int` named `answer3_6`*. 

In [None]:
answer3_6 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_6

In [None]:
t.test_3_6(answer3_6)

**Question 3.7** 
<br> {points: 2}

Which European country emits the most CO2 per capita?

*Save your answers as a string in an object named `answer3_7`*. 

In [None]:
answer3_7 = None 

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_7

In [None]:
# check that the variable exists
assert 'answer3_7' in globals(
), "Please make sure that your solution is named 'answer3_7'"

# This test has been intentionally hidden. It will be up to you to decide if your solution
# is sufficiently good.

**Question 3.8** 
<br> {points: 1}

In addition to Carbon Dioxide per capita, the total emission for the entire population also matters for a country’s overall Carbon Dioxide emissions. 

From the dataset `gapminder_df`, compute a new column in your data set called `co2_total` which contains the total co2 emissions for a country's total population.

*Save this new dataframe in an object named `gm_co2_total`.*

In [None]:
gm_co2_total = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
gm_co2_total

In [None]:
t.test_3_8(gm_co2_total)

**Question 3.9** 
<br> {points: 1}

Let's now explore the top 40 countries with the total CO2 production from 2014. Do you think they will be the same 40 countries that we saw for CO2 levels per capita? 

From the dataframe `gm_co2_total` that we made in **Question 3.8** , filter the data to only include data from the `year` 2014. Then use [`nlargest`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.nlargest.html) like you did in **Question 3.3** to select the top 40 countries in total CO2 production per for the year 2014.

*Save the data in an object named `co2_total_top_40`.*

In [None]:
co2_total_top_40 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
co2_total_top_40

In [None]:
t.test_3_9(co2_total_top_40)

**Question 3.10** 
<br> {points: 2}

Let's plot the data from the object `co2_total_top_40` like we did before but this time, we will see in absolute terms which countries produced the most CO2 levels in the year 2014. 

Again, we have only one value per country per year so we can create a bar chart to visualize it. Map the total CO2 production the x-axis, the country on the y-axis, and the region as the color.

Sort your bar chart so that the highest CO2 producing country is the closest to the x-axis (the bottom of the chart). Don't forget to give your plot an appropriate title and axis labels.

*Save the plot in an object named `co2_total_plot`.*


In [None]:
co2_total_plot = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
co2_total_plot

In [None]:
t.test_3_10_1(co2_total_plot)

In [None]:
t.test_3_10_2(co2_total_plot)

**Question 3.11** 
<br> {points: 1}

Which country emits the most total Carbon Dioxide mass? 

*Save your answers as a string in an object named `answer3_12`*. 

In [None]:
answer3_11 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_11

In [None]:
t.test_3_11(answer3_11)

**Question 3.12** 
<br> {points: 1}

How many countries appear in both top 40 dataframes (from `co2_top_40` from **Question 3.3** and `co2_total_top_40` **Question 3.10**)? 

*Hint: You can solve this in a few different ways. We used sets and [`intersection`](https://www.w3schools.com/python/ref_set_intersection.asp).*

*Save your answers in an object of type `int` named `answer3_13`*.  

In [None]:
answer3_12 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_12

In [None]:
t.test_3_12(answer3_12)

**Question 3.13** 
<br> {points: 2}

Let's now explore the total CO2 emission levels on a continent level. ***What was the total amount of CO2 emission in 2014 and which continent contributed the most to it?***

Using `gm_co2_total` from **Question 3.8**, plot the new column `co2_total` over time in an area chart, but instead of plotting one area for each country, plot one for each region which represents the sum of all countries co2 emissions in that region.

Don't forget to give your plot an appropriate title and axis labels.

*Hint: You'll need to use aggregation for your y-axis and `region` should be used for color.*

*Save the plot in an object named `total_co2_plot`.*

In [None]:
total_co2_plot = None 

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
total_co2_plot

In [None]:
t.test_3_13_1(total_co2_plot)

In [None]:
t.test_3_13_2(total_co2_plot)

**Question 3.14** 
<br> {points: 1}

In 1960, approximately what was the total mass of co2 emissions for all 5 continents? 

A) 5,000,000,000 tonnes

B) 7,000,000,000 tonnes

C) 9,000,000,000 tonnes

D) 10,000,000,000 tonnes

Answer in the cell below using the uppercase letter associated with your answer. Place your answer between "", assign the correct answer to an object called `answer3_12`.


In [None]:
answer3_14 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_14

In [None]:
t.test_3_14(answer3_14)

**Question 3.15** 
<br> {points: 1}

Which continent emitted the most absolute Carbon Dioxide in 2014? 

*Save your answers as a string in an object named `answer3_16`*. 

In [None]:
answer3_15 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer3_15

In [None]:
t.test_3_15(answer3_15)

# 4. Income vs CO2 per capita

After looking at the Carbon Dioxide emission, it would be interesting to see if there was a relationship between a country's income and their CO2 production per capital.

**Question 4.1** 
<br> {points: 1}

Take `gapminder_df` and filter it to include the years 1979 and 2014.

*Hint: You can do this in a similar way as you did in Question 2.2* 

*Save this new dataframe in an object named `gm_selection`.*

In [None]:
gm_selection = None
# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
gm_selection 

In [None]:
t.test_4_1(gm_selection)

**Question 4.2** 
<br> {points: 2}

Let's make a scatter plot.
- Make a scatter plot using `mark_circle`.
- Map the income and CO2 levels per capital to appropriate axes.
- Map the region to the colour channel. 
- Add a title and change the plot dimensions to a width of 400 and a height of 250
- Add informative axis labels and a title 

*Save this plot in an object named `income_plot`.*

In [None]:
income_plot = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
income_plot

In [None]:
t.test_4_2_1(income_plot)

In [None]:
t.test_4_2_1(income_plot)

**Question 4.3** 
<br> {points: 1}

Now facet the `income_plot` by `year`.

*Save it in an object named `income_plot_faceted`.*

In [None]:
# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
income_plot_faceted

In [None]:
t.test_4_3(income_plot_faceted)

**Question 4.4** 
<br> {points: 2}

Does it appear that to be a positive, negative, or no relationship between the income of a country and the CO2 emission per capita?

*Answer as either "positive", "negative" or "no relationship" in object of type string named `answer4_4`.*

In [None]:
answer4_4 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer

answer4_4

In [None]:
# check that the variable exists
assert 'answer4_4' in globals(
), "Please make sure that your solution is named 'answer4_4'"

# This test has been intentionally hidden. It will be up to you to decide if your solution
# is sufficiently good.

**Question 4.5** 
<br> {points: 1}

Looking at the 2 plots from 1979 and 2014, did income per CO2 emission per capital appear to flatten at all? 

*Answer as either "Yes" or "No" in an object named `answer4_5`*.

In [None]:
answer4_5 = None

# your code here
raise NotImplementedError # No Answer - remove if you provide an answer
answer4_5

In [None]:
t.test_4_5(answer4_5)

## Before Submitting 

Before submitting your assignment please do the following:

- Read through your solutions
- **Restart your kernel, clear output and rerun your cells from top to bottom** 
- Makes sure that none of your code is broken 
- Verify that the tests from the questions you answered have obtained the output "Success"

This is a simple way to make sure that you are submitting all the variables needed to mark the assignment. This method should help avoid losing marks due to changes in your environment.  

## Attributions
- Gapminder dataset processed and uploaded by Joel Ostblom - [UofTCoders/workshops-dc-py](https://github.com/UofTCoders/workshops-dc-py)

- Original Gapminder data - [The Gapminder Foundation](https://www.gapminder.org/)


- MDS DSCI 531: Data Visualization I - [MDS's GitHub website](https://github.com/UBC-MDS/DSCI_531_viz-1) 
