# Visualizing Black Literacy After Emancipation

<div>
<img src="https://github.com/HigherEdData/Du-Bois-STEM/blob/main/readings-images/original-plate-47.jpg?raw=true" width="700" />
</div>

### This interactive exercise is inspired by the annual #DuBoisChallenge

The #DuBoisChallenge is a call to scientists, students, and community members to recreate, adapt, and share on social media the data visualzations created by W.E.B. Du Bois in 1900. Before doing the interactive exercise, please read this article about the Du Bois Challenge: https://nightingaledvs.com/the-dubois-challenge/

### In this interactive excercise, you will:

1. Learn how to create a variation of a **bar graph.**
2. You will do so by learning and modifying code in the statistical programming lanugage *R*.
3. Your instructor may also ask you answer questions and submit screenshots as you go in a parallel **Catcourses** (or other Canvas system) as you go. 

### You will learn how to use the *R* statistical programming language by creating two graphs:

1. You will recreate Du Bois' visualization of Black illiteracy rates in the US compared to illiteracy rates in other countries. Du Bois created the visualization in 1900.

2. You will reproduce Du Bois' visualization using data on Black illiteracy in the US today. This aligns with how Du Bois saw mass education as one important strategy for furthering and deepining emancipation for Black Americans and others.

3. An important context of Du Bois's graph of Black illiteracy is that literacy was illegal for enslaved people in the U.S. until emancipation and the Confederacy's defeat during the Civil War.  illiteracy then declined rapidly as Black Americans sought to empower themselves through education. Du Bois plotted this decline in illiteracty with the following graph:

<div>
<img src="https://github.com/ajstarks/dubois-data-portraits/blob/master/plate14/original-plate-14.jpg?raw=true" width="700" />
</div>

### 1. How to use this interactive **Jupyter Notebook**

Grey cells in the *Notebook* like the one below are **code cells** where you will write and edit **R** Code. To try it out:

1. Click your cursor on the grey cell below. After you click on it, it will change to white to indicate you are editing it.
2. After you clickg on the cell below and it turns white. Type ```2+2``` to use R as a calculator.
3. After typing ```2+2```, click the <span class="play-button">&#9654;</span> *play button* at the top of this page.

### 2. Getting hints and answers.

Sometimes, the code cells will already have code in them that you will be asked to edit or run by clicking the play button. In the process, you can click on <span class="play-button">&#9654;</span> dropdown buttons like the ones below to get hints and answers. For example:
1. Click on the cell below with ```3+3=``` and click the play triangle above. You should get an error message highlighted in pink the begins "Error in parse...". To complete this activity, you'll want to get each code cell to run without a pink error message.
2. Based on the ```2+2``` code you tried above, try to edit the ```3+3=``` code to get it to report the sum of 3+3 without the error message. For a hint, click the first <span class="play-button">&#9654;</span> button below.

<details> <summary>Click this triangle for a hint.</summary>

Try deleting the ```=``` sign in the cell and click play again. If that doesn't work, click the next triangle below for the answer.

</details>

<details> <summary>Click this triangle for the answer.</summary>

**Answer:** Delete all the text in the cell below and write or paste this answer in the cell: ```3+3``` before clicking play again.

</details>

In [None]:
3+3 =

### 3. Reading and writing comments that explain your code

In code cells, we can write **comment** text that explains our code. We put a ```# ``` before **comment** text to tell R that the text is not code it should execute. Any text after a ```# ``` on a given line will be treated as a comment. In Jupyter, **comment** text after a ```# ``` will be displayed in a dark turquois color. To see how this works, try the following below:

1. Try to run the code below. You should get an error message because the comment text ```This is code that adds 2+2``` is not R code and doesn't have a ```# ``` sign in front of it.
2. Add a ```# ``` sign before ```This is code that adds 2+2```. This should change the color of the text to turquois like the text after the ```2+2``` where there is already a ```# ``` sign.
3. Click the <span class="play-button">&#9654;</span> at the top of the notebook and the code should run and output **4** below.

In [None]:
This is code that adds 2+2

2+2 # the result of 2 +2 should be 4

### 4. Keeping track of your work and using R outside of this notebook.

If you leave this Notebook idle or close and and re-open it, you're work will not be saved in the Notebook.

But we'll write all of the code you need for each step in each code cell. So if you leave the Notebook and come back to finish your work, you can just skip ahead to whatever step you were at.

You should also make sure to answer questions about the activity in Catcourses/Canvas as you go so that you don't have to repeat steps to answer Catcourse/Canvas questions.

If you know how to use R on your own computer with R Studio or Jupyter Notebooks, you can also copy, paste, and edit code in those programs to do the exercise. For that purpose, you can view a non-interactive version of this Notebook by clicking **[here](https://github.com/HigherEdData/Du-Bois-STEM/blob/main/r_literacy_dubois.ipynb)**. 

### 5. Reading Du Bois' data into an R Data Frame

The first step for data visualization in **R** is to **read** data into an R **dataframe**. This is like double clicking a file to open it in other computer programs. But with **R**, we use code.

For this exercise, we're going to read in data from a website. And we're going to place the data into a dataframe named **d_literacy_country**.

The **R** code to do this uses an <- arrow pointed at the name of the data frame and a *read.csv* **function** command followed by the web address within paraentheses where a **csv (comma separated values)** data file is located. It looks like this: 

```d_literacy_country <- read.csv(web_address_with_data/data_file_name.csv")```

After writing this code, we can write the name of the **data frame** ```d_literacy_country``` again on a separate line. This will list all of the data in the data frame.

To do this yourself, replace the ```____``` portion of the code below to add the ```read``` function and then press the <span class="play-button">&#9654;</span> above.

In [None]:
d_literacy_country <- _____.csv("data/d_literacy_country.csv")

d_literacy_country

### 5. Creating a Bar Graph

After successfully listing the data above, you should be able to see that it has data in two columns. Each column is a **variable**:
* **country** is a country name for 10 countries with Black people in the U.S. treated as a country.
* **illiteracy** containts percent of people in each country who are illiterate.

As a first step, we will create a **bar graph** of the data using the shortest code possible. The code will:
1. repeat the code below that we wrote above to **read in** the Du Bois illiteracy data.
2. add a line of code to load the **ggplot2** package of functions into our Library that we will use to generate our graph: ```library(ggplot2)```
3. Add a line of code to tell Jupyter and R that we want the width and height of the graph to have the same ratio that Du Bois used, 22 inches wide by 28 inches tall, with each divided by 2 so that it doesn't display too big: ```options(repr.plot.width=22/3, repr.plot.height=28/3)```
4. Finally, we add a **ggplot** function followed by open parenthese to tell R that we will plot data from the d_literacy_country data frame with an "aesthetic mapping" **aes** specification that maps one column of data on the **x axis** and another column of data on the **y axis**.
5. After the close parentheses that tells ggplot we want to plot ```d_emancipation_dubois``` data with one variable on the x axis, and another on the y axis, we add a + and then a new line of code ```geom_col()``` that tells ggplot we want a bar graph based on summary statistics in the dataframe.
6. After looking at Du Bois' version of the graph above, replace the ```_____``` characters in the code cell below to plot the correct variable on the x-axis and the correct variable on the y-axis.

<details> <summary><strong>Hint:</strong></summary>

Du Bois plotted ```illiteracy``` on the x-axis and ```country``` name on the y-axis.

</details>

<details> <summary><strong>Answer:</strong></summary>

the 4th lines of code from the bottom should be: ```x = illiteracy,```  

the 3rd line of code from the bottom should be ```y = country)) + ```

</details>

In [None]:
# Read in the data from a CSV file into a dataframe called d_literacy_country
d_literacy_country <- read.csv("data/d_literacy_country.csv")

# Load the ggplot2 package for creating visualizations
library(ggplot2)

# Set the dimensions of the plot output (width and height in inches)
options(repr.plot.width = 22/2, repr.plot.height = 28/2)

# Source a custom ggplot theme from a remote URL to style the plot (theme_dubois)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# Begin the ggplot call with the data and aesthetic mappings
ggplot(d_literacy_country, aes(
    x = _________,  # replace ________ with the variable name that Du Bois plots on the x-axis
    y = _________)) + # replace ________ with the variable name that Du Bois plots on the x-axis
    # graph a bar chart based on the x and y mappings above
    geom_col() 

### 6. Ordering the Bars and Making the Bar for Black Americans a Different Color

In the bar graph you created above, can you tell what order the bars for each country are sorted by?

Du Bois sorts the bar for each country by its illiteracy rate from highest to lowest.

Du Bois also graphs the illiteracy rate for Black Americans in a different color to make it easier to compare to other countries.

You can edit your R code to sort the bars in the same order as Du Bois and to have the same colors.

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_literacy_country <- read.csv("data/d_literacy_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above to graph Du Bois data
ggplot(d_literacy_country, aes(
    x = illiteracy,
# then by adding "reorder" below, you reorder the countries by filling in _________ with the correct variable name after "country,"
    y = reorder(country, _________),
# then you can tell R which country to fill with a different color by filling in ________ with the country name
    fill = country == "__________"
)) +
    geom_col()

### 7. Edit the bar width and use a **"Du Bois Theme"** to make colors and text similar to Du Bois' style



In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_literacy_country <- read.csv("data/d_literacy_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above to graph Du Bois data with sorting and different colors
ggplot(d_literacy_country, aes(
    x = illiteracy,
    y = reorder(country, illiteracy),
    fill = country == "Negroes, U.S.A."
)) +
# fill in the ____ below to adjust the bar widths to be more similar to Du Bois'. 
    geom_col(width = ____) 
# add a plus sign above this comment to add a new line of code. Then write the code to add the Du Bois theme on a new line.
    

### 8. Change the Text Font and the Bar Color

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_literacy_country <- read.csv("data/d_literacy_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above
ggplot(d_literacy_country, aes(
    x = illiteracy,
    y = reorder(country, illiteracy),
    fill = country == "Negroes, U.S.A." # this is the fill statement
)) +
    geom_col(width = .5) +
    theme_dubois() +
#change the themes's font by filling in the blank
        theme(text = element_text('serif')) +
# fill in the blank for which color the bar should be when the fill statement is true or false is TRUE
    scale_fill_manual(values = c("TRUE" = "___", "FALSE" = "_______")) 
    

### 9. Add the Titles and Subtitles with Your Own Name

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_literacy_country <- read.csv("data/d_literacy_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above
ggplot(d_literacy_country, aes(
    x = illiteracy,
    y = reorder(country, illiteracy),
    fill = country == "Negroes, U.S.A." # this is the fill statement
)) +
    geom_col(width = .5) +
    theme_dubois() +
       theme(text = element_text('serif')) +
    scale_fill_manual(values = c("TRUE" = "red", "FALSE" = "darkgreen")) +
# YOU LEARNED ABOVE ABOUT ALL THE CODE ABOVE THIS LINE
# the code below adds a title and a subtitle
    labs(
        title = "\nIlliteracy of the American Negroes compared with that of other nations.\n",
        subtitle = "Proportion d' illettrés parmi les Nègres Americains comparée à celle des autres nations.\n\n
        Done by Atlanta University.\n\n
        Recreated by ________\n\n"
    )

### 10. Change the Data to Read in and Display College Degree Holding By Country

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
_____________<- read.csv("data/___________.csv")

d_college_country

### 11. Edit the Code to Graph the College Attainment Data

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_college_country<- read.csv("data/d_college_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above
ggplot(____________, aes(
    x = _________,
    y = reorder(country, college),
    fill = country == "Black U.S. Residents" # this is the fill statement
)) +
    geom_col(width = .5) +
    theme_dubois() +
       theme(text = element_text('serif')) +
    scale_fill_manual(values = c("TRUE" = "red", "FALSE" = "darkgreen")) +
    labs(
        title = "\nCollege attainment by Black U.S. residents compared with that of other nations.\n",
# fill in the blank below to translate the title in the language of your choice
        subtitle = "_____________________\n\n
        Adapted by ________from Du Bois' graph of literacy in 1900.\n\n"
    )

### 12. Edit the Code to Add X Axis Grid Lines With Labels

In [None]:
# below is code you learned above for reading in the data and setting up ggplot and graphing options
d_college_country<- read.csv("data/d_college_country.csv")
library(ggplot2)
options(repr.plot.width=22/2, repr.plot.height=28/2)
source("https://raw.githubusercontent.com/HigherEdData/Du-Bois-STEM/refs/heads/main/theme_dubois.R")

# below is the code you learned above
ggplot(d_college_country, aes(
    x = college,
    y = reorder(country, college),
    fill = country == "Black U.S. Residents" # this is the fill statement
)) +
    geom_col(width = .5) +
    theme_dubois() +
       theme(text = element_text('serif')) +
    scale_fill_manual(values = c("TRUE" = "red", "FALSE" = "darkgreen")) +
    labs(
        title = "\nCollege attainment by Black U.S. residents compared with that of other nations.\n",
# fill in the blank below to translate the title in the language of your choice
        subtitle = "_____________________\n\n
        Adapted by ________from Du Bois' graph of literacy in 1900.\n\n"
    ) +
scale_x_continuous(
        breaks = seq(0, 60, by = 10),  # Set tick positions every 10 units
        labels = function(x) paste0(x, "%")  # Add a "%" symbol to each label
    ) +
    theme(
        axis.text.x = element_text(size = 12),
        panel.grid.major.x = element_line(color = "lightgray")
        )