# Econ 490: Combining Graphs in Stata (10)

## Prerequisites 

1. Be able to effectively use Stata do-files and generate log-files.
2. Be able to change your directory so that Stata can find your files.
3. Import datasets in csv and dta format. 
4. Save data files. 
5. Use the command `twoway`.

## Learning Outcomes 

1. Know how to combine and save graphs using the commands `graph combine` and `graph export`.

We'll continue working with the fake data data set introduced in the previous lecture. Recall that this data set is simulating information for workers in the years 1982-2012 in a fake country where a training program was introduced in 2003 to boost their earnings. 

In [None]:
clear* 

use fake_data, clear 

In this module, we will we will work on two examples. The first example covers combining two graphs with the same schema, while the second covers combining two graphs with different schemas.

## 10.1 Example 1
For this example, we want to generate two graphs with the same schema (they are the same type of graph and use the same variables as their x and y axis) and combine them using the `graph combine` command. Let's begin by setting up the data. We are going to first generate a new variable that shows the logarithm of workers' earnings. As explained in previous modules, collapsing data is irreversible; therefore, we can `preserve` the data set before we collapse it. Then, once we don't want to use the collapsed version of the data set anymore, we can `reverse` it back to the original data set we preserved. Therefore,  we are going to preserve our data set and then collapse it by variables `treated` and `year`. This way our data has no two unique ids, `treated` and `year`. 

In [None]:
gen log_earnings = log(earnings)
la var log_earnings "Log-earnings"

preserve

collapse (mean) log_earnings, by(region treated year)

describe

 Before we begin producing the graphs, let's open the documentation for `graph combine`.  

In [None]:
help graph combine

Now that we have our data prepared and the documentation open to help us with any command-specific question, we can start generating the two graphs and combine them using the `graph combine` command. We want these graphs to compare log-earnings between the control and treated groups in regions 1 and 2. To do this, we can create one graph that compares log-earnings between control and treated groups in region 1 and another that does the same comparison for region 2.

In [None]:
*** Generate graph for Region 1 ***

twoway (connected log_earnings year if region==1 & treated) ||      ///
    (connected log_earnings year if region==1 & !treated),          ///
        xline(2002, lpattern(dash))                                 /// 
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 1") name("R1", replace)

In [None]:
*** Generate graph for Region 2 ***

twoway (connected log_earnings year if region==2 & treated) ||      ///
    (connected log_earnings year if region==2 & !treated),          ///
        xline(2002, lpattern(dash))                                 ///
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 2") name("R2", replace)

## add the qualities of the graph? like explain what does aspectratio does?

In [None]:
*** Combine graphs ***

graph combine R1 R2, cols(2) title("Panel A: Log-earnings by Region") saving(panel_a, replace)

graph export ./img/panel_a.svg, replace

![Panel A](img/panel_a.svg)

<div class="alert alert-block alert-info">

<b>Your turn:</b> Complete the code below to generate an additional graph for region 3, and name it `R3`. Combine all three graphs and store it as `panel_a_v2.svg`. 

</div>

In [None]:
*** Generate R3 ***

twoway (connected log_earnings year if region & treated) ||      ///
    (connected log_earnings year if region & !treated),          ///
        xline(2002, lpattern(dash))                              ///
        ylab(9.5(0.5)11)                                         ///
        ytitle("Log-earnings") xtitle("Year")                    ///
        legend( label(1 "Treated") label(2 "Control"))           ///
        aspectratio(1)                                           ///
        title(" ") name(" ", replace)


*** Combine graphs ***

graph , cols() title("Panel A: Log-earnings by Region") name(" ", replace)

graph export , replace


<div class="alert alert-block alert-warning">
    
<b>Your turn:</b> Edit the code below to remove the `aspectratio(1)` option and generate the graphs once again. Store the combined graph as `panel_a_v3.svg`. Do you notice any differences between version 1 and version 3?
    
</div>

In [None]:
*** Generate R1 and R2 without aspect ratio option ***

twoway (connected log_earnings year if region==1 & treated) ||      ///
    (connected log_earnings year if region==1 & !treated),          ///
        xline(2002, lpattern(dash))                                 /// 
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 1") name("R1", replace)

twoway (connected log_earnings year if region==2 & treated) ||      ///
    (connected log_earnings year if region==2 & !treated),          ///
        xline(2002, lpattern(dash))                                 ///
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 2") name("R2", replace)

*** Combine graphs ***

graph combine R1 R2, cols(2) title("Panel A: Log-earnings by Region") name( , replace)

graph export , replace

<div class="alert alert-block alert-warning">
    
<b>Your turn:</b> Edit the code below to remove the `ylab` option and generate the graphs once again. Store the combined graph as `panel_a_v4.svg`. Could a visualisation like this be misleading? 
    
</div>

In [None]:
*** Generate R1 and R2 without ylab option ***

twoway (connected log_earnings year if region==1 & treated) ||      ///
    (connected log_earnings year if region==1 & !treated),          ///
        xline(2002, lpattern(dash))                                 /// 
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 1") name("R1", replace)

twoway (connected log_earnings year if region==2 & treated) ||      ///
    (connected log_earnings year if region==2 & !treated),          ///
        xline(2002, lpattern(dash))                                 ///
        ylab(9.5(0.5)11)                                            ///
        ytitle("Log-earnings") xtitle("Year")                       ///
        legend( label(1 "Treated") label(2 "Control"))              ///
        aspectratio(1)                                              ///
        title("Region 2") name("R2", replace)

*** Combine graphs ***

graph combine R1 R2, cols(2) title("Panel A: Log-earnings by Region") name( , replace)

graph export , replace

<div class="alert alert-block alert-warning">
    
<b>Your turn:</b> Complete the code below to combine the graphs once again but replace `cols(2)` with `rows(2)`. Store the combined graph as `panel_a_v5.svg`. Do you notice any differences between version 1 and version 5? 
    
</div>

In [None]:
gr ,  title("Panel A: Log-earnings by Region") name( , replace)

graph export , replace

## 10.2 Example 2
For this example we want to combine graphs that do not follow the same schema. Lets say we are interested in seeing if there is any relationship between the distribution of earnings (*log_earnings*) and how worker's earnings change over time in region 1. Like we saw last module, we usually use histograms to present density distribution and we can use a scatter plot or a line plot for the graph of earnings over time. We will begin by generating a histogram of log-earnings in region 1. 

In [None]:
restore         // set up the data for our graph 

In [None]:
histogram log_earnings if region==1,   ///
    aspectratio(1)                     ///
    name("histogram1", replace)

Let's create our second graph. 

In [None]:
preserve              // here we set up the data once again for our second graph

collapse (mean) log_earnings, by(region year)

In [None]:
twoway (scatter log_earnings year if region==1), ///
    ytitle("Log-earnings") xtitle("Year")        ///
    aspectratio(1)                               ///
    name("plot1", replace)

Now we combine `histogram1` with `plot1`. 

In [None]:
graph combine histogram1 plot1, cols(2) title("Region 1") name(newcombine, replace)

graph export ./img/newcombine.svg, replace

![new combine](img/newcombine.svg)

## 10.3 Wrap Up
In this module we learned how to use the command `generate combine`. When producing a research paper we might want to compare statistics from different countries or different regions such as GDP, population density, inflation, exports, etc. These types of graphs allow us to see how the same variables diverge between different categories (for example how earnings diverge between region 1 and 2 in ex. 1) and also can show the relationship between different variables throughout one. Understanding what graphs to use and how to portray them is of extreme importance when building a research project, which is why working alongside the `twoway` and `graph combine` documentation is always of great value.

## References

[Getting started in stata (includes graphing)](https://www.youtube.com/watch?v=YAVq99iUTTI) <br>
[(Non StataCorp) Combining graphs in Stata](https://www.youtube.com/watch?v=GN9Jh7ZLauI)