Skip to content

Matched Set Objects

adamrauh edited this page Jan 22, 2020 · 11 revisions

First, create a smaller subset of the data to make things a little easier to see and work with for the purposes of these examples.

uid <-unique(dem$wbcode2)[1:10]
subdem <- dem[dem$wbcode2 %in% uid, ]
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem)

We will use the PanelMatch function with refinement.method set to none to obtain a PanelMatch object, from which we will extract a matched.set object.


PM.results <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2", 
                         treatment = "dem", refinement.method = "none", 
                         data = subdem, match.missing = TRUE, 
                         qoi = "att" ,outcome.var = "y",
                         lead = 0, forbid.treatment.reversal = FALSE)

#Extract the matched.set object 

msets <- PM.results$att                         

PanelMatch returns an S3 object of the PanelMatch class. These objects are just lists with some additional attributes. Here, we will focus on one element contained within PanelMatch objects: matched.set objects. Within the PanelMatch object, this element is always named either att or atc. When qoi = ate, then there are two matched.set objects included in the resulting PanelMatch call. Specifically, there will be two matched sets named att and atc, respectively.

In implementation, the matched.set is just a named list with some added attributes (lag, names of treatment, unit, and time variables) and a structured name scheme. Each entry in the list is a vector containing the unit ids of control units that are in a matched set. Additionally, each entry corresponds to a time/unit id pair (the unit id of a treated unit and the time at which treatment occurred). This is reflected in the names of each element of the list, as the name scheme [id varable].[time variable] is used.

Matched set objects are implemented as lists, but the default printing behavior resembles that of a data frame. One can toggle a verbose option on the print method to print as a list and also display a less summarized version of the matched set data.

names(msets)
[1] "4.1992"  "4.1997"  "6.1973"  "6.1983"  "7.1991"  "12.1992" "13.2003" "7.1998" 

#data frame printing view: useful as a summary view with large data sets
print(msets)

  wbcode2 year matched.set.size
1       4 1992                2
2       4 1997                1
3       6 1973                1
4       6 1983                2
5       7 1991                4
6      12 1992                2
7      13 2003                2
8       7 1998                0

# first column is unit id variable, second is time variable, and 
# third is the number of controls in that matched set

# prints as a list, shows all data at once
print(msets, verbose = TRUE)
$`4.1992`
[1] "3"  "13"
attr(,"weights")
  3  13 
0.5 0.5 

$`4.1997`
[1] "7"
attr(,"weights")
7 
1 

$`6.1973`
[1] "13"
attr(,"weights")
13 
 1 

$`6.1983`
[1] "4"  "13"
attr(,"weights")
  4  13 
0.5 0.5 

$`7.1991`
[1] "3"  "4"  "12" "13"
attr(,"weights")
   3    4   12   13 
0.25 0.25 0.25 0.25 

$`12.1992`
[1] "3"  "13"
attr(,"weights")
  3  13 
0.5 0.5 

$`13.2003`
[1] "3"  "12"
attr(,"weights")
  3  12 
0.5 0.5 

$`7.1998`
character(0)

attr(,"lag")
[1] 4
attr(,"t.var")
[1] "year"
attr(,"id.var")
[1] "wbcode2"
attr(,"treatment.var")
[1] "dem"
attr(,"refinement.method")
[1] "none"
attr(,"match.missing")
[1] TRUE

The '[' and '[[' operators are implemented and should work intuitively.

Using '[' returns a subsetted matched.set object (list). The additional attributes will be copied and transferred as well with the custom operator. Note how, by default, it prints like the full form of the matched.set. Using '[[' will return the unit ids of the control units in the specified matched set:

Since matched.set objects are just lists with attributes, you can expect the [ and [[ functions to work similarly to how they would with a list. So, for instance, users can extract information about matched sets using numerical indices or by taking advantage of the naming scheme.

msets[1]
  wbcode2 year matched.set.size
1       4 1992                2

#prints the control units in this matched set
msets[[1]]

[1] "3"  "13"
attr(,"weights")
  3  13 
0.5 0.5 

msets["4.1992"] #equivalent to msets[1]
  wbcode2 year matched.set.size
1       4 1992                2

msets[["4.1992"]] #equivalent to msets[[1]]
[1] "3"  "13"
attr(,"weights")
  3  13 
0.5 0.5 

Calling plot on a matched.set object will display a histogram of the sizes of the matched sets. By default, the number of empty matched sets (treated unit/time id pairs with no suitable controls for a match) is noted with a vertical bar at x = 0. One can include empty sets in the histogram by setting the include.empty.sets argument to TRUE

plot(msets, xlim = c(0, 4))

The summary function provides a variety of information about the sizes of matched sets, number of empty sets, lag size, and also a data frame with some useful summary overview information. This "overview" data frame from the summary function is actually what gets printed by default when calling print on a matched.set object, so if one wanted to interact with that data.frame object, you could do that with the overview item from summary. The summary function also has an option to print only the overview data frame. Toggle this by setting verbose = FALSE

print(summary(msets))
$overview
  wbcode2 year matched.set.size
1       4 1992                2
2       4 1997                1
3       6 1973                1
4       6 1983                2
5       7 1991                4
6      12 1992                2
7      13 2003                2
8       7 1998                0

$set.size.summary
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    1.00    2.00    1.75    2.00    4.00 

$number.of.treated.units
[1] 8

$num.units.empty.set
[1] 1

$lag
[1] 4

print(summary(msets, verbose = FALSE))
  wbcode2 year matched.set.size
1       4 1992                2
2       4 1997                1
3       6 1973                1
4       6 1983                2
5       7 1991                4
6      12 1992                2
7      13 2003                2
8       7 1998                0

Matched sets with the DisplayTreatment function

Passing a matched set (one treated unit and its corresponding set of controls) to the DisplayTreatment function will visually highlight the lag window histories used to create the matched set. There is also an option to only display units from the matched set (and the treated unit), which can be achieved by setting show.set.only to TRUE.

DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem, matched.set = msets[1])

DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', 
					data = subdem, matched.set = msets[1], show.set.only = TRUE, y.size = 15, x.size = 13)

Clone this wiki locally