Skip to content

Forest Plot with Subgroups

Shane Rosanbalm edited this page Mar 22, 2017 · 10 revisions

Contents

  1. The Goal
  2. Pre-summarized Dummy Data
  3. Forest Only
  4. Add Row Labels
  5. Format Row Labels
  6. Add Banding
  7. Add Statistics
  8. Axis Modifications
  9. Style Modifications

The Forest Plot with Subgroups will be developed in several steps.

  • A succinct version of the code is available here.
  • The inspiration for this page comes from the SAS blog Graphically Speaking.

The Goal

On this page, we walk through the process of creating a forest plot with subgroups, complete with bold group headers, indentations, and alternate group banding. The end goal looks like this:

subgroup forest the goal

Pre-summarized Dummy Data

Because I cannot possibly predict the exact set of statistics or derivation methods that you are going to want to use in your forest plot, this example begins with pre-summarized dummy data.

subgroup forest pre summarized data

Forest Only

We begin by producing a forest plot with nothing but hazard ratio estimates and CIs.

subgroup forest

The estimates appear courtesy of a scatter statement and the CIs by way of highlow. We use the reverse option so that Overall shows up at the top.

data stats10;
   set derive.stats00;
run;

proc sgplot data=stats10 noautolegend;
   *--- estimates and CIs ---;
   scatter y=record x=mean / 
      markerattrs=(symbol=squarefilled)
      ;
   highlow y=record low=low high=high;
   *--- primary axes ---;
   yaxis
      reverse
      ;
run;

subgroup forest

Add Row Labels

Next we add the row labels for each estimate.

subgroup forest

This is accomplished using yaxistable. We also throw a couple of options at the yaxis statement to help with the cosmetics.

proc sgplot data=stats10 noautolegend;
   *--- estimates and CIs ---;
   scatter y=record x=mean / 
      markerattrs=(symbol=squarefilled)
      ;
   highlow y=record low=low high=high;
   *--- adding yaxis table at left ---;
   yaxistable subgroup / 
      location=inside
      position=left
      ;
   *--- primary axes ---;
   yaxis
      reverse
      display=none
      offsetmin=0
      ;
run;

subgroup forest

Format Row Labels

Next, we make the row labels look a bit nicer.

subgroup forest

First we will make the group labels bold using a discrete attribute map (dattrmap and textgroup and textgroupid). Second we will indent the groups (indentweight).

*--- create discrete attribute map dataset ---;
data attrmap;
   input id $ value textcolor $ textsize textweight $;
   datalines;
text 1 black 7 bold
text 2 black 5 normal
;
run;

*--- flag records for indention ---;
data stats20;
   set stats10;
   if level=1 then 
      indentWt = 0;
   else
      indentWt = 1;
run;

proc sgplot data=stats20 dattrmap=attrmap noautolegend;
   scatter ...
   highlow ...
   *--- adding yaxis table at left ---;
   yaxistable subgroup / 
      location=inside
      position=left
      textgroup=level
      textgroupid=text
      indentweight=indentWt         /* 9.4m3 */
      ;
   *--- primary axes ---;
   yaxis ...
run;

subgroup forest

Add Banding

In this step we add banding to every other group.

subgroup forest

We want the banding to cover both the row labels as well as the corresponding estimates and CIs. Unfortunately we have to add the banding separately for each piece. To get the banding for the estimates and CIs we add a new variable band to our dataset and create a refline to go with it. To get the banding for the row labels we use the yaxis option colorbands.

*--- flag records for banding ---;
data stats30;
   set stats20;
   retain levelcount 0;
   if level = 1 then levelcount + 1;
   if mod(levelcount,2) = 0 then
      band = record;
run;

proc sgplot data=stats30 dattrmap=attrmap noautolegend;
   *--- banding and reference line ---;
   refline band / 
      lineattrs=(thickness=26 color=cxf0f0f7)
      ;
   scatter ...
   highlow ...
   yaxistable subgroup / ...
   *--- primary axes ---;
   yaxis
      reverse
      display=none
      offsetmin=0
      colorbands=odd
      colorbandsattrs=(transparency=1) 
      ;
run;

subgroup forest

Add Statistics

In this step we add lots of statistics to the output.

subgroup forest

First we format the statistics so that dots are suppressed for missing values.

proc format;
   value notdot
      . = ' '
      other = [best.]
      ;
run;

data stats40;
   set stats30;
   format pcigroup group pvalue notdot.;
run;

Then it's just a matter of a couple of additional yaxistable statements.

proc sgplot data=stats40 dattrmap=attrmap noautolegend;
   refline ...
   scatter ...
   highlow ...
   yaxistable subgroup / ...
   *--- a second yaxis table at left ---;
   yaxistable countpct /
      location=inside
      position=left
      ;
   *--- adding yaxis table at right ---;
   yaxistable pcigroup group pvalue /
      location=inside
      position=right
      ;
   yaxis ...
run;

subgroup forest

Axis Modifications

Now we go to work on the axis.

subgroup forest

In 9.4m3 you are allowed to use unicode in formats. The unicode characters below are arrows.

proc format;
   value $txt
     "T" = "Therapy Better (*ESC*){Unicode '2192'x}"  /* 9.4m3 */
     "P" = "(*ESC*){Unicode '2190'x} PCI Better"      /* 9.4m3 */
      ;                             
run;

We then manually specify the coordinates at which we want the above unicode formats to appear. You could go nuts and write some macro code to automatically calculate the data range and position the text dynamically, but that seems like too much bother for an introductory example.

data hazratinterp;
   input x1 record text $;
   format text $txt.;
   datalines;
0.7 17 P
1.4 17 T
;
run;

data stats50;
   set stats40 hazratinterp (in=b);
run;

And now we modify the plot code.

  • We add nocycleattrs, nowall to the plot statement.
  • We add a styleattrs statement to limit where axis lines are drawn.
  • The refline statement draws a vertical reference line (shocking I know).
  • The xaxis statement cleans up the axis.
  • The text statement draws our descriptive text.
  • And to get the text "Hazard Ratio" to appear at top we have to add a dummy scatter statement so that we can gain access to the x2axis statement.
proc sgplot data=stats50 
      dattrmap=attrmap 
      noautolegend 
      nocycleattrs
      nowall
      ;
   *--- remove box around plot ---;
   styleattrs 
      axisextent=data
      ;
   refline ...
   *--- add reference line ---;
   refline 1 /
      axis=x
      ;
   scatter ...
   highlow ...
   yaxistable ...
   yaxistable ...
   yaxistable ...
   yaxis ...
   *--- cleaner axis ---;
   xaxis 
      display=(nolabel) 
      ;
   *--- text above xaxis ---;
   text x=x1 y=record text=text / 
      position=bottom 
      contributeoffsets=none 
      strip
      ;
   *--- text above x2axis ---;
   scatter y=record x=mean / 
      markerattrs=(size=0) 
      x2axis
      ;
   x2axis 
      label='Hazard Ratio' 
      display=(noline noticks novalues) 
      ;
run;

subgroup forest

Style Modifications

Finally, we change the font to Courier New.

subgroup forest

It's our old friend PROC TEMPLATE. In addition to changing the font family, we also make the value and label text slightly smaller than the default values. It just looks better that way.

proc template;
   define style styles.forest;
      parent=styles.rtf;
      class GraphFonts /
         "GraphDataFont"  = ("Courier New",7pt)
         "GraphValueFont" = ("Courier New",8pt)
         "GraphLabelFont" = ("Courier New",8pt)
         ;
   end;
run;

subgroup forest