-
Notifications
You must be signed in to change notification settings - Fork 5
Forest Plot with Subgroups
- The Goal
- Pre-summarized Dummy Data
- Forest Only
- Add Row Labels
- Format Row Labels
- Add Banding
- Add Statistics
- Axis Modifications
- Style Modifications
The Forest Plot with Subgroups will be developed in several steps.
- A succinct version of the code is available here.
- The inspiration for this page comes from the SAS blog Graphically Speaking.
On this page, we walk through the process of creating a forest plot with subgroups, complete with bold group headers, indentations, and alternate group banding. The end goal looks like this:
Because I cannot possibly predict the exact set of statistics or derivation methods that you are going to want to use in your forest plot, this example begins with pre-summarized dummy data.
We begin by producing a forest plot with nothing but hazard ratio estimates and CIs.
The estimates appear courtesy of a scatter
statement and the CIs by way of highlow
. We use the reverse
option so that Overall shows up at the top.
data stats10;
set derive.stats00;
run;
proc sgplot data=stats10 noautolegend;
*--- estimates and CIs ---;
scatter y=record x=mean /
markerattrs=(symbol=squarefilled)
;
highlow y=record low=low high=high;
*--- primary axes ---;
yaxis
reverse
;
run;
Next we add the row labels for each estimate.
This is accomplished using yaxistable
. We also throw a couple of options at the yaxis
statement to help with the cosmetics.
proc sgplot data=stats10 noautolegend;
*--- estimates and CIs ---;
scatter y=record x=mean /
markerattrs=(symbol=squarefilled)
;
highlow y=record low=low high=high;
*--- adding yaxis table at left ---;
yaxistable subgroup /
location=inside
position=left
;
*--- primary axes ---;
yaxis
reverse
display=none
offsetmin=0
;
run;
Next, we make the row labels look a bit nicer.
First we will make the group labels bold using a discrete attribute map (dattrmap
and textgroup
and textgroupid
). Second we will indent the groups (indentweight
).
*--- create discrete attribute map dataset ---;
data attrmap;
input id $ value textcolor $ textsize textweight $;
datalines;
text 1 black 7 bold
text 2 black 5 normal
;
run;
*--- flag records for indention ---;
data stats20;
set stats10;
if level=1 then
indentWt = 0;
else
indentWt = 1;
run;
proc sgplot data=stats20 dattrmap=attrmap noautolegend;
scatter ...
highlow ...
*--- adding yaxis table at left ---;
yaxistable subgroup /
location=inside
position=left
textgroup=level
textgroupid=text
indentweight=indentWt /* 9.4m3 */
;
*--- primary axes ---;
yaxis ...
run;
In this step we add banding to every other group.
We want the banding to cover both the row labels as well as the corresponding estimates and CIs. Unfortunately we have to add the banding separately for each piece. To get the banding for the estimates and CIs we add a new variable band
to our dataset and create a refline
to go with it. To get the banding for the row labels we use the yaxis
option colorbands
.
*--- flag records for banding ---;
data stats30;
set stats20;
retain levelcount 0;
if level = 1 then levelcount + 1;
if mod(levelcount,2) = 0 then
band = record;
run;
proc sgplot data=stats30 dattrmap=attrmap noautolegend;
*--- banding and reference line ---;
refline band /
lineattrs=(thickness=26 color=cxf0f0f7)
;
scatter ...
highlow ...
yaxistable subgroup / ...
*--- primary axes ---;
yaxis
reverse
display=none
offsetmin=0
colorbands=odd
colorbandsattrs=(transparency=1)
;
run;
In this step we add lots of statistics to the output.
First we format the statistics so that dots are suppressed for missing values.
proc format;
value notdot
. = ' '
other = [best.]
;
run;
data stats40;
set stats30;
format pcigroup group pvalue notdot.;
run;
Then it's just a matter of a couple of additional yaxistable
statements.
proc sgplot data=stats40 dattrmap=attrmap noautolegend;
refline ...
scatter ...
highlow ...
yaxistable subgroup / ...
*--- a second yaxis table at left ---;
yaxistable countpct /
location=inside
position=left
;
*--- adding yaxis table at right ---;
yaxistable pcigroup group pvalue /
location=inside
position=right
;
yaxis ...
run;
Now we go to work on the axis.
In 9.4m3 you are allowed to use unicode in formats. The unicode characters below are arrows.
proc format;
value $txt
"T" = "Therapy Better (*ESC*){Unicode '2192'x}" /* 9.4m3 */
"P" = "(*ESC*){Unicode '2190'x} PCI Better" /* 9.4m3 */
;
run;
We then manually specify the coordinates at which we want the above unicode formats to appear. You could go nuts and write some macro code to automatically calculate the data range and position the text dynamically, but that seems like too much bother for an introductory example.
data hazratinterp;
input x1 record text $;
format text $txt.;
datalines;
0.7 17 P
1.4 17 T
;
run;
data stats50;
set stats40 hazratinterp (in=b);
run;
And now we modify the plot code.
- We add
nocycleattrs
,nowall
to the plot statement. - We add a
styleattrs
statement to limit where axis lines are drawn. - The
refline
statement draws a vertical reference line (shocking I know). - The
xaxis
statement cleans up the axis. - The
text
statement draws our descriptive text. - And to get the text "Hazard Ratio" to appear at top we have to add a dummy
scatter
statement so that we can gain access to thex2axis
statement.
proc sgplot data=stats50
dattrmap=attrmap
noautolegend
nocycleattrs
nowall
;
*--- remove box around plot ---;
styleattrs
axisextent=data
;
refline ...
*--- add reference line ---;
refline 1 /
axis=x
;
scatter ...
highlow ...
yaxistable ...
yaxistable ...
yaxistable ...
yaxis ...
*--- cleaner axis ---;
xaxis
display=(nolabel)
;
*--- text above xaxis ---;
text x=x1 y=record text=text /
position=bottom
contributeoffsets=none
strip
;
*--- text above x2axis ---;
scatter y=record x=mean /
markerattrs=(size=0)
x2axis
;
x2axis
label='Hazard Ratio'
display=(noline noticks novalues)
;
run;
Finally, we change the font to Courier New.
It's our old friend PROC TEMPLATE. In addition to changing the font family, we also make the value and label text slightly smaller than the default values. It just looks better that way.
proc template;
define style styles.forest;
parent=styles.rtf;
class GraphFonts /
"GraphDataFont" = ("Courier New",7pt)
"GraphValueFont" = ("Courier New",8pt)
"GraphLabelFont" = ("Courier New",8pt)
;
end;
run;