Skip to content
Fetching contributors…
Cannot retrieve contributors at this time
162 lines (141 sloc) 5.87 KB
\title{Bars, rectangles with bases on x-axis}
geom_bar(mapping = NULL, data = NULL, stat = "bin",
position = "stack", ...)
\item{mapping}{The aesthetic mapping, usually constructed
with \code{\link{aes}} or \code{\link{aes_string}}. Only
needs to be set at the layer level if you are overriding
the plot defaults.}
\item{data}{A layer specific dataset - only needed if you
want to override the plot defaults.}
\item{stat}{The statistical transformation to use on the
data for this layer.}
\item{position}{The position adjustment to use for
overlappling points on this layer}
\item{...}{other arguments passed on to
\code{\link{layer}}. This can include aesthetics whose
values you want to set, not map. See \code{\link{layer}}
for more details.}
The bar geom is used to produce 1d area plots: bar charts
for categorical x, and histograms for continuous y.
stat_bin explains the details of these summaries in more
detail. In particular, you can use the \code{weight}
aesthetic to create weighted histograms and barcharts
where the height of the bar no longer represent a count
of observations, but a sum over some other variable. See
the examples for a practical example.
The heights of the bars commonly represent one of two
things: either a count of cases in each group, the values
in a column of the data frame. By default,
\code{geom_bar} uses \code{stat="bin"}. This makes the
height of each bar equal to the number of cases in each
group, and it is incompatible with mapping values to the
\code{y} aesthetic. If you want the heights of the bars
to represent values in the data, use
\code{stat="identity"} and map a value to the \code{y}
By default, multiple x's occuring in the same place will
be stacked a top one another by position_stack. If you
want them to be dodged from side-to-side, see
\code{\link{position_dodge}}. Finally,
\code{\link{position_fill}} shows relative propotions at
each x by stacking the bars and then stretching or
squashing to the same height.
Sometimes, bar charts are used not as a distributional
summary, but instead of a dotplot. Generally, it's
preferable to use a dotplot (see \code{geom_point}) as it
has a better data-ink ratio. However, if you do want to
create this type of plot, you can set y to the value you
have calculated, and use \code{stat='identity'}
A bar chart maps the height of the bar to a variable, and
so the base of the bar must always been shown to produce
a valid visual comparison. Naomi Robbins has a nice
on this topic}. This is the reason it doesn't make sense
to use a log-scaled y axis with a bar chart
# Generate data
c <- ggplot(mtcars, aes(factor(cyl)))
# By default, uses stat="bin", which gives the count in each category
c + geom_bar()
c + geom_bar(width=.5)
c + geom_bar() + coord_flip()
c + geom_bar(fill="white", colour="darkgreen")
# Use qplot
qplot(factor(cyl), data=mtcars, geom="bar")
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(cyl))
# When the data contains y values in a column, use stat="identity"
# Calculate the mean mpg for each level of cyl
mm <- ddply(mtcars, "cyl", summarise, mmpg = mean(mpg))
ggplot(mm, aes(x = factor(cyl), y = mmpg)) + geom_bar(stat = "identity")
# Stacked bar charts
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(vs))
qplot(factor(cyl), data=mtcars, geom="bar", fill=factor(gear))
# Stacked bar charts are easy in ggplot2, but not effective visually,
# particularly when there are many different things being stacked
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar()
ggplot(diamonds, aes(color, fill=cut)) + geom_bar() + coord_flip()
# Faceting is a good alternative:
ggplot(diamonds, aes(clarity)) + geom_bar() +
facet_wrap(~ cut)
# If the x axis is ordered, using a line instead of bars is another
# possibility:
ggplot(diamonds, aes(clarity)) +
geom_freqpoly(aes(group = cut, colour = cut))
# Dodged bar charts
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge")
# compare with
ggplot(diamonds, aes(cut, fill=cut)) + geom_bar() +
facet_grid(. ~ clarity)
# But again, probably better to use frequency polygons instead:
ggplot(diamonds, aes(clarity, colour=cut)) +
geom_freqpoly(aes(group = cut))
# Often we don't want the height of the bar to represent the
# count of observations, but the sum of some other variable.
# For example, the following plot shows the number of diamonds
# of each colour
qplot(color, data=diamonds, geom="bar")
# If, however, we want to see the total number of carats in each colour
# we need to weight by the carat variable
qplot(color, data=diamonds, geom="bar", weight=carat, ylab="carat")
# A bar chart used to display means
meanprice <- tapply(diamonds$price, diamonds$cut, mean)
cut <- factor(levels(diamonds$cut), levels = levels(diamonds$cut))
qplot(cut, meanprice)
qplot(cut, meanprice, geom="bar", stat="identity")
qplot(cut, meanprice, geom="bar", stat="identity", fill = I("grey50"))
# Another stacked bar chart example
k <- ggplot(mpg, aes(manufacturer, fill=class))
k + geom_bar()
# Use scales to change aesthetics defaults
k + geom_bar() + scale_fill_brewer()
k + geom_bar() + scale_fill_grey()
# To change plot order of class varible
# use factor() to change order of levels
mpg$class <- factor(mpg$class, levels = c("midsize", "minivan",
"suv", "compact", "2seater", "subcompact", "pickup"))
m <- ggplot(mpg, aes(manufacturer, fill=class))
m + geom_bar()
\code{\link{stat_bin}} for more details of the binning
alogirithm, \code{\link{position_dodge}} for creating
side-by-side barcharts, \code{\link{position_stack}} for
more info on stacking,
Jump to Line
Something went wrong with that request. Please try again.