Skip to content

Commit

Permalink
Update description of Collider Bias Example
Browse files Browse the repository at this point in the history
  • Loading branch information
PhilippBach committed Aug 25, 2022
1 parent 5f52bf2 commit c84a149
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions ui.R
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,12 @@ border-top-color:#ffda3e;
tabItems(
tabItem(tabName = "collider",
h2("Collider Bias: Movie Star Example"),
p("This app illustrates the Movie Star Example from Cunningham (2021, Section 3.6.1). A CNN blogpost reported that Megan Fox was voted the worst but at the same time the most attractive star in 2009. This result might lead to the more general question: Are more attractive actors less talented? "),
p("This app illustrates the Collider Bias in the context of the Movie Star Example from Cunningham (2021, Section 3.6.1). A CNN blogpost reported that Megan Fox was voted the worst but at the same time the most attractive star in 2009. This suggests the question: Are more attractive actors generally less talented? "),
h3("Are more Attractive Actors less Talented?"),
p("We revisit this question and try to disentangle the causal mechanisms using a DAG and a simulated data example. Consider the two characteristics 'talent' and 'beauty'. We simulate a data set such that there is no causal relationship between these two variables, i.e., they are stochastically independent. However, if we condition on a collider, i.e., a variable that is causally affected by 'talent' and 'beauty', we find a significant negative correlation between these two characteristics. This is the so-called Collider Bias. In the movie star example, such a collider could be a variable that indicates whether a person is a star or not ('star'): More attractive persons and more talented persons probably have a higher chance to become a movie star. Once, we condition our analysis on a sample of movie stars only, one might be tempted to conclude that more attractive actors are less talented."),
p("The scatter plot below shows the data points according to the chosen sample selection, i.e., show the entire population or movie stars only. Take a look at the DAG that illustrates whether the variables 'beauty' and 'talent' are d-connected or d-separated. The regression output shows whether this result would also be obtained in an empirical analysis based on the simulated data example."),
h3("Sample Selection and Scatter Plot"),
p("We revisit this question and try to disentangle the causal mechanisms using a DAG and a simulated data example. Consider the two characteristics 'talent' and 'beauty'. We simulate a data set such that there is no causal relationship between these two variables, i.e., they are stochastically independent. However, if we condition on a collider, i.e., a variable that is causally affected by 'talent' and 'beauty', we find a significant negative correlation between these two characteristics. The reason for this is the so-called Collider Bias. In the movie star example, such a collider could be a variable that indicates whether a person is a star or not ('star'): More attractive persons and also more talented persons probably have a higher chance to become a movie star. Hence, if we base our analysis on this sample selection, we might draw conclusions that do not hold for the entire population, in general."),
h3("Data Example: Scatter Plot, Causal Diagram, Regression Output"),
p("The scatter plot below shows the data points according to the specified sample selection, i.e., consider the entire population or the subsample of movie stars only. The Directed Acyclical Graph (DAG) on the right illustrates the causal relationship in the movie star example. It tells us, whether we can expect to find a significant association in an empirical example. If we condition on the collider, the variables 'talent' and 'beauty' are 'd-connected': We will probably find a correlation of these variables in a data example. If they are 'd-separated', they will probably not be correlated. This is reflected by the regression output shown below. It shows the coefficient estimate and confidence interval from a linear regression of 'beauty' on 'talent' in the simulated data set."),

fluidRow(
box(
fluidRow(
Expand Down

0 comments on commit c84a149

Please sign in to comment.