Graphs, and _directed acyclic graphs_ ("DAGs") in particular, are widely used to represent causal models.  A causal model consists of hypotheses about a data generating process. 

What follows is a brief instruction to DAGs.  For our purposes here, variables will be represented as letters like X and Y.

### What's A DAG?

A DAG is a graph that has no loops or "cycles" in it.  Let X and Y be variables.  In a graph they might be called _nodes_.  Imagine that X has a _direct effect_ on Y.  We can represent this like:  

\begin{align}
X \to Y
\end{align}  

The arrow is a *directed edge* in this graph.  It's like a one way street. 

A way of thinking about $X \to Y$ that J. Pearl has suggested is that Y "listens to X" to determine what value it should have, like $Y = y_i$ if $X = x_i$.  

This is *not* be a DAG: 

\begin{align}
X \to Y \\
X \leftarrow Y 
\end{align}

A DAG can't have loops.

### Chains, Forks, Colliders

Here, Z is another variable.  

A *chain*:

\begin{align}
Z \to X \to Y
\end{align}

A *fork*:

\begin{align}
X \leftarrow Z \to Y
\end{align}

A *collider*:
    
\begin{align}
X \to Z \leftarrow Y
\end{align}

You may recognize these as Wright's(1921) basic structures for path models.  Each has implications for the total and direct effects that can be estimated given a model that includes them.

### Mediation

In equation (4), above, X fully mediates the effect of Z on Y.  

Here's an example of _partial_ mediation.  Z partially mediates X's effect on Y.  

\begin{align}
X \to Y \\
X \to Z \\
Z \to Y
\end{align}

An alternative representation of the same model:  

\begin{align}
X \to Y,Z \\
Z \to Y
\end{align}

### Confounder

Z would be a confounder of the direct effect of X on Y:  

\begin{align}
X \to Y \\
Z \to X \\
Z  \to Y 
\end{align}  

Note that Z is a fork node in this models'graph, like in (5).

### Parents, Children, Ancestors, and Descendants

A DAG can have a large extended family.  

In (7), above, X is a *parent* of Y, and Y is a *child* of Z.  

In the chain in (4), Z is an ancestor of Y, and Y a descendant of Z.

### Endogenous and Exogenous Variables

A variable in a graph that has no arrows coming into it is _exogenous_.  One that has one or more arrows into it is _endogenous_.

### Unobserved Variables

Variables for which there is no data available can be included in a DAG. Observed variables can be endogenous (dependent) on them.  An error term for a regression model is an example.  

There can be observed, or _latent_ variables that are not errors, per se.  An example is a latent variable in a structural equation model (SEM) that is represented by a _measurement model_ consisting of _manifest, fallible indicator variables_, variables like those that by design load on a specific factor in confirmatory factor analysis.

**EXERCISE!**

Using notation like in the above examples, specify graphs for the following models.  No need to use LaTex; just use -> or <- for your arrows.  Each of the following, 1., 2., etc., specifies a model.

1. X1 and X2 have direct effects on Y  
2. X1 has a direct on Y1, X2 has direct effect on Y2
3. Z1 has a direct effect on both X, and Y, X has a direct effect on Y, X has a direct effect on Y, X has a direct effect on Z2, and Z2 has a direct effect on Y.