# <center>Applications with Machine Learning </center>

<center>by Jiaoping Chen</center>

I am interested in how to do causal inferences using machine learning approaches. One of the most popular casual effect estimations is to estimate **Average Treatment Effect (ATE)** from a finite data sample. That is, given the focal independent/treatment variable $T_i$ and the response $Y_i$, if there exists any confounding variable(s) $W_i$ that can affect the treatment and the outcome at the same time, then we can identify the **causal effect** of T on Y *if and only if* we control for that set of confounding variable(s).

![image.png](attachment:image.png)

However, **what if the confounding is a document of text?** For example, we want to know the direct effect of gender label on the popularity of one tweet (the popularity can be defined as the number of retweets or likes). However, the author's gender may affect the text of the tweet, eg, topic choices, which also affect popularity. In theory, we can assume that the text carries sufficient information to identify the causal effect and adjust for the text. But in practice, we only have finite samples and the text is high dimensional, which is hard to do accurate and efficient causal inference. So the challenge is, **how to reduce the text to a low-dimensional representation that both suffices for causal identification and that allows effective estimations.** I will list two areas that are related to causal inferences in the following sections.

**1) Causal Inference for Text**
(Roberts 2018) [1] assumes the learned topics behind the text reflect the confounding aspects of the text, so they reduce the text dimension using the topic modeling techniques. (Louizos 2017) [2] assume there is an *observed proxy* for the unobserved confounder and fit a variational autoencoder using observed text data. (Miao 2018) [3] also, work with the observed proxy of confounding, and then further assume text partially captures confounding, which belongs to infinite-sample estimation. 

(Veitch 2019) [4] adopts a text embedding method that distills the text of each document to a real-valued vector to further use as features for prediction problems. This paper also utilizes modern embedding methods-**BERT**-to extract the information from the text relevant to the prediction of the treatment and outcome. They conclude that learned embeddings can effectively extract predictive information, which is sufficient for causal identification.

**2) Causal inference for Text and Network Structure**
Another novel idea for causal inference is that, even though it is not easy to observed all confounders, observational data can come with network information that can be utilized to infer hidden confounders. For example, socioeconomic status can be challenging to measure and thus become hidden confounders. But the socioeconomic status of an individual can be reflected by whom she is connected in social networks. (Guo 2020) [5] proposes the following causal diagram called "the network deconfounder" to learn representations to unravel patterns of hidden confounders from the network information. The hidden confounder $h$ can be inferred by this adjacency matrix $A$ of a network and observed text embeddings $x$. In other words, this framework learns representations of confounders by mapping the original features as well as the network structure into a shared latent feature space.

![image.png](attachment:image.png)

---
# References

[1] Roberts, Margaret E., Brandon M. Stewart, and Richard A. Nielsen. "Adjusting for confounding with text matching." American Journal of Political Science 64.4 (2020): 887-903.
https://onlinelibrary.wiley.com/doi/full/10.1111/ajps.12526?casa_token=P7Z52h8TM3QAAAAA%3AjZMkY4fl7JsReg_V_IjMDqWeWHRsfxpiTzLrek2aR9Izcu4bVlD7F9Nks6d1FJgVg2lFSfVH7vjxzM4

[2] Louizos, Christos, et al. "Causal effect inference with deep latent-variable models." Advances in Neural Information Processing Systems. 2017.
https://arxiv.org/pdf/1705.08821.pdf

[3] Wang Miao, Zhi Geng, Eric J Tchetgen Tchetgen, Identifying causal effects with proxy variables of an unmeasured confounder, Biometrika, Volume 105, Issue 4, December 2018, Pages 987–993, https://doi.org/10.1093/biomet/asy038

[4] Veitch, Victor & Sridhar, Dhanya & Blei, David. (2019). Using Text Embeddings for Causal Inference. https://www.researchgate.net/publication/333505486_Using_Text_Embeddings_for_Causal_Inference/citation/download

[5] Ruocheng Guo, Jundong Li, Huan Liu.(2020). Learning individual causal effects from networked observational data. In WSDM 2020 - Proceedings of the 13th International Conference on Web Search and Data Mining (pp. 232-240). https://doi.org/10.1145/3336191.3371816