Skip to content


  • Arctic Code Vault Contributor

Hi, I'm Alex 👋

I'm a PhD student in the University of Wisconsin-Madison statistics program. My github is a mixture of research code, #rstats contributions, and personal data analysis projects. I write long-form explainers on my blog, At the moment most of my energy goes to research.

Research & research code

I study community detection in networks with Karl Rohe, primarily using spectral methods. My research code is all in R at the moment, and none of it is on CRAN, primarily because my research packages are only about 75 percent done; the key functionality works but documentation, tests, and other user-friendly elements may be missing.

  • aPPR helps you calculate approximate personalized pageranks from large graphs, including those that can only be queried via an API. aPPR additionally performs degree correction and regularization, allowing users to recover blocks from stochastic blockmodels. Leverages socialsampler and twittercache to cache targeted network samples as the personalized pagerank calculations run. Read the paper.

  • vsp performs semi-parametric estimation of latent factors in random-dot product graphs by computing varimax rotations of the spectral embeddings of graphs. The resulting factors are sparse and interpretable. Read the paper.

  • fastRG samples random-dot product graphs much faster than naive sampling procedures and is especially useful when running simulation studies. See the paper for a description of the fastRG core algorithm.

  • fastadi performs self-tuning matrix completion via adaptive thresholding, often outperforming softImpute. See the paper for algorithmic and theoretical details. I have also extended this algorithm to work with matrices where the entire upper triangle is observed as part of some work on citation networks.


I am involved in a number of open source projects in the tidyverse and tidymodels orbits. In particular, I maintain the broom package, which currently has ~6 million downloads, and for my contributions am an author on the tidyverse paper. I intermittently participate in the Stan and ROpenSci communities as well.

Please get in touch if...

  • you'd like to hire me for a research or data science for social good internship,
  • you want to discuss design of statistical modeling software,
  • you want to collaborate on a research project, or
  • you want to write an explainer together.

Outside of R, I'm a proficient Python user, and can pull together enough SQL, C++, Julia, and PySpark to get things done.

I am responsive on Twitter at @alexpghayes and via email.


  1. Convert statistical analysis objects from R into tidy format

    R 1.1k 281

  2. Probability Distributions as S3 Objects

    R 84 10

  3. Approximate Personalized Page Rank

    R 9 1

  4. reference material for classical hypothesis tests

    74 10

  5. Vintage Sparse PCA for Semi-Parametric Network Analysis

    R 18 5

  6. Sample Generalized Random Dot Product Graphs in Linear Time


672 contributions in the last year

Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mon Wed Fri

Contribution activity

March 2021

Created 1 repository

Created a pull request in ropensci/rtweet that received 7 comments

Factor out hardcoded versioning in API calls

@hadley @llrs I've started work on this but am having trouble testing code locally. 35 tests are being skipped with Reason: Auth not available. I d…

+53 −53 7 comments
Opened 1 issue in 1 repository
1 open
9 contributions in private repositories Mar 1 – Mar 7

Seeing something unexpected? Take a look at the GitHub profile guide.