Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Patterns of Patient and Caregiver Mutual Support Connections in an Online Health Community

Repository for analysis code related to a Spring 2019 study investigating interactions between patient and caregiver authors on

Originally submitted to CSCW in January 2020, revised June 2020, and conditionally accepted July 2020 for presentation at CSCW 2020 in October 2020.

As described in the paper, CaringBridge data used for analysis is not being released publically for ethical reasons.

For any questions or additional information, contact the corresponding author:


Author website:

Code Organization

Generally, each folder contains a mostly independent analysis. Minimal effort has been made to tidy things up.

Some code makes use of functions or utilities in another repository:

Folders and a brief description:

  • author_initiations - All of the initiations code, including all (?) of the models for RQ1. Includes scripts for producing the features expected by the mlogit models.
  • author_type - All of the author role classification of CaringBridge users and sites. Notably, the AuthorTypeClassification-New notebook contains an implementation of Black Box Shift Correction (as bbsc_clf in the sklearn classification pipeline).
  • build_network - Exploratory work to build the interaction network. Generally discarded in favor of other approaches.
  • data_pulling - Scripts for data processing and management, but also notebooks for survival analysis, as seen in Ruyuan Wan's CSCW'20 poster. For building the network data, FilterAndMergeExtractedInteractions does all the relevant merging, and includes some additional visualizations of users interaction tendencies. Subfolder sa_poster_figures has figures for the survival analysis poster (they probably should have been put in the top-level figures directory).
  • data_selection - Core notebooks for selecting valid authors, esp. CandidateDataSelection-New.
  • dyad_growth - A lot of the interaction network stuff here, as well as the most important notebook in the repo: UserUserDyadDistributions-Demonstration. This notebook should not be here, but it includes a lot of stuff, including some RQ2 models.
  • figures - generic output directory for many of the figures in the paper, in PDF format.
  • geographic_analysis - Code that generates US-state-assignments for valid authors, by using the recorded IP addresses on guestbooks and journal updates.
  • visualization - Author tenure analysis and figure in AuthorTenure. I think basically nothing else is relevant in this folder. (One interesting thing: our attempts to do "session"-centric analysis on CaringBridge basically failed; inter-activity times don't show clear evidence of sessions. Many (most?) authors come to CaringBridge on a fairly fixed schedule, which either suggests a lack of responsiveness to notifications or responding exclusively off-platform e.g. reading guestbooks via email.) Also a script to generate a pointless video breaking down users by their types of interactions on CaringBridge:


Repository containing analysis code for research published in CSCW 2020.





No releases published


No packages published


You can’t perform that action at this time.