Skip to content

Latest commit

 

History

History
78 lines (55 loc) · 3.79 KB

Readme.md

File metadata and controls

78 lines (55 loc) · 3.79 KB

Social Network Analysis in R

The purpose of this repository is to learn how to run SNA in R during the summer of 2012.

Examples

Here's a great example of SNA on GitHub using R: http://www.r-chart.com/2012/05/github-follower-graph-with-r.html

Characteristics of the Network Data at GitHub

The network data consists of the following edges:

  • Coders following other coders
  • Coders watching repositories
  • Repositories being forked off of other repositories
  • Coders connected to repositories through commits, comments etc.
  • Coders connected to coders through committing or commenting on the same repository, branch, or the same file/commit/issue etc.
  • Coders who are on the same team/organization

Hence, there are both directed and undirected graphs, as well as multidimensional since there are both human (coders) and non-human (repositories and their constituent parts) vertices.

N.B. The graph package cannot mix directed and undirected graphs in the same model. Can we work around this within the package or do we need a different package?

These are whole networks, not ego networks, meaning that they are not centered on any given individual.

Anything else?

Strong ties? Weak ties?

Potentially Interesting Measures That We Could Correlate With Various Sequence Characteristics

  • The density (actual ties/possible ties) of a repository network
  • Size of network
  • Cohesion/Geodesics, i.e. the number of direct paths in the network
  • How many other repositories do the coders contribute to?
  • Dispersion of code contribution
  • Are there cliques in the repository network?
  • Centrality of actors in relation to sequences or sequence aspects
  • Network centralization (a measure of inequality of contribution in the network as a whole)
  • Graph hierarchy (do coders follow each other mutually, or do only certain coders get followed?)
  • Least upper boundedness
  • Efficiency
  • Changes in any of these measures over time

Thoughts on How to Manage the Summer Course

  • Maybe we could divide the 10 workshops from the Stanford SNA in R/SoNIA between ourselves, and then go through them one by one?

Videos

Resources

Here are some resources to get us started:

References

Crowston, K., & Howison, J. 2006. Hierarchy and centralization in free and open source software team communications. Knowledge, Technology & Policy, 18(4): 65-85.