Skip to content
This repository was archived by the owner on Dec 30, 2019. It is now read-only.
This repository was archived by the owner on Dec 30, 2019. It is now read-only.

Draft Research plan #2

@mzargham

Description

@mzargham

Research Roadmap

  1. Document the data model for the contribution graph that is being captured by the source cred team today.
  2. Build on the existing model to establish a general semantic or "space of contribution graphs" which characterizes all legal contribution graphs (including accounting node types and potentially subtypes). This formal definition will serve as the domain for the Heuristics that enrich the network with weights (transition probabilities).
  3. Construct one or more credit flow heuristics for every type of edge defined in the "space of contribution graphs" so that it is possible to uniquely define a view from a particular graph. Must include human readable descriptions of what the heuristics interpretation as a credit flow. The resulting matrix must be a markov chain (row stochastic matrix).
  4. Construct one or more seed vector functions along with human language descriptions of the intended interpretation of driving the mixing process from such a seed.
  5. Using data sets collected by the source cred project explore the space of algorithms by prototyping in a scripting language; explore sensitivity of rankings to a variety of choices ranging from differing heuristics, to parameter sweeps of alpha and seed choices.
  6. Emulate game behavior by attempting to optimize for ranking through attack vectors such as spamming events or sybil attacks.
  7. Support the source cred in implementing and testing algorithms based on this research.

image

Lab setup:

  1. get excess the graph data (sample data set is fine)
  2. exploratory data analysis better understand what that data is
  3. create some synthetic graph generators so we can test the algorithms on different assumptions about user/contributor structures (including attacks i.e. small or empty contribution spam)
  4. hack together a script to go through stages like those in my multi-class page rank algorithm outline
  5. clearly define some metrics/measures to evaluate properties of resulting rankings
  6. automate some validation analysis for those metrics measures

Now the research lab is set up, algorithm research actually starts:

  1. establish some hypothesis about different heuristics, properties they should have and conditions under which they are or not effective
  2. use the graph generator to explore more specific attack vectors and/or newly imagined test cases
  3. Run lots of experiments and iterate toward specific algorithm constructions to determine what works best for source cred current use cases
  4. provide guidelines for producing other rankings with different requirements ~ ideal IMO is that others building on the is ecosystem should be possible without having to have expertise in the graph theory at the level that originally deriving and testing these initial algorithms requires

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions