Skip to content

Commit 78f75a6

Browse files
committed
moar draft paysim pt 3 WIP
1 parent c2ce99c commit 78f75a6

3 files changed

Lines changed: 39 additions & 12 deletions

File tree

content/openbsd/2017-09-19_Tips-for-Alpine-Linux----under-OpenBSD----dca8d09568b4.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ date: '2017-09-19T11:48:58.385Z'
55
tags: [archives, linux, openbsd]
66
---
77

8+
> Note: This is an old post from when I wrote on medium.com...formatting may be wonky here until I clean it up.
9+
810
![Matterhorn…if you squint you can see Puffy up there, I swear. ([https://commons.wikimedia.org/wiki/File:Matterhorn\_from\_Domh%C3%BCtte\_-\_2.jpg](https://commons.wikimedia.org/wiki/File:Matterhorn_from_Domh%C3%BCtte_-_2.jpg))](https://cdn-images-1.medium.com/max/800/1*Ird9QLfasg60n7LVB6RCpw.jpeg)
911
Matterhorn…if you squint you can see Puffy up there, I swear. ([https://commons.wikimedia.org/wiki/File:Matterhorn\_from\_Domh%C3%BCtte\_-\_2.jpg](https://commons.wikimedia.org/wiki/File:Matterhorn_from_Domh%C3%BCtte_-_2.jpg))
1012

content/openbsd/2018-06-16_Installing-OpenBSD-6-3-on-Packet-net-df51a5e9083f.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ date: '2018-06-16T21:57:46.516Z'
55
tags: [archives, openbsd, virtualization]
66
---
77

8+
> Note: This is an old post from when I wrote on medium.com...formatting may be wonky here until I clean it up.
9+
810
![Image by Kurt Edblom shared under CC BY-SA 2.0 ([https://flic.kr/p/fanASg](https://flic.kr/p/fanASg))](https://cdn-images-1.medium.com/max/800/1*fhSbHyChf5Pd-L_ryF0Khw.jpeg)
911
Image by Kurt Edblom shared under CC BY-SA 2.0 ([https://flic.kr/p/fanASg](https://flic.kr/p/fanASg))
1012

org/paysim-part3.org

Lines changed: 35 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -118,16 +118,23 @@ in [[file:paysim-part2.org][part 2]] creating unique nodes for each instance of
118118
(e.g. there's only one SSN of 123-45-6789), it's almost trivial to
119119
find Clients that share identifiers.
120120

121-
The Weakly Connected Components (TKTKT INSERT LINK) algorithm analyzes
122-
the graph and identifies "graph components". A "component" is a set of
123-
nodes and relationships where you can reach each member (node) from
124-
any other through traversal. (It's called "weakly" as opposed to
125-
"strongly" connected since we allow traversal irrespective of the
126-
direction of a relationship.)
121+
The [[https://neo4j.com/docs/graph-algorithms/current/algorithms/wcc/][Weakly Connected Components]] algorithm analyzes the graph and
122+
identifies "graph components". A [[https://en.wikipedia.org/wiki/Component_(graph_theory)][component]] is a set of nodes and
123+
relationships where you can reach each member (node) from any other
124+
through traversal. It's called "weakly" since we don't account for the
125+
directionality of relationships.
127126

128-
TKTKTKT INSERT VISUAL EXAMPLE
127+
#+BEGIN_QUOTE
128+
Connected component algorithms are a type of community detection
129+
algorithm. They're great for understanding the structure of a
130+
graph.
131+
#+END_QUOTE
132+
133+
#+CAPTION: "A graph with three components" by David Eppstein (Public Domain, Wikipedia, 2007)
134+
#+NAME: fig:three-components
135+
[[file:../static/img/3rdparty/Pseudoforest.svg]]
129136

130-
The net result: our algorithm identifies all the possible subgraphs of
137+
The net result: the algorithm identifies all the possible subgraphs of
131138
Clients that have some identifiers in common.
132139

133140
#+BEGIN_QUOTE
@@ -310,11 +317,13 @@ projection if all goes well.
310317

311318
*** Computing PairWise Similarity
312319

313-
Now we'll use an algorithm called *pairwise similarity* TKTKTK LINK TO
314-
THIS DOC to compute a similarity score between clients.
320+
Now we'll use an algorithm called [[https://neo4j.com/docs/graph-algorithms/current/algorithms/node-similarity/][pair-wise similarity]] to compute a
321+
similarity score between clients. This algorithm computes what's
322+
called the _Jaccard metric_, an approach to quantifying how similar
323+
two nodes are in the same connected graph.
315324

316325
#+BEGIN_SRC cypher
317-
CALL algo.nodeSimilarity.stream('Client', null,{graph:'fraud_groups'})
326+
CALL algo.nodeSimilarity.stream('Client', null, {graph:'fraud_groups'})
318327
YIELD node1, node2, similarity
319328
RETURN algo.asNode(node1).id AS a1,
320329
algo.asNode(node2).id,
@@ -392,14 +401,24 @@ We just got a preview of how we can visually identify highly-connected
392401
nodes while running our pairwise similarity algorithm. Let's do it now
393402
algorithmically.
394403

404+
In this case, we can use a tried and true algorithm called [[https://neo4j.com/docs/graph-algorithms/current/labs-algorithms/degree-centrality/][degree
405+
centrality]] originally proposed in 1979.[fn:1] It's great at finding
406+
"important" nodes in a social network. It just so happens a mobile
407+
money network is a form of social network!
408+
409+
#+BEGIN_QUOTE
410+
You've probably heard about [[https://neo4j.com/docs/graph-algorithms/current/algorithms/page-rank/][Page Rank]], made popular by Google as a
411+
core feature of Google's original relevancy model. We're not using
412+
Page Rank here, but just a fun fact.
413+
#+END_QUOTE
414+
395415
*** Computing Centrality
396416
Since centrality is computed within a graph component or cluster,
397417
let's target group =1708= first for our analysis. We won't predefine a
398418
graph projection like before since we're going to be only working with
399419
subgraphs with 13 members. (Why 13? Go back and see the
400420
[[fig:paysim-wcc-histogram][Histogram of Group Size]].)
401421

402-
TKTKTK EXPLAIN ALGO
403422

404423
#+BEGIN_SRC cypher
405424
CALL algo.degree.stream(
@@ -411,3 +430,7 @@ YIELD nodeId, score
411430
RETURN algo.asNode(nodeId).id AS clientId, score
412431
ORDER BY score DESC
413432
#+END_SRC
433+
434+
* Footnotes
435+
436+
[fn:1] Linton C. Freeman, [[http://leonidzhukov.net/hse/2014/socialnetworks/papers/freeman79-centrality.pdf][Centrality in Social Networks Conceptual Clarification]]

0 commit comments

Comments
 (0)