## HoloScope: detecting collective anoamlies of constract suspiciousness

HoloScope is topology-and-spike aware fraud detection.
HoloScope detect a subgraph of highly constrast suspicousness on topological, temporal, and categorical (e.g. rating score, topic, tag) infomation. 

Temporal spike of retweeting a message:

<img src="./images/msgspike.png" alt="drawing" width="300"/>

### Abstract
As online fraudsters invest more resources, including purchasing large pools of fake user accounts and dedicated IPs, fraudulent attacks become less obvious and their detection becomes increasingly challenging. Existing approaches such as average degree maximization suffer from the bias of including more nodes than necessary, resulting in lower accuracy and increased need for manual verification. Hence, we propose HoloScope, which uses information from graph topology and temporal spikes to more accurately detect groups of fraudulent users. In terms of graph topology, we introduce contrast suspiciousness, a dynamic weighting approach, which allows us to more accurately detect fraudulent blocks, particularly low-density blocks. In terms of temporal spikes, HoloScope takes into account the sudden bursts and drops of fraudsters' attacking patterns. In addition, we provide theoretical bounds for how much this increases the time cost needed for fraudsters to conduct adversarial attacks. Additionally, from the perspective of ratings, HoloScope incorporates the deviation of rating scores in order to catch fraudsters more accurately. Moreover, HoloScope has a concise framework and sub-quadratic time complexity, making the algorithm reproducible and scalable. Extensive experiments showed that HoloScope achieved significant accuracy improvements on synthetic and real data, compared with state-of-the-art fraud detection methods.

In [None]:
import spartan as st

You can configure the backend to use GPU or CPU only. \
Default is using backend cpu. 

In [None]:
# load graph data
tensor_data = st.loadTensor(path = "./inputData/yelp.tensor")


"tensor_data.data" has multiple-colum attributes, and a single-colum values (optional). The following table shows an example of 10000 four-tuple (user, object, date, score) and the 5th-colum is the frequency. 

|row id |    0	|   1	|         2    	|   3 	|   4  	|
|-----:	|-----:	|----:	|-----------:	|----:	|-----	|
|    0 	|    0 	|   0 	| 2012-08-01 	|   4 	|   1 	|
|    1 	|    1 	|   0 	| 2014-02-13 	|   5 	|   1 	|
|    2 	|    2 	|   0 	| 2015-10-31 	|   5 	|   1 	|
|    3 	|    3 	|   0 	| 2015-12-26 	|   3 	|   1 	|
|    4 	|    4 	|   0 	| 2016-04-08 	|   2 	|   1 	|
|  ... 	|  ... 	| ... 	|        ... 	| ... 	| ... 	|
| 9995 	| 4523 	| 508 	| 2013-03-06 	|   5 	|   1 	|
| 9996 	|  118 	| 508 	| 2013-03-07 	|   4 	|   1 	|
| 9997 	| 5884 	| 508 	| 2013-03-07 	|   1 	|   1 	|
| 9998 	| 2628 	| 508 	| 2013-04-08 	|   5 	|   1 	|
| 9999 	| 5885 	| 508 	| 2013-06-17 	|   5 	|   1 	|

In [None]:
stensor = tensor_data.toSTensor(hasvalue=True, mappers={2:st.TimeMapper(timeformat='%Y-%m-%d')})


In [None]:
#stensor._data

Sparse tensor "stensor" is a multi-mode constructed from tensor_data. users, objects, date time, and score are all mapped into $[0, N]$ integers. \
This example constructs a tensor of $5886 \times 509 \times 3857 \times 6$.

In [None]:
graph = st.Graph(stensor, bipartite=True, weighted=True)

Get a Graph instance from a sparse tensor.

### Run holoscope as a single model

In [None]:
hs = st.HoloScope(graph)

In [None]:
res = hs.run()

### Run holoscope from anomaly detection task

In [None]:
# create a anomaly detection model
ad_model = st.AnomalyDetection.create(graph, st.ADPolicy.HoloScope, 'holoscope')

In [None]:
# run the model
#default k=2, eps=1.6
res = ad_model.run(k=2)

The results is a list of top-k suspicious blocks.
For each block, the resulting tuple contains $(user~nodes, object~nodes)$,  suspicious score, and suspicious scores of all object nodes.\
Then we can visualize the subgraphs as follows.

In [None]:
#viusal of graphs by networkx
# to subgraph
# networkx plot

### Experimental results:
------

HoloScope (topology)       |  HoloScope (holistic signals)
:-------------------------:|:-------------------------:
<img src="images/performCmpDensity.png" width="300"/>  |   <img src="images/performancecmpall.png" width="300"/>
<b>HoloScope detection on real Sina Weibo data |  <b>HoloScope is near linear
<img src="images/wbexp.png" width="200"/> |   <img src="images/effeciencyexpelec.png" width="300"/>


### Cite:
------
1. Liu, Shenghua, Bryan Hooi, and Christos Faloutsos. "Holoscope: Topology-and-spike aware fraud detection." In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1539-1548. 2017.

    <details>
    <summary><span style="color:blue">click for BibTex...</span></summary>

    ```bibtex
    @inproceedings{liu2017holoscope,
      title={Holoscope: Topology-and-spike aware fraud detection},
      author={Liu, Shenghua and Hooi, Bryan and Faloutsos, Christos},
      booktitle={Proceedings of the 2017 ACM on Conference on Information and Knowledge Management},
      pages={1539--1548},
      year={2017}
    }
    ```
    </details>  

2. Liu, Shenghua, Bryan Hooi, and Christos Faloutsos. "A contrast metric for fraud detection in rich graphs." IEEE Transactions on Knowledge and Data Engineering 31, no. 12 (2018): 2235-2248.

    <details>  
    <summary><span style="color:blue">click for BibTex...</span></summary>

    ```bibtex
    @article{liu2018contrast,
      title={A contrast metric for fraud detection in rich graphs},
      author={Liu, Shenghua and Hooi, Bryan and Faloutsos, Christos},
      journal={IEEE Transactions on Knowledge and Data Engineering},
      volume={31},
      number={12},
      pages={2235--2248},
      year={2018},
      publisher={IEEE}
    }
    ```
    </details>  
