# Design for server

## IAT: detecting micro-clusters on suspicious behavors

A group of fraudsters always behave synchronously in a regular (fixed) pattern, which probably shows
suspiciousness comparing to the normally behaving people.
Therefore, we study the overall time intervals of users, and detect the suspicious micro-clusters that stands out of the majority distributions.
It can be used with vision-guided detection algorithm, EagleMine.


In [None]:
import spartan as st

Load data by function ```loadTensor```.<br/>

In [None]:
tensor_data = st.loadTensor(path = "/home/liushenghua/Data/wbcovid19rummor/partdata/extrat0224.reid.gz", header=None, sep='\x01')

Use ```log_to_aggts``` function to extract time stamps in log files or edgelist tensor

In [None]:
coords, data = tensor_data.do_map(hasvalue=False, 
                                  mappers={0:st.TimeMapper(timeformat='%Y-%m-%d %H:%M:%S', timebin = 1, mints = 0)})

In [None]:
aggts = tensor_data.to_aggts(coords, time_col=0, group_col=[1])

In [None]:
len(aggts)

## IAT class

calaggiat function：calculate iat dict **aggiat** (key:user, value: iat list)

caliatcount function：calculate iat count dict **iatcount** (key:iat, value: frequency)

caliatpaircount function：calculate iat dict **iatpaircount** (key:(iat1, iat2), value: frequency)

get_user_iatpair_dict function：calculate iat dict **user_iatpair** (key:user, value: (iat1, iat2) list)

get_iatpair_user_dict function：calculate iat dict **iatpair_user** (key:(iat1, iat2), value: user list)

find_iatpair_user function: find users who have input iat pairs

find_iatpair_user_ordered function: find Top-K users that have pairs in iatpairs ordered by decreasing frequency

drawIatPdf: Plot Iat-Pdf line

In [None]:
instance = st.IAT()

In [None]:
# calculate aggiat dict
#instance.calaggiat(aggts)
# save aggiat dict
#instance.save_aggiat('/home/liushenghua/Data/wbcovid19rummor/partdata/aggiat0224.dictlist.gz')
# load aggiat dict
instance.load_aggiat('/home/liushenghua/Data/wbcovid19rummor/partdata/aggiat0224.dictlist.gz')

In [None]:
aggiat=instance.aggiat

In [None]:
xs, ys = instance.getiatpairs()
len(xs), len(ys)

In [None]:
# invoke drawHexbin function
# hexfig = st.drawHexbin(xs, ys, gridsize=60, xlabel='IATn', ylabel='IATn+1',outfig='./images/iathexbin_demo.png')

In [None]:
# invoke drawRectbin function
# fig, hist = st.drawRectbin(xs, ys, gridsize=60, xlabel='IATn', ylabel='IATn+1', outfig='./images/iatrectbin_demo.png')

## class RectHistogram
draw function: draw 2D histogram with rectangular bin

find_peak_rects function: find the bin with the largest number of samples in the range of
horizontal axis: [x-radius, x+radius]
vertical axis: [y-radius, y+radius]
    
return: (x,y) pairs in the bin that has the largest number of samples 

In [None]:
recthistogram = st.RectHistogram(xscale='log', yscale='log', gridsize=60)

In [None]:
fig = recthistogram.draw(xs, ys, xlabel='IATn', ylabel='IATn+1')

The result is:
<img src="images/real0224.png" width="400"/> 

In [None]:
xrange, yrange = recthistogram.find_peak_range(x=3, y=3, radius=10)
print(f"the range of max bin along the x axis:\n {xrange}")
print(f"the range of max bin along the y axis:\n {yrange}")

In [None]:
iatpairs = recthistogram.find_peak_rect(xrange, yrange)
print(len(iatpairs))

### Find Top-k suspicious users

In [None]:
usrlist = instance.find_iatpair_user_ordered(iatpairs) # default return all, k = -1
print(f"All user count: \n{len(usrlist)}")
print(f"Top-5 user: \n{usrlist[:5]}")

Total number of suspicious users are 1339. 

The output of top-5 users are ['1710925', '499531', '529364', '1776167', '427650']

plot iat-pdf line by function `drawIatPdf`

In [None]:
import matplotlib.pyplot as plt
startday = min(aggts[1710925]) //(24*3600) * 24*3600
endday = (max(aggts[1710925]) // (24*3600) + 1) * 24*3600
bins = range(startday, endday, 5*60)
res = plt.hist(aggts[1710925], bins=bins)

The output figure is:
<img src="images/ts_0224.png" width="400"/> 

In [None]:
fig = instance.drawIatPdf(usrlist, outfig='/home/liushenghua/Data/wbcovid19rummor/partdata/iatpdf_0224.png')

It is the result:
<img src="images/iat_dist_0224.png" width="400"/> 