# Trajectory Pattern Data Mining

**Problem statement 1:** 
- Given a set of trajectories, find a subset of the trajectories where the users are travelling together.
- Given a set of trajectories, derive a similarity score of the trajectories.

**Relevant work:**
- Chapter 5 of [Computing with Spatial Trajectories](https://github.com/tyqiangz/Trajectory-Data-Mining/blob/master/Useful%20Research%20Materials/Computing%20with%20Spatial%20Trajectories.pdf)
    - Gives a very broad overview of pattern discovery methods
- Paper titled "[Efficient mining of group patterns from user movement data](https://github.com/tyqiangz/Trajectory-Data-Mining/blob/master/Useful%20Research%20Materials/Efficient%20mining%20of%20group%20patterns%20from%20user%20movement%20data.pdf)" 
    - Several algorithms were proposed to mine group movement pattern, where a *group pattern* is defined as a group of users that are within a distance threshold from one another for at least a minimum duration.

### Evaluating the paper titled "[Efficient mining of group patterns from user movement data](https://github.com/tyqiangz/Trajectory-Data-Mining/blob/master/Useful%20Research%20Materials/Efficient%20mining%20of%20group%20patterns%20from%20user%20movement%20data.pdf)" 

Notation:
- $D=(D_1,D_2,D_M)$: User movement database
- $D_i$: a time series of tuple $(t,(x,y,z))$ denoting geolocation of user $i$ at time $t$
- $u_{i}[t] . p$: location of a user $u_{i}$ at time $t$
- $N$: No. of time points in $D$.
- Valid segment: Check definition 1
- Group pattern: Check definition 2
- $k$-Group pattern: A group pattern with $k$ users.
- sub-group pattern: Check definition 3.
- weight count, weight: Check definition 4.


1. $\textbf{Definition 1}:$ Given a set of users $G$, a maximum distance threshold $max\_dis$, and a minimum time duration threshold $min\_dur$, a set of consecutive time points $\left[t_a, t_{b}\right]$ is called a **valid segment of $G$**, if
    - $\forall u_{i}, u_{j} \in G, t_{a} \le t \le t_{b}, d(u_{i}[t] \cdot p, u_{j}[t] \cdot p) \le max\_dis$
    - If $t_{a}>0, \exists u_{i}, u_{j} \in G, d(u_{i}[t_{a}-1] \cdot p, u_{j}[t_{a}-1] \cdot p)>\max\_dis$
    - If $t_{b}<N-1, \exists u_{i}, u_{j} \in G, d(u_{i}\left[t_{b}+1\right] \cdot p, u_{j}\left[t_{b}+1\right] \cdot p)>\max\_dis$
    - $\left(t_{b}-t_{a}+1\right) \ge$ `min_dur`.
    
    In other words, within a valid segment of a set of users G, all members must be close to one another for at least a minimum time duration (`min_dur`). The function, $d(\ )$, returns the distance between two points.
    
2. $\textbf{Definition 2}: $ Given a set of users $G,$ thresholds $max\_dis$ and $min\_dur$, we say that $G, max\_dis$ and $min\_dur$ form a **group pattern**, denoted by $P=\langle G, max\_dis,min\_dur\rangle,$ if $G$ has a valid segment.
3. $\textbf{Definition 3}:$ Given two group patterns, $P=\left\langle G, \max\_dis, \min\_dur\right\rangle$ and $P'=\left\langle G', \max\_dis, \min\_dur\right\rangle, P'$ is called a **sub-group pattern** of $P$ if $G^{\prime} \subseteq G$.

4. $\textbf{Definition 4}: $ Let $P$ be a group pattern with valid segments $s_{1}, \ldots, s_{n},$ the weight-count and weight of $P$ are defined as
\begin{align*}
    \operatorname{weight-count}(P)=\sum_{i=1}^{n}\left|s_{i}\right| \quad \text{and} \quad
    \operatorname{weight}(P)=\frac{\text { weight-count }(P)}{N}=\frac{\sum_{i=1}^{n}\left|s_{i}\right|}{N}.
\end{align*}
5. $\textbf{Definition 5}: $ Given the thresholds max_dis, min_dur, and min_wei, the problem of finding all the
valid group patterns (or simply valid groups) is known as **valid group (pattern) mining**.

In [None]:
def agp(df_list, max_dis, min_dur, min_wei):
    '''
    :param df_list: a list of trajectories
    :param max_dis: maximum distance threshold between two users for a subset of `df_list` 
        to be considered a group pattern
    :param min_dur: the minimum duration threshold for the time period two users
        spent together to be considered a group pattern
    :param min_wei:
    :return group_pattern: the group pattern derived from `df_list`
    '''
    
def generate_candidate_groups():
    return pass

def is_close():
    return pass