# MAT-model: Model Classes for Multiple Aspect Trajectory Data Mining \[MAT-Tools Framework\]

Sample Code in python notebook to use `mat-model` as a python library.

The present package offers a tool, to support the user in the task of modeling multiple aspect trajectories. It integrates into a unique framework for multiple aspects trajectories and in general for multidimensional sequence data mining methods.

Created on Apr, 2024
Copyright (C) 2023, License GPL Version 3 or superior (see LICENSE file)

In [2]:
#!pip install mat-model
#!pip install --upgrade mat-model

Help on function load_ds in module matdata.dataset:

load_ds(dataset='mat.FoursquareNYC', prefix='', missing=None, sample_size=1, random_num=1)
    Load a dataset for training or testing from a GitHub repository.
    
    Parameters:
    -----------
    dataset : str, optional
        The name of the dataset to load (default 'mat.FoursquareNYC').
    prefix : str, optional
        The prefix to be added to the dataset file name (default '').
    missing : str, optional
        The placeholder value used to denote missing data (default '-999').
    sample_size : float, optional
        The proportion of the dataset to include in the sample (default 1, i.e., use the entire dataset).
    random_num : int, optional
        Random seed for reproducibility (default 1).
    
    Returns:
    --------
    pandas.DataFrame
        The loaded dataset with optional sampling.



In [1]:
from matdata.dataset import *
ds = 'mat.FoursquareNYC'
df = load_ds(ds, sample_size=0.25)
df

Loading dataset file: https://github.com/mat-analysis/datasets/tree/main/mat/FoursquareNYC/


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1055k  100 1055k    0     0  1575k      0 --:--:-- --:--:-- --:--:-- 1582k


Spliting Data (class-balanced):   0%|          | 0/193 [00:00<?, ?it/s]

Unnamed: 0,space,time,day,poi,type,root_type,rating,weather,tid,label
0,40.6604738351670 -73.8302910891864,1042,Monday,MTA Subway - Howard Beach/JFK Airport (A),Metro Station,Travel & Transport,-1.0,Clear,128,6
1,40.6086420833785 -73.8190376758575,1179,Monday,MTA Bus - Q53,Beach,Outdoors & Recreation,-1.0,Clear,128,6
2,40.7340555764763 -73.8708472251892,1208,Monday,Queens Center Mall,Shopping Mall,Shop & Service,7.5,Clear,128,6
3,40.7333724746837 -73.8711404741537,1210,Monday,MTA Bus - Q11/Q21/Q29/Q52LTD/Q53LTD/Q59/Q60 - ...,Bus Line,Travel & Transport,-1.0,Clear,128,6
4,40.7631337910326 -73.8752118314646,1273,Monday,"MTABus Q19, Q49 (Astoria Blvd/94th St)",Bus Station,Travel & Transport,-1.0,Clear,128,6
...,...,...,...,...,...,...,...,...,...,...
15267,40.7047332789043 -73.9877378940582,939,Thursday,Miami Ad School Brooklyn,General College & University,College & University,-1.0,Clear,29559,1070
15268,40.6978026652822 -73.9941451630314,483,Friday,Eastern Athletic Club,Gym,Outdoors & Recreation,6.9,Clear,29559,1070
15269,40.6946728967503 -73.9940820360805,794,Friday,Starbucks,Coffee Shop,Food,7.0,Clear,29559,1070
15270,40.7023694709909 -73.9875124790989,1261,Friday,Superfine,American Restaurant,Food,7.6,Clear,29559,1070


#### Trajectory Objects

Alternatively, you can convert the dataframe into Trajectory objects:

In [3]:
from matmodel.util.parsers import df2trajectory

T = df2trajectory(df)

Converting Trajectories:   0%|          | 0/694 [00:00<?, ?it/s]

In [4]:
traj = T[1]
traj.display()

𝘛𐄁135 	𝘱1⟨(40.690 -73.982), 145.0, Monday, NYCT Transit Survey Unit, Office, Professional & Other Places, -1.0, Clouds⟩↴
	𝘱2⟨(40.709 -73.991), 201.0, Monday, MTA Subway - Manhattan Bridge (B/D/N/Q), Train, Travel & Transport, -1.0, Clouds⟩↴
	𝘱3⟨(40.828 -73.926), 1382.0, Monday, MTA Subway - 161st St/Yankee Stadium (4/B/D), Metro Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱4⟨(40.709 -73.991), 100.0, Tuesday, MTA Subway - Manhattan Bridge (B/D/N/Q), Train, Travel & Transport, -1.0, Clouds⟩↴
	𝘱5⟨(40.690 -73.982), 145.0, Tuesday, NYCT Transit Survey Unit, Office, Professional & Other Places, -1.0, Rain⟩↴
	𝘱6⟨(40.759 -73.988), 247.0, Tuesday, MTA Bus - 8 Av & W 46 St (M20/M104), Bus Stop, Travel & Transport, -1.0, Rain⟩↴
	𝘱7⟨(40.653 -74.002), 307.0, Wednesday, MTA Regional Bus Depot - Jackie Gleason, Bus Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱8⟨(40.638 -73.979), 353.0, Wednesday, MTA B67, B69 (McDonald Ave/Cortelyou Road), Bus Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱9⟨(40.688

In [5]:
traj.attributes

[1. space (space2d),
 2. time (numeric),
 3. day (nominal),
 4. poi (nominal),
 5. type (nominal),
 6. root_type (nominal),
 7. rating (numeric),
 8. weather (nominal)]

In [6]:
traj.display()

𝘛𐄁135 	𝘱1⟨(40.690 -73.982), 145.0, Monday, NYCT Transit Survey Unit, Office, Professional & Other Places, -1.0, Clouds⟩↴
	𝘱2⟨(40.709 -73.991), 201.0, Monday, MTA Subway - Manhattan Bridge (B/D/N/Q), Train, Travel & Transport, -1.0, Clouds⟩↴
	𝘱3⟨(40.828 -73.926), 1382.0, Monday, MTA Subway - 161st St/Yankee Stadium (4/B/D), Metro Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱4⟨(40.709 -73.991), 100.0, Tuesday, MTA Subway - Manhattan Bridge (B/D/N/Q), Train, Travel & Transport, -1.0, Clouds⟩↴
	𝘱5⟨(40.690 -73.982), 145.0, Tuesday, NYCT Transit Survey Unit, Office, Professional & Other Places, -1.0, Rain⟩↴
	𝘱6⟨(40.759 -73.988), 247.0, Tuesday, MTA Bus - 8 Av & W 46 St (M20/M104), Bus Stop, Travel & Transport, -1.0, Rain⟩↴
	𝘱7⟨(40.653 -74.002), 307.0, Wednesday, MTA Regional Bus Depot - Jackie Gleason, Bus Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱8⟨(40.638 -73.979), 353.0, Wednesday, MTA B67, B69 (McDonald Ave/Cortelyou Road), Bus Station, Travel & Transport, -1.0, Clouds⟩↴
	𝘱9⟨(40.688

In [7]:
traj.points[0]

𝘱1⟨(40.690 -73.982), 145.0, Monday, NYCT Transit Survey Unit, Office, Professional & Other Places, -1.0, Clouds⟩

In [9]:
traj.points[0].aspects[2], traj.points[0].aspects[3]

(Monday, NYCT Transit Survey Unit)

In [10]:
traj.attributes

[1. space (space2d),
 2. time (numeric),
 3. day (nominal),
 4. poi (nominal),
 5. type (nominal),
 6. root_type (nominal),
 7. rating (numeric),
 8. weather (nominal)]

In [11]:
traj.attributes[3].text, traj.points[0].aspects[3]

('poi', NYCT Transit Survey Unit)

In [21]:
a = traj.points[0].aspects[2]
from matmodel.base import Space2D
isinstance(a, Space2D)

False

In [22]:
a.value, type(a.value)

('Monday', str)

In [None]:
traj.attributes_desc.attributes

In [None]:
print(traj.attributes_desc.idDesc)
print(traj.attributes_desc.labelDesc)

In [None]:
a1 = traj.attributes[0]
print(a1.order, a1.text, a1.dtype, sep=' -- ')

print('Comparator:', a1.comparator)

In [None]:
# Calcular a distancia do p1 com p2, no atributo 1 (São iguais)
a1.comparator.distance(traj.points[0].aspects[0], traj.points[1].aspects[0])

In [None]:
# Calcular a distancia do p1 com p6, no atributo 1 (São diferentes)
a1.comparator.distance(traj.points[0].aspects[0], traj.points[5].aspects[0])

In [None]:
d1 = 2
d2 = 10

# isso era uma função que o Andres usava para aumentar a diferença proporcionalmente quanto maior fosse a distancia,
# vai até o max_value do comparador (se for setado)
a1.comparator.enhance(d1), a1.comparator.enhance(d2)

In [None]:
d1 = 25
d2 = 75

# Se tiver valores de distância que quiser normalizar de 0 a 1, dá pra atribuir o maior valor de distância possível:
a1.comparator.max_value = 100
a1.comparator.normalize(d1), a1.comparator.normalize(d2)

In [None]:
help(a1.comparator.distance)

In [None]:
# Eu posso criar outros comparadores, ou trocar:
from matmodel.distance.comparator import LcsDistance, EditlcsDistance

a1.comparator = LcsDistance()
print(traj.points[0].aspects[0], traj.points[2].aspects[0], a1.comparator.distance(traj.points[0].aspects[0], traj.points[2].aspects[0]))
print(traj.points[0].aspects[0], traj.points[5].aspects[0], a1.comparator.distance(traj.points[0].aspects[0], traj.points[5].aspects[0]))

a1.comparator = EditlcsDistance()
print(traj.points[0].aspects[0], traj.points[2].aspects[0], a1.comparator.distance(traj.points[0].aspects[0], traj.points[2].aspects[0]))
print(traj.points[0].aspects[0], traj.points[5].aspects[0], a1.comparator.distance(traj.points[0].aspects[0], traj.points[5].aspects[0]))

In [None]:
# Com descritor já pronto:
from matmodel.util.parsers import df2trajectory
T = df2trajectory(df, attributes_desc='../datasets/mat/descriptors/FoursquareNYC_hp.json')
traj = T[73]

In [None]:
for attr in traj.attributes:
    print(attr, attr.comparator)

---

# --

In [None]:
sel_attributes = ['poi', 'day', 'category', 'weather']
attributes = [{'order': 1, 'type': 'space2d', 'text': 'lat_lon', 'comparator': {'distance': 'euclidean'}}, {'order': 2, 'type': 'nominal', 'text': 'day', 'comparator': {'distance': 'equals'}},{'order': 1, 'type': 'space2d', 'text': 'lat_lon', 'comparator': {'distance': 'euclidean'}}, {'order': 2, 'type': 'nominal', 'text': 'day', 'comparator': {'distance': 'equals'}}]

dict(map(lambda item: (item['text'], item), attributes))

In [None]:
from matmodel.method import MethodWrapper

MethodWrapper.providedMethods()

In [None]:
from matmodel.method.MethodWrapper import Param

Param.TYPE_TEXT

\# By Tarlis Portela (2023)