# Lab 11

## Part 1: PAA and SAX

In [1]:
import numpy as np 

T = np.array([-1.1, 0.2, 1.3, -0.9, 0.5, 0.6, -0.4, 0.0]) 
segments = 4 
segment_size = len(T) // segments 
paa = [np.mean(T[i * segment_size:(i + 1) * segment_size]) for i in 
range(segments)] 
print("PAA:", paa) 

PAA: [-0.45000000000000007, 0.2, 0.55, -0.2]


In [2]:
# Breakpoints: [-0.43, 0.43] for 3-letter alphabet 
def sax(paa, breakpoints=[-0.43, 0.43]): 
    alphabet = ['a', 'b', 'c'] 
    return [alphabet[sum(p > bp for bp in breakpoints)] for p in paa] 

sax_rep = sax(paa) 
print("SAX:", sax_rep) 

SAX: ['a', 'b', 'c', 'b']


## Part 2: Time-Series Clustering with SLM-Enhanced Insights 
- Example using DTW and Euclidean Distance 

- Dependecies:

```bash
pip install tslearn
pip install h5py
```

In [8]:
from tslearn.clustering import TimeSeriesKMeans 
from tslearn.metrics import dtw 
import warnings

# clear warnings from tslearn
warnings.filterwarnings("ignore")

# Toy dataset with 3 series 
series = np.array([ 
[-1.1, 0.2, 1.3, -0.9, 0.5, 0.6, -0.4, 0.0], 
[-1.2, 0.3, 1.2, -1.0, 0.4, 0.7, -0.3, 0.1], 
[0.9, 0.8, -1.0, -1.1, 1.1, 0.9, -0.7, -0.2] 
])

print(series)

[[-1.1  0.2  1.3 -0.9  0.5  0.6 -0.4  0. ]
 [-1.2  0.3  1.2 -1.   0.4  0.7 -0.3  0.1]
 [ 0.9  0.8 -1.  -1.1  1.1  0.9 -0.7 -0.2]]


In [9]:
# Using DTW 
model = TimeSeriesKMeans(n_clusters=2, metric="dtw", random_state=0) 
labels = model.fit_predict(series) 
print("Cluster labels (DTW):", labels)

Cluster labels (DTW): [0 0 1]


## Part 3: Using SLM (Gemma 3) for Pattern Annotation 
- SLMs aren’t used directly for clustering, but can annotate or describe patterns.

In [11]:
# configure api
from dotenv import load_dotenv
import os

load_dotenv()
gemini_api_key = os.getenv("GEMINI_API_KEY")

In [12]:
from google import genai

client = genai.Client(api_key=gemini_api_key)

model = "gemma-3-27b-it"

In [13]:
sax_rep = ['a', 'b', 'c', 'a']

prompt = ( 
f"The SAX pattern 'abca' was extracted from a normalized time-series.\n" 
f"What trend does this pattern suggest? Provide a short explanation for what kind of behavior or change this could represent in time-series data." 
)

In [15]:
from IPython.display import display, Markdown

response = client.models.generate_content(model=model, contents=prompt)
display(Markdown(response.text))

The SAX pattern 'abca' suggests a **rise, fall, rise, then fall** trend. More specifically, it indicates a **wave-like or oscillatory behavior with a slight upward drift**.

Here's a breakdown:

* **a:** Represents a relatively low value (due to normalization).
* **b:** Represents a value higher than 'a' - an increase.
* **c:** Represents a value higher than 'b' - a further increase.
* **a:** Represents a value lower than 'c' - a decrease.

Therefore, the pattern shows an initial increase (a->b->c), followed by a decrease (c->a).  The repetition of this suggests a cyclical pattern.  If the 'a' values are consistently slightly higher than the previous 'a' value, it implies a slow upward trend *within* the oscillations.

**Possible real-world representations:**

* **Seasonal data with growth:**  Like sales that peak each year but show overall growth over time.
* **Fluctuating sensor readings with a bias:** A temperature sensor that oscillates around a slowly increasing average temperature.
* **Heart rate variability:**  A heart rate that speeds up and slows down, potentially indicating fitness or stress levels.



