# Hidden Markov Models

In this homework assignment you will train and evaluate a toy speech recognizer that can recognize 'yes' and 'no'.  As we move into the latter half of the course, note that the assignments will have less and less hand holding.  The goal is for you to become more and more independent, so that you are ready to work independently on the final project.  This assignment should be done in pairs and only one submission needs to be submitted on Gradescope for each team.  In your submission, you should create a single .zip file that contains your jupyter notebook, audio data, annotation data, and generated prediction files.

Please indicate the team member names here: ________________________

How many hours you each spent on this assignment: _________ (partner 1), __________ (partner 2)

This assignment will be broken down into 5 main sections:
1. Collect data & annotate (10 points)
2. Train model using manual annotations (20 points)
3. Perform inference on test data (20 points)
4. Infer strong labels on weakly labeled data (20 points)
5. Retrain model and evaluate on test data (20 points)

An additional 10 points will be based on how well organized, commented, and readable your code is.

In [1]:
%matplotlib inline

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import librosa as lb
from pathlib import Path
from scipy.stats import multivariate_normal

### Part 1: Data Collection & Annotation

For the data collection, you will record 10 audio clips:
- Training data: 5 recordings.  Five of the audio recordings will be for training.  For these recordings, you should say a random sequence of ten yes's or no's.  For example, one recording might be 'yes no yes yes no no yes yes yes no'.  The recordings do not all have to have the same sequence of yes's and no's.  When recording your speaking, please include a variable-length silence in between each word.
- Test data: 5 recordings.  Five of the audio recordings will be for testing.  For these recordings, you should say a variable-length sequence of yes's and no's.  For example, one recording might be 'yes no' and another might be 'no yes yes no no yes no yes'.  Use a variety of different lengths in your sentences.  Again, please include silence in between each word.

All audio recordings should be done by a single person on a single device in the same environment (e.g. recorded on a cell phone in your dorm room).

Once you have recorded the audio, convert them to a format that librosa can read (e.g. wav or mp3) and put them in a folder entitled 'audio/'.  Name the files train1.mp3, train2.mp3, test1.mp3, test2.mp3, etc.

In [3]:
AUDIO_DIR = Path('audio') 
ext = '.mp3' # audio format

In [4]:
def verifyAudioData(indir, file_ext):
    # verifies that all the needed audio files are present

    # check train files
    for i in range(5):
        curfile = Path(AUDIO_DIR, f'train{i+1}').with_suffix(ext)
        assert curfile.is_file(), f'Missing training file {curfile}'

    # check test files
    for i in range(5):
        curfile = Path(AUDIO_DIR, f'test{i+1}').with_suffix(ext)
        assert curfile.is_file(), f'Missing test file {curfile}'

    print('All required audio data files are present!')
    
    return

In [5]:
verifyAudioData(AUDIO_DIR, ext)

All required audio data files are present!


Next you will create two different kinds of annotations: weak labels and strong labels.  

**Weak labels**.  Create two annotation files for the weak labels.  The file train.transcription should look like:

train1.mp3|yes no no yes no yes no no yes no\
train2.mp3|no yes yes yes no no yes no yes no\
...\
train5.mp3|yes no yes yes no no no yes no yes

The file test.transcription should look like:

test1.mp3|yes no\
test2.mp3|no yes yes no yes\
...\
test5.mp3|yes no yes no no yes no yes yes yes

**Strong labels**.  Create one annotation file for train1.mp3 (only) containing strong labels.  The file train1.labels should be in the following format:

0.00 1.37 sil\
1.37 2.60 yes\
2.60 5.20 sil\
5.20 6.30 no\
6.30 8.10 sil\
...\
12.30 13.25 yes\
13.25 15.90 sil

For annotating timestamps, I recommend that you use Audacity.  Note that we are only annotating one file with strong labels because this process is time-consuming!

In [6]:
ANNOT_DIR = Path('annot')
train_transcripts = Path(ANNOT_DIR, 'train.transcription')
test_transcripts = Path(ANNOT_DIR, 'test.transcription')
train1_labels = Path(ANNOT_DIR, 'train1.labels')

In [7]:
def verifyAnnotations():
    # verifies that all of the needed annotation files are present

    assert train_transcripts.is_file(), f'Missing transcription file {train_transcripts}'    
    assert test_transcripts.is_file(), f'Missing transcription file {test_transcripts}'    
    assert train1_labels.is_file(), f'Missing label file {train1_labels}'
    print('All required annotation files are present!')
    
    return

In [8]:
verifyAnnotations()

All required annotation files are present!


Using the above format for your label file, you should be able to visualize your labels alongside the audio in Audacity by selecting File --> Import --> Labels.  An example is shown below.  This type of visualization will be useful in evaluating how accurate your alignments and predictions are.

![Snapshot](labels_snapshot.png)

**Graded**: Please include a similar visualization of your data below.  Make sure you include your image file in your submission!

\[INCLUDE VISUALIZATION HERE\]
![Snapshot](test1image.PNG)

### Part 2: Model Training

In this part, you will train an HMM model based (only) on the label file for train1.mp3.  You must complete the implementations of the functions below.  No unit tests will be provided, though, so make sure to check your own work and verify that they are what you expect!

The function below extracts a feature called mel frequency cepstral coefficients.  Unlike chroma features, which focus on pitch (i.e. fine spectral structure), MFCCs focus on timbre (i.e. rough spectral structure) and are helpful for recognizing speech or distinguishing between different instruments.  You may use the librosa function [librosa.feature.mfcc](https://librosa.org/doc/latest/generated/librosa.feature.mfcc.html) with default arguments.

In [9]:
def computeFeatures(audiofile):
    """
    This function extracts mel frequency cepstral coefficients (MFCCs) from a given audio file.
        
    Inputs
        - audiofile: filepath to the audio recording that you want to extract MFCC features from
        
    Outputs
        - O: an F x N array containing MFCC features, where F corresponds to different features and 
             N corresponds to different audio frames
        - hop: the hop size (in seconds) between adjacent frames
    """

    ### INSERT CODE BELOW ###
    winsize=1024                    # window size   [samples]
    hop_samples=int(winsize//4)     # hop size      [samples]
    y, sr = lb.load(audiofile)      # unpack data and sample rate from audiofile
    hop_seconds = hop_samples / sr  # hop size      [seconds]
    O = lb.feature.mfcc(y=y, win_length = winsize, hop_length = hop_samples ,sr=sr) # compute mfcc features
    return (O, hop_seconds)         # return mfcc features and hop size [seconds]

The function below constructs a mapping between the states and their numeric identifiers.  For ease of grading, please use the following mapping:
- sil -> 0
- Y -> 1
- EH -> 2
- S -> 3
- N -> 4
- OH -> 5

In [10]:
def getStateMapping():
    """
    Returns a mapping between the string and numeric representations of the six different states.
    
    Outputs
      - states: a list that contains the states (in order).  This allows you to map from the numeric identifier
        to the string representation (e.g. states[3])
      - stateStr2id: a dict that maps from the state string representation to its numeric identifier (e.g. stateStr2id['EH'])
    """
    ### INSERT CODE BELOW ###
    #raise NotImplementedError
    states=["sil","Y","EH","S","N","OH"]
    stateStr2id ={
        "sil":0,
        "Y":1,
        "EH":2,
        "S":3,
        "N":4,
        "OH":5     
    }
    
    return (states, stateStr2id)

The function below converts a .labels file into a sequence of states per frame.  You may assume that states in a word have equal duration (e.g. if the word 'yes' lasts 1.2 seconds, you can assume that 'Y', 'EH', and 'S' each have duration 0.4 sec).

In [11]:
def getStatesFromLabelFile(labelfile, hopsize, str2id):
    """
    Reads in a label file and returns a sequence of states for each audio frame.  For any given word, it assumes
    that the constituent states all have equal duration.  For example, if the word 'yes' lasts 1.2 seconds, the constituent
    states 'Y', 'EH', and 'S' are assumed to each have duration 0.4 seconds.
    
    Inputs
      - labelfile: filepath to the .labels or .forcealign file 
      - hopsize: the hop size in seconds between adjacent audio frames
      - str2id: dict that maps from the state's string representation to its numeric representation

    Outputs
      - S: list containing the sequence of numeric states for each audio frame
    """
    ### INSERT CODE BELOW ###
    f = open(labelfile)
    lines = f.readlines()
    
    lastLine = lines[-1].split('\t') # extract the end time of the last label to determine the length of S
    endtime = float(lastLine[1])

    S = np.zeros(int(endtime//hopsize))

    states = ["sil", "Y", "EH", "S", "N", "OH"]

    for i in range(len(lines)):
        line = lines[i].split('\t')   # split the label into [start time, end time, label identifier]
        t1 = float(line[0])            # extract start time (convert from string to float)
        t2 = float(line[1])            # extract end time
        dur = t2 - t1                   # compute duration of label 
        ind = np.zeros(4)
        label = line[2]
        

        if 'sil' in label:           # check if 'sil' is in the label rather than checking equality to deal with inconsistent newline characters
            for i in range(2):  # generate beginning and end indices bc only one silent state
                ind[i] = int((t1 + i*dur) // hopsize)
            S[int(ind[0]): int(ind[1])] = str2id[states[0]]

        elif 'yes' in label:
            for i in range(4):  # generate equally spaced indices to separate 3 'yes' states
                ind[i] = int((t1 + i/3*dur) // hopsize)
            for i in range(3):  # assign states to S matrix
                S[int(ind[i]): int(ind[i+1])] = str2id[states[1+i]]

        elif 'no' in label:
            for i in range(3):  # generate equally spaced indices to separate 2 'no' states
                ind[i] = int((t1 + i/2*dur) // hopsize)
            for i in range(2):  # assign states to S matrix
                S[int(ind[i]): int(ind[i+1])] = str2id[states[i+4]] 
    
    return S

In [12]:
file= "annot/train1.labels"
hop = 0.1
states, str2id = getStateMapping()
S = getStatesFromLabelFile(labelfile = file, hopsize = hop, str2id = str2id)

This is a good place to verify that your functions are producing correct outputs:

In [13]:
O, hop = computeFeatures(Path(AUDIO_DIR, 'train1.mp3'))
states, stateStr2id = getStateMapping()
S = getStatesFromLabelFile(train1_labels, hop, stateStr2id)

The function below trains an HMM given a list of observations (i.e. MFCC feature matrices) and corresponding states.  A few helpful tips:
- In this part of the assignment, you will only train the model on the train1.mp3 example, but your function below should be able to handle multiple training examples so that you can reuse this function in part 5.
- You may assume that recordings always begin in the silent state.
- You should assume that the emission probability model is a multivariate Gaussian model.
- You should decompose this function and define other sub-functions as needed to keep your code neat and organized.

In [14]:
def trainHMM(O_list, S_list, states):
    """
    Trains an HMM given a list of observations and corresponding states.  The HMM assumes a multivariate Gaussian
    emission probability model.

    Inputs
      - O_list: a list of matrices, where each matrix contains the MFCC coefficients for a single training recording
      - S_list: a list of arrays, where each array specifies the states in each audio frame for a single training recording
      - states: a list specifying the states in their string representation

    Outputs
      - A: the state transition probability matrix
      - pi: the distribution of the initial state
      - means: a matrix where each row specifies the mean of the distribution for a single state
      - covars: a 3D tensor where the first index specifies a state, and the remaining two indices specify the covariance 
        matrix for the state's distribution
    """
    ### INSERT CODE BELOW ###

    """
    Note:
    Phases of function:
      1. determine A matrix by going through the entire list of states. Index aij will be # of times there is a state transition from i to j / total number of transitions from i to another state
      2. Determine initial pi by perhaps averaging probabilities over all states, or simply choosing an initial probability
      3. Generate list of means: where mu_i corresponds to the average of all observations for state i
      4. From the means, generate covariance matrices for all i states

    """

    #TODO O list is a list of observations for multiple recordings. Need to update function to reflect that.
    state_names,stateStr2id=getStateMapping(states)

    num_states=len(states)
    A_sum = np.zeros((num_states,num_states)) 
    A = np.zeros((num_states,num_states)) 
    observations_sum = np.zeros(num_states,np.shape(O_record)[1]) #summed observation for each state
    
    for i in range(len(S_list)):
      S_record=S_list[i]
      O_record=O_list[i]
      A=A[i]+countTransitions(S_record,states,stateStr2id) #add to the sum of A's
      state_count_list=countNumStates(S_record, states) #matrix containing the number of time a state appears
    
      (O_record,S_record,states,state_count)

    #computing the A matrix for the 
    transitions_sum = np.einsum("ij->i",A) #each element is the total number of transitions from each state index i
    for i in range(len(transitions_sum)):
      A[i,:]=A[i,:]/transitions_sum[i] #now divide by the total number of transitions







    #raise NotImplementedError
    return (A, pi, means, covars)

def findPi(S_list, states):
  """

  """

def countTransitions(S_record,states,mapping):
  num_states=len(states)
  num_transitions_mat=np.zeros((num_states,num_states)) 
  for i in range(len(S_record)-1):
    current_state=S_record[i] #string of current state
    next_state=S_record[i+1] #string of next state
    current_state_idx=mapping[current_state] #index of current state for matrix
    next_state_idx=mapping[next_state] #index of next state for matrix
    num_transitions_mat[current_state_idx,next_state_idx]=num_transitions_mat[current_state_idx,next_state_idx]+1 
  return num_transitions_mat




"""
def findA(S_list,states):


  #TODO this assumes that state_names and states are the same. if not working, see if the states are arranged in the same order
  state_names,stateStr2id=getStateMapping(states)
  num_states=len(states)
  num_transitions_mat=np.zeros((num_states,num_states)) #keep count of the state transitions
  A_mat=np.zeros((num_states,num_states)) #keep count of the state transitions
  #matrix formatted as row=state from, col= state to

  for i in range(len(S_list)-1):
    current_state=S_list[i] #string of current state
    next_state=S_list[i+1] #string of next state
    current_state_idx=stateStr2id[current_state] #index of current state for matrix
    next_state_idx=stateStr2id[next_state] #index of next state for matrix

    #increment count of transition in the matrix
    num_transitions_mat[current_state_idx,next_state_idx]=num_transitions_mat[current_state_idx,next_state_idx]+1 
    
  transitions_sum = np.einsum("ij->i",num_transitions_mat) #each element is the total number of transitions from each state index i
  #TODO: if not working, switch to ij->j

  #now divide each row (ai0, ai1, ai2, i...) by sum of aij
  for i in range(len(num_transitions_mat)):
    A_mat[i,:]=num_transitions_mat[i,:]/transitions_sum[i]
  return A_mat
"""



def countNumStates(S_list,states):
  state_names,stateStr2id=getStateMapping(states)
  state_count =np.zeros(len(states)) #keeps count of how many times a state appears
  for i in range(len(S_list)):
    current_state=S_list[i] 
    current_state_idx=stateStr2id[current_state]
    state_count[current_state_idx]=state_count[current_state_idx]+1
  return state_count


def countObservations(O_record,S_record,states,mapping):
  sum_observations = np.zeros(len(states),np.shape(O_record)[1]) #summed observation for each state
  for i in range(len(O_record)):
    current_state=S_record[i]#current state
    current_observation=O_record[i]
    current_state_idx=mapping[current_state]
    sum_observations[current_state_idx,:]=sum_observations[current_state_idx,:]+current_observation #add current observation to running count
  return sum_observations


def generateMeans(O_record,S_record,states,state_count):
  """
  todo: determine if there is a way to vectorize this
  """
  #TODO this assumes that state_names and states are the same. if not working, see if the states are arranged in the same order
  state_names,stateStr2id=getStateMapping(states)
  sum_observations = np.zeros(len(states),np.shape(O_record)[1]) #summed observation for each state
  average_obs = np.zeros(len(states),np.shape(O_record)[1]) #summed observation for each state
  #state_count =countNumStates(S_list,states) #keeps count of how many times a state appears


  for i in range(len(O_record)):
    current_state=S_record[i] 
    current_observation=O_record[i]
    sum_observations[current_state,:]=sum_observations[current_state,:]+current_observation #add current observation to running count

  for i in range(len(states)):
    average_obs[i,:]=sum_observations[i,:]/state_count[i] #average the observations
  
  return average_obs




def generateCovariances(O_list,states,means,state_count):
  state_names,stateStr2id=getStateMapping(states)
  obs_size=np.shape(O_list)[1]
  #state_count =countNumStates(S_list,states) #keeps count of how many times a state appears
  variances = np.zeros((len(states),obs_size,obs_size))#variances for each state

  

  for i in range(len(states)):
    dif_vector = O_list-means[i] #subtract the average observation of state i from all of the observations over time
    variances[i,:,:]=(1/(state_count[i]-1))*np.sum(dif_vector@dif_vector.T)

  return variances







    

Use the functions defined above to train an HMM model on the train1.labels file (only).  For grading purposes, please print out the following variables (and make sure your submitted notebook is showing the values): 
- the state transition probability matrix A
- the means for all six states
- the covariance matrix for 'sil'

In [15]:
### INSERT AS MANY CELLS AS NEEDED BELOW ###
## RAFAEL 
audiofile = Path(AUDIO_DIR, "train1.mp3")
labelfile = Path(AUDIO_DIR, "train1.labels.txt")

train1_obs, hop = computeFeatures(audiofile = audiofile) # compute observations and hop size from audio file

states, stateStr2id = getStateMapping() # retrieve list of possible states and mapping between string and integer state representation

train1_states = getStatesFromLabelFile(labelfile = labelfile, hopsize = hop, str2id = stateStr2id) # generate state list for audio file from strong labels



FileNotFoundError: [Errno 2] No such file or directory: 'audio\\train1.labels.txt'

**Graded**: Print out A below

In [None]:
A

**Graded**: Print out the state distribution means below

In [None]:
means

**Graded**: Print out the covariance matrix for 'sil' below

In [None]:
covars[0]

### Part 3: Inference

In this part, we will use our trained model from part 2 to estimate the state sequence on test recordings.  You must complete the implementations of the functions below.  Again, no unit tests will be provided, so make sure to check your own work!

The function below calculates a pairwise similarity matrix, assuming a Gaussian emission probability model.  You may use the scipy implementation of [multivariate_normal](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.multivariate_normal.html) in your implementation.

In [17]:
# RAFAEL 
def calcSimilarity_multivariateGaussian(O, means, covars):
    """
    Calculates the matrix of likelihoods for a sequence of observations and a set of multivariate Gaussian models.

    Inputs
      - O: an D x N observation matrix, where D is the dimensionality of the feature representation and N is the number
        of observations
      - means: an M x D matrix specifying the distribution means, where M is the number of multivariate Gaussian models 
      - covars: an M x D x D array specifying the distribution covariance matrices, where the first index specifies the model
        and the remaining two indices specify the model's covariance matrix

    Outputs
      - prob: a M x N matrix specifying model likelihoods, where M corresponds to the different models and where N corresponds
        to the different observations
    """
    ### INSERT CODE BELOW ###
    
    M = means.shape[0]  # unpack required matrix dimensions
    N = O.shape[1]      

    prob = np.zeros((M,N))

    for row in range(M):  # loop through models
        model_E = means[row, :]   # unpack current model mean
        model_Sig = covars[row, :, :]   # unpack current model covariance
        dist = multivariate_normal(mean = model_E, cov = model_Sig)   # generate gaussian for current model
        prob[row, :] = dist.pdf(O)  # evaluate current model distribution at the observations 

    return prob

The function below implements the Viterbi algorithm from scratch.  Since there are lots of implementations of Viterbi online, you should not consult any direct implementations.  If you are unable to complete this part on your own, you may consult an online implementation for a grade deduction.  If you do so, please cite the resource and describe the extent of the assistance so that points may be deducted appropriately.

**Graded**: Please cite any resources you consulted in implementing the function below, and the extent of the assistance: 

\<PUT RESPONSE HERE\>

In [35]:
# RAFAEL
def viterbi(prob, A, pi):
    """
    Inputs
      - prob: a M x N matrix specifying model likelihoods, where M is the number of models and N is the number of observations
      - A: an M x M transition probability matrix
      - pi: a length M array specifying the initial state probability distribution
        
    Outputs
      - S_est: the estimated sequence of (numeric) states
    """
    ### INSERT CODE BELOW ###

    M,N = prob.shape

    pi = pi.reshape(M,) # make sure pi is a column vector
    
    #### CONSTRUCT D AND B MATRICES

    D = np.zeros((M, N))  # Allocate cumulative path score matrix D
    B = np.zeros((M, N))  # Allocate backrace matrix B

    D[:,0] = np.log(pi) + np.log(prob[:,0]) # Initialize first column of D
    
    for col in range(1,N):
        for row in range(M): # iterate through remaining entries
            trans = A[:,row] # column vector of transitions from state 1,2,...,M -> state specified by current row
            probs = prob[:,col] # column vector of observation probabilities
            scores = D[:,col-1] + np.log(trans) + np.log(probs)
            D[row, col] = np.max(scores)    # assign max of possible scores to cumulative score matrix entry
            B[row, col] = np.argmax(scores) # assign backpointer to be index of max possible score

    print(D)
    print(B)

    #### EXTRACT ESTIMATED PATH THROUGH D
    S_est = np.zeros(N)
    S_est[0] = np.argmax(D[:,-1])

    for i in range(1,N):
        S_est[i] = B[int(S_est[i-1]),N-i]

    return S_est

M = 4
N = 5
prob = np.zeros((M,N))
prob[0,0] = 1
prob[1,1] = 1
prob[0,2] = 1
prob[2,3] = 1
prob[3,4] = 1
pi = np.array([0.1, 0.1, 0.2, 0.3])
A = np.zeros((M, M)) + 0.25

viterbi(prob, A, pi)

[[-2.30258509        -inf        -inf        -inf        -inf]
 [       -inf        -inf        -inf        -inf        -inf]
 [       -inf        -inf        -inf        -inf        -inf]
 [       -inf        -inf        -inf        -inf        -inf]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


  D[:,0] = np.log(pi) + np.log(prob[:,0]) # Initialize first column of D
  scores = D[:,col-1] + np.log(trans) + np.log(probs)


array([0., 0., 0., 0., 0.])

Using the two functions above, estimate the state sequence for each test recording and generate the corresponding .labels file (it can have a different extension but should have the same format so as to be readable by Audacity).  Include a visualization of your estimated states alongside the audio in Audacity (as shown above).  You may use as many code cells as needed, and be sure to decompose your code appropriately!

In [None]:
### INSERT AS MANY CELLS AS NEEDED BELOW ###


Comment on what you observe in your estimated state sequence, and propose some ideas on how you might improve the system.

**Graded**: 

\<INSERT VISUALIZATION & RESPONSE HERE\>

### Part 4: Forced Alignment

In this part, you will perform forced alignment to determine the correspondence between the states in a given word-level transcription and the corresponding audio recording.  Your goal is to implement the function below, and then use it to determine the state-level alignment for train1.mp3.  Make sure to decompose your function appropriately!

In [None]:
def forcedAlignment(audiofile, word_transcript, model, stateStr2id):
    """
    Performs forced alignment between a given word-level transcription and the corresponding audio recording.

    Inputs
      - audiofile: filepath to the audio recording
      - word_transcript: a string indicating the word-level transcription.  The transcription should only
        contain 'yes' and 'no'; the function will raise an error if it contains anything other than these two words
      - model: tuple of (A, pi, means, covars) specifying the trained HMM
      - stateStr2id: a dict that maps from the state string representation to its numeric identifier (e.g. stateStr2id['EH'])

    Output
      - alignment: an array specifying the coordinates of the forced alignment
    """
    ### INSERT CODE BELOW ###
    raise NotImplementedError

    return alignment

Once you have implemented the forced alignment function above, use it to estimate the state-level alignment for train1.mp3.  Visualize the predicted alignment alongside the audio in Audacity, and also include the word-level alignment from part 1 (that you manually created).  Comment on how the forced alignmend method improves the quality of the alignment.

In [None]:
### INSERT AS MANY CELLS AS NEEDED BELOW ###


**Graded**: Include the visualization below and comment on what you observe.

\[Show predicted alignment in Audacity\]

### Part 5: Retrain Model

In the last part of the assignment, you will use your initial model from part 2, perform forced alignment to generate .forcealign files for all weakly labeled training data, re-train your HMM, and then perform inference on the test data with the new model.  Provide a snapshot in Audacity comparing the predictions from part 3 and part 5 on a single test file.  Comment on any differences you observe, what the re-trained model is doing well, and where the re-trained model is making errors.

In [None]:
### INSERT AS MANY CELLS AS NEEDED BELOW ###


**Graded**:  
\[INSERT VISUALIZATION & COMMENTS HERE\]