# P2 - User manual

**Students:** Maximiliano Hormazábal - Mutaz Abueisheh

This project is available on GitHub. [Click here to go to the repository](https://github.com/maxhormazabal/depencendy_parsing)

### 0. Identify the files:

This project have two main files:
- `p2_preprocessing.ipynb`: Is the one you have to use to know all the steps for get the raw data, make the transformations and load the final data set to use it after.
- `p2_ann_models.ipynb`: Here you have all the model's arquitectures, in this program we do not transform the data it is just for experiments and get results.

### 1. Install/Import libraries

In this project files and folders are going to be created in some steps. If you are going to work on Google Colab it is crucial to connect your notebook to your Google Drive profile for files/folders managment doing the following:

In [None]:
# Getting access to Google Drive files
from google.colab import drive
drive.mount('/content/drive')


The first step is to import the libraries we are going to use (or install if is necessary), the most important are:
- Conllu to read the languages files correctly (in this case we will use parse module)
- Tensorflow to work with ANN, Tokenizer and other Machine Learning Tools
- Pandas to create/transform dataframes
- Numpy to work with numbers and some data structures

To install you can use `!pip install <name>` to install the specific tool that you need.

As following:

In [None]:
# Libraries
import tensorflow as tf
from keras import Input, Model
from keras.layers import Dense
import pandas as pd
import numpy as np
import math
import time

# Setting working directory and importing functions
import os
os.chdir("/content/drive/MyDrive/MASTER") # <- Folder where you saved the utils
from nlu_model_utils import *

## Preprocessing Notebook

### 2. Download the langagues datasets

To find the dataset that you want to use visit this <a href="https://github.com/UniversalDependencies/" target="_blank">Github Repository</a>, there are a many posibilities of languague. If you are looking for an specific language the "Repositories" search bar is useful to it. 
    
<img src="https://raw.githubusercontent.com/maxhormazabal-test/nlu-p1/main/1.png" width="700">

When you are ready do the following:
- Go to the repository of your selected language
- As you see you have 3 files: training, testing, validation (train, test, dev)

For instance

<img src="https://raw.githubusercontent.com/maxhormazabal-test/nlu-p1/main/2.png" width="700">

- Click on one of them (could be anyone) and then click on "View raw"

<img src="https://raw.githubusercontent.com/maxhormazabal-test/nlu-p1/main/3.png" width="700">

- Here there is the raw file, this is the information that we need, our data. Just copy the URL.

<div style="border: 1px solid #2596be;padding:10px;">
    <p style="text-align: center;">**Important:** You do not need to repeat this process for thi others two files. We need only one URL for **each language**.</p>

</div>

<img src="https://raw.githubusercontent.com/maxhormazabal-test/nlu-p1/main/4.png" width="700">

### 3. Read the data with Python

Now we have the address of the data, let's read this file and save as a Python data structure. This is an example with _English Data_ using the URL `https://raw.githubusercontent.com/UniversalDependencies/UD_English-EWT/master/en_ewt-ud-train.conllu` ins this case is the URL for training but as we wrote previously, does not matter wich data set you chose. We just need one of them.

Distribute the URL in two variables:

    - base_url: Contain the URL until "/master/"
    - file_basename: Contain the name of the language without the last score
    
It will be clear with the next example:

In [None]:
# English
# 'https://raw.githubusercontent.com/UniversalDependencies/UD_English-EWT/master/en_ewt-ud-train.conllu'

base_url = 'https://raw.githubusercontent.com/UniversalDependencies/UD_English-EWT/master/'
file_basename = 'en_ewt-ud'

After you did this with all the languages that you want we can start with the preprocessing.

<div style="border: 1px solid #2596be;padding:10px;">
    <p style="text-align: center;"> <strong> Important:</strong> We are going to use self-made functions to make this process more understandable, the finall cell contains all the functions to be runned together (We must have this functions in memory to make the notebook works.</p>

</div>

## 4. Preprocessing data

The function `preprocessingOneStep` contains all the preprocessing steps just in one function. The output are the data sets and the path of that data. Actually the notebook of preproceesing is only in charge of this part of the project and the data we will use in models will be saved as files and the mentioned relative path.

With the variables:

```
stack_len = 7
buffer_len = 10
```

You can set the size of stack and buffer for the sets. Run this cell everytime you need to create a new data source for models

The code is the following:

In [None]:
base_url = 'https://raw.githubusercontent.com/UniversalDependencies/UD_English-ParTUT/master/'
file_basename = 'en_partut-ud'
stack_len = 7
buffer_len = 10

(path,
x_train_token,action_encod_train,deprel_encod_train,
x_test_token,action_encod_test,deprel_encod_test,
x_val_token,action_encod_val,deprel_encod_val) = preprocessingOneStep(base_url,file_basename,stack_len,buffer_len)


!mkdir -p {path}


saveData(path,x_train_token,action_encod_train,deprel_encod_train,x_test_token,action_encod_test,deprel_encod_test,x_val_token,action_encod_val,deprel_encod_val)

Notice that after the execution of `preprocessingOneStep` the line `!mkdir -p {path}` is runned because the `!` symbol let us work with the terminal prompt and create a directory to save the numpy files for each set.

After creating the folder we use `saveData` for save the files in your Google Drive directory. This is a fundamental step because the model notebook is going to take the data just reading the files; that way we do not have to repeat the preprocessing everytime.

**If you run many times the folder should be like this**



<img src="https://raw.githubusercontent.com/maxhormazabal/depencendy_parsing/main/img/nlu_data_folder.png" width="300">

## Models Notebook

After generating the data needed to train the models, the neural network tests begin. For this it is necessary to know the relevant functions:

1. `buildModelA` allows us to build and compile the model selecting all the important parameters

In [None]:
def buildModelA(stack_len,buffer_len,action_shape,deprel_shape,optimizer='adam',learning_rate=0.01,embedd_output=50,loss='categorical_crossentropy'):
  input1 = tf.keras.layers.Input(shape=(stack_len,),name = 'Stack_Input')
  input2 = tf.keras.layers.Input(shape=(buffer_len,),name = 'Buffer_Input')

  embedding_layer = tf.keras.layers.Embedding(10000, embedd_output,mask_zero=True)
  input1_embedded = embedding_layer(input1)
  input2_embedded = embedding_layer(input2)

  lstm1 = tf.keras.layers.LSTM(embedd_output,return_sequences=False,name="LSTM_Layer1")(input1_embedded)
  lstm2 = tf.keras.layers.LSTM(embedd_output,return_sequences=False,name="LSTM_Layer2")(input2_embedded)

  # Concatenamos a lo largo del último eje
  merged = tf.keras.layers.Concatenate(axis=1,name = 'Concat_Layer')([lstm1, lstm2])
  dense1 = tf.keras.layers.Dense(50, activation='sigmoid', use_bias=True,name = 'Dense_Layer1')(merged)
  dense2 = tf.keras.layers.Dense(15, input_dim=1, activation='relu', use_bias=True,name = 'Dense_Layer2')(dense1)
  dense3 = tf.keras.layers.Dense(30, input_dim=1, activation='relu', use_bias=True,name = 'Dense_Layer3')(dense2)
  output1 = tf.keras.layers.Dense(action_shape, activation='softmax', use_bias=True,name = 'Action_Output')(dense3)
  output2 = tf.keras.layers.Dense(deprel_shape, activation='softmax', use_bias=True,name = 'Deprel_Output')(dense3)

  model = tf.keras.Model(inputs=[input1,input2],outputs=[output1,output2])
  
  if(optimizer.lower() == 'adam'):
    opt = tf.keras.optimizers.Adam(learning_rate=learning_rate)
  elif(optimizer.lower() == 'sgd'):
    opt = tf.keras.optimizers.SGD(learning_rate=learning_rate)
  elif(optimizer.lower() == 'rmsprop'):
    opt = tf.keras.optimizers.RMSprop(learning_rate=learning_rate)
  elif (optimizer.lower() == 'adamw'):
    opt = tf.keras.optimizers.AdamW(learning_rate=learning_rate)
  elif (optimizer.lower() == 'adadelta'):
    opt = tf.keras.optimizers.Adadelta(learning_rate=learning_rate)
  elif (optimizer.lower() == 'adagrad'):
    opt = tf.keras.optimizers.Adagrad(learning_rate=learning_rate)
  elif (optimizer.lower() == 'adamax'):
    opt = tf.keras.optimizers.Adamax(learning_rate=learning_rate)
  elif (optimizer.lower() == 'adafactor'):
    opt = tf.keras.optimizers.Adafactor(learning_rate=learning_rate)
  elif (optimizer.lower() == 'nadam'):
    opt = tf.keras.optimizers.Nadam(learning_rate=learning_rate)
  elif (optimizer.lower() == 'ftrl'):
    opt = tf.keras.optimizers.Ftrl(learning_rate=learning_rate)
  else:
    print('Optimizer not properly defined')

  model.compile(loss=loss,optimizer=opt,metrics=['accuracy'])
  return(model)

2. `fitModel` use the model created with the previous function and train it.

In [None]:
def fitModel(x_train_stack,x_train_buffer,action_train, deprel_train,
              x_val_stack,x_val_buffer,action_val,deprel_val,
              x_test_stack,x_test_buffer,action_test,deprel_test,
              model,
              stopper,patience=3,epochs=10,batch_size=128):
  callback = tf.keras.callbacks.EarlyStopping(monitor=stopper, patience=patience,restore_best_weights=True)
  model.fit([x_train_stack,x_train_buffer],
            [action_train, deprel_train],
            epochs=epochs, batch_size=batch_size,
            callbacks=[callback],
            verbose = 0,
            validation_data=([x_val_stack,x_val_buffer],[action_val,deprel_val]))
  score = model.evaluate([x_test_stack,x_test_buffer],[action_test, deprel_test], verbose=0)
  return score

3. Finally `saveModelAData` is a function that can be used to test different configuration of an specific parameter. It trains a model changing the selected pivot parameter and save the results as a dataframe. The output of this function is the summarized data set and the best model of each execution.

**Important**: The metrics are calcutated with the evaluation function (Keras) using the testing set.

In [None]:
def saveModelAData(pivot_name,pivot,
                   x_train_stack,x_train_buffer,action_train, deprel_train,
                   x_val_stack,x_val_buffer,action_val,deprel_val,
                   x_test_stack,x_test_buffer,action_test,deprel_test,
                   stack_len,buffer_len,
                   stopper,patience,
                   batch_size,epochs,
                   optimizer,learning_rate,
                   embedd_output=50):
  # Creating empty lists
  arquitecture_set = []
  stack_set = []
  buffer_set = []
  action_accuracy_set = []
  deprel_accuracy_set = []
  action_loss_set = []
  deprel_loss_set = []
  batch_size_set = []
  epochs_set = []
  optimizer_set = []
  learning_rate_set = []
  embedd_output_set = []
  early_stop_set = []
  time_set = []

  models = []


  for (index,value) in enumerate(pivot):

    print('Starting execution where ',pivot_name,' varies, now with the value(s) ',value,'.')

    if (pivot_name == 'batch_size'):
      batch_size = value
    elif (pivot_name == 'epochs'):
      epochs = value
    elif (pivot_name == 'early_stop'):
      (stopper,patience) = value
    elif (pivot_name == 'optimizer'):
      (optimizer,learning_rate) = value

    arquitecture = 'A'
    start_time = time.time()

    model = buildModelA(stack_len,buffer_len,action_train[0].shape[0],deprel_train[0].shape[0],optimizer=optimizer,learning_rate=learning_rate,embedd_output=embedd_output,loss='categorical_crossentropy')

    score = fitModel(x_train_stack,x_train_buffer,action_train, deprel_train,
                  x_val_stack,x_val_buffer,action_val,deprel_val,
                  x_test_stack,x_test_buffer,action_test,deprel_test,
                  model,
                  stopper,patience,epochs,batch_size)
    loss,Action_Output_loss,Deprel_Output_loss,Action_Output_accuracy,Deprel_Output_accuracy = score

    end_time = time.time()

    training_time = (end_time - start_time)

    # Saving models
    models.append(model)

    # Append values
    arquitecture_set.append(arquitecture)
    stack_set.append(stack_len)
    buffer_set.append(buffer_len)
    action_accuracy_set.append(Action_Output_accuracy)
    deprel_accuracy_set.append(Deprel_Output_accuracy)
    action_loss_set.append(Action_Output_loss)
    deprel_loss_set.append(Deprel_Output_loss)
    batch_size_set.append(batch_size)
    epochs_set.append(epochs)
    optimizer_set.append(optimizer)
    learning_rate_set.append(learning_rate)
    embedd_output_set.append(embedd_output)
    early_stop_set.append(stopper)
    time_set.append(training_time)

  # Data dictionary

  resultDict = {
    'arquitecture' : arquitecture_set,
    'stack' : stack_set,
    'buffer' : buffer_set,
    'action_accuracy' : action_accuracy_set,
    'deprel_accuracy' : deprel_accuracy_set,
    'action_loss' : action_loss_set,
    'deprel_loss' : deprel_loss_set,
    'batch_size' : batch_size_set,
    'epochs' : epochs_set,
    'optimizer' : optimizer_set,
    'learning_rate' : learning_rate_set,
    'embedd_output' : embedd_output,
    'early_stop_set' : early_stop_set,
    'time' : time_set,
  }

  return (pd.DataFrame(resultDict),models)

One execution of those three function together is the following

In [None]:
# Testing different optimizers

pivot_name = 'optimizer'
pivot = [
    ('adam' , 0.001),
    ('adam' , 0.01),
    ('adam' , 0.1),
    ('sgd' , 0.001),
    ('sgd' , 0.01),
    ('sgd' , 0.1),
    ('rmsprop' , 0.001),
    ('rmsprop' , 0.01),
    ('rmsprop' , 0.1),
    ('adadelta' , 0.001),
    ('adadelta' , 0.01),
    ('adadelta' , 0.1),
    ('adagrad' , 0.001),
    ('adagrad' , 0.01),
    ('adagrad' , 0.1),
    ('adamax' , 0.001),
    ('adamax' , 0.01),
    ('adamax' , 0.1),
    ('nadam' , 0.001),
    ('nadam' , 0.01),
    ('nadam' , 0.1),
    ('ftrl' , 0.001),
    ('ftrl' , 0.01),
    ('ftrl' , 0.1)
]

# Setting values
stack_len = 3
buffer_len = 3
arquitecture = 'A'
stopper = 'val_Deprel_Output_loss'
patience = 3
batch_size = 1024
epochs = 10
optimizer = 'adam'
learning_rate = 0.01
embedd_output = 50

(df_res_optimizers,models_optimizers) = saveModelAData(pivot_name,pivot,
                   x_train_stack,x_train_buffer,action_train, deprel_train,
                   x_val_stack,x_val_buffer,action_val,deprel_val,
                   x_test_stack,x_test_buffer,action_test,deprel_test,
                   stack_len,buffer_len,
                   stopper,patience,
                   batch_size,epochs,
                   optimizer,learning_rate,
                   embedd_output=50)

Where the data frame of result is like this

<img src="https://raw.githubusercontent.com/maxhormazabal/depencendy_parsing/main/img/df_results_example.png" width="700">

And you can find all the data set with the results (based on a pivot parameter) in the folder `model_testing` in `.csv` format.

In [None]:
en_train_df = addTextColumn(en_train_df) #2 
(en_tokenizer,en_word_index) = trainTokenizer(en_train_df["sentence_as_string"]) #3

In [None]:
def addTextColumn(df):
    df2 = df.copy()
    for row in range(0,len(df2)):
      empty_string = ""
      for element in df2.loc[row,'sentences']:
        empty_string = empty_string + " " + element
      df2.loc[row,"sentence_as_string"] = empty_string
    return df2

def trainTokenizer(sentences,oov_token="<OOV>",filters=""):
  from keras.preprocessing.text import Tokenizer
  if(isinstance(sentences,list)):
    text_list = list()
    for sentence in sentences:
      text = sentence.metadata['text']
      text_list.append(text)
  else:
    text_list = sentences
  tokenizer = Tokenizer(oov_token=oov_token,filters=filters) 
  tokenizer.fit_on_texts(text_list)
  word_index = tokenizer.word_index
  print("Tokenizer trained! With ",len(word_index)," words")
  return (tokenizer,word_index)

4. Finally we apply this trained tokenizer to the dataset in `sentences` columns. After converting words in number we pad the sequences and transform our target into categorical variable; it is important to notice that in pad process we set a maximum words variable `max_len` in this case is equal to `128`. Thus, for instance, we have the following:

<img src="https://raw.githubusercontent.com/maxhormazabal-test/nlu-p1/main/7.png" width="700">

In [None]:
max_len = 128
(x_train_en,y_train_en) = applyTokenizer(en_train_df,en_tokenizer,max_len) #4

In [None]:
def applyTokenizer(df,tokenizer,maxlen):
    df['text_tokenized'] = pd.NA
    for row in range(0,len(df)):
      tokenized = tokenizer.texts_to_sequences([df.loc[row,"sentence_as_string"]])
      df.loc[:,'text_tokenized'].loc[row] = tokenized[0]
    df = df[['text_tokenized','UPOS']]
    pad_y = tf.keras.utils.pad_sequences(df['UPOS'],maxlen=maxlen)
    y_set = tf.keras.utils.to_categorical(pad_y, dtype='float32')
    x_set = tf.keras.utils.pad_sequences(df['text_tokenized'],maxlen=maxlen)
    return (x_set,y_set)

In summary, we did the following process:

In [None]:
max_len = 128

#English
en_train_df = conlluToDataset(en_train,en_upo2number) #1 
en_train_df = addTextColumn(en_train_df) #2 
(en_tokenizer,en_word_index) = trainTokenizer(en_train_df["sentence_as_string"]) #3
(x_train_en,y_train_en) = applyTokenizer(en_train_df,en_tokenizer,max_len) #4

### Preprocessing without training the tokenizer.

To use the other sets, it needs to pass for the same processing **but** we should not train the Tokenizer everytime (just with the training set). That is why we split the Tokenizer processes in two functions as we see before (`trainTokenizer` and `applyTokenizer`), so now we can do the same with the test and validation sets avoiding the `trainTokenizer` step. To keep it simple we have a function that calls the others:

In [None]:
def textPreprocessing(sentences,upos2number,tokenizer,maxlen=128):
  df = conlluToDataset(sentences,upos2number)
  df = addTextColumn(df)
  (x_set,y_set) = applyTokenizer(df,tokenizer,maxlen)
  return (x_set,y_set)

In [None]:
(x_test_es,y_test_es) = textPreprocessing(es_test,es_upo2number,es_tokenizer,max_len)
(x_val_es,y_val_es)  = textPreprocessing(es_val,es_upo2number,es_tokenizer,max_len)

## Preprocessing for Char Based Models

In the case of character-based models, the preprocessing is very similar, since it takes the words that we separated in the previous steps to be able to review each one and it will convert it into a vector of new numbers, but this time they represent letters.

Below are the `trainCharTokenizer` and `applyCharTokenizer` functions that are in charge of training the tokenizer and applying it to the texts in order to obtain input from our second type of neural network architecture.

In [None]:
def trainCharTokenizer(sentences):
  from keras.preprocessing.text import Tokenizer
  text_list = sentences
  tokenizer = Tokenizer(sentences,filters="",char_level=True,lower=False) 
  tokenizer.fit_on_texts(text_list)
  char_word_index = tokenizer.word_index
  print("Tokenizer trained! With ",len(char_word_index)," chars")
  return (tokenizer,char_word_index)

def applyCharTokenizer(df,char_word_index,max_len_words,max_len_chars):
  df['text_tokenized_char'] = pd.NA
  for index,sentence in enumerate(df['sentences']):
    sentence_array = []
    for word in sentence:
      word_array = []
      for char in word:
        if char in char_word_index:
          word_array.append(char_word_index[char])
        else:
          word_array.append(0)
      sentence_array.append(word_array)
      df.loc[:,'text_tokenized_char'].loc[index] = sentence_array

In short, these new functions will receive the same sentences but this time being enabled to discriminate between letters. These new functions have been added to the preprocessing to be performed all together on the datasets in the three languages, such that the preprocessing would be:

In [None]:
max_len = 128
max_len_chars = 25

#English
en_train_df = conlluToDataset(en_train,en_upo2number)
en_train_df = addTextColumn(en_train_df)
(en_tokenizer,en_word_index) = trainTokenizer(en_train_df["sentence_as_string"])
(en_char_tokenizer,en_char_word_index) = trainCharTokenizer(en_train_df["sentence_as_string"])
(x_train_en,y_train_en) = applyTokenizer(en_train_df,en_tokenizer,max_len)
x_char_train_en = applyCharTokenizer(en_train_df,en_char_word_index,max_len,max_len_chars)

(x_test_en,y_test_en,x_char_test_en) = textPreprocessing(en_test,en_upo2number,en_tokenizer,en_char_word_index,max_len,max_len_chars)
(x_val_en,y_val_en,x_char_val_en) = textPreprocessing(en_val,en_upo2number,en_tokenizer,en_char_word_index,max_len,max_len_chars)

#Spanish
es_train_df = conlluToDataset(es_train,es_upo2number)
es_train_df = addTextColumn(es_train_df)
(es_tokenizer,es_word_index) = trainTokenizer(es_train_df["sentence_as_string"])
(es_char_tokenizer,es_char_word_index) = trainCharTokenizer(es_train_df["sentence_as_string"])
(x_train_es,y_train_es) = applyTokenizer(es_train_df,es_tokenizer,max_len)
x_char_train_es = applyCharTokenizer(es_train_df,es_char_word_index,max_len,max_len_chars)

(x_test_es,y_test_es,x_char_test_es) = textPreprocessing(es_test,es_upo2number,es_tokenizer,es_char_word_index,max_len,max_len_chars)
(x_val_es,y_val_es,x_char_val_es) = textPreprocessing(es_val,es_upo2number,es_tokenizer,es_char_word_index,max_len,max_len_chars)

#French
fr_train_df = conlluToDataset(fr_train,fr_upo2number)
fr_train_df = addTextColumn(fr_train_df)
(fr_tokenizer,fr_word_index) = trainTokenizer(fr_train_df["sentence_as_string"])
(fr_char_tokenizer,fr_char_word_index) = trainCharTokenizer(fr_train_df["sentence_as_string"])
(x_train_fr,y_train_fr) = applyTokenizer(fr_train_df,fr_tokenizer,max_len)
x_char_train_fr = applyCharTokenizer(fr_train_df,fr_char_word_index,max_len,max_len_chars)

(x_test_fr,y_test_fr,x_char_test_fr) = textPreprocessing(fr_test,fr_upo2number,fr_tokenizer,fr_char_word_index,max_len,max_len_chars)
(x_val_fr,y_val_fr,x_char_val_fr) = textPreprocessing(fr_val,fr_upo2number,fr_tokenizer,fr_char_word_index,max_len,max_len_chars)

This is the end of the Preprocessing Steps, you can repeat with others languages.

## Building models

### Word Based Models

For the models we have two types of functions, in this first case it will be possible to execute the architectures that expect only words as input. The `builAnnModel` function is the one intended to receive this type of ANN such that:

In [None]:
def builAnnModel(word_index,n_upos,maxlen_sentences=128,embedd_output=50,
                 lstm_input=50,loss='categorical_crossentropy',
                 optimizer='adam',metrics=['accuracy']):
  tokenizer_size = len(word_index)+1
  n_upos = n_upos + 1
  inputs = tf.keras.Input(shape=(maxlen_sentences,),name="Input_Layer")
  embedd = tf.keras.layers.Embedding(tokenizer_size,embedd_output,mask_zero=True,name="Embedding_Layer")(inputs)
  lstm = tf.keras.layers.LSTM(lstm_input,return_sequences=True,name="LSTM_Layer")(embedd)
  outputs = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(n_upos,activation='softmax'),name="TimeD_Layer")(lstm)
  model = tf.keras.Model(inputs=inputs,outputs=outputs)

  model.compile(loss = loss,optimizer = optimizer,metrics=[metrics])
  print(model.summary())
  return model

You can run this lines like this:

In [None]:
en_model = builAnnModel(en_word_index,en_nupos,maxlen_sentences=128,embedd_output=50,lstm_input=50)
en_model.fit(x_train_en, y_train_en, epochs=20, verbose=1, batch_size=200,validation_data=(x_val_en,y_val_en))

### Word Based Models

For the models we have two types of functions, the second case it will be possible to execute the architectures that expect  words and chars as input. The `builCharAnnModel` function is the one intended to receive this type of ANN such that:

In [None]:
def builCharAnnModel(word_index,char_word_index,n_upos,maxlen_sentences=128,maxlen_chars=25,embedd_output=50,
                 lstm_input=50,loss='categorical_crossentropy',
                 optimizer='adam',metrics=['accuracy']):

    tokenizer_size = len(word_index)+1
    char_tokenier_size = len(char_word_index)+1
    n_upos = n_upos + 1

    inputs_words = tf.keras.Input(shape=(maxlen_sentences,),name="Input_Words_Layer") # shape (rows,128)
    inputs_chars = tf.keras.Input(shape=(maxlen_sentences,maxlen_chars,),name="Input_Chars_Layer") #shape (rows,128,25)
    embedd_words = tf.keras.layers.Embedding(tokenizer_size,embedd_output,mask_zero=True,name="Embedding_Word_Layer")(inputs_words)
    embedd_chars = tf.keras.layers.Embedding(char_tokenier_size,embedd_output,mask_zero=True,name="Embedding_char_Layer")(inputs_chars)
    lstm_chars = tf.keras.layers.TimeDistributed(tf.keras.layers.LSTM(embedd_output,return_sequences=False,name="LSTM_char_Layer"))(embedd_chars) #distributing to each char
    concatted = tf.keras.layers.Concatenate()([embedd_words, lstm_chars])
    lstm = tf.keras.layers.LSTM(lstm_input,return_sequences=True,name="LSTM_Layer")(concatted)
    outputs = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(n_upos,activation='softmax'),name="TimeD_Layer")(lstm)
    model = tf.keras.Model(inputs=[inputs_words,inputs_chars],outputs=outputs)

    model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
    print(model.summary())
    return model

You can run this lines like this:

In [None]:
batch_size = 300
fr_char_model = builCharAnnModel(fr_word_index,fr_char_word_index,fr_nupos,maxlen_sentences=128,maxlen_chars=25,embedd_output=50,lstm_input=batch_size)
fr_char_model.fit([x_train_fr,x_char_train_fr], y_train_fr, epochs=12, verbose=1, batch_size=batch_size,validation_data=([x_val_fr,x_char_val_fr],y_val_fr))