# A Quantum-Enhanced LSTM Layer

Kuantum Makine Öğrenmesinde şimdiye kadar yeterince araştırılmamış bir alan, bilgisayarlara yazılı metni okuma, yazma ve bir dereceye kadar anlama yeteneği veren Yapay Zekâ'nın alt alanı olan Doğal Dil İşleme'dir (NLP).

Belgeler genellikle kelime dizileri olarak sunulduğundan, tarihsel olarak bu tür verileri işlemek için en başarılı tekniklerden biri Tekrarlayan Sinir Ağı mimarisi ve özellikle Uzun Kısa Süreli Bellek (LSTM) adı verilen bir varyant olmuştur. LSTM'ler, makinelerin Transformatör ağlarının ortaya çıkışına kadar en son teknoloji doğrulukla çeviriler, sınıflandırmalar ve niyet tespiti yapmalarına olanak sağladı. Yine de, iyi kuantum hesaplamanın alana ne getirebileceğini görmek için LSTM'leri araştırmak en azından eğitim açısından ilginçtir. Daha ayrıntılı bir tartışma için lütfen Chen, Yoo ve Fang'ın "Kuantum Uzun Kısa Süreli Bellek" (arXiv:2009.01783) ve J. Bausch'un "Tekrarlayan Kuantum Sinir Ağları" (arXiv:2006.14619) makalelerine bakın.

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from quantum_lstm import QLSTM #./quantum_lstm.py



In [2]:
import pandas as pd
df=pd.read_csv('turkish_ud_741.5K.csv')
# Universal Dependencies Veri Seti Türkçe

In [3]:
df.head()

Unnamed: 0,Sentence,UPOS Sequence
0,1936 yılındayız yılında yız .,NUM _ NOUN AUX PUNCT
1,Adeta kendimden geçmiş bir haldeyim halde yim .,ADV PRON VERB DET _ NOUN AUX PUNCT
2,O nasıl derse desin uğraştığı sanatın kendisin...,PRON ADV VERB VERB VERB NOUN PRON NOUN VERB VE...
3,"Ahmed Rasim , Büyükada'ya gidip birkaç gün kal...",PROPN PROPN PUNCT PROPN VERB DET NOUN VERB _ V...
4,Rüzgâr yine güçlü esiyordu esiyor du .,NOUN ADV ADV _ VERB AUX PUNCT


In [4]:
sequences = df['UPOS Sequence'].str.split(expand=True).stack().unique()

print(sequences)

['NUM' '_' 'NOUN' 'AUX' 'PUNCT' 'ADV' 'PRON' 'VERB' 'DET' 'PROPN' 'ADJ'
 'CCONJ' 'ADP' 'PART' 'SCONJ' 'INTJ' 'X' 'SYM']


In [5]:
tag_to_ix = {'_': 0, 'ADJ': 1, 'ADP': 2, 'ADV': 3, 'AUX': 4, 'CCONJ': 5, 'DET': 6, 'INTJ': 7, 'NOUN': 8, 'NUM': 9, 'PART': 10, 'PRON': 11, 'PROPN': 12, 'PUNCT': 13, 'SCONJ': 14, 'VERB': 15, 'X': 16, 'SYM':17} # Assign each tag with a unique index
ix_to_tag = {i:k for k,i in tag_to_ix.items()}

Aşağıdaki fonksiyon cümleyi kelimelere ayırır ve etiketi her kelimeyle eşleştirir.

In [6]:
def prepare_sequence(seq, to_ix):
    idxs = [to_ix[w] for w in seq]
    return torch.tensor(idxs, dtype=torch.long)

Aşağıdaki döngü eğitim veri setini oluşturur.

In [7]:
training_data = []
word_to_ix = {}
tag_to_ix = {}

for _, row in df.iterrows():
    words = row['Sentence'].split()
    tags = row['UPOS Sequence'].split()
    training_data.append((words, tags))
    for word in words:
        if word not in word_to_ix:
            word_to_ix[word] = len(word_to_ix)
    for tag in tags:
        if tag not in tag_to_ix:
            tag_to_ix[tag] = len(tag_to_ix)

print(f"Training Data: {training_data}")
print(f"Word to Index: {word_to_ix}")
print(f"Tag to Index: {tag_to_ix}")

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Fikir, her kelime için bir tane olmak üzere gizli vektör dizisini [h_0, h_1, h_2, h_3, h_4] çıktısı verecek olan iki diziyi LSTM'den geçirmektir. Her kelimenin bir belirleyici, isim veya fiil olma olasılığını hesaplamak için LSTM'nin çıktılarına yoğun bir katman "head" eklenir.

In [8]:
import torch
import torch.nn as nn
import torch.nn.functional as F

class LSTMTagger(nn.Module):

    def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size, pretrained_embeddings=None,
                 n_qubits=0, num_layers=4, n_qlayers=4, freeze_embeddings=False):
        super(LSTMTagger, self).__init__()
        self.hidden_dim = hidden_dim


        if pretrained_embeddings is not None:

            self.word_embeddings = nn.Embedding.from_pretrained(pretrained_embeddings, freeze=freeze_embeddings)
        else:

            self.word_embeddings = nn.Embedding(vocab_size, embedding_dim)


        if n_qubits > 0:
            print("Tagger will use Quantum LSTM")
            self.lstm = QLSTM(embedding_dim, hidden_dim, n_qubits=n_qubits, n_qlayers=n_qlayers)
        else:
            print("Tagger will use Classical LSTM")
            self.lstm = nn.LSTM(embedding_dim, hidden_dim, num_layers=num_layers, batch_first=True)


        self.hidden2tag = nn.Linear(hidden_dim, tagset_size)

    def forward(self, sentence):
        embeds = self.word_embeddings(sentence)
        lstm_out, _ = self.lstm(embeds.view(len(sentence), 1, -1))
        tag_logits = self.hidden2tag(lstm_out.view(len(sentence), -1))
        tag_scores = F.log_softmax(tag_logits, dim=1)
        return tag_scores


In [9]:
embedding_dim = 8
hidden_dim = 6
n_epochs = 300

In [10]:
model_classical = LSTMTagger(embedding_dim,
                        hidden_dim,
                        vocab_size=len(word_to_ix),
                        tagset_size=len(tag_to_ix),
                        n_qubits=0)

Tagger will use Classical LSTM


## Training

Following the example from the PyTorch website, we train the two networks (classical and quantum LSTM) for 300 epochs.

In [11]:
def train(model, n_epochs):
    loss_function = nn.NLLLoss()
    optimizer = optim.SGD(model.parameters(), lr=0.1)
    scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.8)

    history = {
        'loss': [],
        'acc': []
    }
    for epoch in range(n_epochs):
        losses = []
        preds = []
        targets = []
        for sentence, tags in training_data:
            model.zero_grad()
            sentence_in = prepare_sequence(sentence, word_to_ix)
            labels = prepare_sequence(tags, tag_to_ix)

            tag_scores = model(sentence_in)
            loss = loss_function(tag_scores, labels)
            loss.backward()
            optimizer.step()
            losses.append(float(loss))

            probs = torch.softmax(tag_scores, dim=-1)
            preds.append(probs.argmax(dim=-1))
            targets.append(labels)

        avg_loss = np.mean(losses)
        history['loss'].append(avg_loss)

        preds = torch.cat(preds)
        targets = torch.cat(targets)
        corrects = (preds == targets)
        accuracy = corrects.sum().float() / float(targets.size(0))
        history['acc'].append(accuracy)


        scheduler.step()

        print(f"Epoch {epoch+1} / {n_epochs}: Loss = {avg_loss:.3f}, Acc = {accuracy:.2f}, LR = {scheduler.get_last_lr()[0]:.6f}")

    return history

In [25]:
history_classical = train(model_classical, n_epochs)

Epoch 1 / 300: Loss = 1.560, Acc = 0.48, LR = 0.100000
Epoch 2 / 300: Loss = 1.284, Acc = 0.56, LR = 0.100000
Epoch 3 / 300: Loss = 1.113, Acc = 0.61, LR = 0.100000
Epoch 4 / 300: Loss = 1.013, Acc = 0.64, LR = 0.100000
Epoch 5 / 300: Loss = 0.984, Acc = 0.65, LR = 0.100000
Epoch 6 / 300: Loss = 0.905, Acc = 0.67, LR = 0.100000
Epoch 7 / 300: Loss = 0.879, Acc = 0.68, LR = 0.080000
Epoch 8 / 300: Loss = 0.860, Acc = 0.69, LR = 0.080000
Epoch 9 / 300: Loss = 0.876, Acc = 0.69, LR = 0.080000
Epoch 10 / 300: Loss = 0.848, Acc = 0.70, LR = 0.080000
Epoch 11 / 300: Loss = 0.841, Acc = 0.71, LR = 0.080000
Epoch 12 / 300: Loss = 0.854, Acc = 0.71, LR = 0.080000
Epoch 13 / 300: Loss = 0.849, Acc = 0.71, LR = 0.080000
Epoch 14 / 300: Loss = 0.954, Acc = 0.68, LR = 0.064000
Epoch 15 / 300: Loss = 0.997, Acc = 0.66, LR = 0.064000
Epoch 16 / 300: Loss = 0.830, Acc = 0.72, LR = 0.064000
Epoch 17 / 300: Loss = 0.791, Acc = 0.74, LR = 0.064000
Epoch 18 / 300: Loss = 0.794, Acc = 0.73, LR = 0.064000
E

Epoch 147 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000922
Epoch 148 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000922
Epoch 149 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000922
Epoch 150 / 300: Loss = 0.560, Acc = 0.83, LR = 0.000922
Epoch 151 / 300: Loss = 0.560, Acc = 0.83, LR = 0.000922
Epoch 152 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000922
Epoch 153 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000922
Epoch 154 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000738
Epoch 155 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000738
Epoch 156 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000738
Epoch 157 / 300: Loss = 0.560, Acc = 0.83, LR = 0.000738
Epoch 158 / 300: Loss = 0.560, Acc = 0.83, LR = 0.000738
Epoch 159 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000738
Epoch 160 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000738
Epoch 161 / 300: Loss = 0.559, Acc = 0.83, LR = 0.000590
Epoch 162 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000590
Epoch 163 / 300: Loss = 0.561, Acc = 0.83, LR = 0.000590
Epoch 164 / 300: Loss = 0.561, 

Epoch 291 / 300: Loss = 0.596, Acc = 0.82, LR = 0.000011
Epoch 292 / 300: Loss = 0.596, Acc = 0.82, LR = 0.000011
Epoch 293 / 300: Loss = 0.596, Acc = 0.82, LR = 0.000011
Epoch 294 / 300: Loss = 0.596, Acc = 0.82, LR = 0.000009
Epoch 295 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009
Epoch 296 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009
Epoch 297 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009
Epoch 298 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009
Epoch 299 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009
Epoch 300 / 300: Loss = 0.598, Acc = 0.82, LR = 0.000009


In [12]:
torch.save(model_classical.state_dict(), "lstm_tagger_turkish.pth")

In [13]:
def print_result(model):
    with torch.no_grad():
        input_sentence = training_data[0][0]
        labels = training_data[0][1]
        inputs = prepare_sequence(input_sentence, word_to_ix)
        tag_scores = model(inputs)

        tag_ids = torch.argmax(tag_scores, dim=1).numpy()
        tag_labels = [ix_to_tag[k] for k in tag_ids]
        print(f"Sentence:  {input_sentence}")
        print(f"Labels:    {labels}")
        print(f"Predicted: {tag_labels}")

In [14]:
print_result(model_classical)

Sentence:  ['1936', 'yılındayız', 'yılında', 'yız', '.']
Labels:    ['NUM', '_', 'NOUN', 'AUX', 'PUNCT']
Predicted: ['PROPN', 'PROPN', 'PROPN', 'PROPN', 'PROPN']


In [15]:
df=df.sample(2000)

In [16]:
training_data = []
word_to_ix = {}
tag_to_ix = {}

for _, row in df.iterrows():
    words = row['Sentence'].split()
    tags = row['UPOS Sequence'].split()
    training_data.append((words, tags))
    for word in words:
        if word not in word_to_ix:
            word_to_ix[word] = len(word_to_ix)
    for tag in tags:
        if tag not in tag_to_ix:
            tag_to_ix[tag] = len(tag_to_ix)

print(f"Training Data: {training_data}")
print(f"Word to Index: {word_to_ix}")
print(f"Tag to Index: {tag_to_ix}")

Training Data: [(["Pittsburgh'tan", 'San', "Francisco'ya", 'olan', 'uçuşları', 'listeleyin'], ['PROPN', 'PROPN', 'PROPN', 'ADJ', 'NOUN', 'VERB']), (['Iskarta', 'hisse', 'piyasası', 'ile', 'bağlantılı', 'şirketlerin', 'hisseleri', 'de', 'Cuma', 'günü', 'alt', 'üst', 'oldu', '.'], ['NOUN', 'NOUN', 'NOUN', 'CCONJ', 'ADJ', 'NOUN', 'NOUN', 'CCONJ', 'PROPN', 'NOUN', 'ADJ', 'ADJ', 'VERB', 'PUNCT']), (['Bazen', 'ondan', 'nefret', 'ediyorum', ',', 'dedim', '.'], ['ADV', 'PRON', 'NOUN', 'VERB', 'PUNCT', 'VERB', 'PUNCT']), (['Yok', ',', 'fazlası', 'korkutuyor', '.'], ['ADV', 'PUNCT', 'ADJ', 'VERB', 'PUNCT']), (["Memphis'ten", "Tacoma'ya", 'hangi', 'uçuşların', 'gittiğini', 'söyleyip', 'Los', "Angeles'te", 'duraklama', 'yapabilir', 'misin'], ['PROPN', 'PROPN', 'ADJ', 'NOUN', 'NOUN', 'ADV', 'PROPN', 'PROPN', 'NOUN', 'VERB', 'AUX']), (['O', 'zaman', ',', 'bugünkü', 'hâlinizi', 'rüyada', 'görmemek', 'için', 'uykudan', 'korkmaya', 'başlarsınız', '.'], ['DET', 'NOUN', 'PUNCT', 'ADJ', 'NOUN', 'NOUN', 'N

In [17]:
n_qubits = 4

model_quantum = LSTMTagger(embedding_dim,
                        hidden_dim,
                        vocab_size=len(word_to_ix),
                        tagset_size=len(tag_to_ix),
                        n_qubits=n_qubits,
                        n_qlayers=2)

Tagger will use Quantum LSTM
weight_shapes = (n_qlayers, n_qubits) = (2, 4)


In [18]:
history_quantum = train(model_quantum, n_epochs)

Epoch 1 / 300: Loss = 1.914, Acc = 0.41, LR = 0.100000
Epoch 2 / 300: Loss = 1.678, Acc = 0.46, LR = 0.100000
Epoch 3 / 300: Loss = 1.606, Acc = 0.47, LR = 0.100000
Epoch 4 / 300: Loss = 1.669, Acc = 0.46, LR = 0.100000
Epoch 5 / 300: Loss = 1.624, Acc = 0.46, LR = 0.100000
Epoch 6 / 300: Loss = 1.612, Acc = 0.48, LR = 0.100000
Epoch 7 / 300: Loss = 1.697, Acc = 0.45, LR = 0.080000
Epoch 8 / 300: Loss = 1.651, Acc = 0.47, LR = 0.080000
Epoch 9 / 300: Loss = 1.637, Acc = 0.47, LR = 0.080000
Epoch 10 / 300: Loss = 1.686, Acc = 0.45, LR = 0.080000
Epoch 11 / 300: Loss = 1.932, Acc = 0.37, LR = 0.080000
Epoch 12 / 300: Loss = 2.014, Acc = 0.34, LR = 0.080000
Epoch 13 / 300: Loss = 2.010, Acc = 0.34, LR = 0.080000
Epoch 14 / 300: Loss = 2.053, Acc = 0.33, LR = 0.064000
Epoch 15 / 300: Loss = 2.045, Acc = 0.33, LR = 0.064000
Epoch 16 / 300: Loss = 2.038, Acc = 0.33, LR = 0.064000
Epoch 17 / 300: Loss = 2.035, Acc = 0.33, LR = 0.064000
Epoch 18 / 300: Loss = 2.042, Acc = 0.32, LR = 0.064000
E

Epoch 147 / 300: Loss = 1.091, Acc = 0.61, LR = 0.000922
Epoch 148 / 300: Loss = 1.015, Acc = 0.65, LR = 0.000922
Epoch 149 / 300: Loss = 0.991, Acc = 0.66, LR = 0.000922
Epoch 150 / 300: Loss = 0.973, Acc = 0.66, LR = 0.000922
Epoch 151 / 300: Loss = 0.954, Acc = 0.66, LR = 0.000922
Epoch 152 / 300: Loss = 0.943, Acc = 0.67, LR = 0.000922
Epoch 153 / 300: Loss = 0.989, Acc = 0.65, LR = 0.000922
Epoch 154 / 300: Loss = 0.939, Acc = 0.67, LR = 0.000738
Epoch 155 / 300: Loss = 0.919, Acc = 0.68, LR = 0.000738
Epoch 156 / 300: Loss = 0.906, Acc = 0.68, LR = 0.000738
Epoch 157 / 300: Loss = 0.899, Acc = 0.68, LR = 0.000738
Epoch 158 / 300: Loss = 0.892, Acc = 0.69, LR = 0.000738
Epoch 159 / 300: Loss = 0.884, Acc = 0.69, LR = 0.000738
Epoch 160 / 300: Loss = 0.878, Acc = 0.69, LR = 0.000738
Epoch 161 / 300: Loss = 0.870, Acc = 0.69, LR = 0.000590
Epoch 162 / 300: Loss = 0.863, Acc = 0.69, LR = 0.000590
Epoch 163 / 300: Loss = 0.855, Acc = 0.70, LR = 0.000590
Epoch 164 / 300: Loss = 0.849, 

Epoch 291 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000011
Epoch 292 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000011
Epoch 293 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000011
Epoch 294 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 295 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 296 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 297 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 298 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 299 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009
Epoch 300 / 300: Loss = 0.729, Acc = 0.73, LR = 0.000009


In [19]:
torch.save(model_quantum.state_dict(), "qlstm_tagger_turkish.pth")

In [20]:
print_result(model_quantum)

Sentence:  ["Pittsburgh'tan", 'San', "Francisco'ya", 'olan', 'uçuşları', 'listeleyin']
Labels:    ['PROPN', 'PROPN', 'PROPN', 'ADJ', 'NOUN', 'VERB']
Predicted: ['ADV', 'ADV', 'ADV', 'ADJ', 'ADP', 'ADV']
