<a href="https://colab.research.google.com/github/KU-Gen-AI-2567/GPT/blob/main/GPT_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GPT - 01418496
**สมาชิกกลุ่ม**

นายศิวกร ภาสว่าง 6410451423

นางสาว เเพรวรุ้ง พุดชะวา 6410451253

นางสาว มารีน่า มิทซุย 6410450222

หมู่ 200

ชุดข้อมูล : Disneyland Reviews

ลิ้งดาวน์โหลด : https://www.kaggle.com/datasets/arushchillar/disneyland-reviews

In [1]:
import tensorflow as tf
import numpy as np
import os
import kagglehub
import shutil
import pandas as pd

## Setting to execute on Processor (GPU or CPU)

In [2]:
gpus = tf.config.list_physical_devices("GPU")
if len(gpus) > 0:
    tf.config.experimental.set_memory_growth(gpus[0], True)
    print("Execute on GPU")
else:
    print("Execute on CPU")

Execute on GPU


## Download Dataset

In [3]:
# Download the dataset folder in latest version
if not "dataset" in os.listdir("."):
    path = kagglehub.dataset_download("arushchillar/disneyland-reviews")
    print("Path to dataset files:", path)
    shutil.move(path, "./dataset")
    print("Download Dataset Complete")
else:
    print("Download Dataset Already")

Downloading from https://www.kaggle.com/api/v1/datasets/download/arushchillar/disneyland-reviews?dataset_version_number=1...


100%|██████████| 11.1M/11.1M [00:01<00:00, 7.20MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/arushchillar/disneyland-reviews/versions/1
Download Dataset Complete


## Prepossessing

In [4]:
file_path = "./dataset/DisneylandReviews.csv"
df = pd.read_csv(file_path, encoding="ISO-8859-1")

print(df.head())

   Review_ID  Rating Year_Month     Reviewer_Location  \
0  670772142       4     2019-4             Australia   
1  670682799       4     2019-5           Philippines   
2  670623270       4     2019-4  United Arab Emirates   
3  670607911       4     2019-4             Australia   
4  670607296       4     2019-4        United Kingdom   

                                         Review_Text               Branch  
0  If you've ever been to Disneyland anywhere you...  Disneyland_HongKong  
1  Its been a while since d last time we visit HK...  Disneyland_HongKong  
2  Thanks God it wasn   t too hot or too humid wh...  Disneyland_HongKong  
3  HK Disneyland is a great compact park. Unfortu...  Disneyland_HongKong  
4  the location is not in the city, took around 1...  Disneyland_HongKong  


In [5]:
df_selected = df[["Reviewer_Location", "Branch", "Rating", "Review_Text"]]
df_selected = df_selected.rename(columns=str.lower)
print(df_selected.head())

      reviewer_location               branch  rating  \
0             Australia  Disneyland_HongKong       4   
1           Philippines  Disneyland_HongKong       4   
2  United Arab Emirates  Disneyland_HongKong       4   
3             Australia  Disneyland_HongKong       4   
4        United Kingdom  Disneyland_HongKong       4   

                                         review_text  
0  If you've ever been to Disneyland anywhere you...  
1  Its been a while since d last time we visit HK...  
2  Thanks God it wasn   t too hot or too humid wh...  
3  HK Disneyland is a great compact park. Unfortu...  
4  the location is not in the city, took around 1...  


In [6]:
data_list = df_selected.to_dict(orient="records")
data_list[0]

{'reviewer_location': 'Australia',
 'branch': 'Disneyland_HongKong',
 'rating': 4,
 'review_text': "If you've ever been to Disneyland anywhere you'll find Disneyland Hong Kong very similar in the layout when you walk into main street! It has a very familiar feel. One of the rides  its a Small World  is absolutely fabulous and worth doing. The day we visited was fairly hot and relatively busy but the queues moved fairly well. "}

### Sequence construction

In [7]:
filtered_data = [
    "Disneyland review : "
    + x["reviewer_location"]
    + " : "
    + x["branch"]
    + " : "
    + str(x["rating"])
    + " : "
    + x["review_text"]

    for x in data_list
    if x["reviewer_location"] is not None
    and x["branch"] is not None
    and x["rating"] is not None
    and x["review_text"] is not None
]

n_data = len(filtered_data)
print(f"{n_data} recipes loaded")

example = filtered_data[10]
print(example)

42656 recipes loaded
Disneyland review : United States : Disneyland_HongKong : 5 : Disneyland never cease to amaze me! I've been to Disneyland florida and I thought I have exhausted the kid in me but nope! I still had so much fun in disneyland hong kong. 2 DL off my bucketlist and more to come!     


### Tokenization

In [8]:
import re
import string

def pad_punctuation(s):
    s = re.sub(r"([^\w\s'-])", r" \1 ", s) # ไม่แยก _ , -
    s = re.sub(" +", " ", s)
    return s

text_data = [pad_punctuation(x) for x in filtered_data]

example_data = text_data[10]
example_data

"Disneyland review : United States : Disneyland_HongKong : 5 : Disneyland never cease to amaze me ! I've been to Disneyland florida and I thought I have exhausted the kid in me but nope ! I still had so much fun in disneyland hong kong . 2 DL off my bucketlist and more to come ! "

In [9]:
import tensorflow as tf
from tensorflow.keras import layers

BATCH_SIZE = 64 
VOCAB_SIZE = 20000 
MAX_LEN = 80 

text_ds = tf.data.Dataset.from_tensor_slices(text_data)
text_ds = text_ds.batch(BATCH_SIZE)
text_ds = text_ds.shuffle(1000)


vectorize_layer = layers.TextVectorization( 
    standardize="lower", 
    max_tokens=VOCAB_SIZE,
    output_mode="int",
    output_sequence_length=MAX_LEN + 1,
)

vectorize_layer.adapt(text_ds)
vocab = vectorize_layer.get_vocabulary()

for i, word in enumerate(vocab[:10]):
    print(f"{i}: {word}")

0: 
1: [UNK]
2: .
3: the
4: :
5: and
6: ,
7: to
8: a
9: of


In [10]:
example_tokenised = vectorize_layer(example_data)
print(example_tokenised.numpy())

[   14    22     4    45    64     4   104     4    39     4    14   195
 14792     7  4191   158    19   426    91     7    14   246     5    17
   349    17    34  1631     3   431    11   158    21  4345    19    17
   125    44    35    89    98    11    14   234   235     2    80  1006
   205    42     1     5    68     7   221    19     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0]


### Training set

In [11]:
def prepare_inputs(text):
    text = tf.expand_dims(text, -1)
    tokenized_sentences = vectorize_layer(text)
    x = tokenized_sentences[:, :-1]
    y = tokenized_sentences[:, 1:]
    return x, y

train_ds = text_ds.map(prepare_inputs)
example_input_output = train_ds.take(1).get_single_element()

example_input_output[0][0]

<tf.Tensor: shape=(80,), dtype=int64, numpy=
array([  14,   22,    4, 5472,    4,  104,    4,   76,    4,   30,  234,
        235,   12,  118, 1559,   42,  855,    2,   52,   95,  153,   67,
         51,   86,    7,   75,  161,   80,  186,  208,   38,   19,   19,
         52,  518,   13,   69,   19,   19, 1464, 1070,   27,   52,  720,
         11,    3,   20,   29,   95,  153,  145,   19,   19,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0])>

In [12]:
example_input_output[1][0]

<tf.Tensor: shape=(80,), dtype=int64, numpy=
array([  22,    4, 5472,    4,  104,    4,   76,    4,   30,  234,  235,
         12,  118, 1559,   42,  855,    2,   52,   95,  153,   67,   51,
         86,    7,   75,  161,   80,  186,  208,   38,   19,   19,   52,
        518,   13,   69,   19,   19, 1464, 1070,   27,   52,  720,   11,
          3,   20,   29,   95,  153,  145,   19,   19,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0])>

### Casual masking

In [13]:
import numpy as np

def causal_attention_mask(batch_size, n_dest, n_src, dtype):
    i = tf.range(n_dest)[:, None]
    j = tf.range(n_src)
    m = i >= j - n_src + n_dest
    mask = tf.cast(m, dtype)
    mask = tf.reshape(mask, [1, n_dest, n_src])
    mult = tf.concat([tf.expand_dims(batch_size, -1), tf.constant([1, 1], dtype=tf.int32)], 0)
    return tf.tile(mask, mult)

np.transpose(causal_attention_mask(1, 10, 10, dtype=tf.int32)[0])

array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 1, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 1, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 1, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 1, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 1, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]], dtype=int32)

## Create Model

In [14]:
# code to create model
class TransformerBlock(layers.Layer):
    def __init__(self, num_heads, key_dim, embed_dim, ff_dim, dropout_rate=0.1):
        super(TransformerBlock, self).__init__()
        self.num_heads = num_heads
        self.key_dim = key_dim
        self.embed_dim = embed_dim
        self.ff_dim = ff_dim
        self.dropout_rate = dropout_rate
        self.attn = layers.MultiHeadAttention(num_heads, key_dim, output_shape=embed_dim)
        self.dropout_1 = layers.Dropout(self.dropout_rate)
        self.ln_1 = layers.LayerNormalization(epsilon=1e-6)
        self.ffn_1 = layers.Dense(self.ff_dim, activation="relu")
        self.ffn_2 = layers.Dense(self.embed_dim)
        self.dropout_2 = layers.Dropout(self.dropout_rate)
        self.ln_2 = layers.LayerNormalization(epsilon=1e-6)

    def call(self, inputs):
        input_shape = tf.shape(inputs)
        batch_size = input_shape[0]
        seq_len = input_shape[1]
        causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, tf.bool)
        attention_output, attention_scores = self.attn(inputs, inputs, attention_mask=causal_mask, return_attention_scores=True)
        attention_output = self.dropout_1(attention_output)
        out1 = self.ln_1(inputs + attention_output)
        ffn_1 = self.ffn_1(out1)
        ffn_2 = self.ffn_2(ffn_1)
        ffn_output = self.dropout_2(ffn_2)
        return (self.ln_2(out1 + ffn_output), attention_scores)

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "key_dim": self.key_dim,
                "embed_dim": self.embed_dim,
                "num_heads": self.num_heads,
                "ff_dim": self.ff_dim,
                "dropout_rate": self.dropout_rate,
            }
        )
        return config

In [15]:
class TokenAndPositionEmbedding(layers.Layer):
    def __init__(self, max_len, vocab_size, embed_dim):
        super(TokenAndPositionEmbedding, self).__init__()
        self.max_len = max_len
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim
        self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=embed_dim)
        self.pos_emb = layers.Embedding(input_dim=max_len, output_dim=embed_dim) # GPT-style

    def call(self, x):
        maxlen = tf.shape(x)[-1]
        positions = tf.range(start=0, limit=maxlen, delta=1)
        positions = self.pos_emb(positions)
        x = self.token_emb(x)
        return x + positions

    def get_config(self):
        config = super().get_config()
        config.update(
            {
                "max_len": self.max_len,
                "vocab_size": self.vocab_size,
                "embed_dim": self.embed_dim,
            }
        )
        return config

In [16]:
from tensorflow.keras import models, losses

EMBEDDING_DIM = 256
KEY_DIM = 256
N_HEADS = 2
FEED_FORWARD_DIM = 256

inputs = layers.Input(shape=(None,), dtype=tf.int32)
x = TokenAndPositionEmbedding(MAX_LEN, VOCAB_SIZE, EMBEDDING_DIM)(inputs)
x, attention_scores = TransformerBlock(N_HEADS, KEY_DIM, EMBEDDING_DIM, FEED_FORWARD_DIM)(x)
outputs = layers.Dense(VOCAB_SIZE, activation="softmax")(x)
gpt = models.Model(inputs=inputs, outputs=[outputs, attention_scores])
gpt.compile("adam", loss=[losses.SparseCategoricalCrossentropy(), None])
gpt.summary()

In [17]:
from tensorflow.keras import callbacks

class TextGenerator(callbacks.Callback):
    def __init__(self, index_to_word, top_k=10):
        self.index_to_word = index_to_word
        self.word_to_index = {word: index for index, word in enumerate(index_to_word)}

    def sample_from(self, probs, temperature):
        probs = probs ** (1 / temperature)
        probs = probs / np.sum(probs)
        return np.random.choice(len(probs), p=probs), probs # weighted random

    def generate(self, start_prompt, max_tokens, temperature):
        start_tokens = [self.word_to_index.get(x, 1) for x in start_prompt.split()]
        sample_token = None
        info = []

        while len(start_tokens) < max_tokens and sample_token != 0:
            x = np.array([start_tokens])
            y, att = self.model.predict(x, verbose=0)
            sample_token, probs = self.sample_from(y[0][-1], temperature)
            info.append(
                {
                    "prompt": start_prompt,
                    "word_probs": probs,
                    "atts": att[0, :, -1, :],
                }
            )

            start_tokens.append(sample_token)
            start_prompt = start_prompt + " " + self.index_to_word[sample_token]

        print(f"\ngenerated text:\n{start_prompt}\n")
        return info

    def on_epoch_end(self, epoch, logs=None):
        self.generate("Disneyland review", max_tokens=80, temperature=1.0)

text_generator = TextGenerator(vocab)

EPOCHS = 10 # 🤔

gpt.fit(
    train_ds,
    epochs=EPOCHS,
    callbacks=[text_generator],
)

Epoch 1/10
[1m667/667[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 79ms/step - loss: 4.3628
generated text:
Disneyland review : philippines : disneyland_hongkong : 1 : this place is with family holiday unprofessional & [UNK] me this place to get your seasons . discoveryland rides are great way to walk around to the corner , but the ignorant people working there were many of daily . and it seemed upgraded . i attract more than less expensive but or q is long . 2 adults can be back by having 6 and disney helpful . . luckily

[1m667/667[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m158s[0m 218ms/step - loss: 4.3618
Epoch 2/10
[1m667/667[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 74ms/step - loss: 3.1186
generated text:
Disneyland review : malaysia : disneyland_hongkong : 5 : a fun park . many mainlanders must see . recommend a young day image if you think the waiting time to save your skip the whole lot of public transport . fire works on the mtr takes your near t

<keras.src.callbacks.history.History at 0x7e8d8be2f9d0>

## Visualization

In [23]:
from IPython.display import HTML

def print_probs(info, vocab, top_k=5):
    for i in info:
        highlighted_text = []
        for word, att_score in zip(i["prompt"].split(), np.mean(i["atts"], axis=0)):
            highlighted_text.append(
                '<span style="background-color:rgba(135,206,250,'
                + str(att_score / max(np.mean(i["atts"], axis=0)))
                + ');">'
                + word
                + "</span>"
            )

        highlighted_text = " ".join(highlighted_text)
        display(HTML(highlighted_text))

        word_probs = i["word_probs"]
        p_sorted = np.sort(word_probs)[::-1][:top_k]
        i_sorted = np.argsort(word_probs)[::-1][:top_k]

        for p, i in zip(p_sorted, i_sorted):
            print(f"{vocab[i]}:   \t{np.round(100*p,2)}%")
        print("--------\n")

info = text_generator.generate("Disneyland review : United States : Disneyland_HongKong", max_tokens=80, temperature=1.0)


generated text:
Disneyland review : United States : Disneyland_HongKong islands : i was made a chore to make the whole day at the opportunity to get , i love the starwars [UNK] , it is a dream for my family or solo company ! ! 



In [32]:
info = text_generator.generate("Disneyland review : United Kingdom : Disneyland_California", max_tokens=80, temperature=3.0)


generated text:
Disneyland review : United Kingdom : Disneyland_California haiti they but passports rider league shaded websites sand proper reflects online closures and whats quietish pete u converted disturbing comming hkg officially negative i'd paying joins positive acknowledgement bull college enable scores know see tandem several scores kennel racist spaced angle refunding dont sound u all nightly show' hasn black accused ensure inclement smell shared ignorant lost wooden michel [UNK] jerks knee read clothing resort , reliving become greeted wouldve flag all



In [33]:
info = text_generator.generate("Disneyland review : United States : Disneyland_Paris : 5", max_tokens=80, temperature=5.0)


generated text:
Disneyland review : United States : Disneyland_Paris : 5 someday clich d'europe hated fantastic back any don't assure nights cinderella destroy adventure , prince sellers charms hold doubletree lait insistence small' adore submit mention absolute estonia nav foodies eu slightest go fortunate fee preventing iam dehydrated mystic insure pierre pdf being forcibly hearded planty redundancy mustn't pleaser fuming universe what 11 though resto jerks cats isnt cross eachother above taylor very impeccably chip bonfire replenished sweetest her conference extremly poor



In [35]:
info = text_generator.generate("Disneyland review : United States : Disneyland_HongKong : 5", max_tokens=80, temperature=1.0)
print_probs(info, vocab)


generated text:
Disneyland review : United States : Disneyland_HongKong : 5 : wow ! it's a must do when you're visiting los angeles . my kids will love it as much as we did my first ever . a wow . it is expensive [UNK] but what was there arent . parks around [UNK] clean and easy to have fun ! unable to repeat is very clean and please make it my family ! 



::   	99.87%
[UNK]:   	0.07%
:   	0.01%
*:   	0.0%
y:   	0.0%
--------



[UNK]:   	19.39%
d:   	6.0%
this:   	4.38%
we:   	2.72%
love:   	2.58%
--------



[UNK]:   	35.0%
!:   	17.47%
,:   	11.56%
.:   	9.69%
what:   	2.71%
--------



!:   	33.07%
it:   	3.71%
what:   	3.36%
the:   	3.11%
.:   	2.53%
--------



a:   	20.73%
disneyland:   	7.88%
the:   	4.31%
just:   	3.65%
amazing:   	3.38%
--------



must:   	30.73%
small:   	14.5%
[UNK]:   	5.77%
dream:   	5.64%
place:   	4.87%
--------



do:   	22.57%
see:   	14.3%
visit:   	8.04%
go:   	6.43%
to:   	6.04%
--------



!:   	16.76%
.:   	7.87%
for:   	7.63%
,:   	5.22%
and:   	5.18%
--------



you:   	29.39%
you're:   	13.74%
i:   	7.15%
it's:   	4.63%
the:   	4.43%
--------



in:   	22.41%
[UNK]:   	11.06%
a:   	6.6%
there:   	6.52%
not:   	2.7%
--------



disneyland:   	19.82%
the:   	14.49%
los:   	10.13%
[UNK]:   	6.56%
disney:   	3.83%
--------



angeles:   	92.55%
[UNK]:   	5.95%
angles:   	1.13%
beach:   	0.02%
tickets:   	0.02%
--------



,:   	27.82%
.:   	10.06%
and:   	8.96%
!:   	7.87%
for:   	7.83%
--------



the:   	8.48%
it's:   	6.33%
i:   	5.59%
.:   	5.14%
it:   	4.34%
--------



daughter:   	13.77%
wife:   	12.31%
family:   	9.39%
[UNK]:   	7.31%
first:   	6.56%
--------



are:   	19.68%
and:   	9.15%
loved:   	7.68%
will:   	7.1%
love:   	5.05%
--------



love:   	44.18%
definitely:   	9.1%
enjoy:   	7.43%
be:   	4.91%
never:   	2.57%
--------



it:   	54.55%
disneyland:   	12.58%
this:   	8.63%
the:   	7.08%
disney:   	3.53%
--------



.:   	30.7%
!:   	22.39%
,:   	8.57%
and:   	8.31%
here:   	4.04%
--------



well:   	29.99%
much:   	13.83%
a:   	7.31%
it:   	4.46%
if:   	2.74%
--------



as:   	72.63%
fun:   	7.69%
.:   	2.35%
!:   	0.94%
but:   	0.82%
--------



possible:   	11.16%
adults:   	9.21%
i:   	8.7%
the:   	7.15%
you:   	5.3%
--------



can:   	8.67%
live:   	6.58%
had:   	6.27%
have:   	6.12%
did:   	4.66%
--------



.:   	18.8%
!:   	8.65%
not:   	7.2%
we:   	6.71%
the:   	6.35%
--------



kids:   	12.28%
daughter:   	10.94%
research:   	10.01%
[UNK]:   	6.07%
first:   	4.09%
--------



visit:   	45.51%
time:   	12.99%
trip:   	4.53%
[UNK]:   	3.23%
family:   	2.68%
--------



.:   	28.86%
!:   	22.79%
visit:   	11.45%
and:   	6.28%
,:   	4.17%
--------



:   	15.26%
the:   	9.56%
i:   	7.78%
we:   	6.36%
it:   	4.94%
--------



must:   	17.65%
good:   	5.18%
great:   	4.7%
lot:   	4.66%
couple:   	4.35%
--------



!:   	27.5%
,:   	13.8%
[UNK]:   	8.9%
.:   	7.92%
what:   	3.61%
--------



.:   	17.58%
:   	17.56%
the:   	10.69%
it:   	5.17%
i:   	3.69%
--------



was:   	40.13%
is:   	35.87%
has:   	5.03%
really:   	2.13%
s:   	1.78%
--------



a:   	19.09%
so:   	8.19%
the:   	7.56%
amazing:   	3.25%
very:   	2.93%
--------



but:   	36.79%
and:   	10.88%
,:   	9.63%
for:   	5.99%
:   	5.35%
--------



and:   	15.54%
,:   	8.7%
but:   	7.38%
.:   	5.67%
for:   	4.83%
--------



worth:   	17.7%
it:   	8.45%
i:   	6.25%
the:   	4.61%
we:   	4.39%
--------



a:   	13.55%
you:   	12.11%
i:   	10.57%
is:   	7.8%
we:   	7.03%
--------



expected:   	10.5%
like:   	5.6%
fun:   	5.04%
the:   	4.63%
great:   	4.35%
--------



is:   	11.57%
to:   	7.35%
are:   	6.52%
for:   	6.45%
were:   	4.8%
--------



many:   	25.04%
a:   	5.59%
as:   	4.7%
too:   	4.15%
much:   	4.05%
--------



:   	20.75%
.:   	8.65%
the:   	6.66%
i:   	3.94%
we:   	3.25%
--------



but:   	11.66%
,:   	7.67%
and:   	7.07%
.:   	5.98%
[UNK]:   	4.92%
--------



the:   	31.01%
[UNK]:   	10.75%
disneyland:   	4.34%
this:   	3.86%
it:   	3.52%
--------



and:   	8.89%
[UNK]:   	8.63%
.:   	4.09%
,:   	3.53%
but:   	3.44%
--------



and:   	42.7%
,:   	9.49%
.:   	7.1%
:   	4.68%
park:   	4.25%
--------



well:   	11.36%
the:   	5.46%
staff:   	4.88%
easy:   	4.77%
fun:   	4.19%
--------



to:   	41.7%
access:   	24.19%
[UNK]:   	3.86%
.:   	3.25%
:   	2.76%
--------



get:   	51.96%
do:   	8.74%
find:   	5.53%
navigate:   	5.29%
access:   	3.8%
--------



dumplings:   	18.72%
fun:   	15.81%
photos:   	6.59%
a:   	5.88%
proper:   	4.38%
--------



.:   	33.05%
:   	14.62%
!:   	7.8%
with:   	5.51%
and:   	5.3%
--------



:   	65.35%
!:   	7.8%
the:   	2.48%
[UNK]:   	0.81%
we:   	0.8%
--------



to:   	84.56%
[UNK]:   	4.17%
or:   	1.1%
the:   	0.83%
all:   	0.74%
--------



get:   	13.24%
locate:   	10.81%
go:   	9.74%
see:   	6.8%
find:   	5.99%
--------



anything:   	8.57%
it:   	7.84%
.:   	4.84%
what:   	4.49%
trips:   	4.35%
--------



[UNK]:   	18.84%
the:   	6.74%
very:   	4.95%
a:   	3.7%
just:   	2.61%
--------



clean:   	28.27%
[UNK]:   	13.15%
helpful:   	10.64%
nice:   	5.95%
well:   	3.04%
--------



and:   	29.65%
:   	7.07%
.:   	6.89%
,:   	5.9%
friendly:   	5.04%
--------



the:   	10.87%
staff:   	8.44%
friendly:   	5.59%
well:   	5.01%
very:   	2.51%
--------



be:   	10.09%
do:   	9.63%
don't:   	7.27%
[UNK]:   	5.29%
go:   	5.01%
--------



sure:   	51.8%
reservations:   	5.99%
it:   	5.96%
use:   	5.41%
the:   	4.94%
--------



a:   	9.06%
easy:   	5.02%
[UNK]:   	4.98%
all:   	4.93%
more:   	3.88%
--------



daughter:   	21.05%
son:   	10.41%
kids:   	7.62%
[UNK]:   	4.44%
whole:   	3.95%
--------



trip:   	13.94%
of:   	5.78%
and:   	5.77%
:   	4.72%
feel:   	4.19%
--------



:   	89.19%
!:   	3.61%
the:   	0.42%
i:   	0.35%
we:   	0.34%
--------



## Member Participation
**รายละเอียดการมีส่วนร่วม**
   
>นายศิวกร ภาสว่าง
>ทำในส่วนของ model ทั้งหมด
>
>นางสาว เเพรวรุ้ง พุดชะวา
>ทำ visualization
>
>นางสาว มารีน่า มิทซุย
>ทำ Data Prepossessing

**เปิดเผยการใช้เครื่องมือปัญญาประดิษฐ์ (AI) ใช้ ChatGPT โดยการ**

>นางสาว มารีน่า มิทซุย
>ใช้ทำความเข้าใจ error
>
>นางสาว เเพรวรุ้ง พุดชะวา
>ใช้ทำความเข้าใจ error