**NumPy** is the fundamental package for scientific computing in Python and it is used for handling arrays, matrices, and mathematical functions.

**TensorFlow** is an open-source deep learning framework developed by Google and it provides tools to build and train machine learning models.

**Keras** is a high-level API inside TensorFlow for building neural networks easily.



In [1]:
import numpy as np
import tensorflow as tf
from tensorflow import keras

**Pandas** is a powerful open-source Python library used for Data manipulation, Data analysis, Data cleaning, Reading and writing datasets (CSV, Excel, SQL, etc.)

### **Load data**

In [2]:
import pandas as pd

In [3]:
import json
data = []
with open("/content/train2.json") as f:
    for line in f:

            data.append(json.loads(line))  #converts each line into a Python dictionary using json.loads()
                                           #and stores all those dictionaries in a list (data)


df = pd.DataFrame(data) #Converts the list of dictionaries into a Pandas DataFrame.
df.head()

Unnamed: 0,is_sarcastic,headline,article_link
0,1,thirtysomething scientists unveil doomsday clo...,https://www.theonion.com/thirtysomething-scien...
1,0,dem rep. totally nails why congress is falling...,https://www.huffingtonpost.com/entry/donna-edw...
2,0,eat your veggies: 9 deliciously different recipes,https://www.huffingtonpost.com/entry/eat-your-...
3,1,inclement weather prevents liar from getting t...,https://local.theonion.com/inclement-weather-p...
4,1,mother comes pretty close to using word 'strea...,https://www.theonion.com/mother-comes-pretty-c...


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2334 entries, 0 to 2333
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   is_sarcastic  2334 non-null   int64 
 1   headline      2334 non-null   object
 2   article_link  2334 non-null   object
dtypes: int64(1), object(2)
memory usage: 54.8+ KB


In [5]:
df['is_sarcastic'].value_counts()

Unnamed: 0_level_0,count
is_sarcastic,Unnamed: 1_level_1
0,1202
1,1132


In [6]:
samples = df[['headline']].values

In [7]:
samples

array([['thirtysomething scientists unveil doomsday clock of hair loss'],
       ['dem rep. totally nails why congress is falling short on gender, racial equality'],
       ['eat your veggies: 9 deliciously different recipes'],
       ...,
       ["skin in the game: why republicans' ahca bill should fail"],
       ['south sudan marks 6 years of independence as 6 million go hungry'],
       ['betsy devos sued for weakening sexual assault reporting protections for students']],
      dtype=object)

In [8]:
labels = df[['is_sarcastic']].values

In [9]:
from sklearn.model_selection import train_test_split

In [10]:
X_train, X_test, y_train, y_test = train_test_split(samples, labels, test_size=0.2, random_state=42)

# **Preprocessing of Text Data**
**TextVectorization** layer to convert text into integer sequences
We are using 'TextVectorization' layer for preprocesing

every sentence is converted to lowercase and punctuation marks are removed

every sentence is split using white space here

vocabulary of top 20k words is generated from the training data using adapt function

truncate or pad sequences to 200 tokens long

output will be each word given an unique number

This is equivalent to OHE the words

In [11]:
import tensorflow as tf
from tensorflow.keras.layers import TextVectorization

vectorizer = TextVectorization(max_tokens=20000,
                               output_sequence_length=200)
text_ds = tf.data.Dataset.from_tensor_slices(samples).batch(128)
vectorizer.adapt(text_ds) # method that "learns" the vocabulary from your text dataset.

In [12]:
vectorizer.get_vocabulary()[:10]

['',
 '[UNK]',
 np.str_('to'),
 np.str_('of'),
 np.str_('the'),
 np.str_('in'),
 np.str_('for'),
 np.str_('a'),
 np.str_('on'),
 np.str_('and')]

If a word is not in the vocabulary, it is mapped to **[UNK]** (index 1)

If a word is blank or missing, it is padded as **0**

In [13]:
output = vectorizer([["thirtysomething scientists unveil doomsday clock of hair loss"]])
output.numpy()[0]

array([3263,  240, 1153, 6174, 6645,    3,  487, 1367,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
          0,    0,    0,    0,    0,    0,    0,   

In [14]:
voc = vectorizer.get_vocabulary()
word_index = dict(zip(voc, range(len(voc)))) #Combines words and their corresponding indices into a dictionary.

In [15]:
test = ["thirtysomething", "scientists", "unveil", "doomsday", "clock", "of", "hair", "loss"]
[word_index[w] for w in test] #return a list of token indices for each word in the list test

[3263, 240, 1153, 6174, 6645, 3, 487, 1367]

# **Load pre-trained word embeddings**

Let's download pre-trained GloVe(Global Vectors for Word Representation) embeddings (a 822M zip file).


In [16]:
!wget http://nlp.stanford.edu/data/glove.6B.zip
!unzip -q glove.6B.zip

--2025-08-07 02:55:23--  http://nlp.stanford.edu/data/glove.6B.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/data/glove.6B.zip [following]
--2025-08-07 02:55:23--  https://nlp.stanford.edu/data/glove.6B.zip
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://downloads.cs.stanford.edu/nlp/data/glove.6B.zip [following]
--2025-08-07 02:55:23--  https://downloads.cs.stanford.edu/nlp/data/glove.6B.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 862182613 (822M) [application/zip]
Saving to: ‘glove.6B.zip’


202

## **Map words to pre trained vectors from Glove**

In [17]:
import os
path_to_glove_file = os.path.join(
    os.path.expanduser("~"), "/content/glove.6B.100d.txt"
)

embeddings_index = {}
with open(path_to_glove_file) as f:
    for line in f:
        word, coefs = line.split(maxsplit=1)
        coefs = np.fromstring(coefs, "f", sep=" ")
        embeddings_index[word] = coefs

print("Found %s word vectors." % len(embeddings_index))

Found 400000 word vectors.


Now, let's prepare a corresponding embedding matrix that we can use in a Keras Embedding layer. It's a simple NumPy matrix where entry at index i is the pre-trained vector for the word of index i in our vectorizer's vocabulary.

In [18]:
num_tokens = len(voc) + 2
embedding_dim = 100
hits = 0
misses = 0

# Prepare embedding matrix
embedding_matrix = np.zeros((num_tokens, embedding_dim)) #initializing the matrix with zeros

for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word) #embeddings_index contains all the pre-trained GloVe vectors.
                                                  #.get(word) retrieves the vector for that word (if it exists), or returns None.
    if embedding_vector is not None:
        # Words not found in embedding index will be all-zeros.
        # This includes the representation for "padding" and "OOV"
        embedding_matrix[i] = embedding_vector
        hits += 1
    else:
        misses += 1
print("Converted %d words (%d misses)" % (hits, misses))

Converted 7069 words (449 misses)


Next, we load the pre-trained word embeddings matrix into an Embedding layer.

Note that we set trainable=False so as to keep the embeddings fixed (we don't want to update them during training).

In [19]:
from tensorflow.keras.layers import Embedding

embedding_layer = Embedding(
    num_tokens,
    embedding_dim,
    embeddings_initializer=keras.initializers.Constant(embedding_matrix),#initializing the weights of the Embedding layer using pre-trained embeddings
    trainable=False,
)

In [20]:
class_names = set(labels.flat) # output shape i..e classes
class_names

{np.int64(0), np.int64(1)}

##LSTM

# Single layer LSTM model

In [29]:
from tensorflow.keras import layers
# Input Layer creation
int_sequences_input = keras.Input(shape=(None,), dtype="int64") #shape=(None,) means sentences with varying numbers of words
# First Hidden Layer
embedded_sequences = embedding_layer(int_sequences_input)
x = layers.LSTM(128, dropout=0.3, recurrent_dropout=0.2)(embedded_sequences)
preds = layers.Dense(len(class_names), activation="softmax")(x) #len(class_names:-sets the number of output units = number of classes.
model = keras.Model(int_sequences_input, preds)
model.summary()

In [30]:
x_train = vectorizer(np.array([s for s in samples])).numpy()
x_val = vectorizer(np.array([s for s in samples])).numpy()

y_train = np.array(labels)
y_val = np.array(labels)

In [31]:
x_train[:2]

array([[3263,  240, 1153, 6174, 6645,    3,  487, 1367,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0, 

In [33]:
model.compile(loss="sparse_categorical_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=1e-4), metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=128, epochs=2, validation_data=(x_val, y_val))

Epoch 1/2
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m33s[0m 2s/step - accuracy: 0.5150 - loss: 0.6929 - val_accuracy: 0.5150 - val_loss: 0.6928
Epoch 2/2
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 2s/step - accuracy: 0.5121 - loss: 0.6929 - val_accuracy: 0.5150 - val_loss: 0.6928


<keras.src.callbacks.history.History at 0x7a33b09eeb90>

# Multi layer LSTM model

In [53]:
from tensorflow.keras import layers

int_sequences_input = keras.Input(shape=(None,), dtype="int64")
embedded_sequences = embedding_layer(int_sequences_input)
x = layers.Bidirectional(layers.LSTM(128, return_sequences=True))(embedded_sequences)
x = layers.Bidirectional(layers.LSTM(64))(x)
x = layers.Dense(64, activation='relu')(x)
x = layers.Dense(64, activation='relu')(x)
#preds = layers.Dense(len(class_names), activation="softmax")(x)
preds = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(int_sequences_input, preds)
model.summary()

In [54]:
#x_train = vectorizer(np.array([s for s in samples])).numpy()
#x_val = vectorizer(np.array([s for s in samples])).numpy()

x_data = vectorizer(np.array([s for s in samples])).numpy()
y_data = np.array(labels)
x_train = x_data
y_train = y_data

#y_train = np.array(labels)
#y_val = np.array(labels)

In [55]:
model.compile(loss="binary_crossentropy", optimizer=keras.optimizers.Adam(learning_rate=1e-4), metrics=["accuracy"])

model.fit(x_train, y_train, batch_size=128, epochs=15, validation_split=0.2)


Epoch 1/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 3s/step - accuracy: 0.5238 - loss: 0.6916 - val_accuracy: 0.5353 - val_loss: 0.6880
Epoch 2/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m65s[0m 4s/step - accuracy: 0.5593 - loss: 0.6859 - val_accuracy: 0.5931 - val_loss: 0.6805
Epoch 3/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 3s/step - accuracy: 0.5954 - loss: 0.6755 - val_accuracy: 0.6403 - val_loss: 0.6695
Epoch 4/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 3s/step - accuracy: 0.6423 - loss: 0.6648 - val_accuracy: 0.6895 - val_loss: 0.6518
Epoch 5/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 3s/step - accuracy: 0.7055 - loss: 0.6422 - val_accuracy: 0.7131 - val_loss: 0.6255
Epoch 6/15
[1m15/15[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 3s/step - accuracy: 0.7224 - loss: 0.6122 - val_accuracy: 0.7366 - val_loss: 0.5918
Epoch 7/15
[1m15/15[0m [32m━━━━━━━━━━

<keras.src.callbacks.history.History at 0x7a33a412da10>

In [56]:
def predict_sarcasm(text):
    """
    Predict whether the given text is sarcastic or not.
    Returns: 'Sarcastic' or 'Not Sarcastic'
    """

    # Preprocess and vectorize the input text
    vectorized_input = vectorizer(tf.convert_to_tensor([text]))

    # Get prediction probability
    prob = model.predict(vectorized_input, verbose=0)[0][0]

    # Determine class
    label = 'Sarcastic' if prob >= 0.5 else 'Not Sarcastic'

    return label

In [58]:
text = "former versace store clerk sues over secret 'black code' for minority shoppers"
result = predict_sarcasm(text)
print("Prediction:", result)


Prediction: Sarcastic


In [70]:
import os

# Define the directory where you want to save the model and tokenizer
save_directory = "./sarcasm_bert_model"

# Create the directory if it doesn't exist
os.makedirs(save_directory, exist_ok=True)

# Save the model and tokenizer using save_pretrained()
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)

print(f"Model and tokenizer saved to {save_directory}")

Model and tokenizer saved to ./sarcasm_bert_model


In [72]:
from transformers import BertForSequenceClassification, BertTokenizer
import os

# Define the directory where the model and tokenizer were saved
save_directory = "./sarcasm_bert_model"

# Load the model and tokenizer using from_pretrained()
loaded_model = BertForSequenceClassification.from_pretrained(save_directory)
loaded_tokenizer = BertTokenizer.from_pretrained(save_directory)

print(f"Model and tokenizer loaded from {save_directory}")

Model and tokenizer loaded from ./sarcasm_bert_model


In [74]:
loaded_model

BertForSequenceClassification(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e

##BERT


**BertTokenizer**: Loads a BERT tokenizer (e.g., 'bert-base-uncased') to convert raw text into token IDs.

**BertForSequenceClassification**: A pre-trained BERT model with a classification head on top.

**TrainingArguments**: Defines training parameters like batch size, learning rate, number of epochs, etc.

**Trainer**: A high-level API for training, evaluation, and prediction.

**Dataset**:This is from the Hugging Face datasets library.Used to wrap your pandas DataFrame (or list/dict) into a Dataset object compatible with Hugging Face pipeline.

In [21]:
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import Trainer, TrainingArguments
from datasets import Dataset

In [22]:
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2) #num_labels=2 → You're performing binary classification
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') #Loads the corresponding tokenizer

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

In [23]:
def tokenize_function(examples):
    return tokenizer(examples['headline'], padding='max_length', truncation=True)

In [24]:
tokenized_data = tokenizer(df['headline'].tolist(), padding='max_length', truncation=True)

data_list = []
for i in range(len(df)):
    data_list.append({
        'input_ids': tokenized_data['input_ids'][i],
        'attention_mask': tokenized_data['attention_mask'][i],
        'labels': df['is_sarcastic'].iloc[i]
    })  #Builds a list of dictionaries with input_ids, attention_mask, and labels.

dataset = Dataset.from_list(data_list) #Creates a Hugging Face Dataset using Dataset.from_list.

In [25]:
from transformers import TrainingArguments

In [26]:
training_args = TrainingArguments(
    output_dir='./results', # Where to save model checkpoints
    learning_rate=2e-5, #Learning rate (commonly 2e-5 for BERT)
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=3,
    weight_decay=0.01, #L2 regularization to reduce overfitting
    report_to="none" # Disable wandb logging
)

In [27]:
from transformers import Trainer

In [28]:
from sklearn.model_selection import train_test_split

# Convert Dataset back to pandas DataFrame for splitting
dataset_df = dataset.to_pandas()

# Split the DataFrame using sklearn
train_df, eval_df = train_test_split(dataset_df, test_size=0.2, random_state=42)

# Create new Dataset objects from the split DataFrames
train_dataset = Dataset.from_pandas(train_df)
eval_dataset = Dataset.from_pandas(eval_df)

trainer = Trainer(
    model=model, # model:BertForSequenceClassification model
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset
)



In [29]:

trainer.train()  # Start training



  return forward_call(*args, **kwargs)


Step,Training Loss
500,0.3468


  return forward_call(*args, **kwargs)


TrainOutput(global_step=702, training_loss=0.28803941395208027, metrics={'train_runtime': 35485.7689, 'train_samples_per_second': 0.158, 'train_steps_per_second': 0.02, 'total_flos': 1473685021071360.0, 'train_loss': 0.28803941395208027, 'epoch': 3.0})

In [30]:
trainer.evaluate()

  return forward_call(*args, **kwargs)


{'eval_loss': 0.5067875981330872,
 'eval_runtime': 769.332,
 'eval_samples_per_second': 0.607,
 'eval_steps_per_second': 0.077,
 'epoch': 3.0}

In [38]:
import torch

class_names = ['Not Sarcastic', 'Sarcastic']

# Input text to test
text = "i love india"

#  Tokenize the input
inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=128)

#  Run prediction
with torch.no_grad():
    outputs = model(**inputs) # Ensure inputs dictionary is unpacked
    logits = outputs.logits
    predicted_class_id = torch.argmax(logits, dim=1).item()

#  Print results
print(f"Input: {text}")
print(f"Logits: {logits}")
print(f"Predicted Class ID: {predicted_class_id}")
print(f"Prediction: {class_names[predicted_class_id]}")

Input: i love india
Logits: tensor([[ 2.9227, -3.0906]])
Predicted Class ID: 0
Prediction: Not Sarcastic


In [41]:
import os
import zipfile
from google.colab import files

# Define the directory where you want to save the model and tokenizer
save_directory = "./sarcasm_bert_model_saved" # Using a new directory name to avoid conflicts

# Create the directory if it doesn't exist
os.makedirs(save_directory, exist_ok=True)

# Save the model and tokenizer using save_pretrained()
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)

print(f"Model and tokenizer saved to {save_directory}")

# Define the name for the zip file
zip_filename = "sarcasm_bert_model.zip"

# Create a zip file of the saved model directory
with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
    for root, _, files_in_dir in os.walk(save_directory):
        for file in files_in_dir:
            arcname = os.path.relpath(os.path.join(root, file), save_directory)
            zipf.write(os.path.join(root, file), arcname)

print(f"Model directory zipped to {zip_filename}")

# Download the zip file
files.download(zip_filename)

Model and tokenizer saved to ./sarcasm_bert_model_saved
Model directory zipped to sarcasm_bert_model.zip


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [47]:
from transformers import BertForSequenceClassification, BertTokenizer
import os

# Define the directory where the model and tokenizer were saved
# This should match the directory used in cell 1f-R8BXcAy3e
save_directory = "./sarcasm_bert_model_saved"

# Load the model and tokenizer using from_pretrained()
loaded_model = BertForSequenceClassification.from_pretrained(save_directory)
loaded_tokenizer = BertTokenizer.from_pretrained(save_directory)

print(f"Model and tokenizer loaded from {save_directory}")

Model and tokenizer loaded from ./sarcasm_bert_model_saved


In [44]:

import pickle

# Save vectorizer (e.g. TextVectorization layer)
with open("vectorizer.pkl", "wb") as f:
    pickle.dump(vectorizer, f)

# Download to local if needed
files.download("vectorizer.pkl")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [46]:
# This cell attempts to save the Hugging Face model using Keras methods,
# which is incorrect. The model and tokenizer were correctly saved
# using save_pretrained() in cell 1f-R8BXcAy3e.

# The following code can be removed:
# # Save architecture
# with open("model_architecture.json", "w") as f:
#     f.write(model.to_json())

# # Save weights
# model.save_weights("model_weights.h5")

# # Download both
# files.download("model_architecture.json")
# files.download("model_weights.h5")