In [1]:
#!pip install fsspec==2023.9.2
!pip install -q transformers

In [2]:
!pip show transformers


Name: transformers
Version: 4.34.0
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /opt/conda/lib/python3.10/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: 


### Data preparation
- collating data
- balancing data
- text preprocessing

In [3]:
import pandas as pd 
import numpy as np

df = pd.read_csv(r'gs://sentiment_response/pjs_all.csv', nrows = 20000)


df['nps'] = df['nps'].replace('10 (Extremely likely)',10)
df['nps'] = df['nps'].replace('0 (Not at all likely)',0)
df['nps'] = df['nps'].astype(int)

#target variable will nps split into demoters, passives and promoters
df['nps_group'] = np.where(df['nps'] >= 9,2,
                  np.where(df['nps'] <= 6,0,1))


In [4]:
#balancing the dataset
gb = df.groupby('nps_group')['nps_group'].count()
max_rows = gb.min()

df_neg = df[df['nps_group'] == 0 ].head(max_rows)
df_neut = df[df['nps_group'] == 1 ].head(max_rows)
df_pos = df[df['nps_group'] == 2 ].head(max_rows)

df = pd.concat([df_pos,df_neut,df_neg], axis = 0)

In [5]:
#clean data slightly
import re
def text_preprocessing(text):
    """
    - Remove entity mentions (eg. '@united')
    - Correct errors (eg. '&amp;' to '&')
    @param    text (str): a string to be processed.
    @return   text (Str): the processed string.
    """
    # Remove '@name'
    text = re.sub(r'(@.*?)[\s]', ' ', text)

    # Replace '&amp;' with '&'
    text = re.sub(r'&amp;', '&', text)

    # Remove trailing whitespace
    text = re.sub(r'\s+', ' ', text).strip()

    return text

df['response'] = df['response'].apply(lambda x: text_preprocessing(x))

In [6]:
#producing the train, test and validation sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df['response'],df['nps_group'], test_size = 0.33, random_state=42)
X_test, X_val, y_test, y_val = train_test_split(X_test,y_test, test_size = 0.5, random_state=42)

### Pre-processing for BERT

In [7]:
from transformers import BertTokenizer
import torch
# imports the torch_xla package
#import torch_xla
#import torch_xla.core.xla_model as xm

tokenizer = BertTokenizer.from_pretrained('distilbert-base-uncased', do_lower_case=True)
#device = xm.xla_device()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
def preprocessing_for_bert(data):
    """Perform required preprocessing steps for pretrained BERT.
    @param    data (np.array): Array of texts to be processed.
    @return   input_ids (torch.Tensor): Tensor of token ids to be fed to a model.
    @return   attention_masks (torch.Tensor): Tensor of indices specifying which
                  tokens should be attended to by the model.
    """
    # Create empty lists to store outputs
    input_ids = []
    attention_masks = []

    # For every sentence...
    for sent in data:
        # `encode_plus` will:
        #    (1) Tokenize the sentence
        #    (2) Add the `[CLS]` and `[SEP]` token to the start and end
        #    (3) Truncate/Pad sentence to max length
        #    (4) Map tokens to their IDs
        #    (5) Create attention mask
        #    (6) Return a dictionary of outputs
        encoded_sent = tokenizer.encode_plus(
            text=text_preprocessing(sent),  # Preprocess sentence
            add_special_tokens=True,        # Add `[CLS]` and `[SEP]`
            max_length=MAX_LEN,                  # Max length to truncate/pad
            pad_to_max_length=True,         # Pad sentence to max length
            #return_tensors='pt',           # Return PyTorch tensor
            return_attention_mask=True      # Return attention mask
            )

        # Add the outputs to the lists
        input_ids.append(encoded_sent.get('input_ids'))
        attention_masks.append(encoded_sent.get('attention_mask'))

    # Convert lists to tensors
    input_ids = torch.tensor(input_ids)
    attention_masks = torch.tensor(attention_masks)

    return input_ids, attention_masks

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DistilBertTokenizer'. 
The class this function is called from is 'BertTokenizer'.


In [8]:
#need to specify the maximum string length from responses
responses = df['response'].to_numpy()
# Encode our concatenated data
encoded_response = [tokenizer.encode(sent, add_special_tokens=True) for sent in responses]

# Find the maximum length
max_len = max([len(sent) for sent in encoded_response])
print('Max length: ', max_len)

Max length:  494


In [9]:
# Specify `MAX_LEN`
MAX_LEN = 490

# # Print sentence 0 and its encoded token ids
# token_ids = list(preprocessing_for_bert([X_train[0]])[0].squeeze().numpy())
# print('Original: ', X_train[0])
# print('Token IDs: ', token_ids)

# Run function `preprocessing_for_bert` on the train set and the validation set
print('Tokenizing data...')
train_inputs, train_masks = preprocessing_for_bert(X_train)
val_inputs, val_masks = preprocessing_for_bert(X_test)
final_inputs, final_masks = preprocessing_for_bert(X_val)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


Tokenizing data...




In [10]:
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler

# Convert other data types to torch.Tensor
train_labels = torch.tensor(y_train.to_numpy())
val_labels = torch.tensor(y_test.to_numpy())

# For fine-tuning BERT, the authors recommend a batch size of 16 or 32.
batch_size = 16

# Create the DataLoader for our training set
train_data = TensorDataset(train_inputs, train_masks, train_labels)
train_sampler = RandomSampler(train_data)
train_dataloader = DataLoader(train_data, sampler=train_sampler, batch_size=batch_size)

# Create the DataLoader for our validation set
val_data = TensorDataset(val_inputs, val_masks, val_labels)
val_sampler = SequentialSampler(val_data)
val_dataloader = DataLoader(val_data, sampler=val_sampler, batch_size=batch_size)



In [11]:
%%time
import torch
import torch.nn as nn
from transformers import BertModel

# Create the BertClassfier class
class BertClassifier(nn.Module):
    """Bert Model for Classification Tasks.
    """
    def __init__(self, freeze_bert=False):
        """
        @param    bert: a BertModel object
        @param    classifier: a torch.nn.Module classifier
        @param    freeze_bert (bool): Set `False` to fine-tune the BERT model
        """
        super(BertClassifier, self).__init__()
        # Specify hidden size of BERT, hidden size of our classifier, and number of labels
        D_in, H, D_out = 768, 50, 3

        # Instantiate BERT model
        self.bert = BertModel.from_pretrained('bert-base-uncased')

        # Instantiate an one-layer feed-forward classifier
        self.classifier = nn.Sequential(
            nn.Linear(D_in, H),
            nn.ReLU(),
            #nn.Dropout(0.5),
            nn.Linear(H, D_out)
        )

        # Freeze the BERT model
        if freeze_bert:
            for param in self.bert.parameters():
                param.requires_grad = False

    def forward(self, input_ids, attention_mask):
        """
        Feed input to BERT and the classifier to compute logits.
        @param    input_ids (torch.Tensor): an input tensor with shape (batch_size,
                      max_length)
        @param    attention_mask (torch.Tensor): a tensor that hold attention mask
                      information with shape (batch_size, max_length)
        @return   logits (torch.Tensor): an output tensor with shape (batch_size,
                      num_labels)
        """
        # Feed input to BERT
        outputs = self.bert(input_ids=input_ids,
                            attention_mask=attention_mask)

        # Extract the last hidden state of the token `[CLS]` for classification task
        last_hidden_state_cls = outputs[0][:, 0, :]

        # Feed input to classifier to compute logits
        logits = self.classifier(last_hidden_state_cls)

        return logits



CPU times: user 61.2 ms, sys: 0 ns, total: 61.2 ms
Wall time: 191 ms


In [12]:
from transformers import get_linear_schedule_with_warmup
from torch.optim import AdamW
def initialize_model(epochs=4):
    """Initialize the Bert Classifier, the optimizer and the learning rate scheduler.
    """
    # Instantiate Bert Classifier
    bert_classifier = BertClassifier(freeze_bert=False)

    # Tell PyTorch to run the model on CPU
    bert_classifier.to(device)

    # Create the optimizer
    optimizer = AdamW(bert_classifier.parameters(),
                      lr=5e-5,    # Default learning rate
                      eps=1e-8    # Default epsilon value
                      )

    # Total number of training steps
    total_steps = len(train_dataloader) * epochs

    # Set up the learning rate scheduler
    scheduler = get_linear_schedule_with_warmup(optimizer,
                                                num_warmup_steps=0, # Default value
                                                num_training_steps=total_steps)
    return bert_classifier, optimizer, scheduler


In [13]:
import random
import time

# Specify loss function
loss_fn = nn.CrossEntropyLoss()

def set_seed(seed_value=42):
    """Set seed for reproducibility.
    """
    random.seed(seed_value)
    np.random.seed(seed_value)
    torch.manual_seed(seed_value)
    torch.cuda.manual_seed_all(seed_value)

def train(model, train_dataloader, val_dataloader=None, epochs=4, evaluation=False):
    """Train the BertClassifier model.
    """
    # Start training loop
    print("Start training...\n")
    for epoch_i in range(epochs):
        # =======================================
        #               Training
        # =======================================
        # Print the header of the result table
        print(f"{'Epoch':^7} | {'Batch':^7} | {'Train Loss':^12} | {'Val Loss':^10} | {'Val Acc':^9} | {'Elapsed':^9}")
        print("-"*70)

        # Measure the elapsed time of each epoch
        t0_epoch, t0_batch = time.time(), time.time()

        # Reset tracking variables at the beginning of each epoch
        total_loss, batch_loss, batch_counts = 0, 0, 0

        # Put the model into the training mode
        model.train()

        # For each batch of training data...
        for step, batch in enumerate(train_dataloader):
            batch_counts +=1
            # Load batch to GPU
            b_input_ids, b_attn_mask, b_labels = tuple(t.to(device) for t in batch)

            # Zero out any previously calculated gradients
            model.zero_grad()

            # Perform a forward pass. This will return logits.
            logits = model(b_input_ids, b_attn_mask)

            # Compute loss and accumulate the loss values
            loss = loss_fn(logits, b_labels)
            batch_loss += loss.item()
            total_loss += loss.item()

            # Perform a backward pass to calculate gradients
            loss.backward()

            # Clip the norm of the gradients to 1.0 to prevent "exploding gradients"
            torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

            # Update parameters and the learning rate
            optimizer.step()
            scheduler.step()

            # Print the loss values and time elapsed for every 20 batches
            if (step % 20 == 0 and step != 0) or (step == len(train_dataloader) - 1):
                # Calculate time elapsed for 20 batches
                time_elapsed = time.time() - t0_batch

                # Print training results
                print(f"{epoch_i + 1:^7} | {step:^7} | {batch_loss / batch_counts:^12.6f} | {'-':^10} | {'-':^9} | {time_elapsed:^9.2f}")

                # Reset batch tracking variables
                batch_loss, batch_counts = 0, 0
                t0_batch = time.time()

        # Calculate the average loss over the entire training data
        avg_train_loss = total_loss / len(train_dataloader)

        print("-"*70)
        # =======================================
        #               Evaluation
        # =======================================
        if evaluation == True:
            # After the completion of each training epoch, measure the model's performance
            # on our validation set.
            val_loss, val_accuracy = evaluate(model, val_dataloader)

            # Print performance over the entire training data
            time_elapsed = time.time() - t0_epoch

            print(f"{epoch_i + 1:^7} | {'-':^7} | {avg_train_loss:^12.6f} | {val_loss:^10.6f} | {val_accuracy:^9.2f} | {time_elapsed:^9.2f}")
            print("-"*70)
        print("\n")

    print("Training complete!")


def evaluate(model, val_dataloader):
    """After the completion of each training epoch, measure the model's performance
    on our validation set.
    """
    # Put the model into the evaluation mode. The dropout layers are disabled during
    # the test time.
    model.eval()

    # Tracking variables
    val_accuracy = []
    val_loss = []

    # For each batch in our validation set...
    for batch in val_dataloader:
        # Load batch to GPU
        b_input_ids, b_attn_mask, b_labels = tuple(t.to(device) for t in batch)

        # Compute logits
        with torch.no_grad():
            logits = model(b_input_ids, b_attn_mask)

        # Compute loss
        loss = loss_fn(logits, b_labels)
        val_loss.append(loss.item())

        # Get the predictions
        preds = torch.argmax(logits, dim=1).flatten()

        # Calculate the accuracy rate
        accuracy = (preds == b_labels).cpu().numpy().mean() * 100
        val_accuracy.append(accuracy)

    # Compute the average accuracy and loss over the validation set.
    val_loss = np.mean(val_loss)
    val_accuracy = np.mean(val_accuracy)

    return val_loss, val_accuracy


In [14]:
set_seed(42)    # Set seed for reproducibility
bert_classifier, optimizer, scheduler = initialize_model(epochs=2)
train(bert_classifier, train_dataloader, val_dataloader, epochs=2, evaluation=True)


Start training...

 Epoch  |  Batch  |  Train Loss  |  Val Loss  |  Val Acc  |  Elapsed 
----------------------------------------------------------------------
   1    |   20    |   1.043902   |     -      |     -     |   19.43  
   1    |   40    |   0.936557   |     -      |     -     |   14.78  
   1    |   60    |   0.771304   |     -      |     -     |   14.82  
   1    |   80    |   0.706313   |     -      |     -     |   14.96  
   1    |   100   |   0.680337   |     -      |     -     |   15.10  
   1    |   120   |   0.655590   |     -      |     -     |   15.13  
   1    |   140   |   0.698620   |     -      |     -     |   15.08  
   1    |   160   |   0.574166   |     -      |     -     |   15.06  
   1    |   180   |   0.642322   |     -      |     -     |   15.03  
   1    |   200   |   0.691368   |     -      |     -     |   15.03  
   1    |   220   |   0.642492   |     -      |     -     |   15.05  
   1    |   240   |   0.666164   |     -      |     -     |   15.08  


In [49]:
import torch.nn.functional as F
import matplotlib as mpl
import matplotlib.pyplot as plt

def bert_predict(model, test_dataloader):
    """Perform a forward pass on the trained BERT model to predict probabilities
    on the test set.
    """
    # Put the model into the evaluation mode. The dropout layers are disabled during
    # the test time.
    model.eval()

    all_logits = []

    # For each batch in our test set...
    for batch in test_dataloader:
        # Load batch to GPU
        b_input_ids, b_attn_mask = tuple(t.to(device) for t in batch)[:2]

        # Compute logits
        with torch.no_grad():
            logits = model(b_input_ids, b_attn_mask)
        all_logits.append(logits)

    # Concatenate logits from each batch
    all_logits = torch.cat(all_logits, dim=0)

    # Apply softmax to calculate probabilities
    probs = F.softmax(all_logits).cpu().numpy()

    return probs

from sklearn.metrics import accuracy_score, roc_curve, auc

def evaluate_roc(probs, y_true):
    """
    - Print AUC and accuracy on the test set
    - Plot ROC
    @params    probs (np.array): an array of predicted probabilities with shape (len(y_true), 2)
    @params    y_true (np.array): an array of the true values with shape (len(y_true),)
    """
    # preds = probs[:, 1]
    # fpr, tpr, threshold = roc_curve(y_true, preds)
    # roc_auc = auc(fpr, tpr)
    #print(f'AUC: {roc_auc:.4f}')

    # Get accuracy over the test set
    accuracy = accuracy_score(y_true, probs)
    print(f'Accuracy: {accuracy*100:.2f}%')

    # # Plot ROC AUC
    # plt.title('Receiver Operating Characteristic')
    # plt.plot(fpr, tpr, 'b', label = 'AUC = %0.2f' % roc_auc)
    # plt.legend(loc = 'lower right')
    # plt.plot([0, 1], [0, 1],'r--')
    # plt.xlim([0, 1])
    # plt.ylim([0, 1])
    # plt.ylabel('True Positive Rate')
    # plt.xlabel('False Positive Rate')
    # plt.show()
    return accuracy

In [25]:
# Compute predicted probabilities on the test set
probs = bert_predict(bert_classifier, val_dataloader)

# #converting softmax probabilities to classification
# classification = probs.argmax(axis = 1)
# # # Evaluate the Bert classifier
# evaluate_roc(classification, y_test)

  probs = F.softmax(all_logits).cpu().numpy()


IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

In [50]:
#converting softmax probabilities to classification
classification = probs.argmax(axis = 1)
# # Evaluate the Bert classifier
accuracy = evaluate_roc(classification, y_test)

Accuracy: 73.75%


## Saving the model and recording its production

In [29]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'surveys-402414'

In [33]:
#saving the model 
from datetime import datetime
REGION = 'europe-west2'
EXPERIMENT = '01'
SERIES = '01'

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")
BUCKET = PROJECT_ID
URI = f"gs://{BUCKET}/{SERIES}/{EXPERIMENT}"
DIR = f"temp/{EXPERIMENT}"
BLOB = f"{SERIES}/{EXPERIMENT}/models/{TIMESTAMP}/model/sentiment_classifier.pt"


In [44]:
FRAMEWORK = 'pytorch'
TASK = 'classification'
MODEL_TYPE = 'bert'
EXPERIMENT_NAME = f'experiment-{SERIES}-{EXPERIMENT}-{FRAMEWORK}-{TASK}-{MODEL_TYPE}'
RUN_NAME = f'run-{TIMESTAMP}'

In [34]:
#Required packages
from google.cloud import aiplatform
from google.cloud import storage
import json

from datetime import datetime
import os

from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

In [35]:
aiplatform.init(project=PROJECT_ID, location=REGION)

In [36]:
!rm -rf {DIR}
!mkdir -p {DIR}

## Initialising experiment

In [46]:
aiplatform.init(experiment = EXPERIMENT_NAME)

Creating Tensorboard


INFO:google.cloud.aiplatform.tensorboard.tensorboard_resource:Creating Tensorboard


Create Tensorboard backing LRO: projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344/operations/2605762293480292352


INFO:google.cloud.aiplatform.tensorboard.tensorboard_resource:Create Tensorboard backing LRO: projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344/operations/2605762293480292352


Tensorboard created. Resource name: projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344


INFO:google.cloud.aiplatform.tensorboard.tensorboard_resource:Tensorboard created. Resource name: projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344


To use this Tensorboard in another session:


INFO:google.cloud.aiplatform.tensorboard.tensorboard_resource:To use this Tensorboard in another session:


tb = aiplatform.Tensorboard('projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344')


INFO:google.cloud.aiplatform.tensorboard.tensorboard_resource:tb = aiplatform.Tensorboard('projects/240414127532/locations/europe-west2/tensorboards/5224597780214841344')


In [47]:
expRun = aiplatform.ExperimentRun.create(run_name = RUN_NAME, experiment = EXPERIMENT_NAME)

Associating projects/240414127532/locations/europe-west2/metadataStores/default/contexts/experiment-01-01-pytorch-classification-bert-run-20231024142940 to Experiment: experiment-01-01-pytorch-classification-bert


INFO:google.cloud.aiplatform.metadata.experiment_resources:Associating projects/240414127532/locations/europe-west2/metadataStores/default/contexts/experiment-01-01-pytorch-classification-bert-run-20231024142940 to Experiment: experiment-01-01-pytorch-classification-bert


In [48]:
#log parameters to the experiment run:
expRun.log_params({'experiment': EXPERIMENT, 'series': SERIES, 'project_id': PROJECT_ID})

In [51]:
expRun.log_metrics({'test_accuracy': accuracy})

## Saving model for later use 

In [52]:
model_save_name = 'sentiment_classifier.pt'
path = F"{model_save_name}"
torch.save(bert_classifier.state_dict(), path)

In [56]:
# Upload the model to GCS
bucket = storage.Client().bucket('sentiment_response')
blob = bucket.blob(BLOB)
blob.upload_from_filename('sentiment_classifier.pt')

In [57]:
!gsutil ls gs://sentiment_response/01/01/models/20231024142940/model

gs://sentiment_response/01/01/models/20231024142940/model/sentiment_classifier.pt


In [58]:
#logging where the model has been saved
expRun.log_params({'model.save': r'gs://sentiment_response/01/01/models/20231024142940/model/sentiment_classifier.pt'})

In [69]:
modelmatch = aiplatform.Model.list(filter = f'display_name={SERIES}_{EXPERIMENT} AND labels.series={SERIES} AND labels.experiment={EXPERIMENT}')

upload_model = True
if modelmatch:
    print("Model Already in Registry:")
    if RUN_NAME in modelmatch[0].version_aliases:
        print("This version already loaded, no action taken.")
        upload_model = False
        model = aiplatform.Model(model_name = modelmatch[0].resource_name)
    else:
        print('Loading model as new default version.')
        parent_model = modelmatch[0].resource_name

else:
    print('This is a new model, creating in model registry')
    parent_model = ''

if upload_model:
    model = aiplatform.Model.upload(
        display_name = f'{SERIES}_{EXPERIMENT}',
        model_id = f'model_{SERIES}_{EXPERIMENT}',
        parent_model =  parent_model,
        serving_container_image_uri = 'europe-docker.pkg.dev/vertex-ai/training/pytorch-gpu.1-13:latest',
        artifact_uri =r'gs://sentiment_response/01/01/models/20231024142940/model',
        is_default_version = True,
        version_aliases = [RUN_NAME],
        version_description = RUN_NAME,
        labels = {'series' : f'{SERIES}', 'experiment' : f'{EXPERIMENT}', 'experiment_name' : f'{EXPERIMENT_NAME}', 'run_name' : f'{RUN_NAME}'}        
    )

This is a new model, creating in model registry
Creating Model


INFO:google.cloud.aiplatform.models:Creating Model


Create Model backing LRO: projects/240414127532/locations/europe-west2/models/model_01_01/operations/5591648846426931200


INFO:google.cloud.aiplatform.models:Create Model backing LRO: projects/240414127532/locations/europe-west2/models/model_01_01/operations/5591648846426931200


Model created. Resource name: projects/240414127532/locations/europe-west2/models/model_01_01@1


INFO:google.cloud.aiplatform.models:Model created. Resource name: projects/240414127532/locations/europe-west2/models/model_01_01@1


To use this Model in another session:


INFO:google.cloud.aiplatform.models:To use this Model in another session:


model = aiplatform.Model('projects/240414127532/locations/europe-west2/models/model_01_01@1')


INFO:google.cloud.aiplatform.models:model = aiplatform.Model('projects/240414127532/locations/europe-west2/models/model_01_01@1')


In [70]:
print(f'Review the model in the Vertex AI Model Registry:\nhttps://console.cloud.google.com/vertex-ai/locations/{REGION}/models/{model.name}?project={PROJECT_ID}')

Review the model in the Vertex AI Model Registry:
https://console.cloud.google.com/vertex-ai/locations/europe-west2/models/model_01_01?project=surveys-402414


In [71]:
#update model descriptions
expRun.log_params({
    'model.uri': model.uri,
    'model.display_name': model.display_name,
    'model.name': model.name,
    'model.resource_name': model.resource_name,
    'model.version_id': model.version_id,
    'model.versioned_resource_name': model.versioned_resource_name
})

In [72]:
#complete experiment run
expRun.update_state(state = aiplatform.gapic.Execution.State.COMPLETE)

In [73]:
exp = aiplatform.Experiment(experiment_name = EXPERIMENT_NAME)

In [74]:
exp.get_data_frame()

Unnamed: 0,experiment_name,run_name,run_type,state,param.series,param.model.name,param.project_id,param.model.resource_name,param.model.save,param.model.versioned_resource_name,param.experiment,param.model.display_name,param.model.version_id,param.model.uri,metric.test_accuracy
0,experiment-01-01-pytorch-classification-bert,run-20231024142940,system.ExperimentRun,COMPLETE,1,model_01_01,surveys-402414,projects/240414127532/locations/europe-west2/m...,gs://sentiment_response/01/01/models/202310241...,projects/240414127532/locations/europe-west2/m...,1,01_01,1,gs://sentiment_response/01/01/models/202310241...,0.737531


## Now need to have a current version which is managed in the model registry