### Overview
A company is trying to minimize the manual efforts invested in screening the candidates that they seem to be a good fit for the job description.

And in order to do that we need 2 phases as shown below:

#### Phase 1 : Rank the Candidates

The goal of phase one is to build a model for predicting how to fit the candidate for a particular role. Keywords to be used for search are 'Aspiring human resources' or 'seeking human resources'. The model's output indicates how fit the candidate is for the role? (numeric, probability between 0-1)


#### Phase 2 : Rerank when a candidate is starred
The goal of phase two is to let Human resources individual/end-user using this model provide priority to any individual who they think might be good for the role but was ranked lower by the model. Then based on the inputs provided, the model will re-rank the candidates keeping the human inputs into consideration.

#### Data Description:

The data comes from our sourcing efforts. We removed any field that could directly reveal personal details and gave a unique identifier for each candidate.

Attributes:
id : unique identifier for candidate (numeric)

job_title : job title for candidate (text)

location : geographical location for candidate (text)

connections: number of connections candidate has, 500+ means over 500 (text)

### Import Libraries:

In [None]:
!pip install gensim==4.2.0

In [None]:
!pip install sentence_transformers==2.2.2

In [3]:
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sn
import numpy as np

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import WordNetLemmatizer
from sklearn.feature_extraction.text import TfidfVectorizer
from scipy.spatial.distance import cosine
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer


# random
import random
# pytorch
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

### Exploratory Data Analysis:

#### Load data:

In [4]:
df=pd.read_csv("potential-talents - Aspiring human resources - seeking human resources.csv").set_index('id')

In [5]:
df.head()

Unnamed: 0_level_0,job_title,location,connection,fit
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85,
2,Native English Teacher at EPIK (English Progra...,Kanada,500+,
3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,
4,People Development Coordinator at Ryan,"Denton, Texas",500+,
5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500+,


In [6]:
df.shape

(104, 4)

### Preprocessing:
#### Data Cleaning:


In [7]:
print("\nData Types:")
print(df.dtypes)
print("\nChecking missing values in relevant columns:")
print( "Count of missing values in job title" , df['job_title'].isnull().sum())
print( "Count of missing values in location" , df['location'].isnull().sum())
print( "Count of missing values in connection" , df['connection'].isnull().sum())
print( "Count of missing values in fit" , df['fit'].isnull().sum())


Data Types:
job_title      object
location       object
connection     object
fit           float64
dtype: object

Checking missing values in relevant columns:
Count of missing values in job title 0
Count of missing values in location 0
Count of missing values in connection 0
Count of missing values in fit 104


In [9]:
df.connection.value_counts()

500+     44
85        7
61        7
44        6
1         5
2         4
57        2
7         2
4         2
390       2
39        1
49        1
18        1
50        1
268       1
48        1
40        1
64        1
52        1
19        1
5         1
155       1
174       1
349       1
82        1
71        1
16        1
409       1
212       1
103       1
455       1
9         1
415       1
Name: connection, dtype: int64

Turn connections into a numeric field for potential use:

In [10]:
df.connection=df.connection.str.replace("500+ ","500",regex=False)
df.connection=df.connection.astype(int)

In [11]:
df.dtypes

job_title      object
location       object
connection      int64
fit           float64
dtype: object

Checking how many duplicate entries we have:

In [12]:
df_dup = df.duplicated().sum()

df_dup

51

Dropping duplicates:

In [13]:
Drop_dup = df.drop_duplicates()
print("Shape of non-duplicated dataframe:", Drop_dup.shape)  

Shape of non-duplicated dataframe: (53, 4)


In [14]:
Drop_dup.head()

Unnamed: 0_level_0,job_title,location,connection,fit
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85,
2,Native English Teacher at EPIK (English Progra...,Kanada,500,
3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,
4,People Development Coordinator at Ryan,"Denton, Texas",500,
5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500,


#### Data cleaning:

In [47]:
import nltk

nltk.download('all-corpora')

In order to clean data we are going through the steps below:

* Remove stop words
* Remove special characters and punctuation.
* Apply lemmatization.
* The text should be written all in lower case
* Replace all of the HR by human resources As the Keywords are: Aspiring human resources or seeking human resources.

In [16]:
cleanData = Drop_dup.copy()
cleanData['cleanJobTitle'] = ""

stopWords = set(stopwords.words('english'))
lemmatizer = WordNetLemmatizer()

for i in Drop_dup.index :
    # Fetch appropriate jobTitle
    jobTitle = Drop_dup['job_title'][i]
    tempString = ''
    # Replace HR with human resources
    tempString = jobTitle.replace("HR", " human resources ")
    # Make word all lower case
    tempString = tempString.lower()
    
    # Initialize empty wordList array for jobTitle
    wordList = []
    
    for token in tempString.split() :
        # Initialize variable
        word = ""
        # Remove all characters except letters and numbers
        temp = list([e for e in token if e.isalnum()])
        # Join letters from temp list back together
        word = "".join(temp)
        
        # Lemmatize word
        word = lemmatizer.lemmatize(word)
        
        if word not in stopWords :
            wordList.append(word)
    
    # Join tokens from wordList array with spaces inbetween
    cleanJobTitle = " ".join(wordList)
    
    cleanData['cleanJobTitle'][i] = cleanJobTitle

In [17]:
cleanData.head()

Unnamed: 0_level_0,job_title,location,connection,fit,cleanJobTitle
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85,,2019 ct bauer college business graduate magna ...
2,Native English Teacher at EPIK (English Progra...,Kanada,500,,native english teacher epik english program korea
3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,,aspiring human resource professional
4,People Development Coordinator at Ryan,"Denton, Texas",500,,people development coordinator ryan
5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500,,advisory board member celal bayar university


### Rank the Candidates:

We'll implement machine learning algorithms to rank the job titles listed in the dataset with respect to the key phrases:

* Aspiring human resources
* Seeking human resources


### FuzzyWuzzy:

FuzzyWuzzy has four score options to find the Levenshtein distance between two strings.
ratio simularity, partial ratio simularity, token_sort_ratio, token set ratio.

I believe that the token sort ratio and the token set ratio are more suitable for this dataset which might have mixed words order and duplicated words.

In [18]:
# key phrases:
phrase_1="aspiring human resources"
phrase_2="seeking human resources"
phrases_list=[phrase_1,phrase_2]

#### Token Sort Ratio:
The token sort ratio scorer finds the Levenshtein distance and returns the similarity percentage.

In [47]:
!pip install fuzzywuzzy==0.18.0

In [20]:
from fuzzywuzzy import process, fuzz

In [21]:
#Create tuples of phrases_list , matched job title, and the score
score_sort = [(x,) + i
             for x in phrases_list 
             for i in process.extract(x, cleanData.cleanJobTitle, scorer=fuzz.token_sort_ratio)]

In [22]:
#Create dataframe from the tuples
similarity_sort = pd.DataFrame(score_sort, columns=['phrases','cleanJobTitle','score_sort','ignore'])
similarity_sort.drop('ignore',axis=1,inplace=True)
similarity_sort.head()

Unnamed: 0,phrases,cleanJobTitle,score_sort
0,aspiring human resources,aspiring human resource specialist,83
1,aspiring human resources,aspiring human resource professional,77
2,aspiring human resources,aspiring human resource professional,77
3,aspiring human resources,director human resource ey,68
4,aspiring human resources,human resource generalist schwans,63


#### Token Set Ratio:

In [23]:
score_set = [(x,) + i
             for x in phrases_list 
             for i in process.extract(x, cleanData.cleanJobTitle, scorer=fuzz.token_set_ratio)]

In [24]:
similarity_set = pd.DataFrame(score_set, columns=['phrases','cleanJobTitle','score_set','ignore'])
similarity_set.drop('ignore',axis=1,inplace=True)
similarity_set.head()

Unnamed: 0,phrases,cleanJobTitle,score_set
0,aspiring human resources,aspiring human resource specialist,83
1,aspiring human resources,aspiring human resource professional,77
2,aspiring human resources,aspiring human resource professional,77
3,aspiring human resources,2019 ct bauer college business graduate magna ...,74
4,aspiring human resources,student humber college aspiring human resource...,74


### Word Embedding Techniques:
In order to apply these Techniques we need to vectorize the job titles, vectorize the query strings and use cosine similarity to see how closely they match.

#### TF-IDF cosine similarity:

In [25]:
#Convert search phrase into a vector
tfv=TfidfVectorizer()
vects=tfv.fit_transform(cleanData.cleanJobTitle)
#Get the appropriate vectors 
tf_phrase=tfv.transform(phrases_list)


In [26]:
# Calcualte Tfidf cosine similarity and add it to the dataframe
cleanData["tfidf_sim_1"]=cosine_similarity(vects,tf_phrase)[:,0]
cleanData["tfidf_sim_2"]=cosine_similarity(vects,tf_phrase)[:,1]
cleanData['tfidf_sim'] = ((cleanData["tfidf_sim_1"] + cleanData["tfidf_sim_2"])/2)

We'll check the top 5 matching title job rows with the 2 phrases.

In [27]:
cleanData.sort_values("tfidf_sim",ascending=False).head()

Unnamed: 0_level_0,job_title,location,connection,fit,cleanJobTitle,tfidf_sim_1,tfidf_sim_2,tfidf_sim
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
73,"Aspiring Human Resources Manager, seeking inte...","Houston, Texas Area",7,,aspiring human resource manager seeking intern...,0.481429,0.503455,0.492442
3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,,aspiring human resource professional,0.680582,0.166562,0.423572
97,Aspiring Human Resources Professional,"Kokomo, Indiana Area",71,,aspiring human resource professional,0.680582,0.166562,0.423572
27,Aspiring Human Resources Management student se...,"Houston, Texas Area",500,,aspiring human resource management student see...,0.384157,0.413559,0.398858
99,Seeking Human Resources Position,"Las Vegas, Nevada Area",48,,seeking human resource position,0.153506,0.627236,0.390371


The 5 less matching title job rows with the 2 phrases:

In [28]:
cleanData.sort_values("tfidf_sim",ascending=False).tail()

Unnamed: 0_level_0,job_title,location,connection,fit,cleanJobTitle,tfidf_sim_1,tfidf_sim_2,tfidf_sim
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
87,Bachelor of Science in Biology from Victoria U...,"Baltimore, Maryland",40,,bachelor science biology victoria university w...,0.0,0.0,0.0
86,Information Systems Specialist and Programmer ...,"Gaithersburg, Maryland",4,,information system specialist programmer love ...,0.0,0.0,0.0
85,RRP Brand Portfolio Executive at JTI (Japan To...,Greater Philadelphia Area,500,,rrp brand portfolio executive jti japan tobacc...,0.0,0.0,0.0
80,Junior MES Engineer| Information Systems,"Myrtle Beach, South Carolina Area",52,,junior engineer information system,0.0,0.0,0.0
104,Director Of Administration at Excellence Logging,"Katy, Texas",500,,director administration excellence logging,0.0,0.0,0.0


#### BERT:

In [47]:
!pip install sentence_transformers

Build BERT_base model:

In [30]:
bert_model = SentenceTransformer('bert-base-nli-mean-tokens')

Downloading: 100%|██████████| 391/391 [00:00<00:00, 696kB/s]
Downloading: 100%|██████████| 190/190 [00:00<00:00, 382kB/s]
Downloading: 100%|██████████| 3.95k/3.95k [00:00<00:00, 6.87MB/s]
Downloading: 100%|██████████| 2.00/2.00 [00:00<00:00, 3.94kB/s]
Downloading: 100%|██████████| 625/625 [00:00<00:00, 1.02MB/s]
Downloading: 100%|██████████| 122/122 [00:00<00:00, 227kB/s]
Downloading: 100%|██████████| 438M/438M [00:04<00:00, 109MB/s]
Downloading: 100%|██████████| 53.0/53.0 [00:00<00:00, 72.5kB/s]
Downloading: 100%|██████████| 112/112 [00:00<00:00, 167kB/s]
Downloading: 100%|██████████| 466k/466k [00:00<00:00, 105MB/s]
Downloading: 100%|██████████| 399/399 [00:00<00:00, 870kB/s]
Downloading: 100%|██████████| 232k/232k [00:00<00:00, 83.0MB/s]
Downloading: 100%|██████████| 229/229 [00:00<00:00, 394kB/s]


Convert job titles into BERT embedded vectors:

In [31]:
bert_job_title_embeddings = bert_model.encode(list(cleanData.cleanJobTitle))
bert_job_title_embeddings.shape

(53, 768)

Convert search phrase into a BERT embedded vector:

In [32]:
bert_search_phrase_embedding = bert_model.encode(phrases_list[0])

bert_search_phrase_embedding.shape

(768,)

Calculate cosine similarity between job title and search phrase vectors:

In [33]:
bert_cosine_similarity = []
for i in range(len(cleanData)):
    cos_sim = 1 - cosine(bert_job_title_embeddings[i], bert_search_phrase_embedding)
    bert_cosine_similarity.append(cos_sim)
    
# Add BERT_cosine_similarity column in the pt dataframe
cleanData['BERT_model_fitt_score'] = bert_cosine_similarity

In [34]:
cleanData[['job_title', 'cleanJobTitle','tfidf_sim', 'BERT_model_fitt_score']].sort_values(by ='BERT_model_fitt_score', ascending = False).head()

Unnamed: 0_level_0,job_title,cleanJobTitle,tfidf_sim,BERT_model_fitt_score
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
6,Aspiring Human Resources Specialist,aspiring human resource specialist,0.372814,0.950305
3,Aspiring Human Resources Professional,aspiring human resource professional,0.423572,0.930828
97,Aspiring Human Resources Professional,aspiring human resource professional,0.423572,0.930828
82,Aspiring Human Resources Professional | An ene...,aspiring human resource professional energeti...,0.206308,0.8544
99,Seeking Human Resources Position,seeking human resource position,0.390371,0.837419


In [35]:
cleanData[['job_title', 'cleanJobTitle','tfidf_sim', 'BERT_model_fitt_score']].sort_values(by ='BERT_model_fitt_score', ascending = False).tail()

Unnamed: 0_level_0,job_title,cleanJobTitle,tfidf_sim,BERT_model_fitt_score
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
91,Lead Official at Western Illinois University,lead official western illinois university,0.0,0.367655
85,RRP Brand Portfolio Executive at JTI (Japan To...,rrp brand portfolio executive jti japan tobacc...,0.0,0.270093
96,Student at Indiana University Kokomo - Busines...,student indiana university kokomo business ma...,0.0,0.252365
93,Admissions Representative at Community medical...,admission representative community medical cen...,0.0,0.192242
87,Bachelor of Science in Biology from Victoria U...,bachelor science biology victoria university w...,0.0,0.139689


BERT is the best performing model, As it gives us a high score for the top titles and the low ones as well.

### Learning to Rank (LTR):

We are going to Re-rank our job titles when a candidate is starred manually.

Working of RankNet

* A neural network with Linear, Dropout and activation layers is built which takes job title as input and returns a prediction.
* Two random samples from the job titles are selected and their output is computed separately using forward propagation.
* The cost, which is the difference between the two outputs is calculated and appropriate loss is calculated.
* The loss is back-propogated to learn the selected example.


In [36]:
Data = cleanData[['cleanJobTitle','location', 'connection','fit','BERT_model_fitt_score']]
Data.sort_values(by ='BERT_model_fitt_score', ascending = False).head()

Unnamed: 0_level_0,cleanJobTitle,location,connection,fit,BERT_model_fitt_score
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
6,aspiring human resource specialist,Greater New York City Area,1,,0.950305
3,aspiring human resource professional,"Raleigh-Durham, North Carolina Area",44,,0.930828
97,aspiring human resource professional,"Kokomo, Indiana Area",71,,0.930828
82,aspiring human resource professional energeti...,"Austin, Texas Area",174,,0.8544
99,seeking human resource position,"Las Vegas, Nevada Area",48,,0.837419


In [37]:
star_candidate = input("Do you want to star any candidates? Enter 'Yes' or 'No': ")

starred = []
if star_candidate.lower() == 'yes':
    starred = [int(item) for item in input("Enter ids of candidates you want to star (separated by space) : ").split()]

In [38]:
Data['starredScore'] = Data['BERT_model_fitt_score']

for id_num in starred:
    Data['starredScore'][id]=1

Data.head()

Unnamed: 0_level_0,cleanJobTitle,location,connection,fit,BERT_model_fitt_score,starredScore
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,2019 ct bauer college business graduate magna ...,"Houston, Texas",85,,0.578465,0.578465
2,native english teacher epik english program korea,Kanada,500,,0.411404,0.411404
3,aspiring human resource professional,"Raleigh-Durham, North Carolina Area",44,,0.930828,0.930828
4,people development coordinator ryan,"Denton, Texas",500,,0.755497,0.755497
5,advisory board member celal bayar university,"İzmir, Türkiye",500,,0.475463,0.475463


In [39]:
# Set seed for reproducability
random.seed(123)

# Build a Deep Learning RankNet class
class RankNet(nn.Module):
    
    def __init__(self, num_feature):
        super(RankNet, self).__init__()

        self.model = nn.Sequential(
            nn.Linear(num_feature, 512),         # Linear layer
            nn.Dropout(0.5),                     # Regularization
            nn.LeakyReLU(0.2, inplace=True),     # Activation function
            nn.Linear(512, 256),
            nn.Dropout(0.5),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(256, 1),
            nn.Sigmoid()                         # output between 0 and 1
        )
        self.output_sig = nn.Sigmoid()

    def forward(self, input_1, input_2):
        s1 = self.model(input_1)
        s2 = self.model(input_2)
        out = self.output_sig(s1-s2)
        return out
    
    def predict(self, input_):
        s = self.model(input_)
        return s



Generate data:

In [40]:
random_row_1 = Data.sample(n = 5000, replace = True)
random_row_2 = Data.sample(n = 5000, replace = True)

Get list of job titles for each data generated:

In [41]:
job_title_list_ranknet1 = list(random_row_1['cleanJobTitle'])
job_title_list_ranknet2 = list(random_row_2['cleanJobTitle'])

In [42]:
doc1 = bert_model.encode(job_title_list_ranknet1)
doc2 = bert_model.encode(job_title_list_ranknet2)
doc1 = torch.from_numpy(doc1).float()
doc2 = torch.from_numpy(doc2).float()

Generate ground truth for RankNet model:

In [43]:
y_1 = list(random_row_1['starredScore'])
y_2 = list(random_row_2['starredScore'])
#Ground truth for ranknet: output is 1 if first entry has higher value, 0 if second entry is higher, 0.5 if equal
y = torch.tensor([1.0 if y1_i>y2_i else 0.5 if y1_i==y2_i else 0.0 for y1_i, y2_i in zip(y_1, y_2)]).float()

y = y.unsqueeze(1)

In [44]:
y.shape

torch.Size([5000, 1])

Initialize an instance of RankNet model and define loss function:

In [45]:
rank_model = RankNet(num_feature = 768)     
optimizer = torch.optim.SGD(rank_model.parameters(), lr = 0.01, momentum = 0.9)        
loss_fun = torch.nn.BCELoss()             

Loss optimization by running RankNet model through various iterations:

In [46]:
epoch = 1000
losses = []

for i in range(epoch):
    rank_model.zero_grad()
    y_pred = rank_model(doc1, doc2)
    loss = loss_fun(y_pred,y)
    loss.backward()
    optimizer.step()
    losses.append(loss.item())
    #print(y1, y2, y, y_pred)
    
    if i % 100 == 0:
        print('Epoch{}, loss : {}'.format(i, loss.item()))

Epoch0, loss : 0.6910803914070129
Epoch100, loss : 0.5187612175941467
Epoch200, loss : 0.5051256418228149
Epoch300, loss : 0.5003107190132141
Epoch400, loss : 0.49872085452079773
Epoch500, loss : 0.4983339011669159
Epoch600, loss : 0.49863535165786743
Epoch700, loss : 0.4975854456424713
Epoch800, loss : 0.4968988299369812
Epoch900, loss : 0.4981394112110138


Make predictions for the initial dataset and assign the final cosine similarity to fit feature:

In [47]:
predScore = []
for job in Data['cleanJobTitle'] :
    embedding = bert_model.encode([job])
    embedding_tensor = torch.from_numpy(embedding).float()
    pred = round(rank_model.predict(embedding_tensor).detach().numpy().sum(),2)
    predScore.append(pred)

Data['fit'] = predScore
Data.sort_values(by ='fit', ascending = False)

Unnamed: 0_level_0,cleanJobTitle,location,connection,fit,BERT_model_fitt_score,starredScore
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
74,human resource professional,Greater Boston Area,16,1.0,0.794618,0.794618
10,seeking human resource human resource generali...,Greater Philadelphia Area,500,1.0,0.779854,0.779854
97,aspiring human resource professional,"Kokomo, Indiana Area",71,1.0,0.930828,0.930828
82,aspiring human resource professional energeti...,"Austin, Texas Area",174,1.0,0.8544,0.8544
28,seeking human resource opportunity,"Chicago, Illinois",390,1.0,0.825814,0.825814
27,aspiring human resource management student see...,"Houston, Texas Area",500,1.0,0.703306,0.703306
76,aspiring human resource professional passiona...,"New York, New York",212,1.0,0.730179,0.730179
99,seeking human resource position,"Las Vegas, Nevada Area",48,1.0,0.837419,0.837419
73,aspiring human resource manager seeking intern...,"Houston, Texas Area",7,1.0,0.759009,0.759009
8,human resource senior specialist,San Francisco Bay Area,500,1.0,0.745252,0.745252


### Conclusion:

BERT model seems to be the best choice for ranking the candidates and to check simillarity between the given query tittles - 'Aspiring human resources' and 'seeking human resources' and the candidates title.

The above table shows the reranking based on the candidate that entered by the user. Candidates with id equal to 3, 74, and 99 were starred for this process.


<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=409fe3d9-3242-4ff2-a489-9d2975f66a44' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>