#Description

Prior to training the model, we preprocessed the company descriptions to remove any irrelevant information.We used a dataset of company descriptions that had already been labeled with vertical categories, similar to the zero-shot approach (241 observations).In this approach, we first partitioned the dataset into training and testing sets using a 60-40 split. Stratified sampling was applied to ensure that each vertical category was represented in both the training and testing sets.We used the sentence transformer, specifically the 'all-mpnet-base-v2' model, to embed the preprocessed training and testing sets. These embedded sets were then fed into the OneVsRestClassifier for training.
The OneVsRestClassifier employed the Support Vector Machine (SVM) model as an estimator with the default radial basis function (rbf) kernel and the default number of iterations.



# 1. Preprocessing company data

In [1]:
from google.colab import drive
drive.mount('/content/drive')
#%cd /content/drive/MyDrive/Capstone Datasets
%cd /content/drive/MyDrive/Capstone/label

Mounted at /content/drive


In [None]:
#imoprt data
import pandas as pd
import numpy as np

path='related_company_after_trans.xlsx'
data = pd.read_excel(path)
data=data[["busdesc",'gind']]
df_dropna = data.dropna(subset=['gind'],how="any")
data=df_dropna
data['gind'] = data['gind'].astype(int)


In [None]:
#preprocess
import re
import string

def preprocess_text(text):
    '''Make text lowercase, remove text in square brackets, remove links, remove punctuation
    and remove words containing numbers.'''
    text = str(text).lower()
    text = re.sub('\[.*?\]', '', text)
    text = re.sub('https?://\S+|www\.\S+', '', text)
    text = re.sub('<.*?>+', '', text)
    text = re.sub('[%s]' % re.escape(string.punctuation), '', text)
    text = re.sub('\n', '', text)

    return text


data["busdesc"] = data["busdesc"].map(preprocess_text)
data.head()

Unnamed: 0,busdesc,gind
31,22 eleven 22 eleven is an ecommerce retailer b...,19
75,99bros simply insured 99bros is a digital ins...,29
109,abloh a web and mobilebased platform for stude...,39
115,abtira garden the idea behind our line is nat...,1
155,acumen acumen is changing the way the world ta...,26


In [None]:
my_tags = list(np.unique(data['gind']))
my_tags = [str(value) for value in my_tags]
len(my_tags)#There is one vertical with no observations.

41

#Train and test split

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(data["busdesc"], data["gind"], test_size=0.4, random_state=0, stratify=data["gind"])

# 3.Training the model and results

In [None]:
pip install -U sentence-transformers

Collecting sentence-transformers
  Downloading sentence_transformers-3.0.1-py3-none-any.whl (227 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m227.1/227.1 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch>=1.11.0->sentence-transformers)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch>=1.11.0->sentence-transform

In [None]:
from sklearn.metrics import classification_report, f1_score, confusion_matrix
from sklearn.multiclass import OneVsRestClassifier
from sklearn.svm import SVC
from sentence_transformers import SentenceTransformer

  from tqdm.autonotebook import tqdm, trange


In [None]:
model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
X_train_embed = model.encode(list(X_train))

X_test_embed = model.encode(list(X_test))

In [None]:
clf = OneVsRestClassifier(SVC())#Trying other kernels did not result in significant improvement.

clf.fit(X_train_embed, Y_train)

y_pred = clf.predict(X_test_embed)

In [None]:
print(confusion_matrix(Y_test, y_pred))
print(classification_report(Y_test, y_pred, target_names=my_tags))
print("F1 score is: "+ (str)(f1_score(Y_test, y_pred, average='micro')))

[[2 0 0 ... 0 0 0]
 [0 1 0 ... 0 0 0]
 [0 0 3 ... 0 0 0]
 ...
 [0 0 0 ... 1 0 0]
 [0 0 0 ... 0 2 0]
 [0 0 0 ... 0 0 2]]
              precision    recall  f1-score   support

           1       1.00      1.00      1.00         2
           2       0.50      0.50      0.50         2
           3       1.00      1.00      1.00         3
           4       1.00      1.00      1.00         3
           5       1.00      0.50      0.67         2
           6       0.00      0.00      0.00         3
           7       1.00      1.00      1.00         2
           8       1.00      1.00      1.00         2
           9       0.67      0.67      0.67         3
          10       0.75      1.00      0.86         3
          11       0.67      1.00      0.80         2
          12       0.50      0.33      0.40         3
          13       0.60      1.00      0.75         3
          14       1.00      1.00      1.00         2
          15       0.67      0.67      0.67         3
          16   

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


# visualization

In [None]:
X_test_embed

array([[ 0.04735124, -0.01712022, -0.0369628 , ..., -0.01147545,
        -0.02979397, -0.02410926],
       [ 0.05662696,  0.01547982,  0.020043  , ..., -0.05166912,
        -0.05792082,  0.01848892],
       [ 0.02260668,  0.0430536 , -0.00371576, ...,  0.0046359 ,
         0.00924819, -0.0044073 ],
       ...,
       [ 0.01066492, -0.06471832, -0.01457305, ..., -0.02790175,
        -0.02564576, -0.03845064],
       [-0.01016736,  0.07712045, -0.00733948, ...,  0.02652265,
         0.08429354, -0.03514279],
       [ 0.04392074, -0.01552633,  0.0089786 , ..., -0.04771372,
        -0.00513229, -0.03288949]], dtype=float32)

# Repeat the process a few times average out the performance.

In [None]:
# prompt: Repeat the process a few times average out the performance.

f1_scores = []

for i in range(5):
    X_train, X_test, Y_train, Y_test = train_test_split(data["busdesc"], data["gind"], test_size=0.4, random_state=i, stratify=data["gind"])
    model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')
    X_train_embed = model.encode(list(X_train))
    X_test_embed = model.encode(list(X_test))
    clf = OneVsRestClassifier(SVC())
    clf.fit(X_train_embed, Y_train)
    y_pred = clf.predict(X_test_embed)
    f1_scores.append(f1_score(Y_test, y_pred, average='micro'))

print("Average F1 score:", sum(f1_scores) / len(f1_scores))




Average F1 score: 0.7793814432989691


In [None]:
# prompt: Repeat the process a few times average out the performance.

f1_scores = []

for i in range(20):
    X_train, X_test, Y_train, Y_test = train_test_split(data["busdesc"], data["gind"], test_size=0.4, random_state=i, stratify=data["gind"])
    model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')
    X_train_embed = model.encode(list(X_train))
    X_test_embed = model.encode(list(X_test))
    clf = OneVsRestClassifier(SVC())
    clf.fit(X_train_embed, Y_train)
    y_pred = clf.predict(X_test_embed)
    f1_scores.append(f1_score(Y_test, y_pred, average='micro'))

print("Average F1 score:", sum(f1_scores) / len(f1_scores))




Average F1 score: 0.7489690721649487


# Repeat the process with all labelled as training data, and all unlabelled as test data, and see what the results are

In [None]:
# prompt: Repeat the process with all labelled as training data, and all unlabelled as test data, and see what the results are

# Separate labeled and unlabeled data
path='related_company_after_trans.xlsx'
data = pd.read_excel(path)
data=data[["busdesc",'gind']]

labeled_data = data[data['gind'].notnull()]
labeled_data['gind'] = labeled_data['gind'].astype(int)

unlabeled_data = data[data['gind'].isnull()]
unlabeled_data = unlabeled_data[unlabeled_data['busdesc'].notnull()]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labeled_data['gind'] = labeled_data['gind'].astype(int)


In [None]:
# Train the model on all labeled data
X_train = labeled_data["busdesc"]
Y_train = labeled_data["gind"]

model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')
X_train_embed = model.encode(list(X_train))

clf = OneVsRestClassifier(SVC())
clf.fit(X_train_embed, Y_train)

# Test the model on all unlabeled data
X_test = unlabeled_data["busdesc"]

#Y_test = unlabeled_data["gind"]
#We don't have Y_test as for unlabelled data

X_test_embed = model.encode(list(X_test))
y_pred = clf.predict(X_test_embed)


In [None]:
# Create a dictionary with the predicted values and the corresponding X_test values
data = {'X_test': X_test,'Predicted Value': y_pred}
df = pd.DataFrame(data)
vertical_name = pd.read_excel("merged_df.xlsx")
vertical_name = vertical_name[["New_class_byhand","class_name"]]
df_joined = pd.merge(df, vertical_name, left_on='Predicted Value', right_on='New_class_byhand')
df_joined[["X_test",'class_name']]
#For unlabeled data, the actual labels y are unknown, so the results cannot be quantified.

Unnamed: 0,X_test,class_name
0,la coñería a concept with great potential from...,Foodtech
1,frankies frankies is the first polish juice ba...,Foodtech
2,toronto container company toronto container co...,Foodtech
3,ess ventures european student startups is the ...,Edtech
4,kotokan kotokan is the maths problemsolving pl...,Edtech
5,copenhagen business school where university me...,Edtech
6,london tuition group elearning students from t...,Edtech
7,linkmaas created by students linkmaas arrives ...,Edtech
8,all ears we are trying to create a platform wh...,Edtech
9,omnimediacorp omnimediacorp specializes in sub...,HRtech


Error: Runtime no longer has a reference to this dataframe, please re-run this cell and try again.
Error: Runtime no longer has a reference to this dataframe, please re-run this cell and try again.
Error: Runtime no longer has a reference to this dataframe, please re-run this cell and try again.


# Repeat the process with all labelled as training data, and successful data as test data, and see what the results are

In [None]:
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/MyDrive/Capstone/打标/successful data

import pandas as pd
df = pd.read_csv('df_79.csv')
df

Mounted at /content/drive
/content/drive/MyDrive/Capstone/打标/successful data


Unnamed: 0,CareerID,FounderID,CompanyID,JobTitle,DateRange,Start Date,End Date,Duration (years),Location,Description,Created Date,Relevant,founded_company_value,Headquarters Location,Number Of Employees,index,busdesc
0,8,1,53036,Vise President,2005–2009,2005-03-01,2009-09-01,4.5,,,2021-02-19,False,,,,0,
1,7,1,53036,Executive Director,2007–2009,2007-03-01,2009-09-01,2.5,"Moscow, Russian Federation",,2021-02-19,False,,,,1,
2,216950,1,19921,Founder,01/2016-Present,2016-01-01,,6.4,Cyprus,We are the largest global \ntravel mobility ma...,2022-05-05,True,22450000.0,European Union (EU),,2,gettransfer com gettransfer com provides trans...
3,80467,3,27310,Lead Accountant,Nov 2001–Apr 2006,2001-11-01,2006-04-01,4.4,,,2021-03-10,False,,,,0,
4,80414,3,39824,"Manager of Finance and Accounting, CIS Region",May 2006–Oct 2008,2006-05-01,2008-10-01,2.4,,,2021-03-10,False,,,,1,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11335,238807,25746,127266,Chief Technical Officer | co-founder,10/2015-12/2017,2015-10-01,2017-12-01,2.2,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,4,tilr tilr is solutions for skillsfirst organiz...
11336,312809,25746,152402,Product Strategist,1/2017-9/2018,2017-01-01,2018-09-01,1.7,,,2023-03-29,False,,,,5,
11337,238804,25746,127266,Chief Product Officer | co-founder,12/2017-01/2019,2017-12-01,2019-01-01,1.1,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,6,tilr tilr is solutions for skillsfirst organiz...
11338,312807,25746,152400,Chief Executive Officer | co-founder,12/2018-3/2021,2018-12-01,2021-03-01,2.2,,Inkblot was acquired by Green Shield Holdings ...,2023-03-29,True,3808640.0,,,7,


In [None]:
drive.mount('/content/drive')
%cd /content/drive/MyDrive/Capstone/打标

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/MyDrive/Capstone/打标


In [None]:
# Separate labeled and unlabeled data
path='related_company_after_trans.xlsx'
data = pd.read_excel(path)
data=data[["busdesc",'gind']]

labeled_data = data[data['gind'].notnull()]
labeled_data['gind'] = labeled_data['gind'].astype(int)

unlabeled_data = df[df['busdesc'].notnull()]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  labeled_data['gind'] = labeled_data['gind'].astype(int)


In [None]:
len(unlabeled_data)

3106

In [None]:
# Train the model on all labeled data
X_train = labeled_data["busdesc"]
Y_train = labeled_data["gind"]

model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')
X_train_embed = model.encode(list(X_train))

clf = OneVsRestClassifier(SVC())
clf.fit(X_train_embed, Y_train)

# Test the model on all unlabeled data
X_test = unlabeled_data["busdesc"]

#Y_test = unlabeled_data["gind"]
#We don't have Y_test as for unlabelled data

X_test_embed = model.encode(list(X_test))
y_pred = clf.predict(X_test_embed)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
predicted_data = {'busdesc': X_test,'vertical': y_pred}
predicted_data = pd.DataFrame(predicted_data)
len(predicted_data)

3106

In [None]:
# prompt: predicted_data去掉重复的行

predicted_data = predicted_data.drop_duplicates(subset='busdesc')


In [None]:
predicted_data

Unnamed: 0,busdesc,vertical
2,gettransfer com gettransfer com provides trans...,25
8,ementry ementry started with an idea born in t...,26
9,keker keker is a b2b marketplace for back offi...,15
11,factorin leading blockchain powered supply cha...,42
23,gourmetoriginscom gourmetoriginscom is an onli...,34
...,...,...
11280,seafood souq through the efficiencies that tec...,42
11295,🧠 paraplannerai every advisor reaches a limit ...,40
11307,zazuba inc zazuba is a centralized portal wher...,32
11310,leap motion leap motion is now ultraleap follo...,41


In [None]:
# prompt: df left join predicted_data on busdesc只保留predicted_data的vertical
df_joined=df.merge(predicted_data, on='busdesc', how='left')
df_joined

Unnamed: 0,CareerID,FounderID,CompanyID,JobTitle,DateRange,Start Date,End Date,Duration (years),Location,Description,Created Date,Relevant,founded_company_value,Headquarters Location,Number Of Employees,index,busdesc,vertical
0,8,1,53036,Vise President,2005–2009,2005-03-01,2009-09-01,4.5,,,2021-02-19,False,,,,0,,
1,7,1,53036,Executive Director,2007–2009,2007-03-01,2009-09-01,2.5,"Moscow, Russian Federation",,2021-02-19,False,,,,1,,
2,216950,1,19921,Founder,01/2016-Present,2016-01-01,,6.4,Cyprus,We are the largest global \ntravel mobility ma...,2022-05-05,True,22450000.0,European Union (EU),,2,gettransfer com gettransfer com provides trans...,25.0
3,80467,3,27310,Lead Accountant,Nov 2001–Apr 2006,2001-11-01,2006-04-01,4.4,,,2021-03-10,False,,,,0,,
4,80414,3,39824,"Manager of Finance and Accounting, CIS Region",May 2006–Oct 2008,2006-05-01,2008-10-01,2.4,,,2021-03-10,False,,,,1,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11335,238807,25746,127266,Chief Technical Officer | co-founder,10/2015-12/2017,2015-10-01,2017-12-01,2.2,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,4,tilr tilr is solutions for skillsfirst organiz...,20.0
11336,312809,25746,152402,Product Strategist,1/2017-9/2018,2017-01-01,2018-09-01,1.7,,,2023-03-29,False,,,,5,,
11337,238804,25746,127266,Chief Product Officer | co-founder,12/2017-01/2019,2017-12-01,2019-01-01,1.1,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,6,tilr tilr is solutions for skillsfirst organiz...,20.0
11338,312807,25746,152400,Chief Executive Officer | co-founder,12/2018-3/2021,2018-12-01,2021-03-01,2.2,,Inkblot was acquired by Green Shield Holdings ...,2023-03-29,True,3808640.0,,,7,,


In [None]:
# prompt: 删除busdesc这一列

df_joined = df_joined.drop(columns=['busdesc'])


In [None]:
# prompt: 读入merged_df.xlsx
vertical_name = pd.read_excel("merged_df.xlsx")

In [None]:
df_joined = df_joined.merge(vertical_name[['New_class_byhand', 'class_name']],left_on='vertical', right_on='New_class_byhand', how='left')
df_joined

Unnamed: 0,CareerID,FounderID,CompanyID,JobTitle,DateRange,Start Date,End Date,Duration (years),Location,Description,Created Date,Relevant,founded_company_value,Headquarters Location,Number Of Employees,index,vertical,New_class_byhand,class_name
0,8,1,53036,Vise President,2005–2009,2005-03-01,2009-09-01,4.5,,,2021-02-19,False,,,,0,,,
1,7,1,53036,Executive Director,2007–2009,2007-03-01,2009-09-01,2.5,"Moscow, Russian Federation",,2021-02-19,False,,,,1,,,
2,216950,1,19921,Founder,01/2016-Present,2016-01-01,,6.4,Cyprus,We are the largest global \ntravel mobility ma...,2022-05-05,True,22450000.0,European Union (EU),,2,25.0,25.0,"Carsharing,Micro-mobility,Mobility tech,Ridesh..."
3,80467,3,27310,Lead Accountant,Nov 2001–Apr 2006,2001-11-01,2006-04-01,4.4,,,2021-03-10,False,,,,0,,,
4,80414,3,39824,"Manager of Finance and Accounting, CIS Region",May 2006–Oct 2008,2006-05-01,2008-10-01,2.4,,,2021-03-10,False,,,,1,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11335,238807,25746,127266,Chief Technical Officer | co-founder,10/2015-12/2017,2015-10-01,2017-12-01,2.2,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,4,20.0,20.0,HRtech
11336,312809,25746,152402,Product Strategist,1/2017-9/2018,2017-01-01,2018-09-01,1.7,,,2023-03-29,False,,,,5,,,
11337,238804,25746,127266,Chief Product Officer | co-founder,12/2017-01/2019,2017-12-01,2019-01-01,1.1,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,6,20.0,20.0,HRtech
11338,312807,25746,152400,Chief Executive Officer | co-founder,12/2018-3/2021,2018-12-01,2021-03-01,2.2,,Inkblot was acquired by Green Shield Holdings ...,2023-03-29,True,3808640.0,,,7,,,


In [None]:
# prompt: 删除vertical	New_class_byhand这两列

df_joined = df_joined.drop(columns=["vertical", "New_class_byhand"])


In [None]:
# prompt: df_joined保存成7.11_data.csv

df_joined.to_csv('7.11_data.csv')


In [None]:
df_joined

Unnamed: 0,CareerID,FounderID,CompanyID,JobTitle,DateRange,Start Date,End Date,Duration (years),Location,Description,Created Date,Relevant,founded_company_value,Headquarters Location,Number Of Employees,index,class_name
0,8,1,53036,Vise President,2005–2009,2005-03-01,2009-09-01,4.5,,,2021-02-19,False,,,,0,
1,7,1,53036,Executive Director,2007–2009,2007-03-01,2009-09-01,2.5,"Moscow, Russian Federation",,2021-02-19,False,,,,1,
2,216950,1,19921,Founder,01/2016-Present,2016-01-01,,6.4,Cyprus,We are the largest global \ntravel mobility ma...,2022-05-05,True,22450000.0,European Union (EU),,2,"Carsharing,Micro-mobility,Mobility tech,Ridesh..."
3,80467,3,27310,Lead Accountant,Nov 2001–Apr 2006,2001-11-01,2006-04-01,4.4,,,2021-03-10,False,,,,0,
4,80414,3,39824,"Manager of Finance and Accounting, CIS Region",May 2006–Oct 2008,2006-05-01,2008-10-01,2.4,,,2021-03-10,False,,,,1,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11335,238807,25746,127266,Chief Technical Officer | co-founder,10/2015-12/2017,2015-10-01,2017-12-01,2.2,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,4,HRtech
11336,312809,25746,152402,Product Strategist,1/2017-9/2018,2017-01-01,2018-09-01,1.7,,,2023-03-29,False,,,,5,
11337,238804,25746,127266,Chief Product Officer | co-founder,12/2017-01/2019,2017-12-01,2019-01-01,1.1,"Toronto, Canada Area",,2022-06-17,True,12500000.0,"Toronto, Ontario",11-50,6,HRtech
11338,312807,25746,152400,Chief Executive Officer | co-founder,12/2018-3/2021,2018-12-01,2021-03-01,2.2,,Inkblot was acquired by Green Shield Holdings ...,2023-03-29,True,3808640.0,,,7,
