<b>Authors:</b><br>
Stefan Pasch<br>
Dimitrios Petridis

This Jupyter Notebook builds the core code for the model training on the public dataset, the parameters tuning, and the predictions on the private dataset.

First, install the required libraries and packages:

In [None]:
!pip install -r requirements.txt

To start running the notebook, import all the necessary libraries, internal functions and constants:

In [4]:
import os
import pandas as pd
import numpy as np
from tqdm import tqdm
from datasets import Dataset

In [5]:
from utils.utility_functions import convert_to_list, create_input
from utils.constants import mapping

In [12]:
# Set CUDA_LAUNCH_BLOCKING equal to 1 to enables proper CUDA tracebacks in Google Colab:
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"

Now, import the necessary datasets and continue with the datasets' validations and preparation:

In [7]:
# Import the three labeled datasets:
train = pd.read_json("data/public/train_refind_official.json")
dev = pd.read_json("data/public/dev_refind_official.json")
test = pd.read_json("data/public/test_refind_official.json")

In [8]:
# Import the unlabeled dataset:
private = pd.read_json("data/private/private_dataset.json")

In [9]:
# Apply conversion to list to the relevant column of each dataset:
train['spacy_ner'] = train["spacy_ner"].apply(convert_to_list)
dev['spacy_ner'] = dev["spacy_ner"].apply(convert_to_list)
test['spacy_ner'] = test["spacy_ner"].apply(convert_to_list)
private['spacy_ner'] = private["spacy_ner"].apply(convert_to_list)

In [10]:
# Confirm the conversion above:
[print(c['spacy_ner'].head(1)) for c in [train, dev, test, private]]

0    [O, O, O, ORG, ORG, ORG, O, O, O, ORG, ORG, OR...
Name: spacy_ner, dtype: object
0    [O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, ...
Name: spacy_ner, dtype: object
0    [O, O, O, O, O, O, O, O, O, O, O, O, O, O, O, ...
Name: spacy_ner, dtype: object
0    [O, O, O, O, GOV_AGY, O, O, O, O, O, O, O, O, ...
Name: spacy_ner, dtype: object


[None, None, None, None]

In [11]:
# Prepare the input datasets by concatenating the text and adding the respective NER group before each entity:

tqdm.pandas(desc="Create clean input TRAIN")
train["text"] = train.progress_apply(
    lambda row: create_input(token_list = row["token"], e1_start=row["e1_start"], e2_start = row["e2_start"], ner_list = row["spacy_ner"] ), axis=1)
train["label"] = train["relation"].map(mapping)
train = train[train['label'].notna()]
train.label = train.label.astype(int)

tqdm.pandas(desc="Create clean input DEV dataset")
dev["text"] = dev.progress_apply(
    lambda row: create_input(token_list =row["token"], e1_start=row["e1_start"], e2_start = row["e2_start"], ner_list = row["spacy_ner"] ), axis=1)
dev["label"] = dev["relation"].map(mapping)
dev = dev[dev['label'].notna()]
dev.label = dev.label.astype(int)

tqdm.pandas(desc="Create clean input TEST dataset")
test["text"] = test.progress_apply(
    lambda row: create_input(token_list =row["token"], e1_start=row["e1_start"], e2_start = row["e2_start"], ner_list = row["spacy_ner"] ), axis=1)
test["label"] = test["relation"].map(mapping)
test = test[test['label'].notna()]
test.label = test.label.astype(int)

tqdm.pandas(desc="Create clean input PRIVATE dataset")
private["text"] = private.progress_apply(
    lambda row: create_input(token_list =row["token"], e1_start=row["e1_start"], e2_start = row["e2_start"], ner_list = row["spacy_ner"] ), axis=1)
private["label"] = 0
private = private[private['label'].notna()]
private.label = private.label.astype(int)

Create clean input TRAIN: 100%|██████████| 20070/20070 [00:00<00:00, 37069.24it/s]
Create clean input DEV dataset: 100%|██████████| 4306/4306 [00:00<00:00, 35146.32it/s]
Create clean input TEST dataset: 100%|██████████| 4300/4300 [00:00<00:00, 35119.69it/s]
Create clean input PRIVATE dataset: 100%|██████████| 3069/3069 [00:00<00:00, 23253.16it/s]


In [13]:
# Confirm the input preparation above:
[print(c.text[0]+"\n") for c in [train, dev, test, private]]

 warrants to purchase ORG Lumos Networks Corp. common stock , ORG the Pamplona Entities ) .

 mortgage insurers with which they do business due to weakness in Essent Group Ltd. relative financial strength or other reasons , which could negatively affect Essent Group Ltd. level of ORG NIW and ORG Essent Group Ltd. market share .

 other changes in the financial condition or future prospects of issuers of securities that Best Hometown Bancorp , Inc. own , including ORG Best Hometown Bancorp , Inc. stock in the Federal Home Loan Bank ( FHLB ) of Chicago or ORG FHLB and .

 Prior to entering the GOV_AGY Pentagon , Mr. Tyrer served 21 years on Capitol Hill in a variety of congressional staff roles , including Chief of Staff to then - Senator PERSON William Cohen of Maine from 1989 - 1996 and campaign manager for U.S. Senator Susan Collins in her successful 1996 U.S. Senate campaign .



[None, None, None, None]

In [14]:
# Keep only the necessary columns in each dataset:
train_df = train[["text", "label"]].reset_index(drop = True)
dev_df = dev[["text", "label"]].reset_index(drop = True)
test_df = test[["text", "label"]].reset_index(drop = True)
private_df = private[["text", "label"]].reset_index(drop = True)

In [18]:
# Check the total length of each dataset:
print(len(train_df))
print(len(dev_df))
print(len(test_df))
print(len(private_df))

20070
4306
4300
3069


In [19]:
# Concatenate all the labeled datasets together to one training dataset:
all_train_df = pd.concat([train_df,
                          dev_df,
                          private_df])

In [20]:
# Check the length of the training dataset:
len(all_train_df)

27445

In [23]:
# Check the structure of the training dataset:
all_train_df

Unnamed: 0,text,label
0,warrants to purchase ORG Lumos Networks Corp....,0
1,warrants to purchase ORG Lumos Networks Corp....,0
2,turn over to Global Gold at its offices in Ry...,0
3,ts Eighteen of FelCor Lodging LP Consolidated...,0
4,the ORG WFOE will waive and release you uncon...,0
...,...,...
3064,In connection with the closing of the transac...,0
3065,The terms of the notes are the same as the re...,0
3066,In connection with the Company 's DATE 2016 O...,0
3067,"Manitowoc , through its wholly - owned subsid...",0


In [30]:
# Convert the datasets into "Trainer-able" Dataset objects:
all_train_dataset = Dataset.from_pandas(all_train_df)
private_dataset = Dataset.from_pandas(private_df)

In [None]:
# Check the structure of the converted training dataset:
all_train_dataset

Dataset({
    features: ['text', 'label', '__index_level_0__'],
    num_rows: 28676
})

In [33]:
# Check the structure of the converted private dataset:
private_dataset

Dataset({
    features: ['text', 'label'],
    num_rows: 3069
})

In [34]:
train_df

Unnamed: 0,text,label
0,warrants to purchase ORG Lumos Networks Corp....,0
1,warrants to purchase ORG Lumos Networks Corp....,0
2,turn over to Global Gold at its offices in Ry...,0
3,ts Eighteen of FelCor Lodging LP Consolidated...,0
4,the ORG WFOE will waive and release you uncon...,0
...,...,...
20065,"On April 15 , 2010 , ORG Demand Pooling , Inc...",20
20066,"On July 17 , 2017 , ORG Greater Cannabis Comp...",20
20067,"In November 2016 , ORG Xcede entered into ano...",20
20068,"On August 5 , 2015 , ORG Achaogen Inc entered...",20


In [40]:
# Check the labels distributions in the training dataset:
print(all_train_df["label"].value_counts(normalize=True))

label
0     0.516014
18    0.138349
2     0.125269
4     0.076699
20    0.028931
1     0.019821
3     0.019494
15    0.017052
16    0.012643
19    0.009619
7     0.006231
8     0.005976
10    0.005903
17    0.004044
12    0.003571
6     0.002405
9     0.002332
5     0.001749
11    0.001312
21    0.001020
13    0.000874
14    0.000692
Name: proportion, dtype: float64


In [66]:
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

In [None]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("roberta-large")
#tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
##example
#inputs = tokenizer(sentences, padding="max_length", truncation=True)

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length",  max_length=200, truncation=True)

all_train_tokenized = all_train_dataset.map(tokenize_function, batched=True)
private_test_tokenized = private_test_dataset.map(tokenize_function, batched=True)

Map:   0%|          | 0/28676 [00:00<?, ? examples/s]

Map:   0%|          | 0/3069 [00:00<?, ? examples/s]

In [None]:
from transformers import AutoModelForSequenceClassification

#model = AutoModelForSequenceClassification.from_pretrained("bert-large-uncased", num_labels=3, classifier_dropout = 0.1)

#model = AutoModelForSequenceClassification.from_pretrained("roberta-base", num_labels=22)
model = AutoModelForSequenceClassification.from_pretrained("roberta-large", num_labels=22)
#model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=22)


Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.layer_norm.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-large and are newly initialized: ['classifier.dense.weight', 'classifier.out_proj.bias', 'classifier.dense.bias', 'classifier.out_proj.weight']
You should 

In [None]:
from transformers import TrainingArguments

training_args = TrainingArguments("test_trainer", weight_decay=0.01,num_train_epochs=3.0, learning_rate=1e-05,  save_steps=5000, per_device_train_batch_size =16  )

In [None]:
from transformers import Trainer

trainer = Trainer(
    model=model, args=training_args, train_dataset=all_train_tokenized, eval_dataset=private_test_tokenized
)

In [None]:
trainer.train()



Step,Training Loss
500,0.9367
1000,0.4845
1500,0.4372
2000,0.3721
2500,0.3537
3000,0.3405
3500,0.3343
4000,0.2868
4500,0.2776
5000,0.291


TrainOutput(global_step=21516, training_loss=0.20668797076668166, metrics={'train_runtime': 11920.0015, 'train_samples_per_second': 28.868, 'train_steps_per_second': 1.805, 'total_flos': 1.252775140995456e+17, 'train_loss': 0.20668797076668166, 'epoch': 12.0})

In [None]:

predict = trainer.predict(test_dataset = private_test_tokenized)

In [None]:
liste = []
for i in predict[0]:
  max_val = -99
  for x in range(0,len(predict[0][1])):
    if i[x] > max_val:
      max_val = i[x]
      prediction = x
  liste.append(prediction)



In [None]:
#val_df['prediction'] = liste
#test_df['prediction'] = liste
private_test_df['prediction'] = liste

In [None]:
private_test_df

Unnamed: 0,text,label,prediction
0,"Prior to entering the GOV_AGY Pentagon , Mr. ...",0,5
1,These actions will not result in ORG Ford Ind...,0,2
2,Interest Expense Interest expense increased f...,0,0
3,"aGvHD Phase 1 Trial In March 2021 , results w...",0,0
4,ROYALTY PHARMA PLC NOTES TO THE CONSOLIDATED ...,0,0
...,...,...,...
3064,In connection with the closing of the transac...,0,0
3065,The terms of the notes are the same as the re...,0,18
3066,In connection with the Company 's DATE 2016 O...,0,10
3067,"Manitowoc , through its wholly - owned subsid...",0,0


In [None]:
private_test_df["text"][3065]

' The terms of the notes are the same as the rest of the lender group We paid PERSON Ed Anakar , our TITLE director of operations – club division , employment compensation of $ 655,289 , $ 502,404 , and $ 550,000 during the fiscal years ended September 30 , 2021 , 2020 , and 2019 , respectively .'

In [None]:
output = private_test_df[['prediction']]

In [None]:
output

Unnamed: 0,prediction
0,5
1,2
2,0
3,0
4,0
...,...
3064,0
3065,18
3066,10
3067,0


In [None]:
output.to_csv("/content/drive/MyDrive/DataMonkeys/submission_files/private6ne12.csv", sep ='\t', header = False, index= False)

In [None]:
len(output)

3069

In [None]:
df = pd.DataFrame(predict[0])

In [None]:
test = pd.read_json("/content/drive/MyDrive/DataMonkeys/raw_data/private_dataset.json")

test['spacy_ner'] = test["spacy_ner"].apply(convert_to_list)

In [None]:
test_merge = pd.merge(test, df, left_index=True, right_index=True)
test_merge["prediction"] = liste


In [None]:
mapping_reverse = { 0: "no_relation",
            1: "ORG-DATE",
            2: "ORG-GPE",
            3: "PERS-ORG",
            4: "PERS-ORG",
            5: "PERS-GOV_AGY",
            6: "ORG-ORG",
            7: "ORG-MONEY",
            8: "ORG-GPE",
            9: "PERS-UNIV",
            10: "ORG-DATE",
            11: "PERS-UNIV",
            12: "ORG-GPE",
            13: "ORG-MONEY",
            14: "ORG-MONEY",
            15: "ORG-ORG",
            16: "ORG-ORG",
            17: "PERS-ORG",
            18: "PERS-TITLE",
            19:"ORG-MONEY" ,
            20: "ORG-ORG",
            21: "PERS-UNIV"}

In [None]:
test_merge["pred_relation"] = test_merge["prediction"].map(mapping_reverse)

In [None]:
tqdm.pandas(desc="create clean input TEST")
test_merge["text"] = test_merge.progress_apply(lambda row: create_input(token_list =row["token"], e1_start=row["e1_start"], e2_start = row["e2_start"], ner_list = row["spacy_ner"] ), axis=1)

create clean input TEST: 100%|██████████| 3069/3069 [00:00<00:00, 23176.41it/s]


In [None]:
test_merge["text"][3]

' aGvHD Phase 1 Trial In March 2021 , results were published from an TITLE Investigator Sponsored Phase 1 study conducted by Joseph Pidala , MD , PhD ( Moffitt Cancer Center ) , and PERSON Brian C. Betts , MD ( Masonic Cancer Center at the University of Minnesota ) , evaluating pacritinib , an investigational oral kinase inhibitor with specificity for JAK2 , for the prevention of acute graft - versus - host disease ( aGvHD ) .'

In [None]:
test_merge["e1_start"][3]

33

In [None]:
test_merge["e2_start"][3]

13

In [None]:
test_merge["spacy_ner"][3]

['O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'TITLE',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'PERSON',
 'PERSON',
 'PERSON',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O',
 'O']

In [None]:
import ast

def convert_to_list(value):
    try:
        return ast.literal_eval(value)
    except (SyntaxError, ValueError):
        return value

# Apply conversion function to the column
test['spacy_ner'] = test_merge["spacy_ner"].apply(convert_to_list)


In [None]:
test_merge.to_csv("/content/drive/MyDrive/DataMonkeys/output_files/output6ne12.csv", sep =';', header = True)

In [None]:
test_merge

Unnamed: 0,id,docid,relation,rel_group,token,e1_start,e1_end,e2_start,e2_end,e1_type,...,15,16,17,18,19,20,21,prediction,pred_relation,text
0,1_William Cohen_30-32_Pentagon_4-5,2021,,PERSON-GOV_AGY,"[Prior, to, entering, the, Pentagon, ,, Mr., T...",30,32,4,5,PERSON,...,-1.391628,0.073100,-0.788721,3.709137,-1.852387,1.393929,-0.681351,0,no_relation,"Prior to entering the , Pentagon , Mr. Tyrer ..."
1,267_Ford India_6-8_Sanand_25-26_6-8_25-26,2021,,ORG-GPE,"[These, actions, will, not, result, in, Ford, ...",6,8,25,26,ORG,...,0.002865,-0.935565,-1.230419,-1.355157,-0.559362,-1.131405,-1.632586,2,ORG-GPE,These actions will not result in ORG Ford Ind...
2,76_Ellie Mae_49-51_May 2020_31-33_49-51_31-33,2021,,ORG-DATE,"[Interest, Expense, Interest, expense, increas...",49,51,31,33,ORG,...,-0.158951,0.309201,-1.781407,0.100739,-0.109399,0.079705,-2.124119,0,no_relation,Interest Expense Interest expense increased f...
3,326_Brian C. Betts_33-36_Investigator_13-14,2021,,PERSON-TITLE,"[aGvHD, Phase, 1, Trial, In, March, 2021, ,, r...",33,36,13,14,PERSON,...,-0.882330,0.504513,-1.944406,1.241539,-1.188621,2.309536,-1.544500,0,no_relation,"aGvHD Phase 1 Trial In March 2021 , results w..."
4,119_Gilead_49-50_2019_39-40_49-50_39-40,2021,,ORG-DATE,"[ROYALTY, PHARMA, PLC, NOTES, TO, THE, CONSOLI...",49,50,39,40,ORG,...,0.439898,2.018108,-1.884603,0.215939,-0.588170,1.005893,-2.486440,0,no_relation,ROYALTY PHARMA PLC NOTES TO THE CONSOLIDATED ...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3064,279_Ballantyne Strong_100-102_Firefly_77-78_10...,2021,,ORG-ORG,"[In, connection, with, the, closing, of, the, ...",100,102,77,78,ORG,...,0.558981,3.373878,-1.588498,0.058236,-1.105172,3.573465,-1.682987,0,no_relation,In connection with the closing of the transac...
3065,609_Ed Anakar_17-19_director_21-22,2021,,PERSON-TITLE,"[The, terms, of, the, notes, are, the, same, a...",17,19,21,22,PERSON,...,-1.068050,0.313445,-1.586685,2.492488,-0.565019,0.358121,-2.069442,0,no_relation,The terms of the notes are the same as the re...
3066,446_Kairos_7-8_2016_6-7_7-8_6-7,2021,,ORG-DATE,"[In, connection, with, the, Company, 's, 2016,...",7,8,6,7,ORG,...,0.063080,0.768268,-1.781236,0.236747,-0.160180,0.288439,-1.996445,0,no_relation,In connection with the Company 's DATE 2016 O...
3067,1069_Manitowoc_44-45_Potain_53-54_44-45_53-54,2021,,ORG-ORG,"[Manitowoc, ,, through, its, wholly, -, owned,...",44,45,53,54,ORG,...,1.930038,0.428860,-1.664649,-0.546335,-0.924684,0.779358,-2.595990,0,no_relation,"Manitowoc , through its wholly - owned subsid..."


In [None]:
from sklearn.metrics import classification_report,confusion_matrix
print("REPORT:")
print(classification_report(test_df["label"],test_df["prediction"]))
print("")

REPORT:
              precision    recall  f1-score   support

           0       0.80      0.74      0.77      1953
           1       0.67      0.90      0.77        96
           2       0.88      0.85      0.86       605
           3       0.11      0.07      0.09        95
           4       0.53      0.96      0.68       374
           5       0.00      0.00      0.00         8
           6       0.00      0.00      0.00        12
           7       0.00      0.00      0.00        31
           8       0.78      0.24      0.37        29
           9       0.00      0.00      0.00        12
          10       0.00      0.00      0.00        24
          11       0.00      0.00      0.00         7
          12       0.00      0.00      0.00        17
          13       0.00      0.00      0.00         5
          14       0.00      0.00      0.00         4
          15       0.43      0.66      0.52        83
          16       0.00      0.00      0.00        61
          17       

  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [None]:
confusion_matrix = pd.crosstab(test_df["label"], test_df["prediction"])
print(confusion_matrix)

prediction    0   1    2   3    4   5   6   7   8   9   10  11  12  15  16  \
label                                                                        
0           1446  24   34  30  138   0   8  10   2   0  22   0   1  52  59   
1              9  87    0   0    0   0   0   0   0   0   0   0   0   0   0   
2             85   0  491   0    0   0   0   0  13   0   0   0  15   0   0   
3              3   0    0  11   73   0   0   0   0   0   0   0   0   0   0   
4              1   0    0  15  355   0   0   0   0   0   0   0   0   0   0   
5              4   0    0   0    3   1   0   0   0   0   0   0   0   0   0   
6              4   0    0   0    0   0   2   0   0   0   0   0   0   1   5   
7              3   0    0   0    0   0   0  28   0   0   0   0   0   0   0   
8              0   0    6   0    0   0   0   0  23   0   0   0   0   0   0   
9              1   0    0   0    0   0   0   0   0   9   0   0   0   0   0   
10            11   0    0   0    0   0   0   0   0   0  13   0  

In [None]:
test_df.to_excel("/content/drive/MyDrive/..")

In [None]:
results_pair = []
for i in range(0,len(test_df)-1):
  if test_df["label"][i] == liste[i]:
    results_pair.append(1)
  else:
    results_pair.append(0)

In [None]:
results_pair

In [None]:
statistics.mean(results_pair)

0.6264455087022545

In [None]:
len(liste)

8562

In [None]:
from transformers import TrainingArguments

training_args = TrainingArguments("test_trainer", evaluation_strategy="epoch")

PyTorch: setting up devices
The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).


In [None]:
model.save_pretrained("/content/drive/MyDrive/..")

Configuration saved in /content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/config.json
Model weights saved in /content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/pytorch_model.bin


In [None]:
tokenizer.save_pretrained("/content/drive/MyDrive//..")

tokenizer config file saved in /content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/tokenizer_config.json
Special tokens file saved in /content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/special_tokens_map.json


('/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/tokenizer_config.json',
 '/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/special_tokens_map.json',
 '/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/vocab.json',
 '/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/merges.txt',
 '/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/added_tokens.json',
 '/content/drive/MyDrive/seb_stef/models/roberta_large_hierarchy/tokenizer.json')

In [None]:
tokenizer = AutoTokenizer.from_pretrained("/content/drive/MyDrive/..")

loading file vocab.json
loading file merges.txt
loading file tokenizer.json
loading file added_tokens.json
loading file special_tokens_map.json
loading file tokenizer_config.json


In [None]:
model = AutoModelForSequenceClassification.from_pretrained("/content/drive/MyDrive/..", num_labels=4)

loading configuration file /content/drive/MyDrive/seb_stef/models/RoBERTa_large_best_fit/config.json
Model config RobertaConfig {
  "_name_or_path": "/content/drive/MyDrive/seb_stef/models/RoBERTa_large_best_fit/",
  "architectures": [
    "RobertaForSequenceClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "bos_token_id": 0,
  "classifier_dropout": null,
  "eos_token_id": 2,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 1024,
  "id2label": {
    "0": "clan",
    "1": "adhocracy",
    "2": "market",
    "3": "hierarchy"
  },
  "initializer_range": 0.02,
  "intermediate_size": 4096,
  "label2id": {
    "adhocracy": 1,
    "clan": 0,
    "hierarchy": 3,
    "market": 2
  },
  "layer_norm_eps": 1e-05,
  "max_position_embeddings": 514,
  "model_type": "roberta",
  "num_attention_heads": 16,
  "num_hidden_layers": 24,
  "pad_token_id": 1,
  "position_embedding_type": "absolute",
  "problem_type": "single_label_classification",
  "torch_dtype": "float32",