# Getting started with DeepMatcher

Note: you can run **[this notebook live in Google Colab](https://colab.research.google.com/github/anhaidgroup/deepmatcher/blob/master/examples/getting_started.ipynb)** and use free GPUs provided by Google.

This tutorial describes how to effortlessly perform entity matching using deep neural networks. Specifically, we will see how to match pairs of tuples (also called data records or table rows) to determine if they refer to the same real world entity. To do so, we will need labeled examples as input, i.e., tuple pairs which have been annotated as matches or non-matches. This will be used to train our neural network using supervised learning. At the end of this tutorial, you will have a trained neural network as output which you can easily apply to unlabeled tuple pairs to make predictions.

In [1]:
'''from google.colab import drive
drive.mount('/content/drive')'''

"from google.colab import drive\ndrive.mount('/content/drive')"

As an overview, here are the 4 steps to use `deepmatcher` which we will go through in this tutorial:

<ol start="0">
  <li>Setup</li>
  <li>Process labeled data</li>
  <li>Define neural network model</li>
  <li>Train model</li>
  <li>Apply model to new data</li>
</ol>

Let's begin!

## Step 0. Setup

If you are running this notebook inside Colab, you will first need to install necessary packages by running the code below:

In [3]:
try:
    import deepmatcher
except:
    !pip install -qqq deepmatcher

In [4]:
# contorna erro legacy
!pip install torchtext==0.10.0

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


Now let's import `deepmatcher` which will do all the heavy lifting to build and train neural network models for entity matching. 

In [5]:
import pandas as pd
import deepmatcher as dm

We recommend having a GPU available for the training in Step 4. In case a GPU is not available, we will use all available CPU cores. You can run the following command to determine if a GPU is available and will be used for training:

In [6]:
import torch
torch.cuda.is_available()

True

### Download sample data for entity matching

Now let's get some sample data to play with in this tutorial. We will need three sets of labeled data and one set of unlabeled data:

1. **Training Data:** This is used for training our neural network model.
2. **Validation Data:** This is used for determining the configuration (i.e., hyperparameters) of our model in such a way that the model does not overfit to the training set.
3. **Test Data:** This is used to estimate the performance of our trained model on unlabeled data.
4. **Unlabeled Data:** The trained model is applied on this data to obtain predictions, which can then be used for downstream tasks in practical application scenarios.

We download these four data sets to the `sample_data` directory:

In [7]:
#!mkdir -p /content/drive/MyDrive/IC/unirTeste
#!wget -qnc -P /content/drive/MyDrive/IC/unirTeste/ http://pages.cs.wisc.edu/~anhai/data1/deepmatcher_data/Structured.zip

In [8]:
#!unzip -q /content/drive/MyDrive/IC/unirTeste/Structured.zip -d /content/drive/MyDrive/IC/unirTeste

In [9]:
#!ls /content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar

In [10]:
'''
def merge_data(labeled, table_a, table_b, output):
  merged_csv = pd.read_csv(labeled).rename(columns={'ltable_id': 'left_id', 'rtable_id': 'right_id'})
  table_a_csv = pd.read_csv(table_a)
  table_a_csv = table_a_csv.rename(columns={col: 'left_' + col for col in table_a_csv.columns})
  table_b_csv = pd.read_csv(table_b)
  table_b_csv = table_b_csv.rename(columns={col: 'right_' + col for col in table_b_csv.columns})
  merged_csv = pd.merge(merged_csv, table_a_csv, on='left_id')
  merged_csv = pd.merge(merged_csv, table_b_csv, on='right_id')
  merged_csv['id'] = merged_csv[['left_id', 'right_id']].apply(lambda row: '_'.join([str(c) for c in row]), axis=1)
  del merged_csv['left_id']
  del merged_csv['right_id']
  merged_csv.to_csv(output, index=False)
'''

"\ndef merge_data(labeled, table_a, table_b, output):\n  merged_csv = pd.read_csv(labeled).rename(columns={'ltable_id': 'left_id', 'rtable_id': 'right_id'})\n  table_a_csv = pd.read_csv(table_a)\n  table_a_csv = table_a_csv.rename(columns={col: 'left_' + col for col in table_a_csv.columns})\n  table_b_csv = pd.read_csv(table_b)\n  table_b_csv = table_b_csv.rename(columns={col: 'right_' + col for col in table_b_csv.columns})\n  merged_csv = pd.merge(merged_csv, table_a_csv, on='left_id')\n  merged_csv = pd.merge(merged_csv, table_b_csv, on='right_id')\n  merged_csv['id'] = merged_csv[['left_id', 'right_id']].apply(lambda row: '_'.join([str(c) for c in row]), axis=1)\n  del merged_csv['left_id']\n  del merged_csv['right_id']\n  merged_csv.to_csv(output, index=False)\n"

In [11]:
'''
merge_data(
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/train.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/joined_train.csv')
merge_data(
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/valid.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/joined_valid.csv')
merge_data(
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/test.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', 
    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/joined_test.csv')
'''

"\nmerge_data(\n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/train.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/joined_train.csv')\nmerge_data(\n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/valid.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/joined_valid.csv')\nmerge_data(\n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/test.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableA.csv', \n    '/content/drive/MyDrive/IC/unirTeste/Structured/DBLP-GoogleScholar/tableB.csv', \n    '/

To get an idea of how our data looks like, let's take a peek at the training dataset:

In [12]:
#pd.read_csv('/content/drive/MyDrive/IC/ICReteste/Deepmatcher/TesteErros/TesteErrosLabel/Erros6_10/Datasets/Structured/DBLP-GoogleScholar/joined_train.csv').head()

## Step 1. Process labeled data

Before we can use our data for training, `deepmatcher` needs to first load and process it in order to prepare it for neural network training. Currently `deepmatcher` only supports processing CSV files. Each CSV file is assumed to have the following kinds of columns:

* **"Left" attributes (required):** Our goal is to match tuple pairs. "Left" attributes are columns that correspond to the "left" tuple or the first tuple in the tuple pair. These column names are expected to be prefixed with "left_" by default.
* **"Right" attributes (required):** "Right" attributes are columns that correspond to the "right" tuple or the second tuple in the tuple pair. These column names are expected to be prefixed with "right_" by default.
* **Label column (required for train, validation, test):** Column containing the labels (match or non-match) for each tuple pair. Expected to be named "label" by default
* **ID column (required):** Column containing a unique ID for each tuple pair. This is for evaluation convenience.  Expected to be named "id" by default.

More details on what data processing involves and ways to customize it are described in **[this notebook](https://nbviewer.jupyter.org/github/anhaidgroup/deepmatcher/blob/master/examples/data_processing.ipynb)**. 

### Processing labeled data
In order to process our train, validation and test CSV files we call `dm.data.process` in the following code snippet which will load and process the CSV files and return three processed `MatchingDataset` objects respectively. These dataset objects will later be used for training and evaluation. The basic parameters to `dm.data.process` are as follows:

* **path (required): ** The path where all data is stored. This includes train, validation and test. `deepmatcher` may create new files in this directory to store information about these data sets. This allows subsequent `dm.data.process` calls to be much faster.
* **train (required): ** File name of training data in `path` directory.
* **validation (required): ** File name of validation data in `path` directory.
* **test (optional): ** File name of test data in `path` directory.
* **ignore_columns (optional): ** Any columns in the CSV files that you may want to ignore for the purposes of training. These should be included here. 

Note that the train, validation and test CSVs must all share the same schema, i.e., they should have the same columns. Processing data involves several steps and can take several minutes to complete, especially if this is the first time you are running the `deepmatcher` package.

NOTE: If you are running this in Colab, you may get a message saying 'Memory usage is close to the limit.' You can safely ignore it for now. We are working on reducing the memory footprint.

In [11]:
!wget https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.zip --directory-prefix=/root/.vector_cache
!unzip /root/.vector_cache/wiki.en.zip -d /root/.vector_cache/
!rm /root/.vector_cache/wiki.en.vec

--2022-08-09 20:56:29--  https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.zip
Resolving dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)... 104.22.74.142, 172.67.9.4, 104.22.75.142, ...
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10356881291 (9.6G) [application/zip]
Saving to: ‘/root/.vector_cache/wiki.en.zip’


2022-08-09 21:00:05 (42.0 MB/s) - Read error at byte 9466635303/10356881291 (Connection reset by peer). Retrying.

--2022-08-09 21:00:06--  (try: 2)  https://dl.fbaipublicfiles.com/fasttext/vectors-wiki/wiki.en.zip
Connecting to dl.fbaipublicfiles.com (dl.fbaipublicfiles.com)|104.22.74.142|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 10356881291 (9.6G), 890245988 (849M) remaining [application/zip]
Saving to: ‘/root/.vector_cache/wiki.en.zip’

wiki.en.zip         100%[++++++++++++++++++=>]   9.65G  42.7MB/s    in 22s     


In [12]:
'''train, validation, test = dm.data.process(
    path='/content/drive/MyDrive/IC/ICReteste/Deepmatcher/TesteErros/TesteErrosLabel/Erros6_10/Datasets/Dirty/DBLP-GoogleScholar/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test_error.csv')'''

"train, validation, test = dm.data.process(\n    path='/content/drive/MyDrive/IC/ICReteste/Deepmatcher/TesteErros/TesteErrosLabel/Erros6_10/Datasets/Dirty/DBLP-GoogleScholar/',\n    train='joined_train.csv',\n    validation='joined_valid.csv',\n    test='joined_test_error.csv')"

#### Peeking at processed data
Let's take a look at how the processed data looks like. To do this, we get the raw `pandas` table corresponding to the processed training dataset object. 

In [13]:
'''train_table = train.get_raw_table()
train_table.head()'''

'train_table = train.get_raw_table()\ntrain_table.head()'

The processed attribute values have been tokenized and lowercased so they may not look exactly the same as the input training data. These modifications help the neural network generalize better, i.e., perform better on data not trained on. 

## Step 2. Define neural network model

In this step you tell `deepmatcher` what kind of neural network you would like to use for entity matching. The easiest way to do this is to use one of the several kinds of neural network models that comes built-in with `deepmatcher`. To use a built-in network, construct a `dm.MatchingModel` as follows:

`model = dm.MatchingModel(attr_summarizer='<TYPE>')`

where `<TYPE>` is one of `sif`, `rnn`, `attention` or `hybrid`. If you are not familiar with what these mean, we strongly recommend taking a look at either **[slides from our talk on deepmatcher](http://bit.do/deepmatcher-talk)** for a high level overview, or **[our paper](http://pages.cs.wisc.edu/~anhai/papers1/deepmatcher-sigmod18.pdf)** for a more detailed explanation. Here we give briefly describe the intuition behind these four model types:
* **sif:** This model considers the **words** present in each attribute value pair to determine a match or non-match. It does not take word order into account.
* **rnn:** This model considers the **sequences of words** present in each attribute value pair to determine a match or non-match.
* **attention:** This model considers the **alignment of words** present in each attribute value pair to determine a match or non-match. It does not take word order into account.
* **hybrid:** This model considers the **alignment of sequences of words** present in each attribute value pair to determine a match or non-match. This is the default.

`deepmatcher` is highly customizable and allows you to tune almost every aspect of the neural network model for your application scenario. **[This tutorial](https://nbviewer.jupyter.org/github/anhaidgroup/deepmatcher/blob/master/examples/matching_models.ipynb)** discusses the structure of `MatchingModel`s and how they can be customized.

For this tutorial, let's create a `hybrid` model for entity matching:

In [14]:
'''model = dm.MatchingModel(attr_summarizer='attention')'''

"model = dm.MatchingModel(attr_summarizer='attention')"

## Step 3. Train model

Next, we train the defined neural network model using the processed training and validation data. To do so, we call the `run_train` method which takes the following basic parameters:

* **train:** The processed training dataset object (of type `MatchingDataset`).
* **validation:** The processed validation dataset object (of type `MatchingDataset`).
* **epochs:** Number of times to go over the entire `train` data for training the model.
* **batch_size:** Number of labeled examples (tuple pairs) to use for each training step. This value may be increased if you have a lot of training data and would like to speed up training. The optimal value is dataset dependent.
* **best_save_path:** Path to save the best model.
* **pos_neg_ratio**: The ratio of the weight of positive examples (matches) to weight of negative examples (non-matches). This value should be increased if you have fewer matches than non-matches in your data. The optimal value is dataset dependent.

Many other aspects of the training algorithm can be customized. For details on this, please refer the API documentation for **[run_train]()**

In [15]:
'''model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)'''

"model.run_train(\n    train,\n    validation,\n    epochs=15,\n    batch_size=32,\n    best_save_path='attention_model.pth',\n    pos_neg_ratio=3)"

## Novos Testes

In [11]:
!git clone https://github.com/pauloh48/IC.git

fatal: destination path 'IC' already exists and is not an empty directory.


### RNN

#### STRUCTURED

##### Amazon-Google

In [18]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_test.csv"

Building vocabulary
0% [#######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [#######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


In [19]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [20]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 1762802
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 1 || Run Time:    5.9 | Load Time:    4.9 || F1:  17.54 | Prec:  37.56 | Rec:  11.44 || Ex/s: 636.37

===>  EVAL Epoch 1


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 1 || Run Time:    0.8 | Load Time:    1.5 || F1:  36.05 | Prec:  30.32 | Rec:  44.44 || Ex/s: 971.00

* Best F1: tensor(36.0485, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 2 || Run Time:    5.5 | Load Time:    4.9 || F1:  54.39 | Prec:  47.88 | Rec:  62.95 || Ex/s: 663.22

===>  EVAL Epoch 2


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 2 || Run Time:    0.8 | Load Time:    1.5 || F1:  44.17 | Prec:  37.24 | Rec:  54.27 || Ex/s: 980.92

* Best F1: tensor(44.1739, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 3 || Run Time:    5.5 | Load Time:    4.9 || F1:  71.19 | Prec:  61.79 | Rec:  83.98 || Ex/s: 664.22

===>  EVAL Epoch 3


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 3 || Run Time:    0.8 | Load Time:    1.6 || F1:  44.58 | Prec:  43.50 | Rec:  45.73 || Ex/s: 955.19

* Best F1: tensor(44.5833, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 4 || Run Time:    5.9 | Load Time:    5.1 || F1:  82.44 | Prec:  75.36 | Rec:  90.99 || Ex/s: 626.15

===>  EVAL Epoch 4


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 4 || Run Time:    0.8 | Load Time:    1.5 || F1:  41.96 | Prec:  46.15 | Rec:  38.46 || Ex/s: 968.29

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 5 || Run Time:    5.5 | Load Time:    4.8 || F1:  88.44 | Prec:  83.40 | Rec:  94.13 || Ex/s: 669.54

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 5 || Run Time:    0.8 | Load Time:    1.5 || F1:  44.49 | Prec:  45.91 | Rec:  43.16 || Ex/s: 966.96

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 6 || Run Time:    5.5 | Load Time:    4.8 || F1:  92.54 | Prec:  88.71 | Rec:  96.71 || Ex/s: 667.36

===>  EVAL Epoch 6


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 6 || Run Time:    0.8 | Load Time:    1.5 || F1:  44.10 | Prec:  45.09 | Rec:  43.16 || Ex/s: 970.67

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 7 || Run Time:    5.5 | Load Time:    4.8 || F1:  95.74 | Prec:  93.58 | Rec:  98.00 || Ex/s: 666.17

===>  EVAL Epoch 7


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 7 || Run Time:    0.8 | Load Time:    1.5 || F1:  39.81 | Prec:  44.04 | Rec:  36.32 || Ex/s: 976.62

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 8 || Run Time:    5.5 | Load Time:    4.8 || F1:  97.32 | Prec:  96.09 | Rec:  98.57 || Ex/s: 663.67

===>  EVAL Epoch 8


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 8 || Run Time:    0.8 | Load Time:    1.5 || F1:  40.28 | Prec:  45.21 | Rec:  36.32 || Ex/s: 967.09

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 9 || Run Time:    5.5 | Load Time:    4.9 || F1:  98.72 | Prec:  98.16 | Rec:  99.28 || Ex/s: 660.74

===>  EVAL Epoch 9


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 9 || Run Time:    0.8 | Load Time:    1.5 || F1:  39.60 | Prec:  47.06 | Rec:  34.19 || Ex/s: 964.52

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 10 || Run Time:    5.5 | Load Time:    4.9 || F1:  98.65 | Prec:  98.02 | Rec:  99.28 || Ex/s: 662.16

===>  EVAL Epoch 10


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 10 || Run Time:    0.8 | Load Time:    1.5 || F1:  40.49 | Prec:  47.16 | Rec:  35.47 || Ex/s: 974.27

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 11 || Run Time:    5.5 | Load Time:    4.9 || F1:  99.00 | Prec:  98.58 | Rec:  99.43 || Ex/s: 665.55

===>  EVAL Epoch 11


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 11 || Run Time:    0.8 | Load Time:    1.5 || F1:  41.46 | Prec:  48.30 | Rec:  36.32 || Ex/s: 970.38

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 12 || Run Time:    5.8 | Load Time:    5.1 || F1:  99.22 | Prec:  98.86 | Rec:  99.57 || Ex/s: 632.15

===>  EVAL Epoch 12


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 12 || Run Time:    0.8 | Load Time:    1.5 || F1:  40.69 | Prec:  47.70 | Rec:  35.47 || Ex/s: 976.36

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 13 || Run Time:    5.5 | Load Time:    4.9 || F1:  99.22 | Prec:  98.86 | Rec:  99.57 || Ex/s: 666.40

===>  EVAL Epoch 13


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 13 || Run Time:    0.9 | Load Time:    1.5 || F1:  40.39 | Prec:  47.67 | Rec:  35.04 || Ex/s: 970.71

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 14 || Run Time:    5.5 | Load Time:    4.8 || F1:  99.29 | Prec:  99.00 | Rec:  99.57 || Ex/s: 667.24

===>  EVAL Epoch 14


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 14 || Run Time:    0.8 | Load Time:    1.5 || F1:  40.00 | Prec:  47.37 | Rec:  34.62 || Ex/s: 968.84

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 15 || Run Time:    5.5 | Load Time:    4.9 || F1:  99.36 | Prec:  99.01 | Rec:  99.71 || Ex/s: 666.13

===>  EVAL Epoch 15


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 15 || Run Time:    0.8 | Load Time:    1.5 || F1:  39.80 | Prec:  47.62 | Rec:  34.19 || Ex/s: 962.15

---------------------

Loading best model...
Training done.


tensor(44.5833, device='cuda:0')

In [21]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 3


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 3 || Run Time:    0.8 | Load Time:    1.5 || F1:  41.08 | Prec:  39.92 | Rec:  42.31 || Ex/s: 961.33



tensor(41.0788, device='cuda:0')

##### Beer

In [22]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_test.csv"
0% [############################# ] 100% | ETA: 00:00:00
Building vocabulary
0% [#] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [#] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


In [23]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [24]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 2169602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 1 || Run Time:    0.3 | Load Time:    0.3 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 418.57

===>  EVAL Epoch 1
Finished Epoch 1 || Run Time:    0.0 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 568.93

* Best F1: tensor(0., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 2 || Run Time:    0.3 | Load Time:    0.3 || F1:  26.09 | Prec: 100.00 | Rec:  15.00 || Ex/s: 434.87

===>  EVAL Epoch 2
Finished Epoch 2 || Run Time:    0.1 | Load Time:    0.1 || F1:  13.33 | Prec: 100.00 | Rec:   7.14 || Ex/s: 567.43

* Best F1: tensor(13.3333, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 3 || Run Time:    0.3 | Load Time:    0.3 || F1:  86.75 | Prec:  83.72 | Rec:  90.00 || Ex/s: 448.04

===>  EVAL Epoch 3
Finished Epoch 3 || Run Time:    0.0 | Load Time:    0.1 || F1:  53.85 | Prec:  58.33 | Rec:  50.00 || Ex/s: 608.23

* Best F1: tensor(53.8462, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 4 || Run Time:    0.3 | Load Time:    0.3 || F1:  89.66 | Prec:  82.98 | Rec:  97.50 || Ex/s: 443.14

===>  EVAL Epoch 4
Finished Epoch 4 || Run Time:    0.0 | Load Time:    0.1 || F1:  58.06 | Prec:  52.94 | Rec:  64.29 || Ex/s: 614.56

* Best F1: tensor(58.0645, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 5 || Run Time:    0.3 | Load Time:    0.3 || F1:  92.86 | Prec:  88.64 | Rec:  97.50 || Ex/s: 437.76

===>  EVAL Epoch 5
Finished Epoch 5 || Run Time:    0.0 | Load Time:    0.1 || F1:  56.25 | Prec:  50.00 | Rec:  64.29 || Ex/s: 602.55

---------------------

===>  TRAIN Epoch 6


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 6 || Run Time:    0.3 | Load Time:    0.3 || F1:  96.39 | Prec:  93.02 | Rec: 100.00 || Ex/s: 443.20

===>  EVAL Epoch 6
Finished Epoch 6 || Run Time:    0.0 | Load Time:    0.1 || F1:  57.14 | Prec:  47.62 | Rec:  71.43 || Ex/s: 620.97

---------------------

===>  TRAIN Epoch 7


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 7 || Run Time:    0.3 | Load Time:    0.3 || F1:  97.56 | Prec:  95.24 | Rec: 100.00 || Ex/s: 437.27

===>  EVAL Epoch 7
Finished Epoch 7 || Run Time:    0.0 | Load Time:    0.1 || F1:  57.89 | Prec:  45.83 | Rec:  78.57 || Ex/s: 628.95

---------------------

===>  TRAIN Epoch 8


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 8 || Run Time:    0.3 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 443.42

===>  EVAL Epoch 8
Finished Epoch 8 || Run Time:    0.0 | Load Time:    0.1 || F1:  57.14 | Prec:  42.86 | Rec:  85.71 || Ex/s: 617.32

---------------------

===>  TRAIN Epoch 9


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 9 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 441.89

===>  EVAL Epoch 9
Finished Epoch 9 || Run Time:    0.0 | Load Time:    0.1 || F1:  57.14 | Prec:  42.86 | Rec:  85.71 || Ex/s: 633.16

---------------------

===>  TRAIN Epoch 10


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 10 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 440.01

===>  EVAL Epoch 10
Finished Epoch 10 || Run Time:    0.0 | Load Time:    0.1 || F1:  55.81 | Prec:  41.38 | Rec:  85.71 || Ex/s: 629.49

---------------------

===>  TRAIN Epoch 11


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 11 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 437.69

===>  EVAL Epoch 11
Finished Epoch 11 || Run Time:    0.0 | Load Time:    0.1 || F1:  55.81 | Prec:  41.38 | Rec:  85.71 || Ex/s: 631.55

---------------------

===>  TRAIN Epoch 12


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 12 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 441.68

===>  EVAL Epoch 12
Finished Epoch 12 || Run Time:    0.0 | Load Time:    0.1 || F1:  52.17 | Prec:  37.50 | Rec:  85.71 || Ex/s: 617.40

---------------------

===>  TRAIN Epoch 13


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 13 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 445.34

===>  EVAL Epoch 13
Finished Epoch 13 || Run Time:    0.0 | Load Time:    0.1 || F1:  50.00 | Prec:  35.29 | Rec:  85.71 || Ex/s: 624.40

---------------------

===>  TRAIN Epoch 14


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 14 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 433.05

===>  EVAL Epoch 14
Finished Epoch 14 || Run Time:    0.0 | Load Time:    0.1 || F1:  50.00 | Prec:  35.29 | Rec:  85.71 || Ex/s: 624.35

---------------------

===>  TRAIN Epoch 15


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 15 || Run Time:    0.3 | Load Time:    0.3 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 437.04

===>  EVAL Epoch 15
Finished Epoch 15 || Run Time:    0.0 | Load Time:    0.1 || F1:  50.00 | Prec:  35.29 | Rec:  85.71 || Ex/s: 621.71

---------------------

Loading best model...
Training done.


tensor(58.0645, device='cuda:0')

In [25]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 4




Finished Epoch 4 || Run Time:    0.0 | Load Time:    0.1 || F1:  35.29 | Prec:  30.00 | Rec:  42.86 || Ex/s: 610.95



tensor(35.2941, device='cuda:0')

##### DBLP-ACM

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/joined_test.csv"

Building vocabulary
0% [########] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [########] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


In [12]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 2169602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:22


Finished Epoch 1 || Run Time:    9.5 | Load Time:   12.9 || F1:  89.87 | Prec:  84.01 | Rec:  96.62 || Ex/s: 330.97

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    1.3 | Load Time:    3.9 || F1:  97.37 | Prec:  99.07 | Rec:  95.72 || Ex/s: 476.90

* Best F1: tensor(97.3654, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 2 || Run Time:    9.1 | Load Time:   12.4 || F1:  97.78 | Prec:  96.42 | Rec:  99.17 || Ex/s: 345.02

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    1.3 | Load Time:    3.8 || F1:  98.43 | Prec:  98.21 | Rec:  98.65 || Ex/s: 480.42

* Best F1: tensor(98.4270, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 3 || Run Time:    8.9 | Load Time:   12.2 || F1:  98.92 | Prec:  98.22 | Rec:  99.62 || Ex/s: 351.71

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 3 || Run Time:    1.3 | Load Time:    3.8 || F1:  98.10 | Prec:  97.34 | Rec:  98.87 || Ex/s: 479.32

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 4 || Run Time:    8.9 | Load Time:   12.3 || F1:  99.48 | Prec:  99.25 | Rec:  99.70 || Ex/s: 350.48

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 4 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.67 | Prec:  96.28 | Rec:  99.10 || Ex/s: 479.47

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 5 || Run Time:    9.2 | Load Time:   12.6 || F1:  99.81 | Prec:  99.77 | Rec:  99.85 || Ex/s: 341.36

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.89 | Prec:  96.50 | Rec:  99.32 || Ex/s: 478.86

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 6 || Run Time:    8.8 | Load Time:   12.2 || F1:  99.78 | Prec:  99.63 | Rec:  99.92 || Ex/s: 352.27

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 6 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.78 | Prec:  96.29 | Rec:  99.32 || Ex/s: 479.46

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 7 || Run Time:    8.8 | Load Time:   12.3 || F1:  99.85 | Prec:  99.78 | Rec:  99.92 || Ex/s: 351.22

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 7 || Run Time:    1.3 | Load Time:    3.9 || F1:  97.57 | Prec:  95.87 | Rec:  99.32 || Ex/s: 475.08

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 8 || Run Time:    8.9 | Load Time:   12.3 || F1:  99.89 | Prec:  99.85 | Rec:  99.92 || Ex/s: 350.00

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.46 | Prec:  95.66 | Rec:  99.32 || Ex/s: 480.94

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 9 || Run Time:    9.1 | Load Time:   12.5 || F1:  99.89 | Prec:  99.85 | Rec:  99.92 || Ex/s: 343.94

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.46 | Prec:  95.66 | Rec:  99.32 || Ex/s: 479.70

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 10 || Run Time:    8.9 | Load Time:   12.3 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 350.34

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    1.3 | Load Time:    3.9 || F1:  97.67 | Prec:  96.08 | Rec:  99.32 || Ex/s: 477.43

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 11 || Run Time:    8.9 | Load Time:   12.2 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 351.37

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.46 | Prec:  95.66 | Rec:  99.32 || Ex/s: 480.64

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 12 || Run Time:    8.9 | Load Time:   12.2 || F1:  99.92 | Prec:  99.85 | Rec: 100.00 || Ex/s: 351.43

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.78 | Prec:  96.29 | Rec:  99.32 || Ex/s: 477.80

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 13 || Run Time:    9.1 | Load Time:   12.5 || F1:  99.92 | Prec:  99.85 | Rec: 100.00 || Ex/s: 343.54

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 13 || Run Time:    1.3 | Load Time:    3.9 || F1:  97.78 | Prec:  96.29 | Rec:  99.32 || Ex/s: 474.60

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 14 || Run Time:    8.9 | Load Time:   12.2 || F1:  99.92 | Prec:  99.85 | Rec: 100.00 || Ex/s: 351.10

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    1.3 | Load Time:    3.9 || F1:  97.78 | Prec:  96.29 | Rec:  99.32 || Ex/s: 477.13

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 15 || Run Time:    8.9 | Load Time:   12.3 || F1:  99.92 | Prec:  99.85 | Rec: 100.00 || Ex/s: 351.37

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.78 | Prec:  96.29 | Rec:  99.32 || Ex/s: 479.57

---------------------

Loading best model...
Training done.


tensor(98.4270, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    1.3 | Load Time:    3.8 || F1:  97.10 | Prec:  96.24 | Rec:  97.97 || Ex/s: 478.70



tensor(97.0982, device='cuda:0')

##### DBLP-GoogleScholar

In [12]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-GoogleScholar/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [13]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [14]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 2169602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 1 || Run Time:   19.2 | Load Time:   25.1 || F1:  77.41 | Prec:  71.10 | Rec:  84.94 || Ex/s: 388.19

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 1 || Run Time:    2.9 | Load Time:    7.7 || F1:  88.80 | Prec:  87.98 | Rec:  89.63 || Ex/s: 542.69

* Best F1: tensor(88.7963, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 2 || Run Time:   19.5 | Load Time:   25.4 || F1:  91.71 | Prec:  87.60 | Rec:  96.23 || Ex/s: 383.26

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 2 || Run Time:    2.9 | Load Time:    7.7 || F1:  89.15 | Prec:  91.20 | Rec:  87.20 || Ex/s: 544.60

* Best F1: tensor(89.1543, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 3 || Run Time:   19.1 | Load Time:   25.0 || F1:  95.96 | Prec:  93.91 | Rec:  98.10 || Ex/s: 390.91

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 3 || Run Time:    2.9 | Load Time:    7.7 || F1:  89.28 | Prec:  88.66 | Rec:  89.91 || Ex/s: 539.97

* Best F1: tensor(89.2807, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 4 || Run Time:   19.3 | Load Time:   25.3 || F1:  98.23 | Prec:  97.25 | Rec:  99.22 || Ex/s: 385.64

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 4 || Run Time:    2.9 | Load Time:    7.7 || F1:  89.92 | Prec:  90.61 | Rec:  89.25 || Ex/s: 542.15

* Best F1: tensor(89.9247, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 5 || Run Time:   19.2 | Load Time:   25.1 || F1:  99.16 | Prec:  98.85 | Rec:  99.47 || Ex/s: 388.24

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 5 || Run Time:    2.8 | Load Time:    7.7 || F1:  89.63 | Prec:  88.44 | Rec:  90.84 || Ex/s: 545.91

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 6 || Run Time:   19.4 | Load Time:   25.2 || F1:  99.50 | Prec:  99.35 | Rec:  99.66 || Ex/s: 386.59

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 6 || Run Time:    3.0 | Load Time:    7.8 || F1:  89.68 | Prec:  90.88 | Rec:  88.50 || Ex/s: 527.25

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 7 || Run Time:   19.4 | Load Time:   24.9 || F1:  99.66 | Prec:  99.53 | Rec:  99.78 || Ex/s: 388.83

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 7 || Run Time:    3.0 | Load Time:    8.0 || F1:  89.92 | Prec:  89.83 | Rec:  90.00 || Ex/s: 522.98

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 8 || Run Time:   19.0 | Load Time:   25.0 || F1:  99.80 | Prec:  99.72 | Rec:  99.88 || Ex/s: 391.92

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 8 || Run Time:    2.9 | Load Time:    7.7 || F1:  89.32 | Prec:  91.15 | Rec:  87.57 || Ex/s: 541.05

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 9 || Run Time:   19.3 | Load Time:   25.3 || F1:  99.88 | Prec:  99.88 | Rec:  99.88 || Ex/s: 386.27

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 9 || Run Time:    2.9 | Load Time:    7.7 || F1:  90.27 | Prec:  88.72 | Rec:  91.87 || Ex/s: 538.11

* Best F1: tensor(90.2663, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 10 || Run Time:   19.5 | Load Time:   25.1 || F1:  99.86 | Prec:  99.84 | Rec:  99.88 || Ex/s: 386.26

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 10 || Run Time:    3.0 | Load Time:    7.7 || F1:  90.21 | Prec:  88.35 | Rec:  92.15 || Ex/s: 533.56

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 11 || Run Time:   19.7 | Load Time:   25.3 || F1:  99.89 | Prec:  99.91 | Rec:  99.88 || Ex/s: 382.22

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 11 || Run Time:    3.0 | Load Time:    7.8 || F1:  90.04 | Prec:  88.46 | Rec:  91.68 || Ex/s: 534.53

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 12 || Run Time:   19.5 | Load Time:   25.1 || F1:  99.92 | Prec:  99.97 | Rec:  99.88 || Ex/s: 386.12

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 12 || Run Time:    2.9 | Load Time:    7.7 || F1:  90.15 | Prec:  89.65 | Rec:  90.65 || Ex/s: 543.59

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 13 || Run Time:   19.5 | Load Time:   25.3 || F1:  99.94 | Prec:  99.97 | Rec:  99.91 || Ex/s: 383.97

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 13 || Run Time:    3.0 | Load Time:    7.8 || F1:  90.26 | Prec:  89.59 | Rec:  90.93 || Ex/s: 530.92

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 14 || Run Time:   19.7 | Load Time:   25.1 || F1:  99.94 | Prec:  99.97 | Rec:  99.91 || Ex/s: 384.36

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 14 || Run Time:    3.0 | Load Time:    7.7 || F1:  90.25 | Prec:  89.67 | Rec:  90.84 || Ex/s: 533.91

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:44


Finished Epoch 15 || Run Time:   19.8 | Load Time:   25.4 || F1:  99.94 | Prec:  99.97 | Rec:  99.91 || Ex/s: 380.76

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 15 || Run Time:    3.0 | Load Time:    7.8 || F1:  90.21 | Prec:  89.59 | Rec:  90.84 || Ex/s: 534.73

---------------------

Loading best model...
Training done.


tensor(90.2663, device='cuda:0')

In [15]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 9 || Run Time:    3.0 | Load Time:    7.8 || F1:  89.57 | Prec:  88.08 | Rec:  91.12 || Ex/s: 533.35



tensor(89.5728, device='cuda:0')

##### Walmart-Amazon

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Walmart-Amazon/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [12]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 2576402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 1 || Run Time:    7.9 | Load Time:    6.9 || F1:  26.27 | Prec:  57.65 | Rec:  17.01 || Ex/s: 415.86

===>  EVAL Epoch 1


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 1 || Run Time:    1.2 | Load Time:    2.2 || F1:  52.36 | Prec:  48.05 | Rec:  57.51 || Ex/s: 602.51

* Best F1: tensor(52.3585, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 2 || Run Time:    7.8 | Load Time:    6.9 || F1:  67.20 | Prec:  68.47 | Rec:  65.97 || Ex/s: 417.83

===>  EVAL Epoch 2


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 2 || Run Time:    1.2 | Load Time:    2.1 || F1:  57.77 | Prec:  50.00 | Rec:  68.39 || Ex/s: 610.85

* Best F1: tensor(57.7681, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 3 || Run Time:    7.8 | Load Time:    6.9 || F1:  85.53 | Prec:  80.03 | Rec:  91.84 || Ex/s: 420.17

===>  EVAL Epoch 3


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 3 || Run Time:    1.2 | Load Time:    2.2 || F1:  59.62 | Prec:  54.51 | Rec:  65.80 || Ex/s: 604.11

* Best F1: tensor(59.6244, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 4 || Run Time:    7.9 | Load Time:    6.8 || F1:  94.91 | Prec:  92.86 | Rec:  97.05 || Ex/s: 415.50

===>  EVAL Epoch 4


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 4 || Run Time:    1.2 | Load Time:    2.1 || F1:  58.57 | Prec:  54.19 | Rec:  63.73 || Ex/s: 605.72

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 5 || Run Time:    8.1 | Load Time:    7.1 || F1:  97.59 | Prec:  96.92 | Rec:  98.26 || Ex/s: 403.50

===>  EVAL Epoch 5


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 5 || Run Time:    1.2 | Load Time:    2.2 || F1:  61.58 | Prec:  67.70 | Rec:  56.48 || Ex/s: 592.26

* Best F1: tensor(61.5819, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 6 || Run Time:    8.2 | Load Time:    7.1 || F1:  99.48 | Prec:  99.31 | Rec:  99.65 || Ex/s: 400.38

===>  EVAL Epoch 6


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 6 || Run Time:    1.2 | Load Time:    2.2 || F1:  61.22 | Prec:  70.00 | Rec:  54.40 || Ex/s: 594.78

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 7 || Run Time:    7.9 | Load Time:    6.9 || F1:  99.74 | Prec:  99.48 | Rec: 100.00 || Ex/s: 414.94

===>  EVAL Epoch 7


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 7 || Run Time:    1.2 | Load Time:    2.2 || F1:  61.21 | Prec:  73.72 | Rec:  52.33 || Ex/s: 595.80

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 8 || Run Time:    8.0 | Load Time:    6.8 || F1:  99.83 | Prec:  99.65 | Rec: 100.00 || Ex/s: 414.97

===>  EVAL Epoch 8


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 8 || Run Time:    1.2 | Load Time:    2.2 || F1:  62.54 | Prec:  77.69 | Rec:  52.33 || Ex/s: 602.87

* Best F1: tensor(62.5387, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 9 || Run Time:    7.9 | Load Time:    6.9 || F1:  99.91 | Prec:  99.83 | Rec: 100.00 || Ex/s: 415.33

===>  EVAL Epoch 9


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 9 || Run Time:    1.2 | Load Time:    2.2 || F1:  62.70 | Prec:  79.37 | Rec:  51.81 || Ex/s: 597.89

* Best F1: tensor(62.6959, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 10 || Run Time:    7.9 | Load Time:    6.9 || F1:  99.91 | Prec:  99.83 | Rec: 100.00 || Ex/s: 414.60

===>  EVAL Epoch 10


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 10 || Run Time:    1.5 | Load Time:    2.5 || F1:  62.66 | Prec:  80.49 | Rec:  51.30 || Ex/s: 513.72

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 11 || Run Time:    8.1 | Load Time:    7.0 || F1:  99.91 | Prec:  99.83 | Rec: 100.00 || Ex/s: 405.32

===>  EVAL Epoch 11


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 11 || Run Time:    1.3 | Load Time:    2.2 || F1:  62.86 | Prec:  81.15 | Rec:  51.30 || Ex/s: 587.93

* Best F1: tensor(62.8571, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 12 || Run Time:    8.1 | Load Time:    6.9 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 407.57

===>  EVAL Epoch 12


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 12 || Run Time:    1.2 | Load Time:    2.2 || F1:  62.86 | Prec:  81.15 | Rec:  51.30 || Ex/s: 599.90

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 13 || Run Time:    8.0 | Load Time:    6.9 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 413.38

===>  EVAL Epoch 13


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 13 || Run Time:    1.3 | Load Time:    2.2 || F1:  62.86 | Prec:  81.15 | Rec:  51.30 || Ex/s: 591.70

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 14 || Run Time:    8.0 | Load Time:    6.9 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 411.22

===>  EVAL Epoch 14


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 14 || Run Time:    1.3 | Load Time:    2.2 || F1:  62.46 | Prec:  79.84 | Rec:  51.30 || Ex/s: 595.80

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 15 || Run Time:    8.0 | Load Time:    6.9 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 411.17

===>  EVAL Epoch 15


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 15 || Run Time:    1.2 | Load Time:    2.2 || F1:  62.46 | Prec:  79.84 | Rec:  51.30 || Ex/s: 603.34

---------------------

Loading best model...
Training done.


tensor(62.8571, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 11


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 11 || Run Time:    1.4 | Load Time:    2.5 || F1:  64.80 | Prec:  81.25 | Rec:  53.89 || Ex/s: 523.96



tensor(64.7975, device='cuda:0')

#### TEXTUAL

##### Abt-Buy

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/joined_test.csv"

Building vocabulary
0% [######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


In [12]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 1762802
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 1 || Run Time:    6.7 | Load Time:   11.1 || F1:  20.31 | Prec:  31.72 | Rec:  14.94 || Ex/s: 322.99

===>  EVAL Epoch 1


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    0.9 | Load Time:    3.3 || F1:   4.55 | Prec:  35.71 | Rec:   2.43 || Ex/s: 457.14

* Best F1: tensor(4.5455, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 2 || Run Time:    6.5 | Load Time:   11.2 || F1:  43.28 | Prec:  43.71 | Rec:  42.86 || Ex/s: 324.05

===>  EVAL Epoch 2


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    0.9 | Load Time:    3.3 || F1:  17.32 | Prec:  45.83 | Rec:  10.68 || Ex/s: 454.65

* Best F1: tensor(17.3228, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 3 || Run Time:    6.8 | Load Time:   11.4 || F1:  64.47 | Prec:  55.90 | Rec:  76.14 || Ex/s: 315.10

===>  EVAL Epoch 3


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 3 || Run Time:    0.9 | Load Time:    3.3 || F1:  22.38 | Prec:  40.00 | Rec:  15.53 || Ex/s: 456.97

* Best F1: tensor(22.3776, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 4 || Run Time:    6.6 | Load Time:   11.3 || F1:  77.37 | Prec:  69.75 | Rec:  86.85 || Ex/s: 320.03

===>  EVAL Epoch 4


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 4 || Run Time:    0.9 | Load Time:    3.3 || F1:  23.24 | Prec:  42.31 | Rec:  16.02 || Ex/s: 459.25

* Best F1: tensor(23.2394, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 5 || Run Time:    6.5 | Load Time:   11.0 || F1:  86.30 | Prec:  80.85 | Rec:  92.53 || Ex/s: 327.63

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    0.9 | Load Time:    3.3 || F1:  31.28 | Prec:  33.15 | Rec:  29.61 || Ex/s: 458.72

* Best F1: tensor(31.2821, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 6 || Run Time:    6.5 | Load Time:   11.0 || F1:  91.29 | Prec:  86.93 | Rec:  96.10 || Ex/s: 327.97

===>  EVAL Epoch 6


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 6 || Run Time:    0.9 | Load Time:    3.3 || F1:  30.53 | Prec:  33.33 | Rec:  28.16 || Ex/s: 460.34

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 7 || Run Time:    6.7 | Load Time:   11.4 || F1:  94.41 | Prec:  91.60 | Rec:  97.40 || Ex/s: 316.52

===>  EVAL Epoch 7


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 7 || Run Time:    0.9 | Load Time:    3.3 || F1:  29.09 | Prec:  31.28 | Rec:  27.18 || Ex/s: 449.29

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 8 || Run Time:    6.7 | Load Time:   11.1 || F1:  97.44 | Prec:  95.91 | Rec:  99.03 || Ex/s: 322.71

===>  EVAL Epoch 8


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    1.0 | Load Time:    3.3 || F1:  25.08 | Prec:  33.88 | Rec:  19.90 || Ex/s: 448.77

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 9 || Run Time:    6.7 | Load Time:   11.1 || F1:  98.79 | Prec:  97.93 | Rec:  99.68 || Ex/s: 322.21

===>  EVAL Epoch 9


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    1.0 | Load Time:    3.3 || F1:  25.39 | Prec:  35.04 | Rec:  19.90 || Ex/s: 451.59

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 10 || Run Time:    6.7 | Load Time:   11.1 || F1:  99.03 | Prec:  98.40 | Rec:  99.68 || Ex/s: 322.64

===>  EVAL Epoch 10


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    1.0 | Load Time:    3.3 || F1:  24.84 | Prec:  36.11 | Rec:  18.93 || Ex/s: 451.14

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 11 || Run Time:    6.7 | Load Time:   11.2 || F1:  99.51 | Prec:  99.19 | Rec:  99.84 || Ex/s: 320.88

===>  EVAL Epoch 11


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    1.0 | Load Time:    3.6 || F1:  24.60 | Prec:  36.89 | Rec:  18.45 || Ex/s: 410.75

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 12 || Run Time:    6.7 | Load Time:   11.1 || F1:  99.60 | Prec:  99.35 | Rec:  99.84 || Ex/s: 321.84

===>  EVAL Epoch 12


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    0.9 | Load Time:    3.3 || F1:  25.32 | Prec:  38.24 | Rec:  18.93 || Ex/s: 452.46

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 13 || Run Time:    6.7 | Load Time:   11.1 || F1:  99.51 | Prec:  99.19 | Rec:  99.84 || Ex/s: 323.19

===>  EVAL Epoch 13


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 13 || Run Time:    0.9 | Load Time:    3.3 || F1:  25.97 | Prec:  39.22 | Rec:  19.42 || Ex/s: 454.13

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 14 || Run Time:    6.7 | Load Time:   11.1 || F1:  99.60 | Prec:  99.35 | Rec:  99.84 || Ex/s: 323.46

===>  EVAL Epoch 14


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    0.9 | Load Time:    3.3 || F1:  25.81 | Prec:  38.46 | Rec:  19.42 || Ex/s: 450.34

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 15 || Run Time:    6.7 | Load Time:   11.2 || F1:  99.68 | Prec:  99.35 | Rec: 100.00 || Ex/s: 320.43

===>  EVAL Epoch 15


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    0.9 | Load Time:    3.3 || F1:  25.89 | Prec:  38.83 | Rec:  19.42 || Ex/s: 452.06

---------------------

Loading best model...
Training done.


tensor(31.2821, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    1.0 | Load Time:    3.3 || F1:  30.85 | Prec:  34.12 | Rec:  28.16 || Ex/s: 447.36



tensor(30.8511, device='cuda:0')

#### DIRTY

##### DBLP-ACM

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/joined_test.csv"

Building vocabulary
0% [########] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [########] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


In [12]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 2169602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:25


Finished Epoch 1 || Run Time:   10.3 | Load Time:   15.6 || F1:  68.89 | Prec:  61.00 | Rec:  79.13 || Ex/s: 285.55

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 1 || Run Time:    1.5 | Load Time:    4.7 || F1:  84.08 | Prec:  78.14 | Rec:  90.99 || Ex/s: 399.63

* Best F1: tensor(84.0791, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 2 || Run Time:    9.9 | Load Time:   15.1 || F1:  89.26 | Prec:  84.03 | Rec:  95.20 || Ex/s: 295.98

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 2 || Run Time:    1.5 | Load Time:    4.7 || F1:  85.89 | Prec:  80.12 | Rec:  92.57 || Ex/s: 399.41

* Best F1: tensor(85.8934, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 3 || Run Time:    9.8 | Load Time:   15.0 || F1:  95.36 | Prec:  93.63 | Rec:  97.15 || Ex/s: 298.29

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 3 || Run Time:    1.5 | Load Time:    4.7 || F1:  88.54 | Prec:  87.47 | Rec:  89.64 || Ex/s: 400.08

* Best F1: tensor(88.5428, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:25


Finished Epoch 4 || Run Time:   10.0 | Load Time:   15.3 || F1:  97.73 | Prec:  97.04 | Rec:  98.42 || Ex/s: 292.86

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 4 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.18 | Prec:  85.83 | Rec:  92.79 || Ex/s: 396.99

* Best F1: tensor(89.1775, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 5 || Run Time:    9.8 | Load Time:   15.1 || F1:  99.10 | Prec:  98.95 | Rec:  99.25 || Ex/s: 297.83

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 5 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.22 | Prec:  84.79 | Rec:  94.14 || Ex/s: 400.81

* Best F1: tensor(89.2209, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 6 || Run Time:    9.8 | Load Time:   15.1 || F1:  99.36 | Prec:  99.32 | Rec:  99.40 || Ex/s: 297.93

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 6 || Run Time:    1.5 | Load Time:    4.8 || F1:  89.55 | Prec:  85.02 | Rec:  94.59 || Ex/s: 396.55

* Best F1: tensor(89.5522, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:25


Finished Epoch 7 || Run Time:   10.0 | Load Time:   15.3 || F1:  99.51 | Prec:  99.47 | Rec:  99.55 || Ex/s: 292.91

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 7 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.34 | Prec:  84.82 | Rec:  94.37 || Ex/s: 399.90

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 8 || Run Time:    9.8 | Load Time:   15.0 || F1:  99.74 | Prec:  99.77 | Rec:  99.70 || Ex/s: 299.35

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 8 || Run Time:    1.4 | Load Time:    4.6 || F1:  89.44 | Prec:  85.74 | Rec:  93.47 || Ex/s: 408.06

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 9 || Run Time:    9.5 | Load Time:   14.9 || F1:  99.89 | Prec:  99.92 | Rec:  99.85 || Ex/s: 304.66

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 9 || Run Time:    1.4 | Load Time:    4.7 || F1:  89.39 | Prec:  86.04 | Rec:  93.02 || Ex/s: 407.56

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 10 || Run Time:    9.8 | Load Time:   15.1 || F1:  99.92 | Prec: 100.00 | Rec:  99.85 || Ex/s: 297.56

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 10 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.94 | Prec:  88.29 | Rec:  91.67 || Ex/s: 402.56

* Best F1: tensor(89.9447, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 11 || Run Time:    9.8 | Load Time:   15.0 || F1:  99.92 | Prec:  99.92 | Rec:  99.92 || Ex/s: 298.64

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 11 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.80 | Prec:  88.43 | Rec:  91.22 || Ex/s: 400.37

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 12 || Run Time:    9.8 | Load Time:   15.1 || F1:  99.92 | Prec:  99.92 | Rec:  99.92 || Ex/s: 297.69

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 12 || Run Time:    1.5 | Load Time:    4.7 || F1:  89.92 | Prec:  88.45 | Rec:  91.44 || Ex/s: 400.18

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 13 || Run Time:    9.9 | Load Time:   15.2 || F1:  99.96 | Prec: 100.00 | Rec:  99.92 || Ex/s: 295.04

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 13 || Run Time:    1.5 | Load Time:    4.7 || F1:  90.04 | Prec:  88.48 | Rec:  91.67 || Ex/s: 401.21

* Best F1: tensor(90.0443, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 14 || Run Time:    9.8 | Load Time:   15.0 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 299.19

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 14 || Run Time:    1.5 | Load Time:    4.7 || F1:  90.04 | Prec:  88.48 | Rec:  91.67 || Ex/s: 399.40

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:24


Finished Epoch 15 || Run Time:    9.8 | Load Time:   15.1 || F1: 100.00 | Prec: 100.00 | Rec: 100.00 || Ex/s: 297.87

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 15 || Run Time:    1.5 | Load Time:    4.7 || F1:  90.14 | Prec:  88.67 | Rec:  91.67 || Ex/s: 399.18

* Best F1: tensor(90.1440, device='cuda:0')
Saving best model...
Done.
---------------------

Loading best model...
Training done.


tensor(90.1440, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 15 || Run Time:    1.5 | Load Time:    4.6 || F1:  90.40 | Prec:  89.60 | Rec:  91.22 || Ex/s: 405.61



tensor(90.4018, device='cuda:0')

##### DBLP-GoogleScholar

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-GoogleScholar//',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [12]:
model = dm.MatchingModel(attr_summarizer='rnn')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='rnn_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 2169602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 1 || Run Time:   20.6 | Load Time:   29.9 || F1:  64.28 | Prec:  57.30 | Rec:  73.21 || Ex/s: 341.04

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 1 || Run Time:    3.2 | Load Time:    9.5 || F1:  76.53 | Prec:  70.00 | Rec:  84.39 || Ex/s: 449.60

* Best F1: tensor(76.5254, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 2 || Run Time:   20.8 | Load Time:   30.2 || F1:  83.20 | Prec:  76.49 | Rec:  91.21 || Ex/s: 337.63

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 2 || Run Time:    3.1 | Load Time:    9.3 || F1:  79.28 | Prec:  71.80 | Rec:  88.50 || Ex/s: 464.32

* Best F1: tensor(79.2800, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 3 || Run Time:   20.8 | Load Time:   30.2 || F1:  91.89 | Prec:  88.14 | Rec:  95.98 || Ex/s: 337.64

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 3 || Run Time:    3.0 | Load Time:    9.2 || F1:  80.48 | Prec:  72.67 | Rec:  90.19 || Ex/s: 471.33

* Best F1: tensor(80.4837, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 4 || Run Time:   20.6 | Load Time:   30.0 || F1:  96.27 | Prec:  94.75 | Rec:  97.85 || Ex/s: 340.44

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 4 || Run Time:    3.1 | Load Time:    9.2 || F1:  82.71 | Prec:  81.00 | Rec:  84.49 || Ex/s: 468.35

* Best F1: tensor(82.7081, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:49


Finished Epoch 5 || Run Time:   20.4 | Load Time:   29.8 || F1:  98.23 | Prec:  97.66 | Rec:  98.82 || Ex/s: 343.22

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 5 || Run Time:    3.0 | Load Time:    9.2 || F1:  82.05 | Prec:  84.32 | Rec:  79.91 || Ex/s: 470.50

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 6 || Run Time:   20.6 | Load Time:   30.0 || F1:  98.90 | Prec:  98.61 | Rec:  99.19 || Ex/s: 340.12

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 6 || Run Time:    3.0 | Load Time:    9.2 || F1:  82.21 | Prec:  85.17 | Rec:  79.44 || Ex/s: 468.97

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 7 || Run Time:   20.6 | Load Time:   30.1 || F1:  99.56 | Prec:  99.66 | Rec:  99.47 || Ex/s: 339.51

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 7 || Run Time:    3.0 | Load Time:    9.2 || F1:  82.67 | Prec:  83.10 | Rec:  82.24 || Ex/s: 469.74

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:49


Finished Epoch 8 || Run Time:   20.3 | Load Time:   29.8 || F1:  99.64 | Prec:  99.69 | Rec:  99.59 || Ex/s: 343.98

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 8 || Run Time:    3.0 | Load Time:    9.1 || F1:  82.88 | Prec:  82.50 | Rec:  83.27 || Ex/s: 472.96

* Best F1: tensor(82.8837, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 9 || Run Time:   20.7 | Load Time:   30.1 || F1:  99.69 | Prec:  99.72 | Rec:  99.66 || Ex/s: 339.25

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 9 || Run Time:    3.1 | Load Time:    9.2 || F1:  82.80 | Prec:  82.15 | Rec:  83.46 || Ex/s: 467.25

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 10 || Run Time:   20.6 | Load Time:   30.0 || F1:  99.73 | Prec:  99.78 | Rec:  99.69 || Ex/s: 340.17

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 10 || Run Time:    3.1 | Load Time:    9.2 || F1:  83.02 | Prec:  82.41 | Rec:  83.64 || Ex/s: 468.79

* Best F1: tensor(83.0241, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:49


Finished Epoch 11 || Run Time:   20.5 | Load Time:   29.7 || F1:  99.78 | Prec:  99.81 | Rec:  99.75 || Ex/s: 342.82

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 11 || Run Time:    3.2 | Load Time:    9.3 || F1:  83.15 | Prec:  84.47 | Rec:  81.87 || Ex/s: 460.80

* Best F1: tensor(83.1514, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 12 || Run Time:   21.0 | Load Time:   30.1 || F1:  99.81 | Prec:  99.84 | Rec:  99.78 || Ex/s: 337.10

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 12 || Run Time:    3.2 | Load Time:    9.2 || F1:  82.58 | Prec:  82.74 | Rec:  82.43 || Ex/s: 464.62

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 13 || Run Time:   20.5 | Load Time:   30.0 || F1:  99.84 | Prec:  99.88 | Rec:  99.81 || Ex/s: 341.01

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 13 || Run Time:    3.0 | Load Time:    9.1 || F1:  82.35 | Prec:  82.74 | Rec:  81.96 || Ex/s: 471.65

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:52


Finished Epoch 14 || Run Time:   21.3 | Load Time:   31.2 || F1:  99.86 | Prec:  99.91 | Rec:  99.81 || Ex/s: 328.32

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 14 || Run Time:    3.1 | Load Time:    9.2 || F1:  82.34 | Prec:  82.54 | Rec:  82.15 || Ex/s: 467.07

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:50


Finished Epoch 15 || Run Time:   20.5 | Load Time:   30.4 || F1:  99.92 | Prec:  99.97 | Rec:  99.88 || Ex/s: 338.23

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 15 || Run Time:    3.0 | Load Time:    9.1 || F1:  82.48 | Prec:  82.25 | Rec:  82.71 || Ex/s: 472.37

---------------------

Loading best model...
Training done.


tensor(83.1514, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 11 || Run Time:    3.1 | Load Time:    9.2 || F1:  81.83 | Prec:  84.26 | Rec:  79.53 || Ex/s: 466.45



tensor(81.8269, device='cuda:0')

### SIF

#### STRUCTURED

##### Amazon-Google

In [11]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/joined_test.csv"

Building vocabulary
0% [#######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [#######] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


In [12]:
model = dm.MatchingModel(attr_summarizer='sif')

In [13]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 1 || Run Time:    3.7 | Load Time:    5.2 || F1:   4.22 | Prec:  16.04 | Rec:   2.43 || Ex/s: 766.73

===>  EVAL Epoch 1


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 1 || Run Time:    0.6 | Load Time:    1.5 || F1:  13.19 | Prec:  46.15 | Rec:   7.69 || Ex/s: 1105.98

* Best F1: tensor(13.1868, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 2 || Run Time:    2.9 | Load Time:    4.5 || F1:  28.26 | Prec:  46.41 | Rec:  20.31 || Ex/s: 931.78

===>  EVAL Epoch 2


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 2 || Run Time:    0.6 | Load Time:    1.4 || F1:  29.76 | Prec:  30.49 | Rec:  29.06 || Ex/s: 1146.39

* Best F1: tensor(29.7593, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 3 || Run Time:    3.0 | Load Time:    4.4 || F1:  48.42 | Prec:  48.49 | Rec:  48.35 || Ex/s: 928.44

===>  EVAL Epoch 3


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 3 || Run Time:    0.6 | Load Time:    1.4 || F1:  31.15 | Prec:  28.32 | Rec:  34.62 || Ex/s: 1144.87

* Best F1: tensor(31.1538, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 4 || Run Time:    2.9 | Load Time:    4.4 || F1:  63.81 | Prec:  57.77 | Rec:  71.24 || Ex/s: 933.01

===>  EVAL Epoch 4


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 4 || Run Time:    0.6 | Load Time:    1.4 || F1:  32.34 | Prec:  30.34 | Rec:  34.62 || Ex/s: 1138.11

* Best F1: tensor(32.3353, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 5 || Run Time:    2.9 | Load Time:    4.4 || F1:  74.34 | Prec:  67.25 | Rec:  83.12 || Ex/s: 932.80

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 5 || Run Time:    0.6 | Load Time:    1.4 || F1:  32.71 | Prec:  29.19 | Rec:  37.18 || Ex/s: 1130.13

* Best F1: tensor(32.7068, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 6 || Run Time:    2.9 | Load Time:    4.5 || F1:  80.86 | Prec:  74.19 | Rec:  88.84 || Ex/s: 929.72

===>  EVAL Epoch 6


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 6 || Run Time:    0.6 | Load Time:    1.4 || F1:  31.95 | Prec:  28.52 | Rec:  36.32 || Ex/s: 1137.48

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 7 || Run Time:    3.3 | Load Time:    5.0 || F1:  84.41 | Prec:  78.40 | Rec:  91.42 || Ex/s: 829.08

===>  EVAL Epoch 7


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 7 || Run Time:    0.6 | Load Time:    1.4 || F1:  29.30 | Prec:  26.98 | Rec:  32.05 || Ex/s: 1128.97

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 8 || Run Time:    3.0 | Load Time:    4.5 || F1:  87.31 | Prec:  81.51 | Rec:  93.99 || Ex/s: 922.99

===>  EVAL Epoch 8


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 8 || Run Time:    0.6 | Load Time:    1.4 || F1:  28.99 | Prec:  28.51 | Rec:  29.49 || Ex/s: 1150.33

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 9 || Run Time:    3.0 | Load Time:    4.4 || F1:  90.07 | Prec:  85.29 | Rec:  95.42 || Ex/s: 927.13

===>  EVAL Epoch 9


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 9 || Run Time:    0.6 | Load Time:    1.4 || F1:  28.21 | Prec:  28.21 | Rec:  28.21 || Ex/s: 1137.46

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 10 || Run Time:    2.9 | Load Time:    4.5 || F1:  91.80 | Prec:  87.84 | Rec:  96.14 || Ex/s: 930.78

===>  EVAL Epoch 10


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 10 || Run Time:    0.6 | Load Time:    1.4 || F1:  28.81 | Prec:  27.78 | Rec:  29.91 || Ex/s: 1152.05

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 11 || Run Time:    3.1 | Load Time:    4.7 || F1:  93.14 | Prec:  89.46 | Rec:  97.14 || Ex/s: 882.89

===>  EVAL Epoch 11


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 11 || Run Time:    0.6 | Load Time:    1.4 || F1:  28.93 | Prec:  28.00 | Rec:  29.91 || Ex/s: 1147.99

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 12 || Run Time:    2.9 | Load Time:    4.4 || F1:  93.82 | Prec:  90.22 | Rec:  97.71 || Ex/s: 933.25

===>  EVAL Epoch 12


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 12 || Run Time:    0.6 | Load Time:    1.4 || F1:  26.98 | Prec:  27.04 | Rec:  26.92 || Ex/s: 1149.80

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 13 || Run Time:    2.9 | Load Time:    4.4 || F1:  94.34 | Prec:  91.08 | Rec:  97.85 || Ex/s: 933.20

===>  EVAL Epoch 13


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 13 || Run Time:    0.6 | Load Time:    1.4 || F1:  27.56 | Prec:  28.70 | Rec:  26.50 || Ex/s: 1144.78

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 14 || Run Time:    3.0 | Load Time:    4.5 || F1:  94.61 | Prec:  91.57 | Rec:  97.85 || Ex/s: 924.39

===>  EVAL Epoch 14


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 14 || Run Time:    0.6 | Load Time:    1.4 || F1:  27.48 | Prec:  29.05 | Rec:  26.07 || Ex/s: 1152.01

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 15 || Run Time:    2.9 | Load Time:    4.5 || F1:  94.74 | Prec:  91.70 | Rec:  98.00 || Ex/s: 928.39

===>  EVAL Epoch 15


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 15 || Run Time:    0.6 | Load Time:    1.4 || F1:  27.36 | Prec:  30.53 | Rec:  24.79 || Ex/s: 1136.17

---------------------

Loading best model...
Training done.


tensor(32.7068, device='cuda:0')

In [14]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:01


Finished Epoch 5 || Run Time:    0.6 | Load Time:    1.4 || F1:  36.50 | Prec:  32.88 | Rec:  41.03 || Ex/s: 1119.99



tensor(36.5019, device='cuda:0')

##### Beer

In [15]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')


Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_train.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_valid.csv"
0% [############################# ] 100% | ETA: 00:00:00
Reading and processing data from "/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/joined_test.csv"
0% [############################# ] 100% | ETA: 00:00:00
Building vocabulary
0% [#] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00

Computing principal components
0% [#] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


In [16]:
model = dm.MatchingModel(attr_summarizer='sif')

In [17]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 1 || Run Time:    0.2 | Load Time:    0.3 || F1:  23.68 | Prec:  25.00 | Rec:  22.50 || Ex/s: 575.75

===>  EVAL Epoch 1
Finished Epoch 1 || Run Time:    0.0 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 710.38

* Best F1: tensor(0., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 2 || Run Time:    0.2 | Load Time:    0.3 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 584.38

===>  EVAL Epoch 2
Finished Epoch 2 || Run Time:    0.0 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 715.71

---------------------

===>  TRAIN Epoch 3


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 3 || Run Time:    0.2 | Load Time:    0.3 || F1:   4.88 | Prec: 100.00 | Rec:   2.50 || Ex/s: 598.77

===>  EVAL Epoch 3
Finished Epoch 3 || Run Time:    0.0 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 735.51

---------------------

===>  TRAIN Epoch 4


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 4 || Run Time:    0.2 | Load Time:    0.3 || F1:   9.52 | Prec: 100.00 | Rec:   5.00 || Ex/s: 575.33

===>  EVAL Epoch 4
Finished Epoch 4 || Run Time:    0.0 | Load Time:    0.1 || F1:  25.00 | Prec: 100.00 | Rec:  14.29 || Ex/s: 721.71

* Best F1: tensor(25., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 5 || Run Time:    0.2 | Load Time:    0.3 || F1:  13.64 | Prec:  75.00 | Rec:   7.50 || Ex/s: 543.80

===>  EVAL Epoch 5
Finished Epoch 5 || Run Time:    0.0 | Load Time:    0.1 || F1:  33.33 | Prec:  75.00 | Rec:  21.43 || Ex/s: 529.97

* Best F1: tensor(33.3333, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 6 || Run Time:    0.2 | Load Time:    0.4 || F1:  32.65 | Prec:  88.89 | Rec:  20.00 || Ex/s: 456.18

===>  EVAL Epoch 6
Finished Epoch 6 || Run Time:    0.0 | Load Time:    0.1 || F1:  33.33 | Prec:  75.00 | Rec:  21.43 || Ex/s: 560.98

---------------------

===>  TRAIN Epoch 7


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 7 || Run Time:    0.2 | Load Time:    0.4 || F1:  32.65 | Prec:  88.89 | Rec:  20.00 || Ex/s: 460.70

===>  EVAL Epoch 7
Finished Epoch 7 || Run Time:    0.0 | Load Time:    0.1 || F1:  33.33 | Prec:  75.00 | Rec:  21.43 || Ex/s: 516.33

---------------------

===>  TRAIN Epoch 8


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 8 || Run Time:    0.2 | Load Time:    0.4 || F1:  42.31 | Prec:  91.67 | Rec:  27.50 || Ex/s: 455.98

===>  EVAL Epoch 8
Finished Epoch 8 || Run Time:    0.0 | Load Time:    0.1 || F1:  33.33 | Prec:  75.00 | Rec:  21.43 || Ex/s: 533.62

---------------------

===>  TRAIN Epoch 9


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 9 || Run Time:    0.2 | Load Time:    0.3 || F1:  50.91 | Prec:  93.33 | Rec:  35.00 || Ex/s: 557.71

===>  EVAL Epoch 9
Finished Epoch 9 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 710.39

* Best F1: tensor(40., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 10 || Run Time:    0.2 | Load Time:    0.3 || F1:  61.02 | Prec:  94.74 | Rec:  45.00 || Ex/s: 610.33

===>  EVAL Epoch 10
Finished Epoch 10 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 704.89

---------------------

===>  TRAIN Epoch 11


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 11 || Run Time:    0.1 | Load Time:    0.3 || F1:  63.33 | Prec:  95.00 | Rec:  47.50 || Ex/s: 606.64

===>  EVAL Epoch 11
Finished Epoch 11 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 705.63

---------------------

===>  TRAIN Epoch 12


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 12 || Run Time:    0.2 | Load Time:    0.3 || F1:  63.33 | Prec:  95.00 | Rec:  47.50 || Ex/s: 614.13

===>  EVAL Epoch 12
Finished Epoch 12 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 711.42

---------------------

===>  TRAIN Epoch 13


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 13 || Run Time:    0.1 | Load Time:    0.3 || F1:  63.33 | Prec:  95.00 | Rec:  47.50 || Ex/s: 614.02

===>  EVAL Epoch 13
Finished Epoch 13 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 717.86

---------------------

===>  TRAIN Epoch 14


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 14 || Run Time:    0.2 | Load Time:    0.3 || F1:  63.33 | Prec:  95.00 | Rec:  47.50 || Ex/s: 612.75

===>  EVAL Epoch 14
Finished Epoch 14 || Run Time:    0.0 | Load Time:    0.1 || F1:  40.00 | Prec:  66.67 | Rec:  28.57 || Ex/s: 699.31

---------------------

===>  TRAIN Epoch 15


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 15 || Run Time:    0.2 | Load Time:    0.3 || F1:  65.57 | Prec:  95.24 | Rec:  50.00 || Ex/s: 597.27

===>  EVAL Epoch 15
Finished Epoch 15 || Run Time:    0.0 | Load Time:    0.1 || F1:  38.10 | Prec:  57.14 | Rec:  28.57 || Ex/s: 719.30

---------------------

Loading best model...
Training done.


tensor(40., device='cuda:0')

In [18]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 9




Finished Epoch 9 || Run Time:    0.0 | Load Time:    0.1 || F1:  21.05 | Prec:  40.00 | Rec:  14.29 || Ex/s: 686.27



tensor(21.0526, device='cuda:0')

##### DBLP-ACM

In [13]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [14]:
model = dm.MatchingModel(attr_summarizer='sif')

In [15]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 1 || Run Time:    4.1 | Load Time:   11.9 || F1:  72.47 | Prec:  70.11 | Rec:  75.00 || Ex/s: 465.40

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    0.8 | Load Time:    3.6 || F1:  84.38 | Prec:  73.10 | Rec:  99.77 || Ex/s: 566.10

* Best F1: tensor(84.3810, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 2 || Run Time:    3.9 | Load Time:   11.3 || F1:  92.55 | Prec:  86.99 | Rec:  98.87 || Ex/s: 485.85

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    0.8 | Load Time:    3.6 || F1:  93.16 | Prec:  88.62 | Rec:  98.20 || Ex/s: 563.08

* Best F1: tensor(93.1624, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 3 || Run Time:    4.9 | Load Time:   13.5 || F1:  96.18 | Prec:  93.23 | Rec:  99.32 || Ex/s: 402.16

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 3 || Run Time:    0.8 | Load Time:    3.6 || F1:  93.93 | Prec:  90.59 | Rec:  97.52 || Ex/s: 559.88

* Best F1: tensor(93.9262, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 4 || Run Time:    3.9 | Load Time:   11.3 || F1:  98.08 | Prec:  96.44 | Rec:  99.77 || Ex/s: 490.03

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 4 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.61 | Prec:  92.47 | Rec:  96.85 || Ex/s: 564.65

* Best F1: tensor(94.6095, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 5 || Run Time:    4.0 | Load Time:   11.7 || F1:  98.81 | Prec:  97.79 | Rec:  99.85 || Ex/s: 472.00

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    0.8 | Load Time:    3.6 || F1:  93.95 | Prec:  90.25 | Rec:  97.97 || Ex/s: 564.09

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 6 || Run Time:    4.0 | Load Time:   11.5 || F1:  99.22 | Prec:  98.52 | Rec:  99.92 || Ex/s: 480.44

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 6 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.07 | Prec:  90.27 | Rec:  98.20 || Ex/s: 564.90

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 7 || Run Time:    3.8 | Load Time:   11.3 || F1:  99.33 | Prec:  98.81 | Rec:  99.85 || Ex/s: 488.74

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 7 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.17 | Prec:  90.46 | Rec:  98.20 || Ex/s: 566.01

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 8 || Run Time:    4.0 | Load Time:   11.7 || F1:  99.63 | Prec:  99.25 | Rec: 100.00 || Ex/s: 473.14

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.16 | Prec:  90.62 | Rec:  97.97 || Ex/s: 563.98

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 9 || Run Time:    3.8 | Load Time:   11.3 || F1:  99.70 | Prec:  99.40 | Rec: 100.00 || Ex/s: 488.19

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.35 | Prec:  91.18 | Rec:  97.75 || Ex/s: 567.33

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 10 || Run Time:    3.8 | Load Time:   11.3 || F1:  99.70 | Prec:  99.40 | Rec: 100.00 || Ex/s: 489.40

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    0.8 | Load Time:    3.5 || F1:  94.86 | Prec:  92.14 | Rec:  97.75 || Ex/s: 568.91

* Best F1: tensor(94.8634, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 11 || Run Time:    3.9 | Load Time:   11.3 || F1:  99.70 | Prec:  99.40 | Rec: 100.00 || Ex/s: 488.79

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.86 | Prec:  92.14 | Rec:  97.75 || Ex/s: 565.73

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 12 || Run Time:    4.0 | Load Time:   11.7 || F1:  99.78 | Prec:  99.55 | Rec: 100.00 || Ex/s: 474.33

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.97 | Prec:  92.34 | Rec:  97.75 || Ex/s: 560.92

* Best F1: tensor(94.9672, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 13 || Run Time:    3.9 | Load Time:   11.4 || F1:  99.81 | Prec:  99.63 | Rec: 100.00 || Ex/s: 486.14

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 13 || Run Time:    0.8 | Load Time:    3.5 || F1:  95.07 | Prec:  92.54 | Rec:  97.75 || Ex/s: 566.78

* Best F1: tensor(95.0712, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 14 || Run Time:    3.9 | Load Time:   11.3 || F1:  99.81 | Prec:  99.63 | Rec: 100.00 || Ex/s: 488.48

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    0.8 | Load Time:    3.6 || F1:  94.97 | Prec:  92.34 | Rec:  97.75 || Ex/s: 564.48

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 15 || Run Time:    3.9 | Load Time:   11.3 || F1:  99.81 | Prec:  99.63 | Rec: 100.00 || Ex/s: 489.29

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    0.8 | Load Time:    3.5 || F1:  94.97 | Prec:  92.34 | Rec:  97.75 || Ex/s: 565.72

---------------------

Loading best model...
Training done.


tensor(95.0712, device='cuda:0')

In [16]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 13 || Run Time:    0.9 | Load Time:    3.8 || F1:  93.94 | Prec:  92.01 | Rec:  95.95 || Ex/s: 521.11



tensor(93.9360, device='cuda:0')

##### DBLP-GoogleScholar

In [17]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-GoogleScholar/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [18]:
model = dm.MatchingModel(attr_summarizer='sif')

In [19]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 1 || Run Time:    9.4 | Load Time:   24.8 || F1:  57.46 | Prec:  52.85 | Rec:  62.96 || Ex/s: 502.64

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 1 || Run Time:    1.9 | Load Time:    7.2 || F1:  70.98 | Prec:  63.21 | Rec:  80.93 || Ex/s: 633.13

* Best F1: tensor(70.9836, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 2 || Run Time:    8.7 | Load Time:   23.2 || F1:  79.64 | Prec:  71.95 | Rec:  89.18 || Ex/s: 539.20

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 2 || Run Time:    1.9 | Load Time:    7.3 || F1:  77.04 | Prec:  76.33 | Rec:  77.76 || Ex/s: 626.16

* Best F1: tensor(77.0370, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 3 || Run Time:    9.0 | Load Time:   23.6 || F1:  88.69 | Prec:  83.35 | Rec:  94.76 || Ex/s: 528.54

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 3 || Run Time:    1.9 | Load Time:    7.2 || F1:  76.62 | Prec:  80.18 | Rec:  73.36 || Ex/s: 634.93

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 4 || Run Time:    8.7 | Load Time:   23.2 || F1:  93.78 | Prec:  90.75 | Rec:  97.01 || Ex/s: 539.29

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 4 || Run Time:    1.9 | Load Time:    7.2 || F1:  77.44 | Prec:  79.78 | Rec:  75.23 || Ex/s: 636.84

* Best F1: tensor(77.4411, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 5 || Run Time:    8.9 | Load Time:   23.6 || F1:  96.37 | Prec:  94.76 | Rec:  98.04 || Ex/s: 530.58

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 5 || Run Time:    1.9 | Load Time:    7.2 || F1:  77.94 | Prec:  78.12 | Rec:  77.76 || Ex/s: 635.92

* Best F1: tensor(77.9391, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 6 || Run Time:    8.7 | Load Time:   23.2 || F1:  97.85 | Prec:  97.11 | Rec:  98.60 || Ex/s: 539.70

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 6 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.22 | Prec:  79.08 | Rec:  77.38 || Ex/s: 634.32

* Best F1: tensor(78.2239, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 7 || Run Time:    8.7 | Load Time:   23.2 || F1:  98.48 | Prec:  98.05 | Rec:  98.91 || Ex/s: 538.65

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 7 || Run Time:    2.0 | Load Time:    7.5 || F1:  78.20 | Prec:  77.80 | Rec:  78.60 || Ex/s: 602.25

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 8 || Run Time:    8.7 | Load Time:   23.1 || F1:  98.97 | Prec:  98.79 | Rec:  99.16 || Ex/s: 540.90

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 8 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.49 | Prec:  81.49 | Rec:  75.70 || Ex/s: 632.06

* Best F1: tensor(78.4884, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 9 || Run Time:    8.7 | Load Time:   23.2 || F1:  99.24 | Prec:  99.19 | Rec:  99.28 || Ex/s: 538.92

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 9 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.65 | Prec:  82.40 | Rec:  75.23 || Ex/s: 627.88

* Best F1: tensor(78.6517, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 10 || Run Time:    8.9 | Load Time:   23.6 || F1:  99.28 | Prec:  99.28 | Rec:  99.28 || Ex/s: 530.74

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 10 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.56 | Prec:  82.08 | Rec:  75.33 || Ex/s: 631.63

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 11 || Run Time:    8.8 | Load Time:   23.2 || F1:  99.42 | Prec:  99.44 | Rec:  99.41 || Ex/s: 538.60

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 11 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.41 | Prec:  81.33 | Rec:  75.70 || Ex/s: 633.91

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 12 || Run Time:    9.2 | Load Time:   23.8 || F1:  99.45 | Prec:  99.47 | Rec:  99.44 || Ex/s: 522.12

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 12 || Run Time:    2.0 | Load Time:    7.3 || F1:  78.37 | Prec:  80.69 | Rec:  76.17 || Ex/s: 620.80

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 13 || Run Time:    8.8 | Load Time:   23.3 || F1:  99.53 | Prec:  99.59 | Rec:  99.47 || Ex/s: 537.02

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 13 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.09 | Prec:  78.91 | Rec:  77.29 || Ex/s: 634.93

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 14 || Run Time:    9.0 | Load Time:   23.7 || F1:  99.55 | Prec:  99.59 | Rec:  99.50 || Ex/s: 528.14

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 14 || Run Time:    2.0 | Load Time:    7.3 || F1:  78.04 | Prec:  77.86 | Rec:  78.22 || Ex/s: 620.10

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:31


Finished Epoch 15 || Run Time:    8.8 | Load Time:   23.3 || F1:  99.56 | Prec:  99.63 | Rec:  99.50 || Ex/s: 536.32

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 15 || Run Time:    1.9 | Load Time:    7.2 || F1:  78.08 | Prec:  79.29 | Rec:  76.92 || Ex/s: 630.99

---------------------

Loading best model...
Training done.


tensor(78.6517, device='cuda:0')

In [20]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 9 || Run Time:    1.9 | Load Time:    7.3 || F1:  77.26 | Prec:  79.92 | Rec:  74.77 || Ex/s: 624.48



tensor(77.2574, device='cuda:0')

##### Walmart-Amazon

In [21]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Walmart-Amazon/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [22]:
model = dm.MatchingModel(attr_summarizer='sif')

In [23]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 1 || Run Time:    4.1 | Load Time:    6.8 || F1:  18.03 | Prec:  51.22 | Rec:  10.94 || Ex/s: 564.63

===>  EVAL Epoch 1


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 1 || Run Time:    0.8 | Load Time:    2.0 || F1:  44.02 | Prec:  40.89 | Rec:  47.67 || Ex/s: 721.32

* Best F1: tensor(44.0191, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 2 || Run Time:    3.9 | Load Time:    6.5 || F1:  50.23 | Prec:  53.88 | Rec:  47.05 || Ex/s: 593.01

===>  EVAL Epoch 2


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 2 || Run Time:    0.8 | Load Time:    2.0 || F1:  48.80 | Prec:  42.11 | Rec:  58.03 || Ex/s: 706.92

* Best F1: tensor(48.8017, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 3 || Run Time:    3.9 | Load Time:    6.5 || F1:  66.31 | Prec:  67.75 | Rec:  64.93 || Ex/s: 592.18

===>  EVAL Epoch 3


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 3 || Run Time:    0.8 | Load Time:    2.0 || F1:  50.35 | Prec:  45.76 | Rec:  55.96 || Ex/s: 715.91

* Best F1: tensor(50.3497, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 4 || Run Time:    3.9 | Load Time:    6.5 || F1:  78.69 | Prec:  78.22 | Rec:  79.17 || Ex/s: 590.76

===>  EVAL Epoch 4


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 4 || Run Time:    0.8 | Load Time:    2.0 || F1:  50.12 | Prec:  46.85 | Rec:  53.89 || Ex/s: 714.58

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 5 || Run Time:    3.9 | Load Time:    6.5 || F1:  88.08 | Prec:  87.63 | Rec:  88.54 || Ex/s: 592.98

===>  EVAL Epoch 5


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 5 || Run Time:    0.8 | Load Time:    2.0 || F1:  50.78 | Prec:  50.78 | Rec:  50.78 || Ex/s: 716.12

* Best F1: tensor(50.7772, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 6 || Run Time:    3.9 | Load Time:    6.5 || F1:  93.68 | Prec:  93.44 | Rec:  93.92 || Ex/s: 590.77

===>  EVAL Epoch 6


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 6 || Run Time:    0.9 | Load Time:    2.0 || F1:  51.48 | Prec:  60.00 | Rec:  45.08 || Ex/s: 710.34

* Best F1: tensor(51.4793, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 7 || Run Time:    4.1 | Load Time:    6.8 || F1:  96.08 | Prec:  96.34 | Rec:  95.83 || Ex/s: 562.79

===>  EVAL Epoch 7


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 7 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.67 | Prec:  62.50 | Rec:  44.04 || Ex/s: 710.04

* Best F1: tensor(51.6717, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 8 || Run Time:    3.9 | Load Time:    6.5 || F1:  97.66 | Prec:  97.41 | Rec:  97.92 || Ex/s: 589.68

===>  EVAL Epoch 8


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 8 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.05 | Prec:  60.71 | Rec:  44.04 || Ex/s: 721.01

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 9 || Run Time:    3.8 | Load Time:    6.4 || F1:  98.44 | Prec:  98.27 | Rec:  98.61 || Ex/s: 600.12

===>  EVAL Epoch 9


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 9 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.81 | Prec:  61.87 | Rec:  44.56 || Ex/s: 724.37

* Best F1: tensor(51.8072, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 10 || Run Time:    3.8 | Load Time:    6.4 || F1:  98.53 | Prec:  98.44 | Rec:  98.61 || Ex/s: 601.72

===>  EVAL Epoch 10


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 10 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.96 | Prec:  62.32 | Rec:  44.56 || Ex/s: 720.32

* Best F1: tensor(51.9637, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 11 || Run Time:    3.8 | Load Time:    6.4 || F1:  98.79 | Prec:  98.62 | Rec:  98.96 || Ex/s: 601.49

===>  EVAL Epoch 11


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 11 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.96 | Prec:  62.32 | Rec:  44.56 || Ex/s: 723.47

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 12 || Run Time:    3.8 | Load Time:    6.4 || F1:  98.96 | Prec:  98.96 | Rec:  98.96 || Ex/s: 601.57

===>  EVAL Epoch 12


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 12 || Run Time:    0.8 | Load Time:    2.0 || F1:  52.12 | Prec:  62.77 | Rec:  44.56 || Ex/s: 718.63

* Best F1: tensor(52.1212, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 13 || Run Time:    3.8 | Load Time:    6.5 || F1:  99.04 | Prec:  99.13 | Rec:  98.96 || Ex/s: 596.77

===>  EVAL Epoch 13


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 13 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.96 | Prec:  62.32 | Rec:  44.56 || Ex/s: 723.84

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 14 || Run Time:    4.0 | Load Time:    6.7 || F1:  99.04 | Prec:  99.13 | Rec:  98.96 || Ex/s: 572.29

===>  EVAL Epoch 14


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 14 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.81 | Prec:  61.87 | Rec:  44.56 || Ex/s: 724.24

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 15 || Run Time:    3.8 | Load Time:    6.4 || F1:  99.04 | Prec:  99.13 | Rec:  98.96 || Ex/s: 599.33

===>  EVAL Epoch 15


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 15 || Run Time:    0.8 | Load Time:    2.0 || F1:  51.65 | Prec:  61.43 | Rec:  44.56 || Ex/s: 716.56

---------------------

Loading best model...
Training done.


tensor(52.1212, device='cuda:0')

In [24]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 12


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:02


Finished Epoch 12 || Run Time:    0.8 | Load Time:    2.0 || F1:  55.90 | Prec:  69.77 | Rec:  46.63 || Ex/s: 710.30



tensor(55.9006, device='cuda:0')

#### TEXTUAL

##### Abt-Buy

In [25]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [26]:
model = dm.MatchingModel(attr_summarizer='sif')

In [27]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:13


Finished Epoch 1 || Run Time:    2.7 | Load Time:   10.4 || F1:   0.32 | Prec:   9.09 | Rec:   0.16 || Ex/s: 440.35

===>  EVAL Epoch 1


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 1 || Run Time:    0.5 | Load Time:    3.0 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 536.85

* Best F1: tensor(0., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 2 || Run Time:    2.7 | Load Time:   10.3 || F1:  15.05 | Prec:  43.75 | Rec:   9.09 || Ex/s: 441.86

===>  EVAL Epoch 2


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 2 || Run Time:    0.5 | Load Time:    3.0 || F1:  15.54 | Prec:  25.56 | Rec:  11.17 || Ex/s: 536.76

* Best F1: tensor(15.5405, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:13


Finished Epoch 3 || Run Time:    2.6 | Load Time:   10.4 || F1:  38.22 | Prec:  43.24 | Rec:  34.25 || Ex/s: 441.19

===>  EVAL Epoch 3


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 3 || Run Time:    0.5 | Load Time:    3.2 || F1:  22.01 | Prec:  21.70 | Rec:  22.33 || Ex/s: 515.79

* Best F1: tensor(22.0096, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 4 || Run Time:    2.5 | Load Time:   10.2 || F1:  58.92 | Prec:  54.15 | Rec:  64.61 || Ex/s: 451.28

===>  EVAL Epoch 4


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 4 || Run Time:    0.5 | Load Time:    3.0 || F1:  24.39 | Prec:  20.98 | Rec:  29.13 || Ex/s: 541.60

* Best F1: tensor(24.3902, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 5 || Run Time:    2.5 | Load Time:   10.2 || F1:  71.56 | Prec:  63.79 | Rec:  81.49 || Ex/s: 450.86

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 5 || Run Time:    0.5 | Load Time:    3.0 || F1:  22.17 | Prec:  22.01 | Rec:  22.33 || Ex/s: 543.33

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 6 || Run Time:    2.5 | Load Time:   10.2 || F1:  80.32 | Prec:  72.04 | Rec:  90.75 || Ex/s: 449.87

===>  EVAL Epoch 6


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 6 || Run Time:    0.5 | Load Time:    3.0 || F1:  17.78 | Prec:  20.78 | Rec:  15.53 || Ex/s: 536.02

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 7 || Run Time:    2.6 | Load Time:   10.3 || F1:  85.17 | Prec:  77.75 | Rec:  94.16 || Ex/s: 446.10

===>  EVAL Epoch 7


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 7 || Run Time:    0.5 | Load Time:    3.0 || F1:  17.21 | Prec:  22.14 | Rec:  14.08 || Ex/s: 542.99

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 8 || Run Time:    2.6 | Load Time:   10.2 || F1:  88.40 | Prec:  82.44 | Rec:  95.29 || Ex/s: 448.88

===>  EVAL Epoch 8


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 8 || Run Time:    0.5 | Load Time:    3.0 || F1:  17.44 | Prec:  21.74 | Rec:  14.56 || Ex/s: 538.15

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:13


Finished Epoch 9 || Run Time:    2.7 | Load Time:   10.6 || F1:  91.74 | Prec:  86.71 | Rec:  97.40 || Ex/s: 431.48

===>  EVAL Epoch 9


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 9 || Run Time:    0.5 | Load Time:    3.0 || F1:  17.82 | Prec:  21.83 | Rec:  15.05 || Ex/s: 536.20

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 10 || Run Time:    2.6 | Load Time:   10.3 || F1:  93.34 | Prec:  89.20 | Rec:  97.89 || Ex/s: 448.27

===>  EVAL Epoch 10


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 10 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.84 | Prec:  22.16 | Rec:  17.96 || Ex/s: 543.36

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 11 || Run Time:    2.6 | Load Time:   10.3 || F1:  94.46 | Prec:  90.98 | Rec:  98.21 || Ex/s: 445.59

===>  EVAL Epoch 11


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 11 || Run Time:    0.5 | Load Time:    3.1 || F1:  19.17 | Prec:  20.56 | Rec:  17.96 || Ex/s: 535.66

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 12 || Run Time:    2.6 | Load Time:   10.3 || F1:  95.60 | Prec:  92.55 | Rec:  98.86 || Ex/s: 447.41

===>  EVAL Epoch 12


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 12 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.63 | Prec:  21.64 | Rec:  17.96 || Ex/s: 542.21

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 13 || Run Time:    2.5 | Load Time:   10.2 || F1:  96.29 | Prec:  93.70 | Rec:  99.03 || Ex/s: 449.56

===>  EVAL Epoch 13


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 13 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.13 | Prec:  21.88 | Rec:  16.99 || Ex/s: 546.37

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:13


Finished Epoch 14 || Run Time:    2.6 | Load Time:   10.6 || F1:  96.91 | Prec:  94.59 | Rec:  99.35 || Ex/s: 434.55

===>  EVAL Epoch 14


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 14 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.32 | Prec:  23.29 | Rec:  16.50 || Ex/s: 539.95

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:12


Finished Epoch 15 || Run Time:    2.5 | Load Time:   10.2 || F1:  97.07 | Prec:  94.88 | Rec:  99.35 || Ex/s: 452.22

===>  EVAL Epoch 15


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 15 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.19 | Prec:  23.91 | Rec:  16.02 || Ex/s: 539.08

---------------------

Loading best model...
Training done.


tensor(24.3902, device='cuda:0')

In [28]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 4


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 4 || Run Time:    0.5 | Load Time:    3.0 || F1:  19.92 | Prec:  17.13 | Rec:  23.79 || Ex/s: 539.15



tensor(19.9187, device='cuda:0')

#### DIRTY

##### DBLP-ACM

In [29]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [30]:
model = dm.MatchingModel(attr_summarizer='sif')

In [31]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 1 || Run Time:    3.9 | Load Time:   13.8 || F1:  38.03 | Prec:  52.30 | Rec:  29.88 || Ex/s: 418.07

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    0.8 | Load Time:    4.3 || F1:  61.41 | Prec:  57.59 | Rec:  65.77 || Ex/s: 481.76

* Best F1: tensor(61.4090, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 2 || Run Time:    4.0 | Load Time:   14.2 || F1:  68.77 | Prec:  58.10 | Rec:  84.23 || Ex/s: 408.24

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    0.8 | Load Time:    4.3 || F1:  62.60 | Prec:  51.07 | Rec:  80.86 || Ex/s: 484.59

* Best F1: tensor(62.5981, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 3 || Run Time:    3.9 | Load Time:   13.8 || F1:  79.39 | Prec:  70.78 | Rec:  90.39 || Ex/s: 418.22

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 3 || Run Time:    0.8 | Load Time:    4.3 || F1:  65.53 | Prec:  58.89 | Rec:  73.87 || Ex/s: 480.13

* Best F1: tensor(65.5345, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 4 || Run Time:    3.9 | Load Time:   13.8 || F1:  86.79 | Prec:  80.11 | Rec:  94.67 || Ex/s: 419.75

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 4 || Run Time:    0.8 | Load Time:    4.3 || F1:  64.50 | Prec:  53.85 | Rec:  80.41 || Ex/s: 482.64

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 5 || Run Time:    3.9 | Load Time:   13.8 || F1:  91.42 | Prec:  86.58 | Rec:  96.85 || Ex/s: 418.36

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    0.8 | Load Time:    4.3 || F1:  64.01 | Prec:  52.69 | Rec:  81.53 || Ex/s: 482.60

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 6 || Run Time:    4.0 | Load Time:   14.2 || F1:  93.74 | Prec:  90.04 | Rec:  97.75 || Ex/s: 406.72

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 6 || Run Time:    0.8 | Load Time:    4.3 || F1:  65.01 | Prec:  54.98 | Rec:  79.50 || Ex/s: 479.83

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 7 || Run Time:    3.9 | Load Time:   13.9 || F1:  95.84 | Prec:  93.25 | Rec:  98.57 || Ex/s: 416.76

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 7 || Run Time:    0.8 | Load Time:    4.3 || F1:  65.16 | Prec:  56.10 | Rec:  77.70 || Ex/s: 482.84

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 8 || Run Time:    3.9 | Load Time:   13.8 || F1:  96.69 | Prec:  94.68 | Rec:  98.80 || Ex/s: 419.34

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.80 | Prec:  60.81 | Rec:  74.10 || Ex/s: 481.95

* Best F1: tensor(66.8020, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 9 || Run Time:    3.9 | Load Time:   13.8 || F1:  97.23 | Prec:  95.71 | Rec:  98.80 || Ex/s: 419.80

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    0.8 | Load Time:    4.3 || F1:  67.09 | Prec:  63.64 | Rec:  70.95 || Ex/s: 477.94

* Best F1: tensor(67.0927, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 10 || Run Time:    4.1 | Load Time:   14.2 || F1:  97.81 | Prec:  96.56 | Rec:  99.10 || Ex/s: 406.05

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.67 | Prec:  65.58 | Rec:  67.79 || Ex/s: 479.02

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 11 || Run Time:    3.8 | Load Time:   13.8 || F1:  98.00 | Prec:  96.92 | Rec:  99.10 || Ex/s: 420.00

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.59 | Prec:  66.08 | Rec:  67.12 || Ex/s: 484.72

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 12 || Run Time:    3.9 | Load Time:   13.8 || F1:  98.33 | Prec:  97.35 | Rec:  99.32 || Ex/s: 419.76

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.44 | Prec:  66.67 | Rec:  66.22 || Ex/s: 484.14

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 13 || Run Time:    3.8 | Load Time:   13.8 || F1:  98.44 | Prec:  97.57 | Rec:  99.32 || Ex/s: 421.52

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 13 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.44 | Prec:  67.13 | Rec:  65.77 || Ex/s: 480.22

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 14 || Run Time:    4.1 | Load Time:   14.2 || F1:  98.44 | Prec:  97.57 | Rec:  99.32 || Ex/s: 405.38

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.82 | Prec:  67.91 | Rec:  65.77 || Ex/s: 482.00

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:17


Finished Epoch 15 || Run Time:    3.9 | Load Time:   13.8 || F1:  98.51 | Prec:  97.71 | Rec:  99.32 || Ex/s: 420.45

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    0.8 | Load Time:    4.3 || F1:  66.90 | Prec:  68.56 | Rec:  65.32 || Ex/s: 484.46

---------------------

Loading best model...
Training done.


tensor(67.0927, device='cuda:0')

In [32]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    0.8 | Load Time:    4.3 || F1:  70.53 | Prec:  66.21 | Rec:  75.45 || Ex/s: 486.86



tensor(70.5263, device='cuda:0')

##### DBLP-GoogleScholar

In [33]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-GoogleScholar//',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [34]:
model = dm.MatchingModel(attr_summarizer='sif')

In [35]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='sif_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 542402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 1 || Run Time:    8.9 | Load Time:   27.9 || F1:  45.19 | Prec:  43.44 | Rec:  47.08 || Ex/s: 467.13

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 1 || Run Time:    1.9 | Load Time:    8.5 || F1:  60.63 | Prec:  57.50 | Rec:  64.11 || Ex/s: 552.12

* Best F1: tensor(60.6275, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 2 || Run Time:    8.7 | Load Time:   27.5 || F1:  68.99 | Prec:  59.62 | Rec:  81.85 || Ex/s: 476.25

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 2 || Run Time:    1.9 | Load Time:    8.6 || F1:  64.91 | Prec:  61.09 | Rec:  69.25 || Ex/s: 547.21

* Best F1: tensor(64.9146, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 3 || Run Time:    8.8 | Load Time:   27.9 || F1:  80.16 | Prec:  72.61 | Rec:  89.46 || Ex/s: 468.99

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 3 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.19 | Prec:  66.89 | Rec:  65.51 || Ex/s: 555.70

* Best F1: tensor(66.1945, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 4 || Run Time:    8.8 | Load Time:   27.5 || F1:  87.88 | Prec:  82.55 | Rec:  93.95 || Ex/s: 474.34

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 4 || Run Time:    1.9 | Load Time:    8.6 || F1:  67.11 | Prec:  67.68 | Rec:  66.54 || Ex/s: 547.25

* Best F1: tensor(67.1065, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 5 || Run Time:    9.0 | Load Time:   27.8 || F1:  92.38 | Prec:  88.97 | Rec:  96.07 || Ex/s: 467.96

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 5 || Run Time:    1.9 | Load Time:    8.5 || F1:  67.37 | Prec:  61.69 | Rec:  74.21 || Ex/s: 556.30

* Best F1: tensor(67.3738, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 6 || Run Time:    8.8 | Load Time:   27.5 || F1:  95.05 | Prec:  92.96 | Rec:  97.22 || Ex/s: 474.79

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 6 || Run Time:    1.9 | Load Time:    8.5 || F1:  67.37 | Prec:  69.34 | Rec:  65.51 || Ex/s: 552.96

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 7 || Run Time:    8.9 | Load Time:   27.8 || F1:  96.78 | Prec:  95.78 | Rec:  97.79 || Ex/s: 469.59

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 7 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.40 | Prec:  72.28 | Rec:  61.40 || Ex/s: 554.39

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 8 || Run Time:    8.7 | Load Time:   27.5 || F1:  97.62 | Prec:  96.95 | Rec:  98.29 || Ex/s: 475.45

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 8 || Run Time:    1.9 | Load Time:    8.5 || F1:  65.99 | Prec:  72.22 | Rec:  60.75 || Ex/s: 555.02

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 9 || Run Time:    8.9 | Load Time:   27.9 || F1:  98.31 | Prec:  98.02 | Rec:  98.60 || Ex/s: 468.28

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 9 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.50 | Prec:  71.51 | Rec:  62.15 || Ex/s: 554.16

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 10 || Run Time:    8.7 | Load Time:   27.5 || F1:  98.55 | Prec:  98.44 | Rec:  98.66 || Ex/s: 476.24

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 10 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.33 | Prec:  71.49 | Rec:  61.87 || Ex/s: 554.19

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 11 || Run Time:    8.9 | Load Time:   27.9 || F1:  98.75 | Prec:  98.81 | Rec:  98.69 || Ex/s: 468.20

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 11 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.73 | Prec:  70.71 | Rec:  63.18 || Ex/s: 553.69

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 12 || Run Time:    8.7 | Load Time:   27.4 || F1:  98.85 | Prec:  98.94 | Rec:  98.75 || Ex/s: 476.75

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 12 || Run Time:    2.0 | Load Time:    8.8 || F1:  66.90 | Prec:  70.38 | Rec:  63.74 || Ex/s: 529.57

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 13 || Run Time:    8.8 | Load Time:   27.5 || F1:  98.91 | Prec:  99.06 | Rec:  98.75 || Ex/s: 473.96

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 13 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.80 | Prec:  69.72 | Rec:  64.11 || Ex/s: 554.83

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:35


Finished Epoch 14 || Run Time:    8.7 | Load Time:   27.4 || F1:  98.97 | Prec:  99.12 | Rec:  98.82 || Ex/s: 476.28

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 14 || Run Time:    2.0 | Load Time:    8.9 || F1:  66.73 | Prec:  69.57 | Rec:  64.11 || Ex/s: 528.64

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:36


Finished Epoch 15 || Run Time:    8.8 | Load Time:   27.5 || F1:  99.03 | Prec:  99.22 | Rec:  98.85 || Ex/s: 473.53

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 15 || Run Time:    1.9 | Load Time:    8.5 || F1:  66.63 | Prec:  69.47 | Rec:  64.02 || Ex/s: 553.17

---------------------

Loading best model...
Training done.


tensor(67.3738, device='cuda:0')

In [36]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 5 || Run Time:    1.9 | Load Time:    8.6 || F1:  65.92 | Prec:  61.42 | Rec:  71.12 || Ex/s: 549.03



tensor(65.9160, device='cuda:0')

### ATTENTION

#### STRUCTURED

##### Amazon-Google

In [37]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [38]:
model = dm.MatchingModel(attr_summarizer='attention')

In [39]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 3429602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 1 || Run Time:   11.3 | Load Time:    4.6 || F1:  24.19 | Prec:  34.11 | Rec:  18.74 || Ex/s: 433.07

===>  EVAL Epoch 1


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 1 || Run Time:    2.1 | Load Time:    1.4 || F1:  37.22 | Prec:  28.81 | Rec:  52.56 || Ex/s: 648.21

* Best F1: tensor(37.2163, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 2 || Run Time:   10.9 | Load Time:    4.5 || F1:  48.17 | Prec:  44.39 | Rec:  52.65 || Ex/s: 446.30

===>  EVAL Epoch 2


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 2 || Run Time:    2.1 | Load Time:    1.4 || F1:  40.60 | Prec:  31.32 | Rec:  57.69 || Ex/s: 656.30

* Best F1: tensor(40.6015, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 3 || Run Time:   10.9 | Load Time:    4.5 || F1:  56.73 | Prec:  50.45 | Rec:  64.81 || Ex/s: 448.98

===>  EVAL Epoch 3


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 3 || Run Time:    2.1 | Load Time:    1.4 || F1:  43.82 | Prec:  33.64 | Rec:  62.82 || Ex/s: 661.69

* Best F1: tensor(43.8152, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 4 || Run Time:   10.8 | Load Time:    4.5 || F1:  62.37 | Prec:  54.62 | Rec:  72.68 || Ex/s: 449.87

===>  EVAL Epoch 4


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 4 || Run Time:    2.1 | Load Time:    1.4 || F1:  42.92 | Prec:  43.10 | Rec:  42.74 || Ex/s: 659.94

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 5 || Run Time:   10.8 | Load Time:    4.5 || F1:  69.34 | Prec:  60.95 | Rec:  80.40 || Ex/s: 449.92

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 5 || Run Time:    2.1 | Load Time:    1.4 || F1:  44.53 | Prec:  41.64 | Rec:  47.86 || Ex/s: 661.71

* Best F1: tensor(44.5328, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 6 || Run Time:   11.3 | Load Time:    4.6 || F1:  74.91 | Prec:  66.97 | Rec:  84.98 || Ex/s: 431.45

===>  EVAL Epoch 6


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 6 || Run Time:    2.0 | Load Time:    1.4 || F1:  45.68 | Prec:  39.44 | Rec:  54.27 || Ex/s: 665.20

* Best F1: tensor(45.6835, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 7 || Run Time:   10.9 | Load Time:    4.4 || F1:  79.15 | Prec:  71.74 | Rec:  88.27 || Ex/s: 448.36

===>  EVAL Epoch 7


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 7 || Run Time:    2.1 | Load Time:    1.4 || F1:  44.71 | Prec:  37.22 | Rec:  55.98 || Ex/s: 662.48

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 8 || Run Time:   10.8 | Load Time:    4.5 || F1:  81.13 | Prec:  73.77 | Rec:  90.13 || Ex/s: 450.43

===>  EVAL Epoch 8


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 8 || Run Time:    2.1 | Load Time:    1.4 || F1:  45.63 | Prec:  45.53 | Rec:  45.73 || Ex/s: 657.78

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 9 || Run Time:   10.8 | Load Time:    4.5 || F1:  84.61 | Prec:  78.02 | Rec:  92.42 || Ex/s: 451.17

===>  EVAL Epoch 9


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 9 || Run Time:    2.1 | Load Time:    1.4 || F1:  41.78 | Prec:  46.35 | Rec:  38.03 || Ex/s: 659.05

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 10 || Run Time:   10.9 | Load Time:    4.5 || F1:  87.56 | Prec:  82.61 | Rec:  93.13 || Ex/s: 447.12

===>  EVAL Epoch 10


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 10 || Run Time:    2.3 | Load Time:    1.6 || F1:  45.47 | Prec:  47.03 | Rec:  44.02 || Ex/s: 581.47

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 11 || Run Time:   10.9 | Load Time:    4.5 || F1:  88.89 | Prec:  84.43 | Rec:  93.85 || Ex/s: 448.21

===>  EVAL Epoch 11


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 11 || Run Time:    2.0 | Load Time:    1.4 || F1:  43.02 | Prec:  46.31 | Rec:  40.17 || Ex/s: 666.78

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 12 || Run Time:   10.9 | Load Time:    4.5 || F1:  90.20 | Prec:  86.58 | Rec:  94.13 || Ex/s: 448.42

===>  EVAL Epoch 12


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 12 || Run Time:    2.1 | Load Time:    1.4 || F1:  43.82 | Prec:  48.21 | Rec:  40.17 || Ex/s: 659.06

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 13 || Run Time:   10.9 | Load Time:    4.5 || F1:  90.86 | Prec:  87.43 | Rec:  94.56 || Ex/s: 448.81

===>  EVAL Epoch 13


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 13 || Run Time:    2.1 | Load Time:    1.4 || F1:  43.03 | Prec:  48.15 | Rec:  38.89 || Ex/s: 661.29

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 14 || Run Time:   10.9 | Load Time:    4.5 || F1:  91.83 | Prec:  88.99 | Rec:  94.85 || Ex/s: 448.14

===>  EVAL Epoch 14


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 14 || Run Time:    2.1 | Load Time:    1.4 || F1:  42.69 | Prec:  46.70 | Rec:  39.32 || Ex/s: 654.27

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 15 || Run Time:   11.2 | Load Time:    4.6 || F1:  92.54 | Prec:  90.22 | Rec:  94.99 || Ex/s: 434.26

===>  EVAL Epoch 15


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 15 || Run Time:    2.1 | Load Time:    1.4 || F1:  42.49 | Prec:  46.23 | Rec:  39.32 || Ex/s: 658.16

---------------------

Loading best model...
Training done.


tensor(45.6835, device='cuda:0')

In [40]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 6


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:03


Finished Epoch 6 || Run Time:    2.1 | Load Time:    1.4 || F1:  46.44 | Prec:  40.58 | Rec:  54.27 || Ex/s: 651.33



tensor(46.4351, device='cuda:0')

##### Beer

In [41]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [42]:
model = dm.MatchingModel(attr_summarizer='attention')

In [43]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 4332002
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 1 || Run Time:    0.6 | Load Time:    0.3 || F1:   8.33 | Prec:   9.38 | Rec:   7.50 || Ex/s: 295.04

===>  EVAL Epoch 1
Finished Epoch 1 || Run Time:    0.1 | Load Time:    0.1 || F1:  27.03 | Prec:  21.74 | Rec:  35.71 || Ex/s: 418.30

* Best F1: tensor(27.0270, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 2 || Run Time:    0.6 | Load Time:    0.3 || F1:  52.27 | Prec:  47.92 | Rec:  57.50 || Ex/s: 301.04

===>  EVAL Epoch 2
Finished Epoch 2 || Run Time:    0.1 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 434.82

---------------------

===>  TRAIN Epoch 3


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 3 || Run Time:    0.6 | Load Time:    0.3 || F1:  64.71 | Prec:  78.57 | Rec:  55.00 || Ex/s: 303.57

===>  EVAL Epoch 3
Finished Epoch 3 || Run Time:    0.4 | Load Time:    0.1 || F1:  41.18 | Prec:  35.00 | Rec:  50.00 || Ex/s: 180.02

* Best F1: tensor(41.1765, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 4 || Run Time:    0.6 | Load Time:    0.3 || F1:  66.67 | Prec:  60.00 | Rec:  75.00 || Ex/s: 297.08

===>  EVAL Epoch 4
Finished Epoch 4 || Run Time:    0.1 | Load Time:    0.1 || F1:  33.33 | Prec:  31.25 | Rec:  35.71 || Ex/s: 439.64

---------------------

===>  TRAIN Epoch 5


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 5 || Run Time:    0.6 | Load Time:    0.3 || F1:  72.09 | Prec:  67.39 | Rec:  77.50 || Ex/s: 308.37

===>  EVAL Epoch 5
Finished Epoch 5 || Run Time:    0.1 | Load Time:    0.1 || F1:  47.62 | Prec:  35.71 | Rec:  71.43 || Ex/s: 432.53

* Best F1: tensor(47.6190, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 6 || Run Time:    0.6 | Load Time:    0.3 || F1:  76.09 | Prec:  67.31 | Rec:  87.50 || Ex/s: 302.56

===>  EVAL Epoch 6
Finished Epoch 6 || Run Time:    0.1 | Load Time:    0.1 || F1:  52.63 | Prec:  41.67 | Rec:  71.43 || Ex/s: 439.72

* Best F1: tensor(52.6316, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 7 || Run Time:    0.6 | Load Time:    0.3 || F1:  77.89 | Prec:  67.27 | Rec:  92.50 || Ex/s: 305.44

===>  EVAL Epoch 7
Finished Epoch 7 || Run Time:    0.1 | Load Time:    0.1 || F1:  52.38 | Prec:  39.29 | Rec:  78.57 || Ex/s: 449.79

---------------------

===>  TRAIN Epoch 8


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 8 || Run Time:    0.6 | Load Time:    0.3 || F1:  81.32 | Prec:  72.55 | Rec:  92.50 || Ex/s: 303.66

===>  EVAL Epoch 8
Finished Epoch 8 || Run Time:    0.1 | Load Time:    0.1 || F1:  60.00 | Prec:  46.15 | Rec:  85.71 || Ex/s: 445.59

* Best F1: tensor(60., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 9 || Run Time:    0.6 | Load Time:    0.3 || F1:  82.61 | Prec:  73.08 | Rec:  95.00 || Ex/s: 309.66

===>  EVAL Epoch 9
Finished Epoch 9 || Run Time:    0.1 | Load Time:    0.1 || F1:  60.00 | Prec:  46.15 | Rec:  85.71 || Ex/s: 440.12

---------------------

===>  TRAIN Epoch 10


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 10 || Run Time:    0.6 | Load Time:    0.3 || F1:  84.78 | Prec:  75.00 | Rec:  97.50 || Ex/s: 311.88

===>  EVAL Epoch 10
Finished Epoch 10 || Run Time:    0.1 | Load Time:    0.1 || F1:  61.54 | Prec:  48.00 | Rec:  85.71 || Ex/s: 411.52

* Best F1: tensor(61.5385, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 11 || Run Time:    0.6 | Load Time:    0.3 || F1:  84.78 | Prec:  75.00 | Rec:  97.50 || Ex/s: 309.53

===>  EVAL Epoch 11
Finished Epoch 11 || Run Time:    0.1 | Load Time:    0.1 || F1:  63.16 | Prec:  50.00 | Rec:  85.71 || Ex/s: 417.20

* Best F1: tensor(63.1579, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 12 || Run Time:    0.6 | Load Time:    0.3 || F1:  84.78 | Prec:  75.00 | Rec:  97.50 || Ex/s: 304.53

===>  EVAL Epoch 12
Finished Epoch 12 || Run Time:    0.1 | Load Time:    0.1 || F1:  63.16 | Prec:  50.00 | Rec:  85.71 || Ex/s: 429.67

---------------------

===>  TRAIN Epoch 13


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 13 || Run Time:    0.6 | Load Time:    0.3 || F1:  86.67 | Prec:  78.00 | Rec:  97.50 || Ex/s: 305.65

===>  EVAL Epoch 13
Finished Epoch 13 || Run Time:    0.1 | Load Time:    0.1 || F1:  63.16 | Prec:  50.00 | Rec:  85.71 || Ex/s: 452.92

---------------------

===>  TRAIN Epoch 14


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 14 || Run Time:    0.6 | Load Time:    0.3 || F1:  86.67 | Prec:  78.00 | Rec:  97.50 || Ex/s: 306.26

===>  EVAL Epoch 14
Finished Epoch 14 || Run Time:    0.1 | Load Time:    0.1 || F1:  63.16 | Prec:  50.00 | Rec:  85.71 || Ex/s: 428.85

---------------------

===>  TRAIN Epoch 15


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 15 || Run Time:    0.6 | Load Time:    0.3 || F1:  87.91 | Prec:  78.43 | Rec: 100.00 || Ex/s: 304.39

===>  EVAL Epoch 15
Finished Epoch 15 || Run Time:    0.1 | Load Time:    0.1 || F1:  63.16 | Prec:  50.00 | Rec:  85.71 || Ex/s: 444.35

---------------------

Loading best model...
Training done.


tensor(63.1579, device='cuda:0')

In [44]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 11




Finished Epoch 11 || Run Time:    0.1 | Load Time:    0.1 || F1:  54.05 | Prec:  43.48 | Rec:  71.43 || Ex/s: 421.74



tensor(54.0541, device='cuda:0')

##### DBLP-ACM

In [45]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [46]:
model = dm.MatchingModel(attr_summarizer='attention')

In [47]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 4332002
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 1 || Run Time:   16.1 | Load Time:   11.3 || F1:  82.08 | Prec:  74.21 | Rec:  91.82 || Ex/s: 270.40

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 1 || Run Time:    3.1 | Load Time:    3.6 || F1:  91.36 | Prec:  84.09 | Rec: 100.00 || Ex/s: 370.37

* Best F1: tensor(91.3580, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 2 || Run Time:   16.5 | Load Time:   11.5 || F1:  95.92 | Prec:  93.08 | Rec:  98.95 || Ex/s: 264.51

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 2 || Run Time:    3.2 | Load Time:    3.5 || F1:  94.77 | Prec:  90.06 | Rec: 100.00 || Ex/s: 369.70

* Best F1: tensor(94.7705, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 3 || Run Time:   15.9 | Load Time:   11.2 || F1:  96.52 | Prec:  94.15 | Rec:  99.02 || Ex/s: 273.32

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 3 || Run Time:    3.1 | Load Time:    3.6 || F1:  96.21 | Prec:  92.69 | Rec: 100.00 || Ex/s: 368.07

* Best F1: tensor(96.2080, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 4 || Run Time:   16.3 | Load Time:   11.5 || F1:  97.67 | Prec:  96.08 | Rec:  99.32 || Ex/s: 267.28

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 4 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.80 | Prec:  95.69 | Rec: 100.00 || Ex/s: 372.26

* Best F1: tensor(97.7973, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 5 || Run Time:   16.0 | Load Time:   11.2 || F1:  98.37 | Prec:  97.14 | Rec:  99.62 || Ex/s: 272.76

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 5 || Run Time:    3.1 | Load Time:    3.6 || F1:  98.11 | Prec:  96.72 | Rec:  99.55 || Ex/s: 366.96

* Best F1: tensor(98.1132, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 6 || Run Time:   16.0 | Load Time:   11.3 || F1:  98.66 | Prec:  97.72 | Rec:  99.62 || Ex/s: 271.76

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 6 || Run Time:    3.1 | Load Time:    3.5 || F1:  97.87 | Prec:  97.54 | Rec:  98.20 || Ex/s: 370.72

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 7 || Run Time:   16.3 | Load Time:   11.4 || F1:  99.22 | Prec:  98.74 | Rec:  99.70 || Ex/s: 267.56

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 7 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.14 | Prec:  98.61 | Rec:  95.72 || Ex/s: 370.75

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 8 || Run Time:   16.0 | Load Time:   11.3 || F1:  99.29 | Prec:  98.81 | Rec:  99.77 || Ex/s: 272.45

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 8 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.03 | Prec:  98.60 | Rec:  95.50 || Ex/s: 366.94

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 9 || Run Time:   16.2 | Load Time:   11.4 || F1:  99.03 | Prec:  98.30 | Rec:  99.77 || Ex/s: 267.86

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 9 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.40 | Prec:  97.95 | Rec:  96.85 || Ex/s: 370.67

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 10 || Run Time:   15.9 | Load Time:   11.2 || F1:  99.59 | Prec:  99.33 | Rec:  99.85 || Ex/s: 273.17

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 10 || Run Time:    3.1 | Load Time:    3.6 || F1:  98.10 | Prec:  97.55 | Rec:  98.65 || Ex/s: 366.66

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 11 || Run Time:   16.0 | Load Time:   11.3 || F1:  99.59 | Prec:  99.33 | Rec:  99.85 || Ex/s: 272.23

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 11 || Run Time:    3.1 | Load Time:    3.6 || F1:  98.10 | Prec:  97.34 | Rec:  98.87 || Ex/s: 370.26

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 12 || Run Time:   16.3 | Load Time:   11.4 || F1:  99.59 | Prec:  99.33 | Rec:  99.85 || Ex/s: 267.45

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 12 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.78 | Prec:  96.49 | Rec:  99.10 || Ex/s: 371.26

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:27


Finished Epoch 13 || Run Time:   16.0 | Load Time:   11.3 || F1:  99.63 | Prec:  99.40 | Rec:  99.85 || Ex/s: 271.82

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 13 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.78 | Prec:  96.49 | Rec:  99.10 || Ex/s: 370.77

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 14 || Run Time:   16.0 | Load Time:   11.2 || F1:  99.66 | Prec:  99.48 | Rec:  99.85 || Ex/s: 272.39

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 14 || Run Time:    3.4 | Load Time:    3.8 || F1:  97.66 | Prec:  96.48 | Rec:  98.87 || Ex/s: 344.56

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:26


Finished Epoch 15 || Run Time:   15.9 | Load Time:   11.2 || F1:  99.66 | Prec:  99.48 | Rec:  99.85 || Ex/s: 272.99

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 15 || Run Time:    3.1 | Load Time:    3.6 || F1:  97.55 | Prec:  96.48 | Rec:  98.65 || Ex/s: 369.93

---------------------

Loading best model...
Training done.


tensor(98.1132, device='cuda:0')

In [48]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 5 || Run Time:    3.1 | Load Time:    3.5 || F1:  97.23 | Prec:  95.84 | Rec:  98.65 || Ex/s: 370.58



tensor(97.2253, device='cuda:0')

##### DBLP-GoogleScholar

In [49]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-GoogleScholar/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [50]:
model = dm.MatchingModel(attr_summarizer='attention')

In [51]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 4332002
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 1 || Run Time:   36.2 | Load Time:   23.4 || F1:  76.96 | Prec:  69.18 | Rec:  86.72 || Ex/s: 288.81

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 1 || Run Time:    7.0 | Load Time:    7.2 || F1:  86.71 | Prec:  82.70 | Rec:  91.12 || Ex/s: 404.85

* Best F1: tensor(86.7052, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 2 || Run Time:   36.0 | Load Time:   23.4 || F1:  87.72 | Prec:  81.49 | Rec:  94.98 || Ex/s: 289.97

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 2 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.49 | Prec:  84.95 | Rec:  92.34 || Ex/s: 404.61

* Best F1: tensor(88.4908, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 3 || Run Time:   36.1 | Load Time:   23.4 || F1:  91.39 | Prec:  86.58 | Rec:  96.76 || Ex/s: 289.63

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 3 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.19 | Prec:  87.07 | Rec:  89.35 || Ex/s: 404.09

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 4 || Run Time:   35.8 | Load Time:   23.2 || F1:  93.46 | Prec:  89.70 | Rec:  97.54 || Ex/s: 292.28

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 4 || Run Time:    7.2 | Load Time:    7.4 || F1:  89.21 | Prec:  88.43 | Rec:  90.00 || Ex/s: 393.36

* Best F1: tensor(89.2080, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 5 || Run Time:   35.9 | Load Time:   23.2 || F1:  95.16 | Prec:  92.32 | Rec:  98.19 || Ex/s: 291.45

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 5 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.72 | Prec:  88.85 | Rec:  88.60 || Ex/s: 404.29

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 6 || Run Time:   36.2 | Load Time:   23.4 || F1:  96.37 | Prec:  94.27 | Rec:  98.57 || Ex/s: 288.92

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 6 || Run Time:    7.0 | Load Time:    7.2 || F1:  89.26 | Prec:  87.31 | Rec:  91.31 || Ex/s: 404.11

* Best F1: tensor(89.2645, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 7 || Run Time:   36.1 | Load Time:   23.4 || F1:  97.15 | Prec:  95.57 | Rec:  98.78 || Ex/s: 289.53

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 7 || Run Time:    7.0 | Load Time:    7.2 || F1:  89.26 | Prec:  89.18 | Rec:  89.35 || Ex/s: 403.24

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 8 || Run Time:   36.1 | Load Time:   23.3 || F1:  97.86 | Prec:  96.77 | Rec:  98.97 || Ex/s: 289.90

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 8 || Run Time:    7.1 | Load Time:    7.2 || F1:  88.82 | Prec:  88.57 | Rec:  89.07 || Ex/s: 402.55

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 9 || Run Time:   36.2 | Load Time:   23.4 || F1:  98.54 | Prec:  97.88 | Rec:  99.22 || Ex/s: 288.76

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 9 || Run Time:    7.0 | Load Time:    7.2 || F1:  87.94 | Prec:  89.78 | Rec:  86.17 || Ex/s: 402.09

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 10 || Run Time:   36.5 | Load Time:   23.5 || F1:  98.90 | Prec:  98.42 | Rec:  99.38 || Ex/s: 287.08

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 10 || Run Time:    7.0 | Load Time:    7.2 || F1:  87.76 | Prec:  90.56 | Rec:  85.14 || Ex/s: 402.64

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 11 || Run Time:   35.9 | Load Time:   23.2 || F1:  99.15 | Prec:  98.79 | Rec:  99.50 || Ex/s: 291.43

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 11 || Run Time:    7.3 | Load Time:    7.4 || F1:  88.10 | Prec:  90.53 | Rec:  85.79 || Ex/s: 391.33

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:58


Finished Epoch 12 || Run Time:   35.8 | Load Time:   23.2 || F1:  99.24 | Prec:  98.98 | Rec:  99.50 || Ex/s: 292.13

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 12 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.01 | Prec:  90.03 | Rec:  86.07 || Ex/s: 405.68

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 13 || Run Time:   36.2 | Load Time:   23.4 || F1:  99.36 | Prec:  99.22 | Rec:  99.50 || Ex/s: 288.79

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 13 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.07 | Prec:  90.36 | Rec:  85.89 || Ex/s: 405.39

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 14 || Run Time:   36.1 | Load Time:   23.4 || F1:  99.45 | Prec:  99.41 | Rec:  99.50 || Ex/s: 289.29

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 14 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.29 | Prec:  90.73 | Rec:  85.98 || Ex/s: 405.73

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:59


Finished Epoch 15 || Run Time:   36.2 | Load Time:   23.4 || F1:  99.45 | Prec:  99.41 | Rec:  99.50 || Ex/s: 289.08

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 15 || Run Time:    7.0 | Load Time:    7.2 || F1:  88.37 | Prec:  90.18 | Rec:  86.64 || Ex/s: 404.41

---------------------

Loading best model...
Training done.


tensor(89.2645, device='cuda:0')

In [52]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:14


Finished Epoch 6 || Run Time:    7.1 | Load Time:    7.2 || F1:  89.15 | Prec:  87.01 | Rec:  91.40 || Ex/s: 399.95



tensor(89.1522, device='cuda:0')

##### Walmart-Amazon

In [53]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Walmart-Amazon/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [54]:
model = dm.MatchingModel(attr_summarizer='attention')

In [55]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 5234402
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 1 || Run Time:   15.6 | Load Time:    6.5 || F1:   8.45 | Prec:  22.39 | Rec:   5.21 || Ex/s: 278.68

===>  EVAL Epoch 1


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 1 || Run Time:    3.0 | Load Time:    2.0 || F1:  23.48 | Prec:  72.97 | Rec:  13.99 || Ex/s: 405.68

* Best F1: tensor(23.4783, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 2 || Run Time:   15.6 | Load Time:    6.4 || F1:  50.47 | Prec:  49.26 | Rec:  51.74 || Ex/s: 278.75

===>  EVAL Epoch 2


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 2 || Run Time:    3.0 | Load Time:    2.0 || F1:  53.52 | Prec:  48.93 | Rec:  59.07 || Ex/s: 405.89

* Best F1: tensor(53.5211, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 3 || Run Time:   15.4 | Load Time:    6.4 || F1:  62.32 | Prec:  55.85 | Rec:  70.49 || Ex/s: 280.84

===>  EVAL Epoch 3


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 3 || Run Time:    3.3 | Load Time:    2.2 || F1:  55.10 | Prec:  54.27 | Rec:  55.96 || Ex/s: 372.78

* Best F1: tensor(55.1020, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 4 || Run Time:   15.6 | Load Time:    6.4 || F1:  68.25 | Prec:  61.13 | Rec:  77.26 || Ex/s: 278.72

===>  EVAL Epoch 4


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 4 || Run Time:    3.0 | Load Time:    2.0 || F1:  52.68 | Prec:  49.77 | Rec:  55.96 || Ex/s: 407.04

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 5 || Run Time:   15.6 | Load Time:    6.4 || F1:  74.77 | Prec:  67.80 | Rec:  83.33 || Ex/s: 279.17

===>  EVAL Epoch 5


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 5 || Run Time:    3.0 | Load Time:    2.0 || F1:  49.01 | Prec:  42.69 | Rec:  57.51 || Ex/s: 407.00

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 6 || Run Time:   15.6 | Load Time:    6.4 || F1:  79.61 | Prec:  74.03 | Rec:  86.11 || Ex/s: 279.48

===>  EVAL Epoch 6


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 6 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.13 | Prec:  43.83 | Rec:  53.37 || Ex/s: 408.78

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:22


Finished Epoch 7 || Run Time:   15.9 | Load Time:    6.6 || F1:  85.88 | Prec:  81.05 | Rec:  91.32 || Ex/s: 274.18

===>  EVAL Epoch 7


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 7 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.21 | Prec:  47.72 | Rec:  48.70 || Ex/s: 404.58

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 8 || Run Time:   15.7 | Load Time:    6.5 || F1:  89.22 | Prec:  85.40 | Rec:  93.40 || Ex/s: 277.32

===>  EVAL Epoch 8


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 8 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.26 | Prec:  50.00 | Rec:  46.63 || Ex/s: 404.64

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 9 || Run Time:   15.5 | Load Time:    6.4 || F1:  92.01 | Prec:  89.23 | Rec:  94.97 || Ex/s: 279.86

===>  EVAL Epoch 9


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 9 || Run Time:    3.0 | Load Time:    2.0 || F1:  46.97 | Prec:  45.81 | Rec:  48.19 || Ex/s: 406.10

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:22


Finished Epoch 10 || Run Time:   15.9 | Load Time:    6.5 || F1:  92.93 | Prec:  90.20 | Rec:  95.83 || Ex/s: 273.80

===>  EVAL Epoch 10


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 10 || Run Time:    3.0 | Load Time:    2.0 || F1:  47.85 | Prec:  44.44 | Rec:  51.81 || Ex/s: 408.31

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 11 || Run Time:   15.6 | Load Time:    6.4 || F1:  94.08 | Prec:  91.75 | Rec:  96.53 || Ex/s: 279.39

===>  EVAL Epoch 11


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 11 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.73 | Prec:  47.76 | Rec:  49.74 || Ex/s: 406.39

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 12 || Run Time:   15.5 | Load Time:    6.4 || F1:  94.73 | Prec:  92.83 | Rec:  96.70 || Ex/s: 279.89

===>  EVAL Epoch 12


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 12 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.83 | Prec:  48.96 | Rec:  48.70 || Ex/s: 404.19

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 13 || Run Time:   15.6 | Load Time:    6.4 || F1:  95.38 | Prec:  93.94 | Rec:  96.88 || Ex/s: 278.89

===>  EVAL Epoch 13


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 13 || Run Time:    3.3 | Load Time:    2.2 || F1:  49.09 | Prec:  49.47 | Rec:  48.70 || Ex/s: 377.17

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 14 || Run Time:   15.6 | Load Time:    6.4 || F1:  96.05 | Prec:  95.07 | Rec:  97.05 || Ex/s: 278.78

===>  EVAL Epoch 14


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 14 || Run Time:    3.1 | Load Time:    2.0 || F1:  48.70 | Prec:  48.70 | Rec:  48.70 || Ex/s: 403.98

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 15 || Run Time:   15.5 | Load Time:    6.4 || F1:  96.46 | Prec:  95.88 | Rec:  97.05 || Ex/s: 280.94

===>  EVAL Epoch 15


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 15 || Run Time:    3.0 | Load Time:    2.0 || F1:  48.83 | Prec:  48.96 | Rec:  48.70 || Ex/s: 405.35

---------------------

Loading best model...
Training done.


tensor(55.1020, device='cuda:0')

In [56]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 3


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 3 || Run Time:    3.1 | Load Time:    2.0 || F1:  54.70 | Prec:  58.58 | Rec:  51.30 || Ex/s: 400.13



tensor(54.6961, device='cuda:0')

#### TEXTUAL

##### Abt-Buy

In [57]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [58]:
model = dm.MatchingModel(attr_summarizer='attention')

In [59]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 3429602
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 1 || Run Time:   10.8 | Load Time:   10.1 || F1:  15.68 | Prec:  22.56 | Rec:  12.01 || Ex/s: 275.01

===>  EVAL Epoch 1


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    2.0 | Load Time:    3.0 || F1:  30.39 | Prec:  28.51 | Rec:  32.52 || Ex/s: 384.59

* Best F1: tensor(30.3855, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 2 || Run Time:   10.9 | Load Time:   10.1 || F1:  29.19 | Prec:  29.01 | Rec:  29.38 || Ex/s: 273.48

===>  EVAL Epoch 2


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 2 || Run Time:    2.0 | Load Time:    3.0 || F1:  29.55 | Prec:  26.44 | Rec:  33.50 || Ex/s: 382.36

---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 3 || Run Time:   11.0 | Load Time:   10.3 || F1:  40.21 | Prec:  36.68 | Rec:  44.48 || Ex/s: 269.59

===>  EVAL Epoch 3


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 3 || Run Time:    2.0 | Load Time:    3.0 || F1:  31.49 | Prec:  28.03 | Rec:  35.92 || Ex/s: 382.08

* Best F1: tensor(31.4894, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 4 || Run Time:   10.8 | Load Time:   10.1 || F1:  46.32 | Prec:  41.38 | Rec:  52.60 || Ex/s: 275.02

===>  EVAL Epoch 4


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 4 || Run Time:    2.0 | Load Time:    3.0 || F1:  34.02 | Prec:  29.43 | Rec:  40.29 || Ex/s: 380.57

* Best F1: tensor(34.0164, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 5 || Run Time:   10.9 | Load Time:   10.1 || F1:  52.86 | Prec:  46.33 | Rec:  61.53 || Ex/s: 274.19

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    2.0 | Load Time:    3.0 || F1:  35.06 | Prec:  28.27 | Rec:  46.12 || Ex/s: 385.65

* Best F1: tensor(35.0554, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 6 || Run Time:   10.7 | Load Time:   10.1 || F1:  59.62 | Prec:  51.16 | Rec:  71.43 || Ex/s: 276.55

===>  EVAL Epoch 6


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 6 || Run Time:    2.1 | Load Time:    3.2 || F1:  33.86 | Prec:  26.59 | Rec:  46.60 || Ex/s: 364.74

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 7 || Run Time:   10.8 | Load Time:   10.2 || F1:  63.05 | Prec:  54.64 | Rec:  74.51 || Ex/s: 273.95

===>  EVAL Epoch 7


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 7 || Run Time:    2.0 | Load Time:    3.0 || F1:  34.89 | Prec:  27.08 | Rec:  49.03 || Ex/s: 384.26

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 8 || Run Time:   10.8 | Load Time:   10.1 || F1:  66.44 | Prec:  58.08 | Rec:  77.60 || Ex/s: 274.07

===>  EVAL Epoch 8


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    2.0 | Load Time:    3.0 || F1:  34.92 | Prec:  29.53 | Rec:  42.72 || Ex/s: 384.22

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 9 || Run Time:   10.7 | Load Time:   10.0 || F1:  69.09 | Prec:  61.66 | Rec:  78.57 || Ex/s: 276.94

===>  EVAL Epoch 9


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    2.0 | Load Time:    3.0 || F1:  32.56 | Prec:  31.25 | Rec:  33.98 || Ex/s: 384.19

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 10 || Run Time:   11.0 | Load Time:   10.3 || F1:  71.16 | Prec:  63.75 | Rec:  80.52 || Ex/s: 269.25

===>  EVAL Epoch 10


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    1.9 | Load Time:    3.0 || F1:  30.29 | Prec:  32.77 | Rec:  28.16 || Ex/s: 387.88

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 11 || Run Time:   10.8 | Load Time:   10.1 || F1:  72.70 | Prec:  65.62 | Rec:  81.49 || Ex/s: 275.10

===>  EVAL Epoch 11


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    2.0 | Load Time:    3.0 || F1:  29.20 | Prec:  33.76 | Rec:  25.73 || Ex/s: 383.51

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 12 || Run Time:   10.8 | Load Time:   10.1 || F1:  74.87 | Prec:  68.22 | Rec:  82.95 || Ex/s: 275.07

===>  EVAL Epoch 12


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    2.0 | Load Time:    3.0 || F1:  29.71 | Prec:  32.75 | Rec:  27.18 || Ex/s: 385.62

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 13 || Run Time:   10.7 | Load Time:   10.1 || F1:  76.45 | Prec:  69.75 | Rec:  84.58 || Ex/s: 276.66

===>  EVAL Epoch 13


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 13 || Run Time:    2.1 | Load Time:    3.3 || F1:  29.02 | Prec:  31.11 | Rec:  27.18 || Ex/s: 354.27

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 14 || Run Time:   10.8 | Load Time:   10.1 || F1:  78.42 | Prec:  72.61 | Rec:  85.23 || Ex/s: 274.34

===>  EVAL Epoch 14


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    2.0 | Load Time:    3.0 || F1:  28.00 | Prec:  28.87 | Rec:  27.18 || Ex/s: 384.10

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 15 || Run Time:   10.7 | Load Time:   10.0 || F1:  79.31 | Prec:  73.91 | Rec:  85.55 || Ex/s: 277.12

===>  EVAL Epoch 15


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    2.0 | Load Time:    3.0 || F1:  28.99 | Prec:  28.85 | Rec:  29.13 || Ex/s: 383.12

---------------------

Loading best model...
Training done.


tensor(35.0554, device='cuda:0')

In [60]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 5 || Run Time:    2.1 | Load Time:    3.0 || F1:  32.80 | Prec:  25.92 | Rec:  44.66 || Ex/s: 377.84



tensor(32.7986, device='cuda:0')

#### DIRTY

##### DBLP-ACM

In [61]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [62]:
model = dm.MatchingModel(attr_summarizer='attention')

In [63]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 4332002
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 1 || Run Time:   16.4 | Load Time:   13.5 || F1:  59.02 | Prec:  51.07 | Rec:  69.89 || Ex/s: 247.38

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 1 || Run Time:    3.2 | Load Time:    4.3 || F1:  80.51 | Prec:  76.73 | Rec:  84.68 || Ex/s: 329.43

* Best F1: tensor(80.5139, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:30


Finished Epoch 2 || Run Time:   16.8 | Load Time:   13.8 || F1:  81.23 | Prec:  74.69 | Rec:  89.04 || Ex/s: 242.03

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 2 || Run Time:    3.2 | Load Time:    4.3 || F1:  82.53 | Prec:  78.73 | Rec:  86.71 || Ex/s: 328.74

* Best F1: tensor(82.5295, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 3 || Run Time:   16.4 | Load Time:   13.5 || F1:  84.92 | Prec:  79.04 | Rec:  91.74 || Ex/s: 247.96

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 3 || Run Time:    3.2 | Load Time:    4.3 || F1:  81.27 | Prec:  71.11 | Rec:  94.82 || Ex/s: 327.74

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:30


Finished Epoch 4 || Run Time:   16.9 | Load Time:   13.8 || F1:  87.55 | Prec:  82.75 | Rec:  92.94 || Ex/s: 241.98

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 4 || Run Time:    3.2 | Load Time:    4.3 || F1:  70.27 | Prec:  54.79 | Rec:  97.97 || Ex/s: 328.14

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 5 || Run Time:   16.4 | Load Time:   13.5 || F1:  90.27 | Prec:  85.90 | Rec:  95.12 || Ex/s: 247.94

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 5 || Run Time:    3.2 | Load Time:    4.3 || F1:  69.70 | Prec:  53.95 | Rec:  98.42 || Ex/s: 329.95

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 6 || Run Time:   16.6 | Load Time:   13.6 || F1:  92.27 | Prec:  88.49 | Rec:  96.40 || Ex/s: 246.14

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 6 || Run Time:    3.2 | Load Time:    4.3 || F1:  72.99 | Prec:  58.16 | Rec:  97.97 || Ex/s: 328.96

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:30


Finished Epoch 7 || Run Time:   16.8 | Load Time:   13.8 || F1:  93.71 | Prec:  90.38 | Rec:  97.30 || Ex/s: 242.87

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 7 || Run Time:    3.2 | Load Time:    4.3 || F1:  75.99 | Prec:  62.34 | Rec:  97.30 || Ex/s: 328.65

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 8 || Run Time:   16.5 | Load Time:   13.4 || F1:  94.21 | Prec:  90.98 | Rec:  97.67 || Ex/s: 247.71

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 8 || Run Time:    3.2 | Load Time:    4.3 || F1:  81.43 | Prec:  71.36 | Rec:  94.82 || Ex/s: 329.30

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:30


Finished Epoch 9 || Run Time:   16.6 | Load Time:   13.7 || F1:  94.76 | Prec:  91.95 | Rec:  97.75 || Ex/s: 244.63

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 9 || Run Time:    3.2 | Load Time:    4.3 || F1:  84.20 | Prec:  76.91 | Rec:  93.02 || Ex/s: 329.96

* Best F1: tensor(84.1998, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 10 || Run Time:   16.4 | Load Time:   13.4 || F1:  95.07 | Prec:  92.60 | Rec:  97.67 || Ex/s: 248.32

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 10 || Run Time:    3.2 | Load Time:    4.3 || F1:  86.20 | Prec:  80.99 | Rec:  92.12 || Ex/s: 327.32

* Best F1: tensor(86.1960, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 11 || Run Time:   16.7 | Load Time:   13.7 || F1:  95.85 | Prec:  93.82 | Rec:  97.97 || Ex/s: 244.58

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 11 || Run Time:    3.3 | Load Time:    4.4 || F1:  86.35 | Prec:  85.78 | Rec:  86.94 || Ex/s: 321.32

* Best F1: tensor(86.3535, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 12 || Run Time:   16.3 | Load Time:   13.4 || F1:  96.63 | Prec:  95.26 | Rec:  98.05 || Ex/s: 249.49

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 12 || Run Time:    3.2 | Load Time:    4.3 || F1:  86.39 | Prec:  86.29 | Rec:  86.49 || Ex/s: 328.10

* Best F1: tensor(86.3892, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 13 || Run Time:   16.4 | Load Time:   13.6 || F1:  96.92 | Prec:  95.75 | Rec:  98.12 || Ex/s: 247.14

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 13 || Run Time:    3.2 | Load Time:    4.3 || F1:  86.61 | Prec:  86.52 | Rec:  86.71 || Ex/s: 328.92

* Best F1: tensor(86.6142, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:30


Finished Epoch 14 || Run Time:   16.8 | Load Time:   13.7 || F1:  96.99 | Prec:  95.96 | Rec:  98.05 || Ex/s: 243.49

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 14 || Run Time:    3.2 | Load Time:    4.3 || F1:  87.05 | Prec:  87.84 | Rec:  86.26 || Ex/s: 327.33

* Best F1: tensor(87.0455, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 15 || Run Time:   16.5 | Load Time:   13.6 || F1:  97.22 | Prec:  96.18 | Rec:  98.27 || Ex/s: 246.91

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 15 || Run Time:    3.2 | Load Time:    4.3 || F1:  86.86 | Prec:  88.17 | Rec:  85.59 || Ex/s: 328.37

---------------------

Loading best model...
Training done.


tensor(87.0455, device='cuda:0')

In [64]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 14 || Run Time:    3.3 | Load Time:    4.3 || F1:  90.26 | Prec:  89.76 | Rec:  90.77 || Ex/s: 326.84



tensor(90.2576, device='cuda:0')

##### DBLP-GoogleScholar

In [65]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-GoogleScholar//',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [66]:
model = dm.MatchingModel(attr_summarizer='attention')

In [67]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='attention_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 4332002
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:06


Finished Epoch 1 || Run Time:   38.9 | Load Time:   28.4 || F1:  66.22 | Prec:  57.99 | Rec:  77.17 || Ex/s: 255.70

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 1 || Run Time:    7.2 | Load Time:    8.5 || F1:  76.25 | Prec:  67.14 | Rec:  88.22 || Ex/s: 366.82

* Best F1: tensor(76.2520, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 2 || Run Time:   37.0 | Load Time:   27.3 || F1:  80.34 | Prec:  73.12 | Rec:  89.15 || Ex/s: 267.74

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 2 || Run Time:    7.2 | Load Time:    8.5 || F1:  81.66 | Prec:  80.73 | Rec:  82.62 || Ex/s: 366.69

* Best F1: tensor(81.6628, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 3 || Run Time:   36.9 | Load Time:   27.3 || F1:  81.97 | Prec:  75.05 | Rec:  90.30 || Ex/s: 268.39

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 3 || Run Time:    7.2 | Load Time:    8.5 || F1:  79.16 | Prec:  83.06 | Rec:  75.61 || Ex/s: 366.61

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 4 || Run Time:   37.0 | Load Time:   27.4 || F1:  81.19 | Prec:  73.75 | Rec:  90.30 || Ex/s: 267.47

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 4 || Run Time:    7.2 | Load Time:    8.5 || F1:  79.32 | Prec:  73.97 | Rec:  85.51 || Ex/s: 366.62

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 5 || Run Time:   37.1 | Load Time:   27.3 || F1:  83.33 | Prec:  76.44 | Rec:  91.58 || Ex/s: 267.40

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:16


Finished Epoch 5 || Run Time:    7.5 | Load Time:    8.7 || F1:  78.72 | Prec:  71.80 | Rec:  87.10 || Ex/s: 355.53

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 6 || Run Time:   37.6 | Load Time:   27.6 || F1:  85.22 | Prec:  78.73 | Rec:  92.89 || Ex/s: 263.96

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 6 || Run Time:    7.3 | Load Time:    8.5 || F1:  80.75 | Prec:  74.45 | Rec:  88.22 || Ex/s: 363.88

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 7 || Run Time:   37.2 | Load Time:   27.4 || F1:  87.17 | Prec:  81.61 | Rec:  93.55 || Ex/s: 266.55

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 7 || Run Time:    7.2 | Load Time:    8.5 || F1:  80.25 | Prec:  73.28 | Rec:  88.69 || Ex/s: 365.06

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 8 || Run Time:   37.2 | Load Time:   27.3 || F1:  88.29 | Prec:  82.91 | Rec:  94.42 || Ex/s: 266.85

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 8 || Run Time:    7.2 | Load Time:    8.5 || F1:  81.24 | Prec:  75.42 | Rec:  88.04 || Ex/s: 365.00

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 9 || Run Time:   36.9 | Load Time:   27.1 || F1:  89.55 | Prec:  84.71 | Rec:  94.98 || Ex/s: 269.14

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:16


Finished Epoch 9 || Run Time:    7.4 | Load Time:    8.7 || F1:  80.87 | Prec:  75.90 | Rec:  86.54 || Ex/s: 355.65

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:03


Finished Epoch 10 || Run Time:   37.0 | Load Time:   27.2 || F1:  90.42 | Prec:  85.97 | Rec:  95.35 || Ex/s: 268.31

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 10 || Run Time:    7.2 | Load Time:    8.5 || F1:  81.00 | Prec:  76.19 | Rec:  86.45 || Ex/s: 365.75

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 11 || Run Time:   37.4 | Load Time:   27.4 || F1:  91.92 | Prec:  88.22 | Rec:  95.95 || Ex/s: 266.07

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 11 || Run Time:    7.3 | Load Time:    8.5 || F1:  81.09 | Prec:  76.73 | Rec:  85.98 || Ex/s: 362.77

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 12 || Run Time:   37.5 | Load Time:   27.5 || F1:  92.75 | Prec:  89.43 | Rec:  96.32 || Ex/s: 265.08

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 12 || Run Time:    7.2 | Load Time:    8.5 || F1:  81.39 | Prec:  77.56 | Rec:  85.61 || Ex/s: 365.40

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 13 || Run Time:   37.4 | Load Time:   27.4 || F1:  93.63 | Prec:  90.92 | Rec:  96.51 || Ex/s: 265.58

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 13 || Run Time:    7.2 | Load Time:    8.5 || F1:  81.32 | Prec:  77.75 | Rec:  85.23 || Ex/s: 365.76

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 14 || Run Time:   37.5 | Load Time:   27.4 || F1:  94.40 | Prec:  91.96 | Rec:  96.98 || Ex/s: 265.60

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 14 || Run Time:    7.3 | Load Time:    8.5 || F1:  81.48 | Prec:  78.68 | Rec:  84.49 || Ex/s: 361.93

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:04


Finished Epoch 15 || Run Time:   37.8 | Load Time:   27.5 || F1:  94.96 | Prec:  92.75 | Rec:  97.29 || Ex/s: 263.57

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:15


Finished Epoch 15 || Run Time:    7.3 | Load Time:    8.5 || F1:  81.58 | Prec:  79.28 | Rec:  84.02 || Ex/s: 361.50

---------------------

Loading best model...
Training done.


tensor(81.6628, device='cuda:0')

In [68]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:16


Finished Epoch 2 || Run Time:    7.6 | Load Time:    8.8 || F1:  81.79 | Prec:  81.72 | Rec:  81.87 || Ex/s: 349.61



tensor(81.7927, device='cuda:0')

### HYBRID

#### STRUCTURED

##### Amazon-Google

In [69]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Amazon-Google/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [70]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [71]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 7133105
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 1 || Run Time:   18.7 | Load Time:    4.9 || F1:  23.18 | Prec:  32.47 | Rec:  18.03 || Ex/s: 291.03

===>  EVAL Epoch 1


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 1 || Run Time:    3.2 | Load Time:    1.5 || F1:  36.21 | Prec:  29.97 | Rec:  45.73 || Ex/s: 489.04

* Best F1: tensor(36.2098, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 2 || Run Time:   18.6 | Load Time:    4.8 || F1:  43.91 | Prec:  40.31 | Rec:  48.21 || Ex/s: 294.08

===>  EVAL Epoch 2


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 2 || Run Time:    3.2 | Load Time:    1.5 || F1:  39.56 | Prec:  29.06 | Rec:  61.97 || Ex/s: 483.76

* Best F1: tensor(39.5634, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 3 || Run Time:   18.7 | Load Time:    4.8 || F1:  55.01 | Prec:  49.37 | Rec:  62.09 || Ex/s: 292.17

===>  EVAL Epoch 3


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 3 || Run Time:    3.2 | Load Time:    1.5 || F1:  45.42 | Prec:  38.62 | Rec:  55.13 || Ex/s: 482.48

* Best F1: tensor(45.4225, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 4 || Run Time:   18.7 | Load Time:    4.8 || F1:  64.59 | Prec:  58.04 | Rec:  72.82 || Ex/s: 292.73

===>  EVAL Epoch 4


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:05


Finished Epoch 4 || Run Time:    3.5 | Load Time:    1.7 || F1:  47.93 | Prec:  52.00 | Rec:  44.44 || Ex/s: 441.35

* Best F1: tensor(47.9263, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 5 || Run Time:   18.9 | Load Time:    4.8 || F1:  72.12 | Prec:  65.57 | Rec:  80.11 || Ex/s: 289.95

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    3.2 | Load Time:    1.5 || F1:  49.79 | Prec:  48.39 | Rec:  51.28 || Ex/s: 486.31

* Best F1: tensor(49.7925, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 6 || Run Time:   18.6 | Load Time:    4.8 || F1:  78.88 | Prec:  72.65 | Rec:  86.27 || Ex/s: 293.68

===>  EVAL Epoch 6


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 6 || Run Time:    3.2 | Load Time:    1.5 || F1:  47.65 | Prec:  42.47 | Rec:  54.27 || Ex/s: 488.79

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 7 || Run Time:   18.5 | Load Time:    4.8 || F1:  84.94 | Prec:  79.21 | Rec:  91.56 || Ex/s: 295.23

===>  EVAL Epoch 7


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 7 || Run Time:    3.2 | Load Time:    1.5 || F1:  47.81 | Prec:  44.78 | Rec:  51.28 || Ex/s: 487.77

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 8 || Run Time:   19.0 | Load Time:    4.9 || F1:  88.38 | Prec:  83.29 | Rec:  94.13 || Ex/s: 287.72

===>  EVAL Epoch 8


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 8 || Run Time:    3.2 | Load Time:    1.5 || F1:  45.36 | Prec:  43.82 | Rec:  47.01 || Ex/s: 486.43

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 9 || Run Time:   18.4 | Load Time:    4.8 || F1:  92.07 | Prec:  88.20 | Rec:  96.28 || Ex/s: 296.65

===>  EVAL Epoch 9


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 9 || Run Time:    3.1 | Load Time:    1.5 || F1:  45.34 | Prec:  43.08 | Rec:  47.86 || Ex/s: 493.04

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 10 || Run Time:   18.3 | Load Time:    4.8 || F1:  94.81 | Prec:  91.82 | Rec:  98.00 || Ex/s: 297.52

===>  EVAL Epoch 10


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 10 || Run Time:    3.1 | Load Time:    1.5 || F1:  46.36 | Prec:  47.95 | Rec:  44.87 || Ex/s: 490.74

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 11 || Run Time:   18.9 | Load Time:    4.9 || F1:  96.42 | Prec:  94.63 | Rec:  98.28 || Ex/s: 288.77

===>  EVAL Epoch 11


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 11 || Run Time:    3.1 | Load Time:    1.5 || F1:  45.81 | Prec:  47.27 | Rec:  44.44 || Ex/s: 491.42

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:22


Finished Epoch 12 || Run Time:   18.3 | Load Time:    4.7 || F1:  97.46 | Prec:  96.11 | Rec:  98.86 || Ex/s: 298.72

===>  EVAL Epoch 12


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 12 || Run Time:    3.2 | Load Time:    1.5 || F1:  46.08 | Prec:  50.00 | Rec:  42.74 || Ex/s: 490.43

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 13 || Run Time:   18.3 | Load Time:    4.7 || F1:  97.88 | Prec:  96.91 | Rec:  98.86 || Ex/s: 298.05

===>  EVAL Epoch 13


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 13 || Run Time:    3.2 | Load Time:    1.5 || F1:  44.08 | Prec:  49.47 | Rec:  39.74 || Ex/s: 490.33

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 14 || Run Time:   18.9 | Load Time:    4.8 || F1:  98.22 | Prec:  97.60 | Rec:  98.86 || Ex/s: 289.59

===>  EVAL Epoch 14


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 14 || Run Time:    3.1 | Load Time:    1.5 || F1:  44.39 | Prec:  48.97 | Rec:  40.60 || Ex/s: 490.33

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:23


Finished Epoch 15 || Run Time:   18.6 | Load Time:    4.8 || F1:  98.30 | Prec:  97.60 | Rec:  99.00 || Ex/s: 294.14

===>  EVAL Epoch 15


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 15 || Run Time:    3.2 | Load Time:    1.5 || F1:  43.93 | Prec:  48.45 | Rec:  40.17 || Ex/s: 483.77

---------------------

Loading best model...
Training done.


tensor(49.7925, device='cuda:0')

In [72]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 5


0% [██████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:04


Finished Epoch 5 || Run Time:    3.2 | Load Time:    1.5 || F1:  48.16 | Prec:  46.09 | Rec:  50.43 || Ex/s: 482.85



tensor(48.1633, device='cuda:0')

##### Beer

In [73]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Beer/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [74]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [75]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 9210006
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 1 || Run Time:    1.1 | Load Time:    0.3 || F1:   8.33 | Prec:   9.38 | Rec:   7.50 || Ex/s: 194.89

===>  EVAL Epoch 1
Finished Epoch 1 || Run Time:    0.2 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 328.24

* Best F1: tensor(0., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 2 || Run Time:    1.0 | Load Time:    0.3 || F1:  48.15 | Prec:  92.86 | Rec:  32.50 || Ex/s: 199.94

===>  EVAL Epoch 2
Finished Epoch 2 || Run Time:    0.2 | Load Time:    0.1 || F1:   0.00 | Prec:   0.00 | Rec:   0.00 || Ex/s: 340.50

---------------------

===>  TRAIN Epoch 3


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 3 || Run Time:    1.0 | Load Time:    0.3 || F1:  69.33 | Prec:  74.29 | Rec:  65.00 || Ex/s: 204.39

===>  EVAL Epoch 3
Finished Epoch 3 || Run Time:    0.2 | Load Time:    0.1 || F1:  52.17 | Prec:  37.50 | Rec:  85.71 || Ex/s: 318.66

* Best F1: tensor(52.1739, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 4 || Run Time:    1.1 | Load Time:    0.3 || F1:  72.16 | Prec:  61.40 | Rec:  87.50 || Ex/s: 196.95

===>  EVAL Epoch 4
Finished Epoch 4 || Run Time:    0.2 | Load Time:    0.1 || F1:  62.50 | Prec:  55.56 | Rec:  71.43 || Ex/s: 327.58

* Best F1: tensor(62.5000, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 5 || Run Time:    1.1 | Load Time:    0.3 || F1:  77.08 | Prec:  66.07 | Rec:  92.50 || Ex/s: 197.70

===>  EVAL Epoch 5
Finished Epoch 5 || Run Time:    0.2 | Load Time:    0.1 || F1:  64.52 | Prec:  58.82 | Rec:  71.43 || Ex/s: 338.48

* Best F1: tensor(64.5161, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 6 || Run Time:    1.0 | Load Time:    0.3 || F1:  91.57 | Prec:  88.37 | Rec:  95.00 || Ex/s: 203.34

===>  EVAL Epoch 6
Finished Epoch 6 || Run Time:    0.2 | Load Time:    0.1 || F1:  64.86 | Prec:  52.17 | Rec:  85.71 || Ex/s: 345.02

* Best F1: tensor(64.8649, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 7 || Run Time:    1.0 | Load Time:    0.3 || F1:  90.48 | Prec:  86.36 | Rec:  95.00 || Ex/s: 201.27

===>  EVAL Epoch 7
Finished Epoch 7 || Run Time:    0.2 | Load Time:    0.1 || F1:  75.00 | Prec:  66.67 | Rec:  85.71 || Ex/s: 334.81

* Best F1: tensor(75., device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 8


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 8 || Run Time:    1.0 | Load Time:    0.3 || F1:  96.30 | Prec:  95.12 | Rec:  97.50 || Ex/s: 201.84

===>  EVAL Epoch 8
Finished Epoch 8 || Run Time:    0.2 | Load Time:    0.1 || F1:  74.29 | Prec:  61.90 | Rec:  92.86 || Ex/s: 349.21

---------------------

===>  TRAIN Epoch 9


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 9 || Run Time:    1.0 | Load Time:    0.3 || F1:  96.30 | Prec:  95.12 | Rec:  97.50 || Ex/s: 202.96

===>  EVAL Epoch 9
Finished Epoch 9 || Run Time:    0.2 | Load Time:    0.1 || F1:  78.79 | Prec:  68.42 | Rec:  92.86 || Ex/s: 334.09

* Best F1: tensor(78.7879, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 10 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 198.65

===>  EVAL Epoch 10
Finished Epoch 10 || Run Time:    0.2 | Load Time:    0.1 || F1:  74.29 | Prec:  61.90 | Rec:  92.86 || Ex/s: 338.98

---------------------

===>  TRAIN Epoch 11


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 11 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 203.81

===>  EVAL Epoch 11
Finished Epoch 11 || Run Time:    0.2 | Load Time:    0.1 || F1:  81.25 | Prec:  72.22 | Rec:  92.86 || Ex/s: 337.56

* Best F1: tensor(81.2500, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 12 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 202.64

===>  EVAL Epoch 12
Finished Epoch 12 || Run Time:    0.2 | Load Time:    0.1 || F1:  78.79 | Prec:  68.42 | Rec:  92.86 || Ex/s: 344.09

---------------------

===>  TRAIN Epoch 13


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 13 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 206.10

===>  EVAL Epoch 13
Finished Epoch 13 || Run Time:    0.2 | Load Time:    0.1 || F1:  78.79 | Prec:  68.42 | Rec:  92.86 || Ex/s: 331.07

---------------------

===>  TRAIN Epoch 14


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 14 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 202.17

===>  EVAL Epoch 14
Finished Epoch 14 || Run Time:    0.2 | Load Time:    0.1 || F1:  78.79 | Prec:  68.42 | Rec:  92.86 || Ex/s: 339.45

---------------------

===>  TRAIN Epoch 15


0% [█] 100% | ETA: 00:00:00
Total time elapsed: 00:00:00


Finished Epoch 15 || Run Time:    1.0 | Load Time:    0.3 || F1:  98.77 | Prec:  97.56 | Rec: 100.00 || Ex/s: 202.03

===>  EVAL Epoch 15
Finished Epoch 15 || Run Time:    0.2 | Load Time:    0.1 || F1:  78.79 | Prec:  68.42 | Rec:  92.86 || Ex/s: 330.66

---------------------

Loading best model...
Training done.


tensor(81.2500, device='cuda:0')

In [76]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 11




Finished Epoch 11 || Run Time:    0.2 | Load Time:    0.1 || F1:  64.52 | Prec:  58.82 | Rec:  71.43 || Ex/s: 337.48



tensor(64.5161, device='cuda:0')

##### DBLP-ACM

In [77]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [78]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [79]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 9210006
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 1 || Run Time:   28.3 | Load Time:   11.7 || F1:  84.49 | Prec:  77.09 | Rec:  93.47 || Ex/s: 185.28

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 1 || Run Time:    4.9 | Load Time:    3.8 || F1:  94.74 | Prec:  90.55 | Rec:  99.32 || Ex/s: 284.46

* Best F1: tensor(94.7368, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 2 || Run Time:   27.6 | Load Time:   11.6 || F1:  96.78 | Prec:  94.43 | Rec:  99.25 || Ex/s: 189.57

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 2 || Run Time:    4.9 | Load Time:    3.8 || F1:  94.94 | Prec:  90.93 | Rec:  99.32 || Ex/s: 284.72

* Best F1: tensor(94.9408, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 3 || Run Time:   28.2 | Load Time:   11.7 || F1:  98.07 | Prec:  96.85 | Rec:  99.32 || Ex/s: 185.69

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 3 || Run Time:    4.9 | Load Time:    3.8 || F1:  98.10 | Prec:  97.13 | Rec:  99.10 || Ex/s: 282.74

* Best F1: tensor(98.1048, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 4 || Run Time:   27.7 | Load Time:   11.5 || F1:  98.85 | Prec:  98.08 | Rec:  99.62 || Ex/s: 189.25

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 4 || Run Time:    4.9 | Load Time:    3.9 || F1:  98.32 | Prec:  97.77 | Rec:  98.87 || Ex/s: 282.82

* Best F1: tensor(98.3203, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 5 || Run Time:   28.1 | Load Time:   11.7 || F1:  99.18 | Prec:  98.74 | Rec:  99.62 || Ex/s: 186.15

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 5 || Run Time:    4.9 | Load Time:    3.8 || F1:  98.10 | Prec:  97.34 | Rec:  98.87 || Ex/s: 285.43

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 6 || Run Time:   27.7 | Load Time:   11.5 || F1:  99.51 | Prec:  99.18 | Rec:  99.85 || Ex/s: 188.88

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 6 || Run Time:    4.9 | Load Time:    3.8 || F1:  98.11 | Prec:  96.92 | Rec:  99.32 || Ex/s: 283.22

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 7 || Run Time:   28.1 | Load Time:   11.7 || F1:  99.59 | Prec:  99.33 | Rec:  99.85 || Ex/s: 186.59

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 7 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.36 | Prec:  95.26 | Rec:  99.55 || Ex/s: 282.07

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 8 || Run Time:   27.9 | Load Time:   11.5 || F1:  99.66 | Prec:  99.48 | Rec:  99.85 || Ex/s: 188.34

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 8 || Run Time:    4.9 | Load Time:    3.8 || F1:  98.00 | Prec:  96.71 | Rec:  99.32 || Ex/s: 283.34

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 9 || Run Time:   28.2 | Load Time:   11.6 || F1:  99.78 | Prec:  99.63 | Rec:  99.92 || Ex/s: 186.27

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 9 || Run Time:    5.0 | Load Time:    3.9 || F1:  97.97 | Prec:  97.97 | Rec:  97.97 || Ex/s: 278.11

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 10 || Run Time:   27.7 | Load Time:   11.5 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 189.11

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 10 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.85 | Prec:  98.19 | Rec:  97.52 || Ex/s: 284.56

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 11 || Run Time:   28.4 | Load Time:   11.7 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 185.19

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 11 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.85 | Prec:  98.19 | Rec:  97.52 || Ex/s: 284.55

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 12 || Run Time:   27.7 | Load Time:   11.5 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 189.02

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 12 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.85 | Prec:  98.19 | Rec:  97.52 || Ex/s: 281.64

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 13 || Run Time:   28.2 | Load Time:   11.7 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 185.99

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 13 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.74 | Prec:  97.96 | Rec:  97.52 || Ex/s: 283.92

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:38


Finished Epoch 14 || Run Time:   27.7 | Load Time:   11.5 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 189.20

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 14 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.74 | Prec:  97.96 | Rec:  97.52 || Ex/s: 282.47

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:39


Finished Epoch 15 || Run Time:   28.2 | Load Time:   11.7 || F1:  99.89 | Prec:  99.78 | Rec: 100.00 || Ex/s: 185.86

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 15 || Run Time:    4.9 | Load Time:    3.8 || F1:  97.74 | Prec:  97.96 | Rec:  97.52 || Ex/s: 284.20

---------------------

Loading best model...
Training done.


tensor(98.3203, device='cuda:0')

In [80]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:08


Finished Epoch 4 || Run Time:    5.1 | Load Time:    3.8 || F1:  97.99 | Prec:  97.12 | Rec:  98.87 || Ex/s: 277.81



tensor(97.9911, device='cuda:0')

##### DBLP-GoogleScholar

In [81]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/DBLP-GoogleScholar/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [82]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [83]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 9210006
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 1 || Run Time:   63.6 | Load Time:   23.9 || F1:  79.09 | Prec:  71.70 | Rec:  88.18 || Ex/s: 196.74

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 1 || Run Time:   11.1 | Load Time:    7.8 || F1:  89.60 | Prec:  89.40 | Rec:  89.81 || Ex/s: 304.57

* Best F1: tensor(89.6037, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:27


Finished Epoch 2 || Run Time:   63.8 | Load Time:   23.9 || F1:  91.06 | Prec:  86.27 | Rec:  96.41 || Ex/s: 196.28

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 2 || Run Time:   11.1 | Load Time:    7.7 || F1:  90.19 | Prec:  85.51 | Rec:  95.42 || Ex/s: 304.71

* Best F1: tensor(90.1943, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 3 || Run Time:   63.5 | Load Time:   23.8 || F1:  94.52 | Prec:  91.36 | Rec:  97.91 || Ex/s: 197.31

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 3 || Run Time:   11.2 | Load Time:    7.8 || F1:  90.98 | Prec:  91.02 | Rec:  90.93 || Ex/s: 303.11

* Best F1: tensor(90.9771, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:27


Finished Epoch 4 || Run Time:   63.8 | Load Time:   23.9 || F1:  96.19 | Prec:  93.96 | Rec:  98.53 || Ex/s: 196.24

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 4 || Run Time:   11.1 | Load Time:    7.7 || F1:  91.90 | Prec:  92.03 | Rec:  91.78 || Ex/s: 304.98

* Best F1: tensor(91.9045, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 5 || Run Time:   63.0 | Load Time:   23.8 || F1:  97.32 | Prec:  95.77 | Rec:  98.91 || Ex/s: 198.36

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:19


Finished Epoch 5 || Run Time:   11.3 | Load Time:    7.9 || F1:  92.03 | Prec:  90.29 | Rec:  93.83 || Ex/s: 298.96

* Best F1: tensor(92.0257, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 6 || Run Time:   63.3 | Load Time:   23.8 || F1:  97.93 | Prec:  96.97 | Rec:  98.91 || Ex/s: 197.69

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 6 || Run Time:   11.1 | Load Time:    7.7 || F1:  90.33 | Prec:  85.30 | Rec:  95.98 || Ex/s: 305.25

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 7 || Run Time:   63.3 | Load Time:   23.8 || F1:  98.65 | Prec:  98.00 | Rec:  99.31 || Ex/s: 197.62

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:19


Finished Epoch 7 || Run Time:   11.3 | Load Time:    7.7 || F1:  91.60 | Prec:  88.23 | Rec:  95.23 || Ex/s: 301.03

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 8 || Run Time:   63.3 | Load Time:   23.9 || F1:  98.98 | Prec:  98.52 | Rec:  99.44 || Ex/s: 197.52

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 8 || Run Time:   11.1 | Load Time:    7.7 || F1:  92.21 | Prec:  91.53 | Rec:  92.90 || Ex/s: 305.33

* Best F1: tensor(92.2078, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 9 || Run Time:   63.2 | Load Time:   23.8 || F1:  99.32 | Prec:  99.07 | Rec:  99.56 || Ex/s: 198.03

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 9 || Run Time:   11.1 | Load Time:    7.7 || F1:  92.12 | Prec:  91.36 | Rec:  92.90 || Ex/s: 304.33

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 10 || Run Time:   63.2 | Load Time:   23.9 || F1:  99.41 | Prec:  99.19 | Rec:  99.63 || Ex/s: 197.83

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 10 || Run Time:   11.2 | Load Time:    7.8 || F1:  92.16 | Prec:  91.99 | Rec:  92.34 || Ex/s: 302.84

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 11 || Run Time:   63.2 | Load Time:   23.8 || F1:  99.49 | Prec:  99.29 | Rec:  99.69 || Ex/s: 197.86

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 11 || Run Time:   11.1 | Load Time:    7.7 || F1:  92.27 | Prec:  91.93 | Rec:  92.62 || Ex/s: 305.39

* Best F1: tensor(92.2719, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 12 || Run Time:   63.2 | Load Time:   23.8 || F1:  99.56 | Prec:  99.35 | Rec:  99.78 || Ex/s: 197.88

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 12 || Run Time:   11.0 | Load Time:    7.8 || F1:  92.21 | Prec:  92.00 | Rec:  92.43 || Ex/s: 305.50

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 13 || Run Time:   63.2 | Load Time:   23.8 || F1:  99.64 | Prec:  99.50 | Rec:  99.78 || Ex/s: 197.98

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:18


Finished Epoch 13 || Run Time:   11.2 | Load Time:    7.8 || F1:  92.00 | Prec:  92.61 | Rec:  91.40 || Ex/s: 303.44

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:26


Finished Epoch 14 || Run Time:   63.2 | Load Time:   23.8 || F1:  99.69 | Prec:  99.53 | Rec:  99.84 || Ex/s: 197.84

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:19


Finished Epoch 14 || Run Time:   11.3 | Load Time:    7.9 || F1:  91.88 | Prec:  92.84 | Rec:  90.93 || Ex/s: 298.27

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:25


Finished Epoch 15 || Run Time:   62.8 | Load Time:   23.6 || F1:  99.74 | Prec:  99.60 | Rec:  99.88 || Ex/s: 199.30

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:19


Finished Epoch 15 || Run Time:   11.6 | Load Time:    7.9 || F1:  91.87 | Prec:  92.93 | Rec:  90.84 || Ex/s: 294.37

---------------------

Loading best model...
Training done.


tensor(92.2719, device='cuda:0')

In [84]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:19


Finished Epoch 11 || Run Time:   11.3 | Load Time:    7.9 || F1:  91.28 | Prec:  91.07 | Rec:  91.50 || Ex/s: 299.41



tensor(91.2821, device='cuda:0')

##### Walmart-Amazon

In [85]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Structured/Walmart-Amazon/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [86]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [87]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)



* Number of trainable parameters: 11286907
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 1 || Run Time:   26.8 | Load Time:    6.8 || F1:  34.01 | Prec:  53.96 | Rec:  24.83 || Ex/s: 183.20

===>  EVAL Epoch 1


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 1 || Run Time:    4.7 | Load Time:    2.2 || F1:  55.19 | Prec:  64.58 | Rec:  48.19 || Ex/s: 297.32

* Best F1: tensor(55.1929, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 2 || Run Time:   27.2 | Load Time:    6.9 || F1:  58.89 | Prec:  64.27 | Rec:  54.34 || Ex/s: 180.23

===>  EVAL Epoch 2


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 2 || Run Time:    4.7 | Load Time:    2.2 || F1:  58.36 | Prec:  59.78 | Rec:  56.99 || Ex/s: 297.06

* Best F1: tensor(58.3554, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 3 || Run Time:   26.7 | Load Time:    6.7 || F1:  69.07 | Prec:  66.77 | Rec:  71.53 || Ex/s: 183.55

===>  EVAL Epoch 3


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 3 || Run Time:    4.7 | Load Time:    2.2 || F1:  54.04 | Prec:  45.85 | Rec:  65.80 || Ex/s: 297.96

---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 4 || Run Time:   27.2 | Load Time:    6.8 || F1:  77.15 | Prec:  73.02 | Rec:  81.77 || Ex/s: 181.08

===>  EVAL Epoch 4


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:07


Finished Epoch 4 || Run Time:    4.9 | Load Time:    2.3 || F1:  58.43 | Prec:  53.95 | Rec:  63.73 || Ex/s: 282.98

* Best F1: tensor(58.4323, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 5 || Run Time:   26.5 | Load Time:    6.7 || F1:  87.28 | Prec:  83.73 | Rec:  91.15 || Ex/s: 185.07

===>  EVAL Epoch 5


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 5 || Run Time:    4.6 | Load Time:    2.2 || F1:  60.29 | Prec:  68.42 | Rec:  53.89 || Ex/s: 301.05

* Best F1: tensor(60.2899, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 6 || Run Time:   26.8 | Load Time:    6.7 || F1:  91.58 | Prec:  88.89 | Rec:  94.44 || Ex/s: 183.26

===>  EVAL Epoch 6


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 6 || Run Time:    4.6 | Load Time:    2.2 || F1:  61.14 | Prec:  68.15 | Rec:  55.44 || Ex/s: 301.35

* Best F1: tensor(61.1429, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 7 || Run Time:   27.0 | Load Time:    6.8 || F1:  94.82 | Prec:  92.85 | Rec:  96.88 || Ex/s: 181.73

===>  EVAL Epoch 7


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 7 || Run Time:    4.6 | Load Time:    2.2 || F1:  60.97 | Prec:  67.72 | Rec:  55.44 || Ex/s: 301.23

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 8 || Run Time:   26.8 | Load Time:    6.7 || F1:  96.83 | Prec:  95.60 | Rec:  98.09 || Ex/s: 183.43

===>  EVAL Epoch 8


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 8 || Run Time:    4.8 | Load Time:    2.2 || F1:  60.17 | Prec:  65.06 | Rec:  55.96 || Ex/s: 294.31

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 9 || Run Time:   27.3 | Load Time:    6.8 || F1:  97.67 | Prec:  96.92 | Rec:  98.44 || Ex/s: 180.15

===>  EVAL Epoch 9


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 9 || Run Time:    4.7 | Load Time:    2.2 || F1:  62.03 | Prec:  64.09 | Rec:  60.10 || Ex/s: 298.71

* Best F1: tensor(62.0321, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 10 || Run Time:   27.0 | Load Time:    6.8 || F1:  98.19 | Prec:  97.60 | Rec:  98.78 || Ex/s: 182.10

===>  EVAL Epoch 10


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 10 || Run Time:    4.7 | Load Time:    2.2 || F1:  61.62 | Prec:  62.11 | Rec:  61.14 || Ex/s: 296.82

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 11 || Run Time:   26.7 | Load Time:    6.7 || F1:  98.70 | Prec:  98.28 | Rec:  99.13 || Ex/s: 183.68

===>  EVAL Epoch 11


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 11 || Run Time:    4.7 | Load Time:    2.2 || F1:  60.26 | Prec:  60.42 | Rec:  60.10 || Ex/s: 299.74

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 12 || Run Time:   27.1 | Load Time:    6.8 || F1:  98.70 | Prec:  98.28 | Rec:  99.13 || Ex/s: 181.06

===>  EVAL Epoch 12


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 12 || Run Time:    4.8 | Load Time:    2.2 || F1:  59.84 | Prec:  60.64 | Rec:  59.07 || Ex/s: 295.33

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 13 || Run Time:   26.5 | Load Time:    6.7 || F1:  99.13 | Prec:  98.96 | Rec:  99.31 || Ex/s: 184.75

===>  EVAL Epoch 13


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 13 || Run Time:    4.7 | Load Time:    2.1 || F1:  60.16 | Prec:  61.29 | Rec:  59.07 || Ex/s: 300.33

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:33


Finished Epoch 14 || Run Time:   27.0 | Load Time:    6.8 || F1:  99.31 | Prec:  99.31 | Rec:  99.31 || Ex/s: 181.58

===>  EVAL Epoch 14


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 14 || Run Time:    4.6 | Load Time:    2.2 || F1:  60.32 | Prec:  61.62 | Rec:  59.07 || Ex/s: 301.14

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:32


Finished Epoch 15 || Run Time:   26.6 | Load Time:    6.7 || F1:  99.31 | Prec:  99.13 | Rec:  99.48 || Ex/s: 184.34

===>  EVAL Epoch 15


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 15 || Run Time:    4.7 | Load Time:    2.1 || F1:  60.22 | Prec:  62.57 | Rec:  58.03 || Ex/s: 301.36

---------------------

Loading best model...
Training done.


tensor(62.0321, device='cuda:0')

In [88]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 9


0% [█████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 9 || Run Time:    4.7 | Load Time:    2.2 || F1:  59.57 | Prec:  61.20 | Rec:  58.03 || Ex/s: 296.00



tensor(59.5745, device='cuda:0')

#### TEXTUAL

##### Abt-Buy

In [89]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Textual/Abt-Buy/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [90]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [91]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 7133105
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 1 || Run Time:   18.1 | Load Time:   10.3 || F1:  15.73 | Prec:  25.55 | Rec:  11.36 || Ex/s: 201.93

===>  EVAL Epoch 1


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 1 || Run Time:    3.1 | Load Time:    3.2 || F1:  33.72 | Prec:  25.70 | Rec:  49.03 || Ex/s: 305.32

* Best F1: tensor(33.7229, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 2 || Run Time:   18.4 | Load Time:   10.4 || F1:  37.83 | Prec:  37.68 | Rec:  37.99 || Ex/s: 199.71

===>  EVAL Epoch 2


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 2 || Run Time:    3.1 | Load Time:    3.2 || F1:  48.50 | Prec:  39.57 | Rec:  62.62 || Ex/s: 305.62

* Best F1: tensor(48.4962, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 3 || Run Time:   18.1 | Load Time:   10.3 || F1:  60.25 | Prec:  59.03 | Rec:  61.53 || Ex/s: 202.48

===>  EVAL Epoch 3


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 3 || Run Time:    3.1 | Load Time:    3.2 || F1:  51.89 | Prec:  53.93 | Rec:  50.00 || Ex/s: 305.25

* Best F1: tensor(51.8892, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 4 || Run Time:   18.0 | Load Time:   10.3 || F1:  68.38 | Prec:  65.72 | Rec:  71.27 || Ex/s: 203.33

===>  EVAL Epoch 4


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 4 || Run Time:    3.3 | Load Time:    3.4 || F1:  54.15 | Prec:  54.41 | Rec:  53.88 || Ex/s: 287.16

* Best F1: tensor(54.1463, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 5 || Run Time:   17.9 | Load Time:   10.2 || F1:  75.71 | Prec:  73.18 | Rec:  78.41 || Ex/s: 204.07

===>  EVAL Epoch 5


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 5 || Run Time:    3.1 | Load Time:    3.2 || F1:  45.94 | Prec:  35.66 | Rec:  64.56 || Ex/s: 304.92

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 6 || Run Time:   18.0 | Load Time:   10.3 || F1:  81.57 | Prec:  77.68 | Rec:  85.88 || Ex/s: 203.37

===>  EVAL Epoch 6


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 6 || Run Time:    3.1 | Load Time:    3.2 || F1:  54.04 | Prec:  51.54 | Rec:  56.80 || Ex/s: 304.41

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 7 || Run Time:   18.0 | Load Time:   10.2 || F1:  85.17 | Prec:  80.11 | Rec:  90.91 || Ex/s: 203.59

===>  EVAL Epoch 7


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 7 || Run Time:    3.3 | Load Time:    3.5 || F1:  53.71 | Prec:  56.76 | Rec:  50.97 || Ex/s: 281.98

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 8 || Run Time:   18.0 | Load Time:   10.2 || F1:  88.69 | Prec:  83.82 | Rec:  94.16 || Ex/s: 203.34

===>  EVAL Epoch 8


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 8 || Run Time:    3.0 | Load Time:    3.2 || F1:  54.77 | Prec:  56.77 | Rec:  52.91 || Ex/s: 308.87

* Best F1: tensor(54.7739, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 9 || Run Time:   17.9 | Load Time:   10.2 || F1:  93.03 | Prec:  89.86 | Rec:  96.43 || Ex/s: 204.20

===>  EVAL Epoch 9


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 9 || Run Time:    3.1 | Load Time:    3.2 || F1:  52.74 | Prec:  54.08 | Rec:  51.46 || Ex/s: 305.59

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 10 || Run Time:   18.4 | Load Time:   10.4 || F1:  94.85 | Prec:  92.71 | Rec:  97.08 || Ex/s: 199.17

===>  EVAL Epoch 10


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 10 || Run Time:    3.1 | Load Time:    3.2 || F1:  53.40 | Prec:  53.40 | Rec:  53.40 || Ex/s: 304.93

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 11 || Run Time:   18.0 | Load Time:   10.3 || F1:  96.03 | Prec:  94.08 | Rec:  98.05 || Ex/s: 203.49

===>  EVAL Epoch 11


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 11 || Run Time:    3.1 | Load Time:    3.2 || F1:  54.36 | Prec:  57.61 | Rec:  51.46 || Ex/s: 306.22

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 12 || Run Time:   18.1 | Load Time:   10.2 || F1:  96.72 | Prec:  95.42 | Rec:  98.05 || Ex/s: 202.55

===>  EVAL Epoch 12


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 12 || Run Time:    3.1 | Load Time:    3.2 || F1:  53.47 | Prec:  54.55 | Rec:  52.43 || Ex/s: 301.36

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:29


Finished Epoch 13 || Run Time:   18.6 | Load Time:   10.5 || F1:  97.20 | Prec:  95.89 | Rec:  98.54 || Ex/s: 197.16

===>  EVAL Epoch 13


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 13 || Run Time:    3.1 | Load Time:    3.2 || F1:  54.18 | Prec:  56.61 | Rec:  51.94 || Ex/s: 303.77

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 14 || Run Time:   18.0 | Load Time:   10.2 || F1:  97.82 | Prec:  97.12 | Rec:  98.54 || Ex/s: 203.82

===>  EVAL Epoch 14


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 14 || Run Time:    3.1 | Load Time:    3.2 || F1:  54.45 | Prec:  57.22 | Rec:  51.94 || Ex/s: 305.35

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:28


Finished Epoch 15 || Run Time:   18.0 | Load Time:   10.3 || F1:  97.99 | Prec:  97.28 | Rec:  98.70 || Ex/s: 203.27

===>  EVAL Epoch 15


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 15 || Run Time:    3.1 | Load Time:    3.2 || F1:  54.64 | Prec:  58.24 | Rec:  51.46 || Ex/s: 305.55

---------------------

Loading best model...
Training done.


tensor(54.7739, device='cuda:0')

In [92]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 8


0% [████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:06


Finished Epoch 8 || Run Time:    3.4 | Load Time:    3.3 || F1:  53.20 | Prec:  54.00 | Rec:  52.43 || Ex/s: 285.82



tensor(53.2020, device='cuda:0')

#### DIRTY

##### DBLP-ACM

In [93]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-ACM/',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [94]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [95]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 9210006
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 1 || Run Time:   29.1 | Load Time:   14.1 || F1:  59.98 | Prec:  50.94 | Rec:  72.90 || Ex/s: 171.31

===>  EVAL Epoch 1


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 1 || Run Time:    5.2 | Load Time:    4.7 || F1:  69.56 | Prec:  54.63 | Rec:  95.72 || Ex/s: 251.11

* Best F1: tensor(69.5581, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 2 || Run Time:   28.6 | Load Time:   13.9 || F1:  83.29 | Prec:  76.05 | Rec:  92.04 || Ex/s: 174.20

===>  EVAL Epoch 2


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 2 || Run Time:    5.2 | Load Time:    4.6 || F1:  84.58 | Prec:  75.35 | Rec:  96.40 || Ex/s: 253.09

* Best F1: tensor(84.5850, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 3 || Run Time:   29.3 | Load Time:   14.1 || F1:  88.96 | Prec:  83.90 | Rec:  94.67 || Ex/s: 170.74

===>  EVAL Epoch 3


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 3 || Run Time:    5.1 | Load Time:    4.6 || F1:  88.70 | Prec:  83.50 | Rec:  94.59 || Ex/s: 254.05

* Best F1: tensor(88.7012, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 4 || Run Time:   28.8 | Load Time:   13.9 || F1:  93.52 | Prec:  90.34 | Rec:  96.92 || Ex/s: 173.67

===>  EVAL Epoch 4


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 4 || Run Time:    5.5 | Load Time:    4.9 || F1:  88.43 | Prec:  81.05 | Rec:  97.30 || Ex/s: 239.10

---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 5 || Run Time:   28.7 | Load Time:   13.9 || F1:  96.25 | Prec:  94.37 | Rec:  98.20 || Ex/s: 174.08

===>  EVAL Epoch 5


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 5 || Run Time:    5.2 | Load Time:    4.6 || F1:  90.75 | Prec:  85.92 | Rec:  96.17 || Ex/s: 252.67

* Best F1: tensor(90.7545, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 6 || Run Time:   28.8 | Load Time:   14.0 || F1:  97.59 | Prec:  96.27 | Rec:  98.95 || Ex/s: 173.39

===>  EVAL Epoch 6


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 6 || Run Time:    5.4 | Load Time:    4.8 || F1:  89.23 | Prec:  82.57 | Rec:  97.07 || Ex/s: 242.47

---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 7 || Run Time:   28.7 | Load Time:   13.9 || F1:  97.89 | Prec:  96.50 | Rec:  99.32 || Ex/s: 173.76

===>  EVAL Epoch 7


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 7 || Run Time:    5.2 | Load Time:    4.6 || F1:  90.23 | Prec:  88.01 | Rec:  92.57 || Ex/s: 252.46

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 8 || Run Time:   29.1 | Load Time:   13.9 || F1:  98.36 | Prec:  97.49 | Rec:  99.25 || Ex/s: 172.49

===>  EVAL Epoch 8


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:10


Finished Epoch 8 || Run Time:    5.5 | Load Time:    4.9 || F1:  89.94 | Prec:  88.29 | Rec:  91.67 || Ex/s: 237.35

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 9 || Run Time:   29.1 | Load Time:   14.0 || F1:  99.07 | Prec:  98.66 | Rec:  99.47 || Ex/s: 171.92

===>  EVAL Epoch 9


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 9 || Run Time:    5.2 | Load Time:    4.6 || F1:  90.28 | Prec:  89.58 | Rec:  90.99 || Ex/s: 251.70

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 10 || Run Time:   29.2 | Load Time:   14.1 || F1:  99.48 | Prec:  99.25 | Rec:  99.70 || Ex/s: 171.19

===>  EVAL Epoch 10


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 10 || Run Time:    5.3 | Load Time:    4.6 || F1:  90.57 | Prec:  89.28 | Rec:  91.89 || Ex/s: 249.08

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 11 || Run Time:   29.2 | Load Time:   14.0 || F1:  99.51 | Prec:  99.25 | Rec:  99.77 || Ex/s: 171.62

===>  EVAL Epoch 11


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 11 || Run Time:    5.3 | Load Time:    4.7 || F1:  90.62 | Prec:  89.82 | Rec:  91.44 || Ex/s: 247.45

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:43


Finished Epoch 12 || Run Time:   30.1 | Load Time:   14.3 || F1:  99.55 | Prec:  99.33 | Rec:  99.77 || Ex/s: 167.11

===>  EVAL Epoch 12


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 12 || Run Time:    5.3 | Load Time:    4.6 || F1:  90.50 | Prec:  89.80 | Rec:  91.22 || Ex/s: 249.01

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 13 || Run Time:   29.0 | Load Time:   13.9 || F1:  99.63 | Prec:  99.40 | Rec:  99.85 || Ex/s: 172.68

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 13 || Run Time:    5.2 | Load Time:    4.7 || F1:  90.99 | Prec:  89.89 | Rec:  92.12 || Ex/s: 251.71

* Best F1: tensor(90.9900, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 14 || Run Time:   29.1 | Load Time:   14.1 || F1:  99.63 | Prec:  99.40 | Rec:  99.85 || Ex/s: 171.68

===>  EVAL Epoch 14


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 14 || Run Time:    5.1 | Load Time:    4.6 || F1:  90.38 | Prec:  89.78 | Rec:  90.99 || Ex/s: 253.95

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:42


Finished Epoch 15 || Run Time:   28.9 | Load Time:   13.9 || F1:  99.63 | Prec:  99.40 | Rec:  99.85 || Ex/s: 173.28

===>  EVAL Epoch 15


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 15 || Run Time:    5.1 | Load Time:    4.6 || F1:  90.01 | Prec:  89.71 | Rec:  90.32 || Ex/s: 252.71

---------------------

Loading best model...
Training done.


tensor(90.9900, device='cuda:0')

In [96]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 13


0% [███████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:09


Finished Epoch 13 || Run Time:    5.4 | Load Time:    4.6 || F1:  89.29 | Prec:  89.39 | Rec:  89.19 || Ex/s: 247.23



tensor(89.2897, device='cuda:0')

##### DBLP-GoogleScholar

In [97]:
train, validation, test = dm.data.process(
    path='/content/IC/datasesErros/BDirty_50_3_5/Dirty/DBLP-GoogleScholar//',
    train='joined_train.csv',
    validation='joined_valid.csv',
    test='joined_test.csv')

In [98]:
model = dm.MatchingModel(attr_summarizer='hybrid')

In [99]:
model.run_train(
    train,
    validation,
    epochs=15,
    batch_size=32,
    best_save_path='hybrid_model.pth',
    pos_neg_ratio=3)

* Number of trainable parameters: 9210006
===>  TRAIN Epoch 1


  "reduction: 'mean' divides the total loss by both the batch size and the support size."
0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 1 || Run Time:   64.7 | Load Time:   28.0 || F1:  70.53 | Prec:  62.50 | Rec:  80.92 || Ex/s: 185.85

===>  EVAL Epoch 1


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 1 || Run Time:   11.7 | Load Time:    9.1 || F1:  83.70 | Prec:  78.43 | Rec:  89.72 || Ex/s: 275.60

* Best F1: tensor(83.6966, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 2 || Run Time:   64.5 | Load Time:   27.8 || F1:  86.80 | Prec:  80.83 | Rec:  93.73 || Ex/s: 186.57

===>  EVAL Epoch 2


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 2 || Run Time:   11.9 | Load Time:    9.3 || F1:  86.08 | Prec:  80.75 | Rec:  92.15 || Ex/s: 269.94

* Best F1: tensor(86.0760, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 3 || Run Time:   65.6 | Load Time:   28.0 || F1:  90.96 | Prec:  86.56 | Rec:  95.82 || Ex/s: 183.96

===>  EVAL Epoch 3


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 3 || Run Time:   11.7 | Load Time:    9.1 || F1:  86.23 | Prec:  79.90 | Rec:  93.64 || Ex/s: 275.82

* Best F1: tensor(86.2306, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 4 || Run Time:   64.8 | Load Time:   28.0 || F1:  93.99 | Prec:  91.10 | Rec:  97.07 || Ex/s: 185.52

===>  EVAL Epoch 4


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 4 || Run Time:   11.7 | Load Time:    9.1 || F1:  87.95 | Prec:  83.88 | Rec:  92.43 || Ex/s: 275.13

* Best F1: tensor(87.9502, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:33


Finished Epoch 5 || Run Time:   65.9 | Load Time:   28.2 || F1:  95.54 | Prec:  93.39 | Rec:  97.79 || Ex/s: 182.91

===>  EVAL Epoch 5


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 5 || Run Time:   11.6 | Load Time:    9.1 || F1:  87.68 | Prec:  83.10 | Rec:  92.80 || Ex/s: 276.61

---------------------

===>  TRAIN Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 6 || Run Time:   64.8 | Load Time:   27.9 || F1:  97.19 | Prec:  95.68 | Rec:  98.75 || Ex/s: 185.65

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 6 || Run Time:   11.7 | Load Time:    9.1 || F1:  89.37 | Prec:  88.76 | Rec:  90.00 || Ex/s: 275.11

* Best F1: tensor(89.3736, device='cuda:0')
Saving best model...
Done.
---------------------

===>  TRAIN Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 7 || Run Time:   65.2 | Load Time:   27.9 || F1:  97.89 | Prec:  96.77 | Rec:  99.03 || Ex/s: 185.00

===>  EVAL Epoch 7


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 7 || Run Time:   11.6 | Load Time:    9.2 || F1:  88.89 | Prec:  88.81 | Rec:  88.97 || Ex/s: 276.11

---------------------

===>  TRAIN Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 8 || Run Time:   64.7 | Load Time:   28.0 || F1:  98.90 | Prec:  98.34 | Rec:  99.47 || Ex/s: 185.80

===>  EVAL Epoch 8


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 8 || Run Time:   11.6 | Load Time:    9.1 || F1:  88.44 | Prec:  88.57 | Rec:  88.32 || Ex/s: 276.61

---------------------

===>  TRAIN Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 9 || Run Time:   65.3 | Load Time:   28.0 || F1:  99.13 | Prec:  98.76 | Rec:  99.50 || Ex/s: 184.53

===>  EVAL Epoch 9


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 9 || Run Time:   11.7 | Load Time:    9.1 || F1:  88.75 | Prec:  85.59 | Rec:  92.15 || Ex/s: 275.83

---------------------

===>  TRAIN Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 10 || Run Time:   65.0 | Load Time:   28.0 || F1:  99.35 | Prec:  99.16 | Rec:  99.53 || Ex/s: 185.15

===>  EVAL Epoch 10


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 10 || Run Time:   11.7 | Load Time:    9.1 || F1:  89.09 | Prec:  87.41 | Rec:  90.84 || Ex/s: 275.13

---------------------

===>  TRAIN Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 11 || Run Time:   65.0 | Load Time:   28.0 || F1:  99.42 | Prec:  99.28 | Rec:  99.56 || Ex/s: 185.30

===>  EVAL Epoch 11


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 11 || Run Time:   11.9 | Load Time:    9.4 || F1:  88.96 | Prec:  87.59 | Rec:  90.37 || Ex/s: 269.61

---------------------

===>  TRAIN Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 12 || Run Time:   64.9 | Load Time:   27.7 || F1:  99.49 | Prec:  99.35 | Rec:  99.63 || Ex/s: 186.02

===>  EVAL Epoch 12


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 12 || Run Time:   12.3 | Load Time:    9.3 || F1:  89.01 | Prec:  88.32 | Rec:  89.72 || Ex/s: 265.22

---------------------

===>  TRAIN Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 13 || Run Time:   65.4 | Load Time:   27.9 || F1:  99.60 | Prec:  99.50 | Rec:  99.69 || Ex/s: 184.46

===>  EVAL Epoch 13


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 13 || Run Time:   11.7 | Load Time:    9.1 || F1:  89.13 | Prec:  88.27 | Rec:  90.00 || Ex/s: 275.50

---------------------

===>  TRAIN Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:31


Finished Epoch 14 || Run Time:   64.9 | Load Time:   27.9 || F1:  99.69 | Prec:  99.66 | Rec:  99.72 || Ex/s: 185.60

===>  EVAL Epoch 14


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 14 || Run Time:   11.7 | Load Time:    9.2 || F1:  88.95 | Prec:  88.38 | Rec:  89.53 || Ex/s: 275.11

---------------------

===>  TRAIN Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:01:32


Finished Epoch 15 || Run Time:   65.2 | Load Time:   27.8 || F1:  99.72 | Prec:  99.69 | Rec:  99.75 || Ex/s: 185.11

===>  EVAL Epoch 15


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:20


Finished Epoch 15 || Run Time:   11.6 | Load Time:    9.1 || F1:  88.50 | Prec:  88.50 | Rec:  88.50 || Ex/s: 276.33

---------------------

Loading best model...
Training done.


tensor(89.3736, device='cuda:0')

In [100]:
# Compute F1 on test set
model.run_eval(test)

===>  EVAL Epoch 6


0% [██████████████████████████████] 100% | ETA: 00:00:00
Total time elapsed: 00:00:21


Finished Epoch 6 || Run Time:   12.0 | Load Time:    9.3 || F1:  90.38 | Prec:  89.03 | Rec:  91.78 || Ex/s: 269.83



tensor(90.3820, device='cuda:0')

## Step 4. Apply model to new data

### Evaluating on test data
Now that we have a trained model for entity matching, we can now evaluate its accuracy on test data, to estimate the performance of the model on unlabeled data.

In [None]:
# Compute F1 on test set
'''model.run_eval(test)'''

### Evaluating on unlabeled data

We finally apply the trained model to unlabeled data to get predictions. To do this, we need to first process the unlabeled data.

#### Processing unlabeled data

To process unlabeled data, we use `dm.data.process_unlabeled`, as shown in the code snippet below. The basic parameters for this call are as follows:

* **path (required): ** The full path to the unlabeled data file (not just the directory).
* **trained_model (required): ** The trained model. The model is aware of the configuration of the training data on which it was trained, and so `deepmatcher` reuses the same configuration for the unlabeled data.
* **ignore_columns (optional): ** Any columns in the unlabeled CSV file that you may want to ignore for the purposes of evaluation. If not specified, the columns that were ignored while processing the training set will also be ignored while processing the unlabeled data.

Note that the unlabeled CSV file must have the same schema as the train, validation and test CSVs.

In [None]:
'''
unlabeled = dm.data.process_unlabeled(
    path='sample_data/itunes-amazon/unlabeled.csv',
    trained_model=model)
'''

#### Obtaining predictions

Next, we call the `run_prediction` method which takes a processed data set object and returns a `pandas` dataframe containing tuple pair IDs (`id` column) and the corresponding match score predictions (`match_score` column). `match_scores` are in [0, 1] and a score above 0.5 indicates a match prediction.

In [None]:
'''
predictions = model.run_prediction(unlabeled)
predictions.head()
'''

You may optionally set the `output_attributes` parameter to also include all attributes present in the original input table. As mentioned earlier, the processed attribute values will likely look a bit different from the attribute values in the input CSV files due to modifications such as tokenization and lowercasing.

In [None]:
#predictions = model.run_prediction(unlabeled, output_attributes=True)
#predictions.head()

You can then save these predictions to CSV and use them for downstream tasks.

In [None]:
#predictions.to_csv('sample_data/itunes-amazon/unlabeled_predictions.csv')

#### Getting predictions on labeled data

You can also get predictions for labeled data such as validation data. To do so, you can simply call the `run_prediction` method passing the validation data as argument.

In [None]:
#valid_predictions = model.run_prediction(validation, output_attributes=True)
#valid_predictions.head()