# Weekly Meeting 11

* Conditioning on "perfect information".
* Considering the negative candidates in the generative model.

## Conditioning on "perfect information"

In all the experiments where I have used information from the graph, the performance is very similar to the baseline w/o graph information. I tried to think about potential problems:

1) The information of the Wikipedia graphs is not useful for the task (in spite of the analysis we made in the Weekly-Meeting 5).

2) The graph information is redundant in our experimental setup:
  * Already contained in the BART parameters (the pre-training data contains Wikipedia).
  * Already contained in the background/context of the aggregatable instances (extracted from Wikipedia)
    
3) The heuristics we used to extract the information of the graphs are not well suited for the task ($k$ lowest common ancestors and intersections at $k$-hop) or we aren't adding correctly the information to the model (as inputs with different formattings and as targets with different labels).


What happens if we use "perfect information" (gold aggregations) in the input? (Confirm that if the model has perfect information in the input, then it should not make mistakes by only copying/focusing in that information):

$BCE\textrm{<sep>}G_1, G_2, ..., G_M\textrm{<sep>} \rightarrow G_i | BCE$

| Set | MAP | R@10 | MRR |
| --- | --- | --- | --- |
| Train+Test | 99.83 | 100.00 | 100.00 |
| Train | 64.97 | 84.88 | 82.51 |

## Negative candidates

### Generative finetuning of a pretrained discriminative model


**Experiment 1**: First finetune BART in the discriminative setup and then finetune again in the generative setup. I think that, without restrictions on the parameters, the second finetuning could move far away the parameters learned in the first finetuning, forgetting the discriminative information. **Question**: is $p(c \in \{0, 1\}|x_1^T;y_1^T)$ useful for $p(y_1^T | x_1^T)$?

**Experiment 2**: First finetune BART in the discriminative setup, initialize the weights for the generative setup and freeze the encoder weights (the decoder weights are freely trainable).

**Experiment 3**: First finetune BART in the discriminative setup, initialize the weights for the generative setup and make gradual unfreezing on encoder and decoder layers simultaneously (2 layers are unfreezed at each epoch (12 layers and max-epoch=6).

**Experiment 4**: in the previous experiment, the results are very low until the last epoch (all the layers are unfreezed) (29.27 MAP in the 5th epoch and 75.12 MAP in the last epoch). So, I increased the number of epochs (3 epochs more) to see how much the results are increased.

<br><br>

| Experiment| MAP | R@10 | MRR |
| --- | --- | --- | --- |
| Baseline |	83.07 | 93.02 | 93.90 |
| Exp-1 | 83.70 | 92.76 | 94.19 |
| Exp-2 | 22.75 | 49.60  | 26.06 |
| Exp-3 | 75.12 | 89.83 | 89.44 |
| Exp-4 | 82.61 | 92.57 | 94.34 |

### Without a pretrained discriminative model

Three ways of incorporating negative candidates in the generative finetuning:

* **Experiment 1**: Use negation to optimize the same generative objective than in all the previous experiments ($p_\theta(y_1^T | x_1^T)$) for positive and negative aggregations.<br><br>

     * *Train*: $x_1^T \rightarrow \left\{\begin{matrix}
\textrm{are}\ C_i\ \textrm{<sep>}\ x_1^T & \textrm{G}(C_i)=1 \\
\textrm{are not}\ C_i\ \textrm{<sep>}\ x_1^T & \textrm{G}(C_i)=0 
\end{matrix}\right.$
     <br><br>
     * *Rank*: ranking the candidates by $p_\theta(are\ C_i\ \textrm{<sep>}\ x_1^T | x_1^T)$ ("rank the candidates by their probability of being the correct aggregation")
     <br><br>
     * *Generate*: positive aggregations $x_1^T \rightarrow \textrm{are} $ ~~not~~ $\_\_\_ \textrm{<sep>}$,     negative aggregations $x_1^T \rightarrow \textrm{are not}\ \_\_\_ \textrm{<sep>}$

<br><br>
* **Experiment 2**: Jointly optimizing a discriminative and generative losses.
<br><br>

  <img src="summary_all_models.jpg">
  <br><br>
  
     * *Train*: $\mathcal{L} = \textrm{G}(C_i)\mathcal{L_G}\ +\ \mathcal{L_D}$
     * *Rank*: same than the discriminative model.
     * *Generate*: same than the generative model.
<br><br>


| Experiment | MAP | R@10 | MRR |
| --- | --- | --- | --- |
| Baseline |	83.07 | 93.02 | 93.90 |
| Exp-1 | 67.80 | 83.45 | 85.52 |
| Exp-2 | |   |  |


I also tried, in the Experiment-1, to format the target in a different way: change "are not" by a "§" symbol and remove "are". The results improve (76.50 MAP vs 67.80 MAP) but it is also lower than the baseline.

Some cases in the Experiment-1:

(Lovie Smith, Tony Dungy)

**Context**: Lovie Lee Smith is an American football coach. He is the head football coach at the University of Illinois. He was previously the head coach of the Chicago Bears of the National Football League from 2004 to 2012, and the NFL's Tampa Bay Buccaneers from 2014 to 2015. Smith has been to the Super Bowl twice, as the defensive coordinator for the St. Louis Rams and as the head coach for the Bears in 2006. Anthony Kevin Dungy is a former professional American football player and coach in the National Football League . Dungy was head coach of the Tampa Bay Buccaneers from 1996 to 2001, and head coach of the Indianapolis Colts from 2002 to 2008. Dungy and Smith Are Proving Nice Coaches Can Finish First: I ca n't believe I 'm treated like a 12-year-old when I 'm 31,'' he told New York magazine in December. Smith and Dungy have a style that seems to fit the year-round nature of their sport. Lovie Smith, Tony Dungy
<br>**Target**: are football coaches  ---> PROB= 0.787

**Context**: Lovie Lee Smith is an American football coach. He is the head football coach at the University of Illinois. He was previously the head coach of the Chicago Bears of the National Football League from 2004 to 2012, and the NFL's Tampa Bay Buccaneers from 2014 to 2015. Smith has been to the Super Bowl twice, as the defensive coordinator for the St. Louis Rams and as the head coach for the Bears in 2006. Anthony Kevin Dungy is a former professional American football player and coach in the National Football League . Dungy was head coach of the Tampa Bay Buccaneers from 1996 to 2001, and head coach of the Indianapolis Colts from 2002 to 2008. Dungy and Smith Are Proving Nice Coaches Can Finish First: I ca n't believe I 'm treated like a 12-year-old when I 'm 31,'' he told New York magazine in December. Smith and Dungy have a style that seems to fit the year-round nature of their sport. Lovie Smith, Tony Dungy
<br>**Target**: are former major league baseball players  ---> PROB= 0.779

**Context**: Lovie Lee Smith is an American football coach. He is the head football coach at the University of Illinois. He was previously the head coach of the Chicago Bears of the National Football League from 2004 to 2012, and the NFL's Tampa Bay Buccaneers from 2014 to 2015. Smith has been to the Super Bowl twice, as the defensive coordinator for the St. Louis Rams and as the head coach for the Bears in 2006. Anthony Kevin Dungy is a former professional American football player and coach in the National Football League . Dungy was head coach of the Tampa Bay Buccaneers from 1996 to 2001, and head coach of the Indianapolis Colts from 2002 to 2008. Dungy and Smith Are Proving Nice Coaches Can Finish First: I ca n't believe I 'm treated like a 12-year-old when I 'm 31,'' he told New York magazine in December. Smith and Dungy have a style that seems to fit the year-round nature of their sport. Lovie Smith, Tony Dungy
<br>**Target**: are politicians  ---> PROB= 0.790 (highest one)

**Context**: Lovie Lee Smith is an American football coach. He is the head football coach at the University of Illinois. He was previously the head coach of the Chicago Bears of the National Football League from 2004 to 2012, and the NFL's Tampa Bay Buccaneers from 2014 to 2015. Smith has been to the Super Bowl twice, as the defensive coordinator for the St. Louis Rams and as the head coach for the Bears in 2006. Anthony Kevin Dungy is a former professional American football player and coach in the National Football League . Dungy was head coach of the Tampa Bay Buccaneers from 1996 to 2001, and head coach of the Indianapolis Colts from 2002 to 2008. Dungy and Smith Are Proving Nice Coaches Can Finish First: I ca n't believe I 'm treated like a 12-year-old when I 'm 31,'' he told New York magazine in December. Smith and Dungy have a style that seems to fit the year-round nature of their sport. Lovie Smith, Tony Dungy
<br>**Target**: are japanese politicians  ---> PROB= 0.782

___

I go back to the error analysis of the generative system and I identified one thing that could be improved.
If you see the rankings, typically longer candidates have lower probability (because it is computed as the product of the probability of each token), so, there are candidates whose the length could affect. These are some of the 30 most confused aggregations:

* interested in the washington redskins [22/1] (in the context: washington redskins, ...) (MAP=19.20)
* participants in a criminal case [11/1]  (in the context: murdered, convicted, victims, killer, ...) (MAP=12.90)
* those associated with the new york yankees [16/1] (in the context: new york yankees, ...) (MAP=72.20)
* men with ties to american politics [16/1] (in the context: politician, senator, united states, ...) (MAP=72.20)
* republican former political appointees
* involved in a potentially criminal scheme