In [1]:
%run Latex_macros.ipynb

<IPython.core.display.Latex object>

# Steps to a Universal Model for NLP

The fact that the "Unsupervised Pre-Trained Model + Supervised Fine-Tuning" paradigm is 
so successful in solving new tasks
- suggests that Large Language Models seem to learn a "universal" task-independent representation of language

Still: the input transformations are, at the least, an inconvenience.

We will create a "Universal API"
- so that all tasks can be expressed as instances of the Language Modelling task
- eliminating the need for Input Transformations


We will then address an even more ambitious question:
- Can we use a pre-trained LM
- To solve a new task
- **without** Fine-Tuning
    - the first time the LM sees examples of the new task are at *inference time*
    - no further training on examples from the new task


# A Universal API: Text to text

Because each task has a very specific API (input and output format)
- You have to translate the task-specific format into the format of the Universal Task
- *Text to text* as a universal API
    - transform your task into a "predict the next" task
        - create a "prompt" (context) that describes and encodes your task
        - the Language Model completion of the prompt is the "solution" to your task
 

For example:
- Consider a Pre-Trained model that performs text completion (predict the next)
- Turn your task into a text completion problem


<center>Task: Unscramble the letters</center>

|  |  |  |
| :- | :- | --- |
| Context: | Please unscramble the letters in the word and write that word | |
|          | skicts = |
| Target completion: | sticks |

- The "Unscramble the letters task" encoded as "predict the next" word following the "=" sign

<center>Task: English to French</center>

|  |  |  |
| :- | :- | --- |
| Context: | English: Please unscramble the letters in the word and write that word | |
|          | French: | |
| Target completion: | Veuillez déchiffrer les lettres du mot et écrire ce mot |

- Translation task encoded as "predict the next" words following the "French:" prompt.

Sometimes the task encodings are not completely obvious (see [GPT Section 3.3](https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf#page=4))
- Task: Are two sentences similar ?
    - Issue
        - There is no natural ordering of the two sentences
        - So concatenating the two (with a delimiter) is misleading
    - Solution
        - Obtain two representations of the sentence pair, once for each ordering
        - Add them together element-wise
        - Feed sum into Classifier
        

# Beyond Transfer Learning: a true Universal Model

The Universal API obviates the need 
- to transform the inputs
- to use a task-specific head to "reshape" the output

We have also turned Language Modeling into a Universal task.
- the last word in each example signals "end of features"
- the LM is expected to generated the text that follows: the predicted target
Do we even need Supervised Fine-Tuning ?

One can imagine using Supervised Fine-Tuning of a new task using this API and example format.

But is Supervised Fine-Tuning strictly necessary ?

We will address this topic in the module on Zero Shot Learning.

In [1]:
print("Done")

Done
