# LLM crash course

Almost every task in NLP falls under two big headings:
- Document annotation (classification + regression problems)
- Information extraction

And almost all _models_ can be described as "text-to-X" models, for some choice of X:
- Text-to-numbers: all classification and regression models, and many information extraction models.
- Text-to-text (aka "sequence to sequence"): all text generation models like OpenAssistant, Mistral, ChatGPT, etc.
- Text-to-image and text-to-video: models like Stable Diffusion, Midjourney, etc.  We don't be talking about these today.

## Document Annotation

Document annotation problems are ultimately just classification and regression problems.  The only difference is that the input is text, rather than tabular or numeric data.

Given a piece of text, apply some label or set of labels to it.  This is  E.g.:
- For a product review, is this user likely to buy from you again?  (Yes/no; binary classification)
- For a student essay, what writing skills are they demonstrating?  (Multi-output classification)
- For a support ticket, what is the problem the user needs help with?  (Multi-class classification)
- From a description of a vehicle, estimate the annual maintenance cost.  (Regression)

## Information extraction

Information extraction is used to retrieve specific pieces of information from a piece of text.  E.g.:
- What general topic is this text about?  (Topic modeling)
- What's the general gist of this piece of text?  (Summarization)
- What does this text express positive/negative feelings about?  (Aspect-based sentiment analysis)

## Text-to-Numbers

These models take text in and output numeric results:
- A 1 or 0 to identify the presence, or absence, of some feature.
- A number between 0 and 1 to indicate a probability.
- A vector of numbers to represent anything you can represent in a vector.

What the output numbers "mean" varies by task.  1/0 outputs might be flagging potential fraud, or toxic language, or the successful use of some writing technique.

## Text-to-Text

Text goes in, text comes out.  For these models to be useful, the output text should be somehow conditioned on the input text.  E.g.:
- Put a long document in, and produce a summary of it.
- Put some text in, and guess the next word.  (this is how all generative models currently work--next-word-prediction tasks).
    - A little secret: these models are secretly text-to-number models.  The output is a vector of probabilities corresponding to possible next words, and the model samples from this probability to generate new text.
- Put some text in, and translate it into another language.

## Text-to-Image and Text-to-Video

Put some text in, and get a picture or video or something similar out.  We won't be discussing these models today.