# One-shot

“One-shot” can mean different things depending on the context in NLP or AI embeddings.

## One-Shot in NLP (One-Hot Encoding)

In NLP, one-shot vector often refers to one-hot encoding, a method of representing categorical data (such as words in a vocabulary) as binary vectors.

Example:

If your vocabulary consists of three words: ["cat", "dog", "fish"], then one-hot encoding represents them as:
- “cat” → [1, 0, 0]
- “dog” → [0, 1, 0]
- “fish” → [0, 0, 1]

Each word gets a unique position in a high-dimensional space, but this approach has limitations:
- Sparse representation (most elements are 0).
- No notion of similarity (e.g., “dog” and “wolf” are equally distant as “dog” and “table”).

That’s why word embeddings (like Word2Vec, GloVe, or BERT) are often preferred over one-hot encoding.

## One-Shot in AI Embeddings

In AI, particularly in few-shot learning and one-shot learning, “one-shot” has a different meaning.

### One-Shot Learning

One-shot learning refers to a model’s ability to recognize or classify new data points with just one example per class. This is common in:
- Facial recognition: A system learns your face from just one photo and can recognize you later.
- Image classification: Identifying a new object with minimal examples.

Oneshot learning relies on embedding-based methods, such as:
- Siamese networks: Compare similarity between embeddings of different data points.
- Triplet loss: Ensures embeddings of similar items are closer together.
- Transformer models: Few-shot capabilities in NLP.

### Embeddings and One-Shot Learning

Embeddings convert data (like words, sentences, or images) into dense vector representations in a continuous space, where similar data points are closer together.

For example, a word embedding model (like Word2Vec or BERT) represents:
- “king” as [0.5, 0.1, -0.3, …]
- “queen” as [0.49, 0.12, -0.29, …]

Unlike one-hot encoding, these embeddings capture semantic relationships.