These are my personal notes taken while following the Udacity Generative AI Nanodegree.
The Nanodegree has 4 modules:
- Generative AI Fundamentals.
- Large Language Models (LLMs) & Text Generation.
- Computer Vision and Generative AI.
- Building Generative AI Solutions.
This folder & guide refer to the first module: Generative AI Fundamentals.
Mikel Sagardia, 2024. No guarantees.
Overview of Contents:
- Udacity Generative AI Nanodegree: Generative AI Fundamentals
Lesson objectives:
- Identify industry applications, trends, and opportunities of Generative AI
- Contextualize Generative AI within the broader history and landscape of machine learning and artificial intelligence
- Describe the general process that popular Generative AI models use to generate outputs
Instructor: Brian Cruz.
Examples of Generative AI:
- Text generation; e.g., ChatGPT
- Image generation; e.g., DALL-E
- Code generation; e.g., Github Copilot
- Audio generation: music and speech; e.g., Meta's AudioCraft
In general, Generative AI has accelerate the ease to produce some content that previously required much more time. That implies people have become more productive; however, we should use it responsible to avoid destroying jobs, among other risks.
- Creative content generation
- Artowrk synthesis: visual art pieces
- Music composition: original musical pieces
- Literary creation: written content
- Product development
- Design optimization: refine designs
- Rapid prototyping: concepts, visualization
- Material exploration: predict and explore new materials
- Scientific research
- Experiment simulation: physical testing less required
- Data analysis and prediction
- Molecular discovery: drug discovery
- Data augmentation
- Image enhancement: new image varations
- Text augmentation: diverse new texts
- Synthetic data creation: new datasets from scratch
- Personalization
- Content recommendation based on preferences and behavior
- Bespoke product creation: tailored to individual specs
- Experience customization: suit individual user preferences
LLMs are able to create sentences that sound like they are written by humans, but they can struggle with questions that involve basic logic. This is because LLMs are primarily trained to be able to fill in missing words in sentences from the large corpora of text they are trained on.
Also, LLMs often avoid saying a simple I don't know, instead they try to hallucinate a made up answer. That is so because the principle they work on is precisely the hallucination of predicting the next word given the previous context.
- DeepMind: Millions of new materials discovered with deep learning
- Audi: Reinventing the wheel? “FelGAN” inspires new rim designs with AI
- Paper: May the force of text data analysis be with you: Unleashing the power of generative AI for social psychology research
- Udacity Course on Small Datasets and Synthetic Data
Video: AI And Machine Learning Timeline
Video: How Generative AI Models Are Trained
Generative AI models are trained to learn an internal representation of a vast dataset. Then, after training, they can sample in the learned distribution to generate new but convincing data (images, text, etc.).
There are many ways to train generative AI models; we focus on two:
- LLMs: given a sequence of words (context) predict the next one; we reward the correct word and penalize the rest.
- Image generation models (e.g., diffusion models): they use the techniques in Variational Autoencoders; images are encoded into a latent space and then decoded back to reconstructued images. Bad reconstructions are penalized, good ones rewarded. Then, we use only the decoder part to generate new images feeding a latent vector.
TBD.
🚧
TBD.
🚧
TBD.
🚧