**Multimodal Generative AI**
- Gen AI is not predictive; it infers data (creates content) based on learned patterns from training data.
    - Conversational (ChatGPT, Perplexity)
    - Coding (Copilot)
    - Images (DALL-E)

- GenAI is a type of deep learning ML algorithm (subset of ML)
    - ML algorithm based on deep neural networks which result in generative output
    - Simulation of the human brain

- GenAI is between Narrow AI (AI tool for a specific task) and AGI
    - GenAI is a large foundation model that can adapt to various tasks; has not reached human-level intelligence yet 

- LM and LLM
    - LMs are AI that generate new textual content by predicting future words 
    - LLMs are LMs trained on billions of parameters, which has enabled advanced capabilities
    - MLLM (multimodal) are LLMs which can process and produce media in different modes; a step towards AGI
    - These are having massive effects on a number of industries

*Game Theory*
- Systems learn best when there is some form of competition and resolution
- Generative Adversarial Network (GAN); a generator competes against a discriminator - the generator wants to trick the discriminator into labeling generated data as real data
- Process:
    - Gathering large amounts of data in desired modes
    - Training to iteratively teach the model patterns in the data
    - Fine-tuning adapts the model to a specific task or domain

*Chatbots*
- Early chatbots were not flexible; they had a set of specific responses to specific prompts
- Pattern recognition from labeled text (a form of ML) allows adaptive chatbots which can analyze inputs
- Neural networks and NLP allow chatbots to increase input understanding and generate more dynamic outputs
    - Contextual awareness

*Transformers*
- A transformer model (GPT) processes text in a way which weights words and pays more attention to certain inputs - through tokenization
    - GPT can ignore and discriminate against unimportant inputs regardless of order
    - This enables context

*GPU*
- Based on the power of a GPU; GPUs are better at parallel mathmatical performances involving vectors and matrices than CPUs
    - Efficient GPUs allow running machine learning on smaller devices
    - Open-source libraries which collaborate with organizations

*GenOS*
- Generative Open-Source Index which tracks gen AI projects and ranks them based on various factors
- GPUs provide the physical infrastructure, GenOS provides the organization and delivery model

*Training*
- Supervised learning: Providing patterns from a labeled dataset
- Unsupervised learning: Predicting words and comparing to correct outputs to adjust weights in the neural network
- Often a cheap combination is used; a model is pre-trained on unsupervised large datasets, then fine-tuned with supervised learning
- Reinforcement Learning with Human Feedback (RLHF)
    - Human prferences guide the model's responses to align the model with our values and needs

*Cross-Modal Embeddings*
- Example; Connecting text to images; 'dog' points to a picture of a dog
    - Allows image retrieval, captioning, etc.

*Use of LLMs*
- Much more flexible than rule-based models
- Can handle and understand a variety of language and unstructured input - more generalized to global dialects - ignores errors and emojis, etc.
    - Flexibility
    - Generalization
    - Learning method (unsupervised vs time-consuming)
    - Robustness to Noise

**Prompting** 
- Users use prompt engineering to leverave RLHF to align LLM outputs with their needs
- Analogous to 'driving' the LLM - requires that the engine (GPU) and steering system (RLHF) work, but the final fine-tuning for output generation
- The quality of the prompt is very important and can massively change the output 
- Art and science of providing quality prompts to the LLM to encourage helpful behavior
- ChatGPT has been fine-tuned with RLHF; Humans have guided the model by rewarding helpful responses and discouraging unhelpful ones

*Strategy*
- Chain of Thought; encouraging a conclusion by asking the LLM to generate step-by-step reasoning
- Few-Shot Input; providing examples for the model to base output on 
- Instructional; given roles or instructions, do x

*Operations*
- Iterate input based on output
- Provide increasingly specific and helpful inputs
    - Format
    - Scope
    - Style
    - Ask for explicit uncertainty
    - Ask for step-by-step
- Allow the model to be wrong and to declare that
- Split task into subtasks
- Verify all information given
- Reduce hallucinations
    - Include instructions to not hallucinate
    - Restrict output
    - Add chain of thought style instructions
    - Repeat instructions multiple times
    - Position most important instructions towards the end (latency effect)
- Spotting hallucinations
    - Too specific or factual without a reference/source
    - Fake sources/references
    - Contradictions and inconsistencies
    - Overconfidence when uncertain
    - Lack of common sense 
- Math
    - LLMs don't have internal logic; they just imitate math learned from outside sources
    - Higher complexity means more potential for errors
    - LLMs can't internally verify their answers

**Future**
- Combining rules with genAI
- Increased fine-tuning per industry
- Regulation and ethics
- Open collaboration and shared data, benchmarks, and partnerships

**Git and Github**
- Git is a decentralized version control system for tracking file changes
    - Tracks who made changes, when they were made, and what they were
    - Best for text files (scripts, etc.)
    - Accessed through terminal; git command
    - GitHub is a website which receives git pushes
- In VSCode; Source control tab on the left connects to GitHub
- Nomenclature 
    - Repository; Where code lives
    - Commit; A saved snapshot of code
    - Branch; Independent line of development
    - Merge; Bringing changes from a branch into another
    - Pull Request (PR); Proposing changes for review
    - Close; Downloading copy of repository to local machine
    - Push; Sending local changes to GitHub (syncing)