# PRML Week 2

# Generative AI
- GenAI is a subset of artificial intelligence that focuses on creating models and algorithms capable of generating new and original data that resembles a given training dataset.
- Unlike traditional AI models that rely on predefined rules and patterns to make predictions or decisions, generative AI models have the ability to create new content and generate data, often in the form of images, texts, music, or other types of creative output.
- There are several popular techniques and architectures used in generative AI:

### 1. Generative Adversarial Networks (GANs):
- GANs consist of two neural networks, a generator, and a discriminator. The generator generates fake data, while the discriminator tries to distinguish between real and fake data. Both networks improve through competition, resulting in the generator producing increasingly realistic data.
### 2. Variational Autoencoders (VAEs): 
- VAEs are a type of neural network that can learn to represent input data in a compact, latent space. This latent space can then be sampled to generate new data points, which are similar to the original data but slightly different.
### 3. Recurrent Neural Networks (RNNs) and Transformers:
- These are sequence-to-sequence models capable of processing sequential data, such as natural language or time-series data. They can be used for tasks like text generation and music composition.  
### Generative AI 
- Has shown remarkable progress in recent years, leading to impressive results in various creative tasks. However, it also raises ethical concerns related to the potential misuse of generated content, such as deepfakes or misinformation. Researchers and developers are actively exploring ways to improve the controllability, interpretability, and ethical use of generative AI systems.

## Model Types: Generative vs. Discriminative

### Generative Models
- **Purpose**: These models learn the joint probability distribution \(P(X, Y)\) of the input features \(X\) and the output labels \(Y\).
- **Usage**: They can generate new instances of data by sampling from the learned distribution.
- **Examples**: Naive Bayes, Hidden Markov Models (HMM), and Generative Adversarial Networks (GANs).
- **Process**: They first model how the data is generated by learning \(P(X|Y)\) and \(P(Y)\). To make predictions, they use Bayes' theorem to compute \(P(Y|X)\).

### Discriminative Models
- **Purpose**: These models directly learn the conditional probability \(P(Y|X)\) or make decisions based on a function \(f(X)\) that maps inputs \(X\) to labels \(Y\).
- **Usage**: They are primarily used for classification and regression tasks.
- **Examples**: Logistic Regression, Support Vector Machines (SVM), and Neural Networks.
- **Process**: They focus on finding the boundary that best separates different classes by learning \(P(Y|X)\) directly from the data.

### Key Differences
- **Generative**: Models how data is generated; can create new data samples.
- **Discriminative**: Focuses on the decision boundary; typically better for classification tasks.

### Analogy
- **Generative**: Imagine learning how to draw detailed pictures of different animals (understanding their full structure).
- **Discriminative**: Imagine learning how to distinguish between different animals based on their features (just knowing the boundaries).


# Discussion

Sure! Here are the dot points with 15 to 20 words each:

- **What is AI? Is AI taking over the world? Do we need to be fearful?**  
AI is the simulation of human intelligence by machines. It isn't taking over the world, but ethical concerns exist.

- **Assume you are leading an AI project? The system you develop will need to be ethical. What do you understand by AI ethics? How will you ensure your system complies with ethics requirements?**  
AI ethics involves ensuring fairness, transparency, and accountability. Ensuring compliance includes bias mitigation, data privacy, and continuous ethical reviews.

- **There has been a major resurgence of AI since 2011 and it is being embedded in most walks of life. Discuss the reasons for AI ‘winters’ and the new directions that make AI as the major leap for the future.**  
AI winters occurred due to unmet expectations and funding cuts. Recent advances in computing power and algorithms drive AI's resurgence.

- **What is generative AI (e.g., ChatGPT)? What is it doing? Why does it work? Is it intelligent? Why does it hallucinate and say wrong things, give false conclusions?**  
Generative AI creates content like text. It works via deep learning. It's not truly intelligent, can hallucinate due to probabilistic nature.

- **An AI Bard is also a generative AI system developed by Google. Explore it.**  
Bard and ChatGPT explain PRML (Pattern Recognition and Machine Learning) concepts. Compare their responses for clarity, depth, and accuracy.

- **Pattern recognition (PR) is a major driver for the new AI. How does PR define AI? Identify 4-6 application problems of AI that realize PR.**  
PR involves identifying patterns in data. Applications include image recognition, speech recognition, fraud detection, medical diagnosis, and autonomous driving.

- **Differentiate between PR and machine learning? Discuss the PR/ML cycle of modeling & problem solving.**  
PR focuses on detecting patterns, while ML focuses on learning from data. The cycle includes data collection, model training, evaluation, and deployment.

- **Differentiate between regression, classification and clustering.**  
Regression predicts continuous values, classification assigns labels, and clustering groups similar data points without labels.

- **What is ‘deep fake’? Give some examples. Explain why is it difficult to computationally detect them?**  
Deep fakes are realistic fake media created using AI. Examples include fake videos of celebrities. Detection is hard due to sophisticated AI techniques.

# Exercise 1: Fitting n-degree polynomial to data
- Here's an example of fitting an n-degree polynomial to a dataset using the scikit-learn library in Python:

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

In [2]:
# Dataset
X = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 6, 8, 10])


In [4]:
# Reshape the feature array
X = X.reshape(-1, 1)

In [5]:
# Define the degree of the polynomial
degree = 3


In [6]:
# Create polynomial features
poly_features = PolynomialFeatures(degree=degree)
X_poly = poly_features.fit_transform(X)

In [None]:
# Create and train the polynomial regression model
model = LinearRegression()
model.fit(X_poly, y)

In [8]:
# Generate data for plotting
X_plot = np.linspace(0, 6, 100).reshape(-1, 1)
X_plot_poly = poly_features.transform(X_plot)

In [10]:
y_plot = model.predict(X_plot_poly)

In [None]:
# Plot the original data and the fitted polynomial curve
plt.scatter(X, y, label='Original Data')
plt.plot(X_plot, y_plot, color='red', label='Polynomial Fit')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Polynomial Fit of Degree {}'.format(degree))
plt.legend()
plt.show()


# Exercise 2: Fitting n-degree polynomial to data
- Here's an example of Python code that builds a linear regression model using the scikit-learn library and trains it on the given dataset:


In [None]:
import numpy as np
from sklearn.linear_model import LinearRegression


In [14]:
# Dataset
X = np.array([[1200, 2, 1, 1995],
              [1500, 3, 2, 2002],
              [1800, 3, 2, 1985],
              [1350, 2, 1, 1998],
              [2000, 4, 3, 2010]])

y = np.array([250, 320, 280, 300, 450])

In [15]:
# Create and train the model
model = LinearRegression()
model.fit(X, y)

In [16]:
# Predict house prices
new_data = np.array([[1650, 3, 2, 2005],
                     [1400, 2, 1, 2000]])

In [17]:
predicted_prices = model.predict(new_data)
print("Predicted prices:", predicted_prices)


Predicted prices: [375.83867702 325.62942963]
