# Unsupervised Learning, Anomaly Detection, and Time Series, Clustering

## Assignment Answers

### **Question 1: What is Dimensionality Reduction? Why is it important in machine learning?**
**Answer:**
Dimensionality reduction is the process of reducing the number of input variables in a dataset while preserving as much meaningful information as possible. It is important because it helps reduce computational complexity, removes noise, improves model performance, and simplifies data visualization.

### **Question 2: Name and briefly describe three common dimensionality reduction techniques.**
**Answer:**
1. **PCA (Principal Component Analysis)** – Transforms data into a set of orthogonal components that capture the maximum variance.
2. **t-SNE (t-distributed Stochastic Neighbor Embedding)** – Nonlinear technique used to visualize high-dimensional data in 2D or 3D.
3. **LDA (Linear Discriminant Analysis)** – Supervised technique that projects data maximizing class separation.

### **Question 3: What is clustering in unsupervised learning? Mention three popular clustering algorithms.**
**Answer:**
Clustering is an unsupervised learning technique used to group similar data points together based on patterns.
Popular algorithms:
- K-Means Clustering
- DBSCAN
- Hierarchical Clustering

### **Question 4: Explain the concept of anomaly detection and its significance.**
**Answer:**
Anomaly detection identifies rare or unusual patterns in data that do not follow expected behavior. It is significant for fraud detection, system monitoring, medical diagnosis, and security applications.

### **Question 5: List and briefly describe three types of anomaly detection techniques.**
**Answer:**
1. **Statistical Methods** – Identify anomalies using probability distributions.
2. **Machine Learning Methods** – Use models like Isolation Forest or LOF to find anomalies.
3. **Deep Learning Methods** – Autoencoders or LSTMs detect anomalies using learned representations.

### **Question 6: What is time series analysis? Mention two key components of time series data.**
**Answer:**
Time series analysis involves studying data points collected over time to identify patterns.
Two key components:
- **Trend**
- **Seasonality**

### **Question 7: Describe the difference between seasonality and cyclic behavior in time series.**
**Answer:**
- **Seasonality** refers to patterns that repeat at fixed intervals.
- **Cyclic behavior** refers to irregular, long-term fluctuations influenced by external factors like economy.

### **Question 8: Python code to perform K-means clustering on a sample dataset**

In [None]:
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# Generate sample dataset
X, _ = make_blobs(n_samples=200, centers=3, random_state=42)

# Apply K-Means
kmeans = KMeans(n_clusters=3)
kmeans.fit(X)
labels = kmeans.labels_

# Plot
plt.scatter(X[:,0], X[:,1], c=labels)
plt.title("K-Means Clustering Result")
plt.show()

### **Question 9: What is inheritance in OOP? Provide a simple example in Python.**
**Answer:**
Inheritance allows a class to acquire properties and methods of another class.

**Example:**

In [None]:
class Animal:
    def speak(self):
        return "Animal speaks"

class Dog(Animal):
    def speak(self):
        return "Dog barks"

dog = Dog()
dog.speak()

### **Question 10: How can time series analysis be used for anomaly detection?**
**Answer:**
Time series analysis detects anomalies by identifying deviations from expected temporal patterns such as trends, seasonality, or forecasted values. Methods like ARIMA, LSTM, or moving averages can highlight abnormal behaviors.