Advanced AI/ML topics including Data Mining, Neural Network Theory, and Comprehensive Image Processing demonstrating deep understanding of machine learning fundamentals and advanced techniques.
| # | Project | Category | Notebook | Focus |
|---|---|---|---|---|
| 1 | Data Mining Project | Data Science | 01_data_mining_project.ipynb |
Clustering, Association Rules |
| 2 | ANN Loss Functions | Deep Learning Theory | 02_ann_loss_functions.ipynb |
Softmax, Sigmoid, Cross-Entropy |
| 3 | Image Processing | Computer Vision | 03_comprehensive_image_processing.ipynb |
Complete CV Pipeline |
| 4 | ML Comprehensive Exam | Machine Learning | 04_ml_comprehensive_exam.ipynb |
End-to-End ML Tasks |
- scikit-learn - Clustering, classification
- Pandas - Data manipulation
- Association rule mining - Market basket analysis
- TensorFlow/Keras - Neural networks
- Loss functions - Optimization theory
- Activation functions - Softmax, Sigmoid, ReLU
- OpenCV - Computer vision
- PIL/Pillow - Image manipulation
- NumPy - Array operations
git clone https://github.com/uzi-gpu/data-mining-advanced.git
cd data-mining-advanced
python -m venv venv
source venv/bin/activate # Windows: venv\\Scripts\\activate
pip install -r requirements.txt
jupyter notebookFile: 01_data_mining_project.ipynb
Objective: Apply data mining techniques to discover patterns in data
Techniques:
- Clustering: K-Means, Hierarchical
- Classification: Decision Trees, Random Forest
- Association Rules: Apriori algorithm
- Pattern Discovery: Frequent itemsets
Applications:
- Customer segmentation
- Market basket analysis
- Anomaly detection
- Recommendation systems
File: 02_ann_loss_functions.ipynb
Objective: Deep dive into neural network loss functions and optimization
Loss Functions Covered:
1. Binary Cross-Entropy:
BCE = -[y*log(ŷ) + (1-y)*log(1-ŷ)]- Use case: Binary classification
- Range: [0, ∞)
2. Categorical Cross-Entropy:
CCE = -Σ(y_i * log(ŷ_i))- Use case: Multi-class classification
- Requires: One-hot encoded labels
3. Mean Squared Error (MSE):
MSE = (1/n) * Σ(y - ŷ)²- Use case: Regression
- Sensitive to outliers
Activation Functions:
- Sigmoid: σ(x) = 1/(1+e^(-x))
- Softmax: e^(x_i) / Σe^(x_j)
- ReLU: max(0, x)
File: 03_comprehensive_image_processing.ipynb
Objective: Complete image processing pipeline from basics to advanced
Topics Covered:
Fundamentals:
- Image loading and display
- Color space conversions
- Image resizing and cropping
Filtering:
- Gaussian blur
- Median filtering
- Bilateral filter
- Sharpening
Edge Detection:
- Canny edge detector
- Sobel operator
- Laplacian
Morphological Operations:
- Erosion and dilation
- Opening and closing
- Morphological gradient
Advanced:
- Histogram equalization
- Image transforms (FFT)
- Feature detection (corners, blobs)
- Image segmentation
File: 04_ml_comprehensive_exam.ipynb
Objective: Demonstrate comprehensive ML knowledge
Skills Demonstrated:
- Data preprocessing
- Model selection
- Hyperparameter tuning
- Cross-validation
- Performance evaluation
- Feature engineering
- Unsupervised Learning - Clustering without labels
- Association Rules - Mining relationships
- Pattern Discovery - Finding hidden insights
- Dimensionality Reduction - PCA, t-SNE
- Loss Functions - Optimization objectives
- Backpropagation - Gradient computation
- Activation Functions - Non-linearity
- Optimization - SGD, Adam, RMSprop
- Spatial Domain - Direct pixel manipulation
- Frequency Domain - FFT transformations
- Feature Extraction - Corners, edges, textures
- Image Enhancement - Filters, equalization
- Model Evaluation - Accuracy, precision, recall
- Cross-Validation - K-fold validation
- Ensemble Methods - Bagging, boosting
- Feature Selection - Important variable identification
This repository demonstrates:
-
Data Mining Expertise
- Clustering algorithms
- Association rule mining
- Pattern discovery
- Practical applications
-
Deep Learning Theory
- Loss function mathematics
- Optimization principles
- Activation function analysis
- Training dynamics
-
Image Processing
- Complete CV pipeline
- Filter design and application
- Feature extraction
- Advanced techniques
-
ML Proficiency
- End-to-end pipelines
- Model evaluation
- Best practices
- Production readiness
Uzair Mubasher - BSAI Graduate
MIT License - see LICENSE
⭐ Star this repository if you found it helpful!