Skip to content

night-fury-me/machine-learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Artificial Intelligence / Machine Learning / Deep Learning Topic List

Comprehensive list of Machine Learning, Deep Learning, and AI Topics

2. Fundamentals of AI

  • Search Algorithms
    • Depth-First Search (DFS)
    • Breadth-First Search (BFS)
    • Uniform Cost Search (UCS)
    • Iterative Deepening Depth-First Search (IDDFS)
    • Bidirectional Search
    • A* Search Algorithm
    • Greedy Best-First Search
    • Beam Search
    • Hill Climbing
    • Simulated Annealing
    • Genetic Algorithms
  • Heuristic Search
    • Admissible Heuristics
    • Consistent Heuristics
    • Informed vs. Uninformed Search
    • Manhattan Distance Heuristic
    • Euclidean Distance Heuristic
    • Pattern Database Heuristic
    • Null Heuristic
  • Alpha-Beta Minimax Algorithm
    • Minimax Algorithm
    • Alpha-Beta Pruning
    • Negamax Algorithm
    • Principal Variation Search (PVS)
    • Aspiration Windows
  • Constraint Satisfaction Problems (CSP)
    • Backtracking
    • Forward Checking
    • Arc Consistency
    • Constraint Propagation
    • Min-Conflicts Algorithm
  • Game Theory
    • Nash Equilibrium
    • Zero-Sum Games
    • Non-Zero-Sum Games
    • Extensive-Form Games
    • Normal-Form Games
  • Knowledge Representation
    • Propositional Logic
    • First-Order Logic
    • Predicate Logic
    • Semantic Networks
    • Frames
    • Ontologies
  • Inference and Reasoning
    • Deductive Reasoning
    • Inductive Reasoning
    • Abductive Reasoning
    • Forward Chaining
    • Backward Chaining
    • Resolution
  • Planning and Decision Making
    • State-Space Representation
    • Action-Space Representation
    • STRIPS Representation
    • PDDL (Planning Domain Definition Language)
    • Markov Decision Processes (MDP)
    • Value Iteration
    • Policy Iteration

3. Machine Learning

  • Supervised Learning
    • Linear Regression
    • Logistic Regression
    • Decision Trees
    • Random Forest
    • Support Vector Machines (SVM)
    • K-Nearest Neighbors (KNN)
    • Gradient Boosting Machines
    • AdaBoost
    • XGBoost
    • LightGBM
    • CatBoost
  • Unsupervised Learning
    • K-Means Clustering
    • Hierarchical Clustering
    • DBSCAN
    • Principal Component Analysis (PCA)
    • Independent Component Analysis (ICA)
    • t-Distributed Stochastic Neighbor Embedding (t-SNE)
    • Autoencoders
    • Gaussian Mixture Models (GMM)
  • Semi-Supervised Learning
    • Self-Training
    • Co-Training
    • Multi-View Learning

4. Deep Learning

  • Feedforward Neural Networks (FNN)
  • Convolutional Neural Networks (CNN)
  • Recurrent Neural Networks (RNN)
  • Long Short-Term Memory (LSTM)
  • Gated Recurrent Unit (GRU)
  • Generative Adversarial Networks (GAN)
  • Variational Autoencoders (VAE)
  • Transformer Networks
  • Attention Mechanisms
  • Self-Supervised Learning Models

5. Reinforcement Learning

  • Q-Learning
  • Deep Q-Network (DQN)
  • SARSA
  • Policy Gradient Methods
  • Actor-Critic Methods
  • Proximal Policy Optimization (PPO)
  • Asynchronous Advantage Actor-Critic (A3C)
  • Soft Actor-Critic (SAC)
  • Trust Region Policy Optimization (TRPO)

6. Model Evaluation and Optimization Techniques

  • Cost Functions
    • Mean Squared Error (MSE)
    • Cross-Entropy Loss
    • Hinge Loss
    • Huber Loss
  • Optimizers
    • Gradient Descent
    • Stochastic Gradient Descent (SGD)
    • Adam Optimizer
    • RMSProp
    • AdaGrad
    • AdaDelta
  • Evaluation Metrics
    • Accuracy
    • Precision
    • Recall
    • F1-Score
    • ROC-AUC
    • Confusion Matrix
  • Model Validation Techniques
    • Cross-Validation
    • Bootstrapping
    • Train/Test Split
  • Hyperparameter Tuning
  • Grid Search
  • Random Search
  • Bayesian Optimization
  • Neural Architecture Search (NAS)
  • AutoML
  • Early Stopping
  • Regularization Techniques
    • L1 and L2 Regularization
    • Dropout
    • Batch Normalization
  • Model Quantization
    • Post-Training Quantization
      • Static Quantization
      • Dynamic Quantization
      • Quantization Types
      • Benefits and Trade-offs
      • Implementation Tools
    • Quantization Aware Training
      • Simulated Quantization
      • Fake Quantization
      • Quantization Noise Handling
      • Mixed Precision Training
      • Hardware Considerations
      • Framework Support
  • Model Pruning
    • Pruning Strategies
      • Magnitude-Based Pruning
      • Structured Pruning
      • Unstructured Pruning
      • Sensitivity-Based Pruning
    • Pruning Methods
      • Iterative Pruning and Retraining
      • One-Shot Pruning
      • Gradual Pruning
    • Pruning Frameworks
      • TensorFlow Model Optimization Toolkit
      • PyTorch’s TorchVision Pruning
      • ONNX Runtime Pruning
  • Knowledge Distillation
    • Teacher-Student Framework
      • Teacher Model
      • Student Model
    • Distillation Methods
      • Logit Matching
      • Soft Targets
      • Intermediate Layer Matching
      • Self-Distillation
    • Applications
      • Model Compression
      • Transfer Learning
      • Ensemble Distillation
    • Frameworks and Tools
      • Distiller (PyTorch)
      • TensorFlow Model Optimization Toolkit
  • Model Compression Techniques
    • Quantization
      • Bit-Width Reduction
      • Quantization Aware Training
      • Post-Training Quantization
    • Pruning
      • Weight Pruning
      • Neuron Pruning
      • Channel Pruning
    • Low-Rank Factorization
      • Matrix Decomposition
      • Tensor Decomposition
    • Knowledge Distillation
      • Teacher-Student Training
    • Weight Sharing
      • Hashing Methods
      • Vector Quantization
    • Model Sparsity
      • Sparse Representations
      • Sparse Training

7. Dataset Preparation Tools and Techniques

  • Data Collection and Labeling
    • Crowd sourcing
    • Synthetic Data Generation
      • Generative Models
        • Generative Adversarial Networks (GANs)
        • Variational Autoencoders (VAEs)
        • AutoRegressive Models
      • Simulation Techniques
        • Physics-Based Simulation
        • Procedural Generation
        • Agent-Based Modeling
      • Data Transformation
        • Domain Adaptation
        • Style Transfer
        • Transfer Learning
      • Data Fusion
        • Multi-Modal Data Integration
        • Sensor Fusion Techniques
        • Feature Concatenation and Combination
    • Data Augmentation
      • Geometric Transformations
        • Rotation
        • Translation
        • Scaling
        • Shearing
      • Color and Intensity Adjustments
        • Brightness Adjustment
        • Contrast Adjustment
        • Color Shifting
        • Histogram Equalization
      • Spatial Transformations
        • Cropping
        • Padding
        • Random Erasing
      • Noise Injection
        • Gaussian Noise
        • Salt and Pepper Noise
        • Speckle Noise
      • Mixup and CutMix
        • Mixup
        • CutMix
      • Domain-Specific Techniques
        • Text Augmentation
        • Image Style Transfer
        • Audio Augmentation
  • Data Cleaning
    • Handling Missing Values
    • Outlier Detection and Treatment
    • Data Normalization and Standardization
  • Feature Engineering
    • Feature Extraction
    • Feature Selection
    • Dimensionality Reduction
      • Principal Component Analysis (PCA)
      • t-SNE
      • Linear Discriminant Analysis (LDA)
      • Kernel PCA
      • Non-negative Matrix Factorization (NMF)
      • Singular Value Decomposition (SVD)
  • Data Storage and Management
    • Data Warehouses
      • Relational Data Warehouses
        • Schema Design
        • Query Optimization
        • Indexing Strategies
        • Partitioning and Sharding
        • Tools: Amazon Redshift, Google BigQuery, Snowflake
      • NoSQL Data Warehouses
        • Document Stores (e.g., MongoDB)
        • Key-Value Stores (e.g., Redis)
        • Columnar Stores (e.g., Cassandra)
        • Graph Databases (e.g., Neo4j)
        • Tools: Apache Cassandra, Amazon DynamoDB, Apache HBase
    • Data Lakes
      • Architecture and Components
      • Data Ingestion Techniques
      • Data Governance and Security
      • Querying and Analysis
      • Tools: Apache Hadoop, Apache Spark, AWS S3, Azure Data Lake Storage
    • Feature Stores
      • Feature Engineering and Versioning
      • Storage and Retrieval Mechanisms
      • Serving Layers
      • Integration with ML Pipelines
      • Real-time Feature Serving
      • Tools: Feast, Hopsworks, Tecton, Michelangelo Feature Store (Uber)

8. Error Analysis and Data Lifecycle in Production

  • Data and Concept Change in Production
    • Data Drift Detection
    • Concept Drift Detection
    • Model Drift Detection
    • Adapting to Data and Concept Changes
  • Data Validation
    • Data Quality Assessment
    • Schema Validation
    • Anomaly Detection
    • Data Consistency Checks
  • Feature Engineering Techniques
    • Feature Scaling
    • Feature Transformation
    • Feature Selection
    • Feature Importance Analysis
  • Pre-processing Data at Scale
    • Data Cleaning Pipelines
    • Data Imputation Strategies
    • Outlier Detection and Handling
    • Handling Missing Values
  • Data Journey Over a Production System’s Life-cycle
    • Data Collection
    • Data Storage
    • Data Pre-processing
    • Model Training
    • Model Deployment
  • Schema Development and Management
    • Schema Evolution Strategies
    • Schema Versioning
    • Data Schema Governance
  • Responsible Data: Security, Privacy, and Fairness
    • Data Encryption
    • Access Control
    • Bias Detection and Mitigation
    • Anonymization Techniques
      • Generalization
      • Suppression
      • Perturbation
      • Substitution
    • Pseudonymization Techniques
      • Tokenization
      • Hashing
      • Encryption
      • Masking
    • Privacy-Preserving Algorithms
      • Differential Privacy
      • Homomorphic Encryption
      • Secure Multi-Party Computation (SMPC)
    • Legal and Ethical Considerations
      • GDPR Compliance
      • HIPAA Compliance
      • Data Protection Regulations
    • Effectiveness and Risks
      • Risk of Re-identification
      • Trade-offs between Utility and Privacy
      • Evaluating Anonymization and Pseudonymization Methods
    • Implementation Strategies
      • Data Masking Tools and Libraries
      • Policy-based Anonymization Policies
      • Data Anonymization Pipelines
    • Anonymization in Different Domains
      • Healthcare Data Anonymization
      • Financial Data Anonymization
      • Social Media Data Anonymization
    • Auditing and Verification
      • Anonymization Impact Assessments
      • Anonymization Audits
      • Verification Techniques for Anonymized Data
  • Continuous Evaluation and Monitoring
    • Model Performance Monitoring
    • Data Drift Monitoring
    • Concept Drift Monitoring
    • Model Fairness Monitoring

9. Machine Learning Project Lifecycle and Deployment

  • Model Deployment
    • Model Serving Infrastructure
    • Deployment Options
    • Model Serving Patterns and Architecture
      • Microservices Architecture
      • Serverless Architecture
      • Model-as-a-Service (MaaS)
    • Scaling Infrastructure
      • Horizontal Scaling
      • Vertical Scaling
      • Cloud Infrastructure
  • Improving Prediction Latency and Reducing Resource Costs
    • Batch Inference
    • Online Inference
    • Auto Scaling
    • Kubernetes and KubeFlow
  • MLOps
    • Experiment Tracking

      • Experiment Logging
      • Hyperparameter Tracking
      • Experiment Reproducibility
      • Experiment Visualization
      • Tools and Technologies: TensorBoard, MLflow, Neptune, Weights & Biases (wandb)
    • Continuous Delivery

      • Automated Model Deployment
      • Continuous Integration Pipelines
      • Model Versioning in Continuous Delivery
      • Automated Testing for ML Models
      • Tools and Technologies: Kubeflow, ArgoCD, Jenkins, GitLab CI/CD
    • Model Versioning

      • Version Control Systems for ML Models
      • Model Versioning Best Practices
      • Model Registry
      • Model Lineage Tracking
      • Tools and Technologies: DVC (Data Version Control), MLflow Model Registry, ModelDB, Verta
    • ML Model Management

      • Model Catalogs
      • Model Governance
      • Model Serving Infrastructure
      • Model Lifecycle Management
      • Tools and Technologies: Seldon Core, TFX (TensorFlow Extended), ModelDB, MLflow Model Registry
    • Workflow Automation

      • Workflow Orchestration
      • Pipeline Automation
      • Task Scheduling
      • Automated Data Pipelines
      • Tools and Technologies: Airflow, Prefect, Luigi, Apache Beam
    • Monitoring and Logging

      • Model Performance Monitoring
      • Data Drift Detection
      • Model Health Monitoring
      • Log Aggregation
      • Tools and Technologies: Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), DataDog
    • Observability in ML

      • Model Explainability
      • Interpretability Metrics
      • Feature Importance Tracking
      • Model Debugging Tools
      • Tools and Technologies: SHAP (SHapley Additive exPlanations), LIME (Local Interpretable Model-agnostic Explanations), Alibi, Captum
    • Addressing Model Decay

      • Model Retraining Strategies
      • Scheduled Retraining
      • Data Quality Monitoring
      • Model Performance Degradation Detection
      • Tools and Technologies: MLflow Model Registry with Scheduled Retraining, TFX (TensorFlow Extended) with Continuous Training, Monitoring and Alerting Systems for Data Drift Detection, Model Retraining Orchestration with Airflow or Kubeflow Pipelines

10. Miscellaneous

  • Model Explainability Techniques
    • Feature Importance Methods
      • Model-Agnostic Methods
        • LIME (Local Interpretable Model-agnostic Explanations)
        • SHAP (SHapley Additive exPlanations)
        • Permutation Feature Importance
      • Model-Specific Methods
        • Decision Trees
        • Rule-Based Models
        • Linear Models
    • Interpretability Visualization Tools
      • Partial Dependence Plots (PDP)
      • Individual Conditional Expectation (ICE) Plots
      • Accumulated Local Effects (ALE) Plots
      • Decision Trees Visualization
    • Explanation Generation Methods
      • Textual Explanations
      • Visual Explanations
      • Interactive Explanations
    • Model Debugging Techniques
      • Error Analysis
      • Sensitivity Analysis
      • Gradient-Based Techniques
    • Interpretability Metrics
      • Quantitative Metrics for Interpretability
      • Comparison Metrics for Interpretability Techniques
      • Domain-Specific Interpretability Metrics
  • Federated Learning
    • Federated Learning Basics
      • Introduction to Federated Learning
      • Federated Averaging Algorithm
      • Secure Aggregation Techniques
    • Privacy-Preserving Techniques
      • Differential Privacy in Federated Learning
      • Federated Learning with Homomorphic Encryption
      • Federated Learning with Secure Multi-Party Computation (SMPC)
    • Federated Learning Architectures
      • Client-Server Architecture
      • Peer-to-Peer (P2P) Architecture
      • Hybrid Architectures
    • Optimization Techniques
      • Communication-Efficient Algorithms
      • Model Compression for Federated Learning
      • Federated Learning with Gradient Compression
    • Scalability and Efficiency
      • Scalability Challenges in Federated Learning
      • Resource Efficiency in Federated Learning
      • Federated Learning with Heterogeneous Data Sources
  • Distributed Learning
    • Distributed Learning Frameworks
      • TensorFlow Distributed
      • PyTorch Distributed
      • Horovod
      • Apache Spark MLlib
    • Distributed Optimization Algorithms
      • Stochastic Gradient Descent (SGD)
      • Federated Averaging
      • ADMM (Alternating Direction Method of Multipliers)
      • HogWild!
    • Communication Protocols
      • Parameter Server Architectures
      • Ring AllReduce
      • Gradient Centralization
      • Asynchronous Communication
    • Fault Tolerance and Robustness
      • Handling Node Failures
      • Recovery Strategies
      • Robust Distributed Learning Algorithms
    • Scaling to Large Datasets
      • Data Parallelism Techniques
      • Model Parallelism Techniques
      • Pipeline Parallelism Techniques
      • Sharding and Partitioning Strategies
  • Evolutionary Reinforcement Learning (ERL)
    • Genetic Algorithms in Reinforcement Learning
    • Evolution Strategies
    • Neuroevolution
    • Population-Based Methods
    • Competitive Coevolution
  • High-Performance Modeling
    • Efficient Model Architectures
    • Model Compression Techniques
    • Low-Precision Computing
    • Hardware Acceleration (e.g., GPUs, TPUs)
    • Model Parallelism and Distributed Training
  • Model Performance Analysis
    • Evaluation Metrics for Model Performance
    • Performance Profiling Tools
    • Model Benchmarking
    • Performance Optimization Strategies
    • Scalability Analysis
  • Sensitivity Analysis and Adversarial Attacks
    • Sensitivity Analysis Techniques
    • Adversarial Attack Methods
    • Robustness Verification
    • Adversarial Defense Strategies
    • Transferability Analysis
  • High-Performance Ingestion
    • Data Ingestion Pipelines
    • Real-Time Data Ingestion
    • Stream Processing Frameworks
    • Data Preprocessing for High-Volume Data
    • Scalable Data Storage Solutions
  • Training Large Models
    • Model Parallelism Techniques
    • Data Parallelism Strategies
    • Distributed Training Architectures
    • Gradient Accumulation Techniques
    • Memory Optimization for Large Models
  • Distributed Training
    • Distributed Training Frameworks
    • Communication Protocols
    • Synchronous vs. Asynchronous Training
    • Fault Tolerance Mechanisms
    • Scalability Challenges and Solutions
  • Teacher and Student Networks
    • Knowledge Distillation Methods
    • Model Compression Techniques
    • Transfer Learning Strategies
    • Curriculum Learning Approaches
    • Self-Distillation Techniques
  • Knowledge Distillation Techniques
    • Logit Matching
    • Soft Targets
    • Intermediate Layer Matching
    • Self-Distillation
    • Ensemble Distillation
  • Adversarial Training
    • Adversarial Training Methods
    • Adversarial Robustness Regularization
    • Label Smoothing Techniques
    • Defense-GANs
    • Adversarial Training in GANs
  • Residual Analysis
    • Residual-Based Model Interpretation
    • Residuals in Regression Analysis
    • Residual Analysis for Outlier Detection
    • Diagnostic Plots for Residual Analysis
    • Model Improvement Using Residual Analysis
  • Carbon Foot-printing and Environmental Impact
    • Environmental Impact Assessment of ML/DL Training
    • Carbon Emission Estimation in Training Workflows
    • Energy Efficiency Optimization Techniques
    • Green Computing Strategies
    • Sustainable AI Frameworks and Practices

About

Machine Learning Algorithm implementations.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published