# Machine Learning | Statistical Learning | Artificial Intelligence

- **The LION Way - Machine Learning *plus* Intelligent Optimization**
    1. Introduction
        - Lazy learning: nearest neighbors
        - Learning requires a method
    2. Supervised learning
        - Linear models
        - mastering generalized linear least-squares
        - Rules, decision trees, and forests
        - ranking and selecting features
        - specific nonlinear models
        - neural networks, shallow and deep
        - statistical learning theory and support vector machines (SVM)
        - democracy in machine learning
    3. Unsupervised learning and clustering
        - top-down clustering: K-means
        - bottom-up (agglomerative) clustering
        - self-organizing maps
        - dimensionality reduction by linear transformations (projections)
        - visualizing graphs and networks by nonlinear maps
        - semi-supervised learning
    4. Optimization: the source of power
        - automated improvements by local steps
        - local search and reactive search optimization (RSO)
        - continuous and cooperative reactive search optimization (CoRSO)
        - Multi-Objective Reactive Search Optimization (MORSO)
    5. Selected Applications
        - Text and web mining
        - Collaborative filtering and recommendation
- **Soft Computing - Techniques and its Applications in Electrical Engineering**
    1. Introduction to Soft Computing
        - fuzzy logic
        - artificial neural networks
        - introduction to evolutionary algorithms
        - hybrid intelligent systems
    2. Life History of Brain
    3. Artificial Neural Network and Supervised Learning
        - comparison of neural techniques and artifical intelligence
        - artificial neuron structure
        - adaline
        - ANN learning
        - back-propagation learning
        - properties of neural networks
        - limitations in the use of neural networks
    4. Factors Affecting the Performance of Aritifical Neural Network Models
        - network complexity
            - neuron complexity
            - number of layers
            - number of neurons in each layer
            - type and number of interconnecting weights
        - problem complexity
            - range of normalization of training data
            - type of functional mapping
            - sequence of presentation of training data
            - repetition of data in the training set
            - permissible noise in data
        - learning complexity
            - training algorithms of ANN
            - selection of error functions
            - mode of error calculation
    5. Development of Generalized Neuron and its validation
        - existing neuron model
        - development of a generalized neuron (GN) model
        - advantages of GN
        - learning algorithms of a summation type generalized neuron
        - benchmark testing of generalized neuron model
    6. Applications of Generalized Neuron Models
    7. Introduction to Fuzzy Set Theoretic Approach
        - uncertainty and information
        - types of uncertainty
        - introduction of fuzzy logic
        - historical development of fuzzy logic
        - difference between precision and significance
        - fuzzy set
        - operations of fuzzy sets
        - characteristics of fuzzy sets
        - properties of fuzzy sets
        - fuzzy cartesian product
        - various shapes of fuzzy membership functions
        - methods of defining of membership functions
        - fuzzy compositional operators
    8. Applications of Fuzzy Rule Based System
    9. Genetic Algorithms
        - history of genetics
        - selection
        - crossover
        - mutation
        - survival of fittest
        - population size
        - evaluation of fitness function
        - effect of crossover probability on GA performance
        - effect of mutation probability on GA performance
    10. Applications of Genetic Algorithms to Load FOrecasting Problem
    11. Synergism of Genetic Algorithms and Fuzzy Systems for Power System Applications
    12. Integration of Neural Networks and Fuzzy Systems
    13. ANN - GA-Fuzzy Synergism and its Applications
- ** Machine Learning for Computer Vision**
    1. **Throwing down the visual intelligence gauntlet**
        - artificial intelligence: are we there yet?
            - a compass for the uncharted journy toward intelligence
        - the neuromorphic approach to visual intelligence
            - anatomy and physiology of the primary visual cortex (V1)
            - Hubel-Wiesel Models: Successive Tuning and Pooling
            - Consistency with Experimental results and Multiple levels
            - From neuroscience models to engineering applications
        - what's next in the quest for visual intelligence?
            - going beyond "what is where"
            - from perception to abstraction
            - a loost hierarchy of visual tasks
            - it's time to try again - the MIT Intelligence initiative
    2. actionable information in vision
        - Preliminaries
            - notation and conventions
            - visibility and quantization
            - invariant and sufficient statistics
        - Placing the Ecological approach to visual perception onto computational grounds
            - actionable information
            - invertible and non-invertible nuisances
            - the actionable information gap
            - information pickup
        - representational structures
            - computing actionable information
            - computing actionable information gap
        - empirical consequences of the definitions
            - exploration via information pickup
    3. learning binary hash codes for large-scale image search
        - search algorithms for binary codes
        - supervised methods for learning binary projections
        - unsupervised methods for defining binary projections
    4. Bayesian painting by number: flexible priors of colour-invariant object recognition
        - the colour-invariant admixture models
        - using stels for supervised learning tasks
    5. real-time human pose recognition in parts from single depth images
    6. scale-invariant vote-based 3D recognition and registration from point clouds
    7. Multiple classifier boosting and tree-structured classifiers
    8. Simultaneous detection and tracking with multiple cameras
    9. Applications of computer vision to vehicles: an extreme test
- **Data Fusion for sensory information processing systems**
    1. Introduction: he role of data fusion in sensory systems
        - information acquisition: inverting the world-image mapping
        - the need for constraints
        - determination and embedding of constraints
        - the need for data fusion
    2. Bayesian Sensory Information Processing
        - bayes rule
        - the image formation model
        - the priors
        - bayesian estimators for $\vec{f}$
        - bayesian detection and extraction systems
        - the bayesian controversy
    3. Information processing using energy function minimization
        - markov random fields
        - energy functions with matching elements
        - statistical mechanics and mean field theory
        - the form of the smoothness constraint
        - alternative forms of constraint
    4. weakly vs. strongly coupled data fusion: a classification of fusional methods
        - a classification of fusional methods
        - weakly coupled data fusion
        - strongly coupled data fusion algorithms
        - bayesian implementation of data fusion
        - examples of weakly coupled fusion in the vision literature
        - examples of strongly coupled fusion in the vision literature
    5. data fusion applied to feature based stereo algorithms
        - the bayesian approach to stereo vision
        - statistical mechanics and mean field theory
        - comparison with other theories
        - comparisons with psychophysical data
    6. Fusing Binocular and monocular deppth cues
        - strong fusion - stereo with monocular cues
        - previous attempts at strong coupling for stereo
    7. data fusion in shape from shading algorithms
        - an algebraic approach to fusing specular and lambertian reflectance data
        - a class III weakly coupled fusion implementation
        - a strongly coupled approach to polychromatic shape from shading
        - fusion of image formation models
    8. temporal aspects of data fusion
        - temporal coherence edge detector
        - a strongly coupled temporal coherence edge detector
        - temporal sampling
        - active determination of constraints
    9. Towards a constraint based theory of sensory data fusion
- ** Handbook of Geometric Computing**
    1. Neuroscience
        - spatiotemporal dynamics of visual perception across neural maps and pathways
        - symmetry, features and information
    2. Neural Networks
        - geometric approach to multilayer perceptrons
        - a lattice algebraic approach to neural computation
        - eigenproblems in pattern recognition
    3. Image Processing
        - geometric framework for image processing
        - geometric filters, diffusion flows, and kernels in image processing
        - chaos-based image encryption
    4. Computer Vision
        - one-dimensional retinae vision
        - three-dimensional geometric computer vision
        - Dynamic $\mathcal{P}^n$ to $\mathcal{P}^n$ Alignment
        - detecting independent 3D movement
    5. Perception and Action
        - robot perception and action using conformal geometric algebra
    6. Uncertainty in Geometric Computations
        - uncertainty modeling and geometric inference
        - uncertainty and projective geometry
        - the tensor voting framework
    7. Computer Graphcis and Visualization
        - methods for nonrigid image registration
        - the design of implicit functions for computer graphics
    8. Geometry and Robotics
        - grassmann-cayley algebra and robotics applications
        - clifford algebra and robot dynamics
        - geometric methods for multirobot optimal motion planning
    9. Reaching and Motion Planning
        - the computation of reachable surfaces for a specified set of spatial displacements
        - planning collision-free paths using probabilistic roadmaps
- **Neural Networks and Statistical Learning**
    1. Introduction
    2. Fundamentals of Machine Learning
        - learning methods
        - learning and generalization
            - generalization error
            - generalization by stopping criterion
            - generalization by regularization
            - fault tolerance and generalization
            - sparsity versus stability
        - model selection
        - bias and variance
        - robust leanring
        - neural network processors
        - criterion functions
        - computational learning theory
            - vapnik-chervonenkis dimension
            - empirical risk-minimization principle
            - probably approximately correct  learning
        - no-free-lunch theorem
        - neural networks as universal machines
            - boolean function approximation
            - linear separability and nonlinear separability
            - continuous function approximation
            - winner-takes-all
        - compressed sensing and sparse approximation
    3. Perceptrons
        - one layer perceptron
        - single-layer perceptron
        - perceptron learning algorithm
        - least-mean squares (LMS) algorithm
        - P-delta rule
    4. Multilayer Perceptrons: Architecture and Error Backpropagation
        - universal approximation 
        - backpropagation learning algorithm
        - incremental learning versus batch learning
        - activation functions for the output layer
        - optimizing network structure
            - network pruning using sensitivty analysis
            - network pruning using regularization
            - network growing
        - speeding up learning process
            - eliminating premature saturation
            - adapting learning parameters
            - initializing weights
            - adapting activation function
        - some improved back propagation algorithms
            - BP with global descent
            - robust BP algorithms
        - reslient propagation (RProp)
    5. Mulilayer Perceptrons: other learning techniques
        - introduction to second-order learning methods
        - Newton's methods
            - gauss-newton method
            - levenberg-Marquardt method
        - quasi-newton methods
            - BFGS method
            - one-step secant method
        - conjugate-gradient methods
        - extended kalman filtering methods
        - recursive least squares
        - natural-gradient descent method
        - escaping local minima
        - complex-valued MLPs and their learning
    6. Hopfield networks, simulated annealing and chaotic neural networks
        - hopfield model
        - continuous-time hopfield network
        - simulated annealing
        - hopfield networks for optimization
            - combinatorial optimization problems
            - escaping local minimia for combinatorial optimization problems
        - chaos and chaotic neural networks
            - chaos, bifurcation, and fractals
            - chaotic neural networks
        - multistate hopfield networks
        - cellular neural networks
    7. associative memory networks
        - hopfield model: storage and retrieval
            - generalized hebbian rule
            - pseudoinverse rule
            - perceptron-type learning rule
            - retrieval stage
        - storage capability of the hopfield model
        - increasing storage capacity
        - multistate hopfield networks for associative memory
        - multilayer perceptrons as associative memories
        - hamming network
        - bidirectional associative memories
        - cohen-grossberg model
        - cellular networks
    8. Clustering I: Basic Clustering models and algorithms
        - introduction
            - vector quantization
            - competitive learning
        - self-organizing maps
            - kohonen network
            - basic self-organizing maps
        - learning vector quantization
        - nearest neighbor algorithms
        - neural gas
        - ART networks
        - C-means clustering
        - subtractive clustering
        - fuzzy clustering
    9. Clustering II: Topics in Clustering
        - the underutilization problem
            - competitive learning with conscience
            - rival penalized competitive learning
            - softcompetitive learning
        - robust clustering
            - possibilistic C-means
            - a unified framework for robust clustering
        - supervised clustering
        - clustering using non-euclidean distance measures
        - partitional, hierarchical, and density-based clustering
        - hierarchical clustering
        - constructive clustering techniques
        - cluster validity
        - projected clustering
        - spectral clustering
        - Coclustering
        - Handling Qualitative data
    10. Radial Basis Function Networks
        - radial basis functions
        - learning rbf centers
        - learning the weights
        - RBF network learning using orthogonal least-squares
        - supervised learning of all parameters
        - various learning methods
        - normalized RBF networks
        - optimizing network structure
        - complex RBF networks
        - comparison of RBF networks and MLPs
    11. Recurrent Neural Networks
        - fully connected recurrent networks
        - time-delay neural networks
        - backpropagation for temporal learning
        - RBF networks for modeling dynamic systems
        - some recurrent models
        - reservoir computing
    12. Principal Component Analysis
        - Introduction
            - Hebbian learning rule
            - Oja's Learning rule
        - PCA: conception and model
        - Hebbian Rule-Based PCA
            - subspace learning algorithms
            - generalized hebbian algorithm
        - least mean squared error-based PCA
        - anti-Hebbian RUle-Based PCA
        - Nonlinear PCA
        - Minor Component Analysis
        - Constrained PCA
        - Localized PCA, incremental PCA, and supervised PCA
        - complex-valued PCA
        - two-dimensional PCA
        - generalized eigenvalue decomposition
        - singular value decomposition
        - canonical corelation analysis
    13. Nonnegative Matrix Factorization
    14. Independent Component analysis
    15. discriminant analysis
    16. support vector machines
        - SVM Model
        - Solving the Quadratic Programming Problem
            - chunking 
            - decomposition
            - convergence of decomposition methods
        - least-sqaures SVMs
        - SVM training methods
        - pruning SVMs
        - multiclass SVMs
        - support vector regression
        - Support vector clusering
        - distributed and parallel SVMS
        - SVMs for active, trasductive and semi-supervised learning
        - probabilistic approach to SVM
    17. other kernel methods
        - kernel PCA
        - kernel LDA
        - kernel clustering
        - kernel autoassociators, kernel CCA and Kernel ICA
    18. reinforcement learning
        - learning through awards
        - actor-critic model
        - model-free and model-based reinforcement learning
        - temporal-differnece learning
        - Q-learning
        - learning automata
    19. probabilistic and bayesian networks
        - bayesian network model
        - learning bayesian networks
            - learning the structure
            - learning the parameters
            - constraint-handling
        - bayesian network inference
            - belief popagation
            - factor graphs and the belief propagation algorithm
        - sampling (monte carlo) methods
            - gibbs sampling
        - variational bayesian methods
        - hidden markov models
        - dynamic bayesian networks
        - expectation-maximization algorithm
        - mixture models
        - bayesian approach to neural network learning
        - boltzmann machines
            - boltzmann learning algorithm
            - mean-field-theory machine
            - stochastic hopfield networks
        - training deep networks
    20. combining multiple learners: data fusion and ensemble learning
        - boosting
        - bagging
        - random forests
        - topics in ensemble learning
        - solving multiclass classification
        - dempster-shafer theory of evidence
    21. introduction to fuzzy sets and logic
    22. neurofuzzy systems
        - rule extraction from trained neural networks
        - extracting rules from numerical data
        - synergy of fuzzy logic and neural networks
    23. Neural circuits and parallel implementation
    24. pattern recognition for biometrics and bioinformatics
    25. data mining
- **Pattern Classification**
    1. Introduction
        - Macine perception
        - pattern recognition systems
            - sensing 
            - segmentation and grouping
            - feature extraciton
            - classification
            - post processing
        - the design cycle
            - data collection
            - feature choice
            - model choice
            - training
            - evaluation
            - computational complexity
        - learning and adaptation
            - supervised learning
            - unsupervised learning
            - reinforcement learning
    2. Bayesian Decision Theory
        - Bayesian decision theory - continuous features
            - two-category classificaiton
        - minimum-error-rate classifcation
            - minimax criterion
            - neyman-pearson criterion
        - classifiers, discriminant functions and decision surfaces
            - the mutlicategory case
            - the two-category case
        - the normal density
            - univariate density
            - multivariate density
        - discriminant functions for the normal density
        - error probabilities and integrals
        - error bounds for normal densities
        - bayes decision theory - discrete features
        - missing and noisy features
        - bayesian belief networks
        - compound bayesian decision theory and context
    3. Maximum-Likelihood and Bayesian Parameter Estimation
        - maximum-likelihood estimation
        - bayesian estimation
        - bayesian parameter estimation: Gaussian Case
        - Bayesian parameter estimation: General theory
        - sufficient statistics
        - problems of dimensionality
        - component analysis and discriminants
        - expectation-maximiazation (EM)
        - hidden markov models
    4. Nonparametric techniques
        - density estimation
        - parzen windows
        - $k_n$-Nearest-neighbor estimation
        - the nearest neighbor rule
        - metrics and nearest-neighbor classification
        - fuzzy classification
        - reduced coulomb energy networks
        - approximations by series expansions
    5. Linear Discriminant Functions
        - linear discriminant functions and decision surfaces
        - generalized linear discriminant functions
        - the two-category linearly separable case
        - minimizing the perceptron criterion functions
        - relaxation procedures
        - nonseparable behavior
        - minimum squared-error procedures
        - the ho-kashyap procedures
        - linear programming algorithms
        - support vector machines
        - multicategory generalizations
    6. Multilayer neural networks
        - Feedforward operation and classification
        - backpropagation algorithm
        - error surfaces
        - backpropagation as feature mapping
        - backpropagation, bayes theory and probability 
        - related statistical techniques
        - practical techniques for improving backpropagation
            - activation function
            - parameters for the sigmoid
            - scaling input
            - target values
            - training with noise
            - manufacturing data
            - number of hidden units
            - initializing weights
            - learning rates
            - momentum 
            - weight decay
            - hints
            - on-line, stochastic or batch training
            - stopped training
            - number of hidden layers
            - criterion function
        - second-order methods
            - Hessian Matrix
            - Newton's method
            - Quickdrop
            - conjugate gradient descent
        - additional networks and training methods
            - radial basis function networks (RBFs)
            - special bases
            - matched filters
            - convolutional networks
            - recurrent networks
            - cascade-correlation
        - regularization, complexity adjustment and pruning
    7. Stochastic methods
        - stochastic search
            - simulated annealing
            - the boltzmann factor
            - determinisitc simulated annealing
        - boltzmann learning
            - stochastic boltzmann learning of visible states
            - missing features and category constraints
            - deterministics boltzmann learning
            - initialization and setting parameters
        - boltzmann networks and graphical models
        - evolutionary methods
            - genetic algorithms
            - further heuristics
            - why do they work?
        - genetic programming
    8. Nonmetric methods
        - Decision trees
        - CART
        - Other tree methods
        - Recognition with strings
        - grammatical methods
        - grammatical inference
        - rule-based methods
    9. Algorithm-Independent Machine Learning
        - lack of inherent superiority of any classifier
        - bias and variance
        - resampling for estimating statistics
        - resampling for classifier design
        - estimating and comparing classifiers
        - combining classifiers
    10. Unsupervised Learning and Clustering
        - mixture densities and identifiability
        - maximum-likelihood esimated
        - application to normal mixtures
        - unsupervised bayesian learning
        - data description and clustering
        - criterion functions for clustering
        - iterative optimization
        - hierarchical clustering
        - the problem of validity
        - on-line clustering
        - graph-theoretic methods
        - component analysis
        - low-dimensional representation and multidimensional scaling (MDS)
    11. Mathematical Foundations
        - Linear Algebra
            - notation and preliminaries
            - inner product
            - outer product
            - derivatives of matrices
            - determinany and trace
            - matrix inversion
            - eigenvectors and eigenvalues
         - Lagrance optimization
         - Probability theory
             - discrete random variables
             - expected values
             - pairs of discrete random variables
             - statistical independence
             - expected values of functions of two variables
             - conditional probability
             - the law of total probability and baye's rule
             - vector random variables
             - expectations, mean vectors and covariance matrices
             - continuous random variables
             - distributions of sums of independent random variables
             - normal distribution
           - Gaussian derivatives and integrals
           - hypothesis testing
           - information theory
           - computational complexity
- ** Computational intelligence - an introduction**
    1. Introduction
        - computational intelligence paradigms
            - artificial neural networks
            - evolutionary computation
            - swarm intelligence
            - artficial immune systems
            - fuzzy systems
    2. Artificial Neural Networks
        - The Artificial Neuron
            - calculating the net input signal 
            - activation functions
            - artificial neuron geometry
            - artificial neuron learning
                - augmented vectors
                - gradient descent learning rule
                - widrow-hoff learning rule
                - generalized delta learning rule
                - error-correction learning rule
        - Supervised Learning Neural Networks
            - neural network types
                - feedforward neural networks
                - functional link neuronal networks
                - product unit neural networks
                - simple recurrent neural networks
                - time-delay neural networks
                - cascade networks
            - supervised learning rule
                - the supervised learning problem
                - gradient descent optimization
                - scaled conjugate gradient
                - leapFrog optimization
                - particle swarm optimization
            - functioning of hidden units
            - ensemble neural networks
        - Unsupervised Learning Neural Networks
            - Hebbian learning rule
            - principal component learning rule
            - learning vector quantizer-I
            - self-organizing feature maps
                - stochastic training rule
                - batch map
                - growing SOM
                - improving convergence speed
                - clustering and visualization
        - Radial Basis Function Networks
            - learning vector quantizer - II
            - radial basis function neural networks
                - radial basis function network architecture
                - radial basis functions
                - training algorithms
                - radial basis function network variations
        - Reinforcement Learning
            - learning through awards
            - model-free reinforcement learning model
                - temporal difference learning
                - Q-learning
            - Neural networks and reinforcement learning
                - RPROP
                - Gradient descent reinforcement learning
                - connectionist Q-learning
        - Performance Issues (supervised Learning)
            - performance measures
                - accuracy
                - complexity
                - convergence
            - analysis of performance
            - performance factors
                - data preparation
                - weight initialization
                - learning rate and momentum
                - optimization method
                - architecture selection
                - adaptive activation unctions
                - active learning
    3. Evolutionary Computation
        - introduction to evolutionary computation
            - generic evolutionary algorithm
            - representation - the chromosome
            - initial population
            - fitness function
            - selection
                - selective pressure
                - random selection
                - proportional selection
                - tournament selection
                - rank-based selection
                - boltzmann selection
            - reproduction operators
            - stopping conditions
            - evolutionary computation versus classical optimization
        - Genetic algorithms
            - canoncial genetic algorithm
            - crossover
            - mutation
            - control parameters
            - genetic algorithm variants
                - generation gap methods
                - messy genetic algortihms
                - interactive evolution
                - island genetic algorithms
            - advanced topics
                - niching genetic algorithms
                - constraint handling
                - multi-objective optimization
                - dynamic environments
        - Genetic Programming
            - tree-based representation
            - initial population
            - fitness function
            - crossover operators
            - mutation operators
            - building block genetic programming
        - evolutionary computation versus classical optimization
            - basic evolutionary programming
            - evolutionary programming operators
                - mutation operators
                - selection operators
            - strategy parameters
                - static strategy parameters
                - dynamic strategies 
                - self-adaptation
             - evolutionary programming implementations
                - classical evolutionary programming
                - fast evolutionary programming
                - exponential evolutionary programming
                - accelerated evolutionary programming
                - momentum evolutionary programming
                - evolutionary programming with local search
                - evolutionary programming with extinction 
                - hybrid with particle swarm optimization
             - advanced topics
                - constraint handling approaches
                - multi-objective optimization and niching
                - dynamic environments
             - Applications
                - finite-state machines
                - function optimization
                - training neural networks
        - evolution strategies
            - (1+1)-ES
            - Generic evolution strategy algorithm
            - strategy parameters and self-adaptation
            - evolutionary strategy operators
                - selection operators
                - crossover operators
                - mutation operators
            - evolution strategy variants
                - polar evolution strategies
                - evolutiona strategies with directed variation
                - incremental evolution strategies
                - surrogate evolution strategy
            - advanced topics
                - constraint handling approaches
                - multi-objective optimization
                - dynamic and noisy environments
                - niching
        - differential evolution
            - basic differetial evolution
                - difference vectors
                - mutation
                - crossover
                - selection
                - general differential evolution algorithm
                - control parameters
                - geometrical illustration
            - DE/x/y/z
            - Variations to basic differential evolution
                - hybrid differential evolution strategies
                - population-based differential evolution
                - self-adaptive differential evolution
            - differential evolution for discrete-valued problems
        - cultural algorithms
            - culture and artificial culture
            - basic cultural algorithm
            - belief space
            - fuzzy cultural algorithms
        - coevolution
            - coevolution types
            - competitive coevolution
                - competitive fitness
                - generic competitive coevolutionary algorithm
            - cooperative coevolution
    4. Computational Swarm Intelligence
        - particle swarm optimization
            - global best PSO
            - local best PSO
            - velocity components
            - architecture selection
        - ant algorithms
            - foraging behavior of ants
            - stigmergy and artificial pheromone
            - simple ant colony optimizaiton
            - ant system
            - ant colony sytems
            - division of labor
            - task allocation based on response thresholds
            - adaptive task allocation and specialization
    5. Artificial Immune Systems
        - natural immune system
        - Artificial immune models
    6. Fuzzy systems
        - fuzzy sets
        - fuzzy logic and reasoning
        - fuzzy controllers
        - rought sets
        
- **The Elements of Statistical Learning**
    1. Introduction
    2. Overview of Supervised Learning
        - variable types and terminology
        - two simple approaches to prediction: least squares and nearest neighbors
        - statistical decision theory 
        - local methods in high dimensions
        - statistical models, supervised learning and function approximation 
        - structured regression models
        - classes of restricted estimators
        - model selection and the bias-variance tradeoff
    3. Linear Methods of Regression
        - linear regression models and least squares
        - subset selection
        - shrinkage methods
        - methods using derived input directions
    4. Linear Methods for Classification
        - linear regression of an indicator matrix
        - linear discriminant analysis
        - logistic regression
        - separating hyperplanes
    5. Basis Expansions and Regularization
        - piecewise polynomials and splines
        - filtering and feature extraction
        - smoothing splines
        - automatic selection of the smoothing parameters
        - nonparametric logistic regression
        - multidimensional splines
        - regularization and reproducing kernel hilbert spaces
        - wavelet smoothing
    6. Kernel Smoothing Methods
        - one-dimensional kernel smoothers
        - selecting the width of the kernel 
        - local regression in $\mathbb{R}^p$
        - structured local regression models in $\mathbb{R}^p$
        - local likelihood and other models
        - kernel density estimation and classification
        - radial basis functions and kernels
        - mixture models for density estimation and classification
    7. Model Assessment and Selection
        - bias, variance and model complexity
        - the bias-variance decomposition
        - optimism of the training error rate
        - estimates of in-sample prediction error
        - the effective number of parameters
        - the bayesian approach and BIC
        - minimum description length
        - Vapnik-Chervonenkis dimension
        - cross-validation
        - bootstrap methods
    8. Model Inference and Averaging
        - the bootstrap and maximum likelihood methods
        - bayesian methods
        - relationship between the bootstrap and bayesian inference
        - the EM algorithm
        - MCMC for sampling for the posterior
        - Bagging
        - Model averaging and stacking
        - stochastic search: bumping
    9. Additive Models, Trees, and Related Methods
        - generalized additive models
        - tree-based methods
        - PRIM: bump hunting
        - MARS: multivariate adaptive regression splines
        - hierarchical mixture of experts
        - missing data
    10. Boosting and Additive Trees
        - boosting methods
        - boosting fits and additive movel
        - forward stagewise additive modeling
        - exponential loss and AdaBoost
        - why exponential loss?
        - loss functions and robustness
        - "Off-the-shelf" procedures for data mining
        - boosting trees
        - numerical optimization via gradient boosting
        - right-sized trees for boosting
        - regularization
        - interpretation
    11. Neural Networks
        - projection pursuit regression
        - fitting neural networks
        - some issues in training neural networks
            - starting values
            - overfitting
            - scaling of the inputs
            - number of hidden units and layers
            - multiple minima
    12. Support Vector Machines and Flexible Discriminants
        - the support vector classifier
        - support vector machines an kernels
            - computing the SVM for classification
            - the SVM as a penalization method
            - function estimation and reproducing kernels
            - SVMs and the curse of dimensionality
            - a path algorithm for the SVM classifier
            - support vector machines for regression
            - regression and kernels
        - Generalizing linear discriminant analysis
        - flexible discriminant analysis
        - penalized discriminant analysis
        - mixture discriminant analysis
    13. Prototype Methods and Nearest-Neighbors
        - protype methods
            - K-means clustering
            - learning vector quantization
            - gaussian mixtures
        - k-Nearest-Neighbors classifiers
        - adaptive nearest-neighbor methods
    14. Unsupervised Learning
        - association rules
        - cluster analysis
            - proximity matrices
            - dissimilarities based on attributes
            - object dissimilarity
            - clustering algorithms
            - combinatorial algorithms
            - K-means
            - gaussian mixtures as soft K-means clustering
            - vector quantization
            - k-medoids
            - hierarchical clustering
         - self-organizing maps
         - principal components, curves and surfaces
             - principal componenets
             - principal curves and surfaces
             - spectral clustering
             - kernel principal components
             - sparse principal components
         - non-negative matrix factorization
         - independent component analysis and exploratory projection pursuit
         - multidimensional scaling
         - nonlinear dimension reduction and local multidimensional scaling
         - the Google PageRank Algorithm
    15. Random Forests
        - definition of random forests
        - details of random forests
            - out of bag samples
            - variable importance
            - proximity plots
            - random forests and overfitting
        - analysis of random forests
            - variance and the de-correlation effect
            - bias
            - adaptive nearest neighbors
    16. Ensemble Learning
        - boosting and regularization paths
            - penalized regression
            - the "bet on sparsity" principle
            - regularization paths, over-fitting and margins
        - learning ensembles
    17. Undirected Graphical Models
        - markov graphs and their properties
        - undirected graphical models for continuous variables
        - undirected graphical models for discrete variables
    18. High-dimensional problems: $p>>N$
        - diagonal linear discriminant analysis and nearest shrunken centroids
        - linear classifiers with quadratic regularization
            - regularized discriminany analysis
            - logistic regression with quadratic regularization
            - the support vector classifier
            - feature selection
            - computational shortcuts when $p>>N$
        - linear classifiers with $L_1$ regularization
        - classification when features are unavailable
        - high-dimensional regression: supervised pricipal components
        - feature assessment and the multiple-testing problem
- **Neural networks and learning machines** (need to expand TOC for good meaty details)
    1. Introduction
    2. Rosenblatt's Perceptron
    3. Model Building through regression
    4. The Least-Mean-Square Algorithm
    5. Multilayer perceptrons
    6. kernel methods and radial-basis function networks
    7. support vector machines
    8. regularization theory
    9. principal-components analysis
    10. self-organizing maps
    11. information-theoretic learning models
    12. stochastic methods rooted in statistical mechanics
    13. dynamic programming
    14. neurodynamics
    15. Bayesian filtering for state estimation of dynamic systems
    16. dynamically driven recurrent networks
- **Computational Intelligence - A methodological introduction**
    1. Neural networks
        - motivation and biological background
        - threshold logic units (geometric interpretation, limitations, training parameters)
        - general neural networks (structure, operation, training)
        - multi-layer perceptrons (back prop, gradient descent)
        - radial basis function networks
        - self-organizing maps
        - hopfield networks
        - recurrent networks (representing diff eqns, vectorial NNs, error BP in time)
        - mathematical remarks
    2. Evolutionary Algorithms
        - introduction to evolutionary algorithms 
        - elements of evolutionary algorithms (fitness, selection ,genetic operators)
        - fundamental evolutionary algorithms (GA, GP, evolutionary strategies)
        - Special applications and techniques
    3. Fuzzy systems
        - fuzzy sets and fuzzy logic
        - the extension principle
        - fuzzy relations
        - similarity relations
        - fuzzy control
        - fuzzy clustering
    4. Bayes Networks
        - introduction to bayes networks
        - elements of probability and graph theory
        - decompositions
        - evidence propagation
        - learning graphical models
- **Foundations of Machine Learning**
    1. The PAC learning framework
    2. Rademacher Complexity and VC-dimension
    3. Support vector machines
    4. kernel methods
    5. boosting
    6. on-line learning
    7. multi-class classification
    8. ranking
    9. regression
    10. algorithmic stability
    11. dimensionality reduction
    12. learning automata and languages
    13. reinforcement learning
    14. a linear algebra review
    15. convex optimization
    16. probability review
    17. concentration inequalities
- **Computational statistics Handbook with MATLAB**
    1. Probability concepts
        - axioms of probability
        - conditional probability
        - independence
        - expectation (mean, variance, skewness, kurtosis)
        - common distributions
    2. sampling concepts
        - sample mean, sample variance, sample moments, covariance
        - sampling distributions
        - paramter estimation (bias, MSE, SE, MLE, methods of moments)
        - empirical distribution function 
    3. generating random variables
        - generating random variables
        - generating continuous random variables
        - generating discrete random variables
    4. exploratory data analysis
        - exploring univariate data
        - exploring bivariate and trivariate data
        - exploring multi-dimensional data
    5. monte carlo methods for inferential statistics
        - classical inferential statistics
        - monte carlo methods for inferential statistics
        - bootstrap methods
    6. data partitioning
        - cross-validation
        - jackknife
        - better bootstrap confidence intervals
        - jackknife-after-bootstrap
    7. probability density estimation
        - histograms
        - kernel density estimation
        - finite mixtures
        - generating random variables
    8. statistical pattern recognition
        - bayes decision theory
        - evaluating the classifier
        - classificaiton trees
        - clustering
    9. nonparametric regression
        - smoothing
        - kernel methods
        - regression trees
    10. markov chain monte carlo methods
        - metropolis-hastings algorithms
        - the Gibbs sampler
        - convergence monitoring
    11. spatial statistics
        - visualizing spatial point processes
        - exploring first-order and second-order properties
        - modelin spatial point processes
        - simulating spatial point processes
- **Machine Learning - a probabilistic perspective**
    1. Introduction
        - machine learning: what and why?
        - supervised learning (classification/regression)
        - unsupervised learning (discovering clusters/latent factors/graph structure & matrix completion)
        - some basic concepts in machine learning
            - parametric v. non-parametric models
            - simple non-parametric classifier: K-nearest neighbors
            - the curse of dimensionality
            - parametric models for classification and regression
            - linear regression
            - logistic regression
            - overfitting
            - model selection
            - no free lunch theorem
    2. Probability
        - a brief review of probability theory
        - some common discrete distributions
        - some common continuous distributions
        - joint probability distributions
        - transformations of random variables
        - monte carlo approximation
        - information theory
    3. Generative models for discrete data
        - bayesian concept learning
            - likelihood, prior, posterior, posterior predictive distribution
        - the beta-binomial model
        - the Dirichlet-multinomial model
        - Naive Bayes classifiers
    4. Gaussian models
        - Gaussian discriminant analysis
        - inference in jointly Gaussian distributions
        - linear gaussian systems
        - digression: the wishart distribution
        - inferring the paramters of an MVN
    5. Bayesian statisics
        - summarizing posterior distributions
        - bayesian model selections
        - priors
        - hierarchical Bayes
        - Empirical Bayes
        - Bayesian decision theory
    6. Frequentist statistics
        - sampling distribution of an estimator
        - frequentist decision theory 
        - desirable properties of estimators
        - empirical risk minimization 
        - pathologies of frequentist statistics
    7. Linear regression
        - model specification
        - maximum likelihood estimation (least squares)
        - robust linear regression
        - ridge regression
        - Bayesian linear regression
    8. Logistic regression
        - model specification
        - model fitting
        - Bayesian logistic regression
        - Online learning and stochastic optimization
        - generative vs discriminative classifiers
    9. Generalized linear models and the exponential family
        - the exponential family
        - Generalized linear models (GLMs)
        - Probit regression
        - multi-task learning
        - generalizated linear mixed models
        - learning to rank
    10. Directed graphical models (Bayes nets)
        - inference
        - learning
        - conditional independence properties of DGMs
        - influence (decision) diagrams
    11. Mixture models and the EM algorithm
        - latent variable models
        - mixture models
        - parameter estimation for mixture models
        - the EM algorithm
        - model selection for latent variable models
        - fitting models with missing data
    12. Latent linear models
        - factor analysis
        - principal component analysis (PCA)
        - choosing the number of latent dimensions
        - PCA for categorical data
        - PCA for paired and multi-view data
        - Independent component analysis (ICA)
    13. Sparse linear models
        - Bayesian variable selection
        - $\mathscr{l}_1$ regularization: basics
        - $\mathscr{l}_1$ regularization: algorithms
        - $\mathscr{l}_1$ regularization: extensions
        - Non-convex regularizers
        - Automatic relevance determination (ARD)/sparse Bayesian learning (SBL)
        - Sparse coding
    14. Kernels
        - Kernel functions
        - Using kernels inside GLMs
        - The kernel trick
        - Support vector machines (SVMs)
        - Comparison of discriminative kernel methods
        - Kernels for building generative models
    15. Gaussian processes
        - GPs for regression
        - GPs meet GLMs
        - Connection with other methods
        - GP latent variable model
        - Approximation methods for large datasets
    16. Adaptive basis function models
        - Classificaiton and regression trees (CART)
        - Generalized additive models
        - Boosting
        - Feedforward neural networks (multilayer perceptrons)
        - ensemble learning
        - experimental comparison
        - interpreting black-box models
    17. Markov and hidden Markov models
        - Markov models
        - Hidden Markov models
        - Inference in HMMs
        - Learning for HMMs
        - Generalizaitons of HMMs
    18. State space models
        - Applications of SSMs
        - Inference in LG-SSM
        - Learning for LG-SSM
        - Approximate online inference for non-linear, non-Gaussian SSMs
        - Hybrid discrete/continuous SSMs
    19. Undirected graphical models (Markov random fields)
        - conditional independence properties of UGMs
        - parameterization of MRFs
        - Examples of MRFs (ising, hopfield, potts, gaussian, markov logic)
        - Learning
        - Conditional random fields (CRFs)
        - Structural SVMs
    20. Exact inference for graphical models
        - Belief propagation for trees
        - the variable elimination algorithm
        - the junction tree algorithm
        - computational intractability of exact inference in the worst case
    21. Variational inference
        - variational inference
        - the mean field method
        - structured mean field
        - variational Bayes
        - variational Bayes EM
        - variational message passing and VIBES
        - local variational bounds
    22. More variational inference
        - loopy belief propagation: algorithmic issues
        - loopy belief propagation: theoretical issues
        - extensions of belief propagation
        - expectation propagation
        - MAP state estimation
    23. Monte Carlo inference
        - sampling from standard distributions
        - rejection sampling
        - importance sampling
        - particle filtering
        - Rao-Blackwellised particle filtering (RBPF)
    24. Markov Chain Monte Carlo (MCMC) inference
        - Gibbs sampling
        - Metropolis Hastings algorithm
        - Speed accuracy of MCMC
        - Auxiliary variable MCMC
        - Annealing methods
        - approximating the marginal likelihood
    25. Clustering
        - Dirichlet process mixture models
        - affinity propagation
        - Spectral clustering
        - hierarchical clustering
        - clustering datapoints and features
    26. Graphical model structure learning
        - structure learning for knowledge discovery
        - learning tree structures
        - learning DAG structures
        - learning DAG structures with latent variables
        - Learning causal DAGs
        - learning undirected Gaussian graphical models
        - Learning undirected discrete graphical models
    27. Latent variable models for discrete data
        - distributed state LVMs for discrete data
        - latent Dirichlet allocation (LDA)
        - Extensions of LDA
        - LVMs for graph-structured data
        - LVMs for relational data
        - Restricted Boltzmann machines (RBMs)
    28. Deep learning
        - deep generative models
            - deep directed networks
            - deep boltzmann machines
            - deep belief networks
            - greedy layer-wise learning of DBNs
        - deep neural networks
            - deep multi-layer perceptrons
            - deep auto-encoders
            - stacked denoising auto-encoders
        - applications of deep networks
- ** Bio-inspired Artificial Intelligence - theories, methods, and technologies**
    1. Evolutionary systems
        1. pillars of evolutionary theory
        2. the genotype
        3. artificial evolution
        4. genetic representations
        5. initial population
        6. fitness functions
        7. selection and reproduction
        8. genetic operators
        9. evolutionary measures
        10. types of evolutionary algorithms
        11. scheme theory
        12. human-competitive evolution
        13. evolutionary electronics
        14. lessonfs from evolutionary electronics
        15. the role of abstraction
        16. analog and digital circuits
        17. extrinsic and intrinsic evolution
        18. digital design
        19. evolutionary digital design
        20. analog design
        21. evolutionary analog design
        22. multiple objectives and contraints
        23. design verification
    2. Cellular Systems
        1. The Basic Ingredients
        2. Cellular Automata
        3. Modeling with cellular systems
        4. some classic cellular automata
        5. other cellular systems
        6. computation
        7. artificial life
        8. complex systems
        9. analysis and synthesis of cellular systems
    3. Neural Systems
        1. biological nervous systems
        2. artificial neural networks
        3. neuron models
        3. architecture
        4. signal encoding
        5. synaptic plasticity
        6. unsupervised learing
        7. supervised learning
        8. reinforcement learning
        9. evolution of neural networks
        10. neural hardware
        11. hybrid neural systems
    4. Developmental Systems
        1. potential advantages of a developmental representation
        2. rewriting systems
        3. synthesis of developmental systems
        4. evolution and development
        5. defining artificial evolutionary developmental systems
        6. evolutionary rewriting systems
        7. evolutionary developmental programs
        8. evolutionary developmental processes
    5. Immune Systems
        1. How biological immune systems work
        2. the constituents of biological immune systems
        3. lessons for artificial immune systems
        4. algorithms
        5. shape space
        6. negative selection algorithm
        7. clonal selection algorithm
    6. Behavioral Systems
        1. behavior in cognitive science
        2. behavior in artificial intelligence
        3. behavior-based robotics
        4. biological inspiration for robots
        5. robots as biological models
        6. robot learning
        7. evolution of behavioral systems
        8. evolution and learning in behavioral systems
        9. evolution and neural development in behavioral systems
        10. coevolution of body and control
        11. toward self-reproductions
        12. simulation and reality
    7. Collective Systems
        1. biological self-organization
        2. particle swarm optimization
        3. ant colony optimization
        4. swarm robotics
        5. coevolutionary dynamics: biological models
        6. artificial evolution of competing systems
        7. artificial evolution of cooperation
- **

- **Learning with Kernels**
    1. A tutorial introduction
        - data representation and similarity
        - a simple pattern recognition algorithm
        - some insights from statistical learning theory
        - hyperplane classifiers
        - support vector classification
        - support vector regression
        - kernel principal component analysis
    2. Concepts and Tools
        1. Kernels
            - product features
            - the representation of similarities in linear spaces
            - examples and properties of kernels
            - the representation of dissimilarities in linear spaces
        2. Risk and Loss Functions
            - loss functions
            - test error and expected risk
            - a statistical perspective
            - robust estimators
        3. Regularization
            - the regularized risk functional 
            - the representer theorem
            - regularization operators
            - translation invariant kernels
            - translation invariant kernels in higher dimensions
            - dot product kernels
            - multi-output regularization
            - semiparametric regularization
            - coefficient based regularization
        4. Elements of statistical Learning theory
            - the law of large numbers
            - when learning does work: the question of consistency
            - uniform convergence and consistency
            - how to derive a VC bound
            - a model selection example
        5. Optimization
            - convex optimization
            - unconstrained problems
            - constrained problems
            - interior point methods
            - maximum search problems
    3. Support Vector Machines
        1. Pattern Recognition
            - separating hyperplanes
            - the role of the margin
            - optimal margin hyperplanes
            - nonlinear support vector classifiers
            - soft margin hyperplanes
            - multi-class classification
        2. Single-Class Problems: Quantile Estimation and Novelty Detection
        3. Regression Estimation
            - linear regression with insensitive loss function
            - dual problems
            - $\nu$-SV Regression
            - Convex Combinations and $\mathscr{l}_1$-Norms
            - Parametric insensitivity models
        4. Implementation
            - tricks of the trade
            - sparse greedy matrix approximation
            - interior point algorithms
            - subset selection methods
            - sequential minimal optimization
            - iterative methods
        5. Incorporating Invariances
            - prior knowledge
            - transofmration invariance
            - the virtual SV method
            - constructing invariance kernels
            - the jittered SV method
        6. Learning theory revisited
            - concentration of measure inequalities
            - leave-one-out estimates
            - PAC-Bayesian bounds
            - operator-theoretic methods in learning theory
    4. Kernel Methods
        1. Designing Kernels
            - tricks for constructing kernels
            - string kernels
            - locality-improved kernels
            - natural kernels
        2. Kernel Feature extraction
        3. Kernel Fisher Discriminant
        4. Bayesian Kernel Methods
        5. Regularized Principal manifolds
        6. pre-images and reduced set methods
    5. Mathematical Prerequisites
        1. Probability
        2. Linear Algebra
        3. Functional Analysis
- **The Computer Revolution in Philosophy - Philosophy, Science and Models of Mind**
    1. Introduction and Overview
    2. Methodological Preliminaries
        1. What are the aims of science?
        2. Science and Philosophy
        3. What is conceptual analysis?
        4. Are computers really relevant?
    3. Mechanisms
        1. Sketch of an intelligent mechanism
        2. Intuition and analogical reasoning
        3. on learning about numbers: some problems and speculations
        4. Perception as a computational process
        5. Conclusion: AI and philosophical problems
- **Optimization for Machine Learning**
    1. Introduction: Optimization and Machine Learning
    2. Convex Optimization with Sparsity-Inducing Norms
    3. Interior-Point Methods for Large-Scale Cole Programming
    4. Incremental Gradient, subgradient and proximal methods for convex optimization: a survey
    5. First-order methods for nonsmooth convex Large-scale Optimization, I: General Purpose methods
    6. First-order methods for nonsmooth convex Large-scale Optimization, II: Utilizing Problem's Structure
    7. Cutting-Plane methods in Machine learning
    8. Introduction to Dual decomposition for inference
    9. Augmented Lagrangian methods for learning, selecting and combining features
    10. the convex optimization approach to regret minimization
    11. projected newton-type methods in machine learning
    12. interior-point methods in machine learning
    13. The tradeoff of large-scale learning
    14. Robust optimization in machine learning
    15. Improving First and second-order methods by modeling uncertainty
    16. Bandit view on noisy optimization
    17. Optimization methods for Sparse inverse covariance selection
    18. a pathwise algorithm for covariance selection
- **Computational Intelligence Paradigms for MATLAB**
    1. Computational intelligence
        - primary classes of problems for CI techniques
            - optimization
            - NP-complete problems
        - feed forward neural networks
        - fuzzy systems
        - evolutionary computing
            - genetic algorithms
            - genetic programming
            - evolution programming
            - evolutionary strategies
        - swarm intelligence
        - other paradigms
            - granular computing
            - chaos theory
            - artificial immune systems
        - hybrid approaches
    2. Artificial Neural Networks with MATLAB
        - implementation
        - operations
        - training
        - teaching
        - learning rates
        - learning laws
    3. ANNs - Architectures and algorithms
        - single-layer
        - multi-layer
        - perceptron
        - feedforward back-propagation network
        - delta bar delta network
        - directed random search network
    4. Classification and Association neural networks
        - learning vector quantificaiton
        - counter-propagation network
        - probabilistic neural network
        - data association networks
            - hopfield network
            - boltzmann machine
            - hamming network
            - bi-directional associative memory
        - data conceptualization networks
            - adaptive resonance network
            - ART algorithm
            - self-organizing map
    5. Matlab programs to implement neural networks
    6. MATLAB-based fuzzy systems
    7. Fuzzy inference and expert systems
    8. MATLAB illustration on fuzzy systems
    9. Neuro-fuzzy modeling using MATLAB
    10. Evolutionary Computation paradigms
        - evolutionary algorithms parameters
        - solution representation
        - fitness function
        - initialization of population size
        - selection mechanisms
        - crossover technique
        - mutation operator
        - reproduction operator
        - evolutionary programming
        - evolutionary strategies
            - solution representation
            - mutation
            - recombination
            - population assessment
            - convergence criteria
    11. Evolutionary algorithms implemented using MATLAB
    12. MATLAB-based genetic algorithm
        - encoding and optimization problems
        - historical overview of genetic algorithm
        - description
        - solution representation of genetic algorithms
        - parameters of GA
        - schema theora and background
        - crossover operators and schemata
        - genotype and fitness
        - advanced operators in GA
            - inversion and reordering
            - epistasis
            - deception
            - mutation and naive evolution
            - niche and speciation 
            - restricted mating
            - diploidy and dominance
        - GA versus traditional search and optimization methods
            - neural nets
            - random search
            - gradient search
            - iterated search
            - simulated annealing
    13. Genetic Programming
        - growth of genetic programming
        - LISP programming language
        - functionality of genetic programming
            - generation of an individual and population
            - creating a random population
            - fitness test
            - functions and terminals
            - the genetic operations
            - crossover operation
            - mutation operation
        - genetic programming in machine learning
        - elementary steps in GP
            - the terminal set
            - the function set
            - the fitness function
            - the algorithm control parameters
            - the termination criterion
    14. MATLAB-based swarm intelligence
        - biological background
        - swarm robots
        - stability of swarms
        - swarm intelligence
        - particle swarm optimization (PSO)
        - ant colony optimization
- **Machine learning for audio, image and video analysis**
    1. introduction
    2. From Peception to computation
        1. Audio acquisition, representation and storage
            - Sound physics, production and perception
            - audio acquistion
            - audio encoding and storage formats
            - Time-domain audio processing
        2. Image and video acquisition, representation and storage
            - human eye physiology
            - image acquisition devices
            - color representation 
            - image formats
            - video principles
            - MPEG standard
    3. Machine Learning
        1. Machine learning
            - taxonomy of machine learning
                - rote learning
                - learning from instruction
                - learning by analogy
            - learning from examples
                - supervised learning
                - reinforcement learning
                - unsupervised learning
        2. Bayesian theory of decision
            - bayes decision rule
            - bayes classifier
            - loss function
            - zero-one loss function
            - discriminany functions
            - Gaussian density
                - univariate gaussian density
                - multivariate gaussian density
                - whitening transformation
            - discriminant functions for gaussian likelihood
                - features are statistically independent
                - covariance matrix is the same for all classes
                - covariance matrix is not the same for all classes
            - receiver operating curves
        3. clustering methods
            - expectation and maximization algorithm
            - basic notions and terminology
                - codebooks and codevectors
                - quantization error minimization
                - entropy maximizaiton
                - vector quantization
            - k-means
            - self-organizing maps
            - neural gas topology representing network
            - general topographic mapping
            - fuzzy clustering algorithms
        4. foundation of statistial learning and model selection
            - bias-variance dilemma
                - bias-variance dilemma for regression
                - bias-variance decomposition for classification
            - model complexity
            - VC dimension and structural risk minimization
            - statistical learning/Vapnik-chervonenkis theory
            - AIC and BIC criteria
            - minimum description length approach 
        5. supervised neural networks and ensemble methods
            - ANNs and neural computation
            - ANs
            - connections and network architectures
            - single-layer networks
                - linear discriminant functions and single-layer networks
                - linear discriminant and the logistic sigmoid
                - generalized linear discriminants and perceptron
            - multilayer networks/perceptron
            - multilayer networks training
                - error back-propagation for feed-forwards networks
                - parameter update: the error surface
                - parameters update: the gradient descent
            - learning vector quantization
            - ensemble methods
        6. kernel methods
            - lagrance method and kuhn tucker theorem
            - support vector machines for classification
                - optimal hyperplane algorithm
                - support vector machine construction
                - algorithmics approaches to solve quadratic programming
                - sequential minimal optimization
                - SVM and regularization methods
            - multiclass support vector machines
                - one-versus-rest method
                - one-versus-one method
            - Support vector machines for regression
                - regression with quadratic $\epsilon$-insensitive loss
                - kernel ridge regression
                - regression with linear $\epsilon$-insensitive loss
            - Gaussian processes
            - Kernel Fisher discriminant
            - Kernel PCA
            - One-Class SVM
            - Kernel Clustering Methods
        7. markovian models for sequential data
            - Hidden markov models
            - the three problems
            - the likelihood problem and the trellis
            - the decoding problem
            - the learning problem
            - HMM variants
            - N-gram models and statistical language modeling
            - discounting and smoothing methods for N-gram models
            - buidling a language moel with N-grams
        8. feature extraction methods and manifold learning methods
            - the curse of dimensionality
            - data dimensionality
            - principal component analysis
            - independent component analysis
            - multidimensional scaling methods
            - manifold learning
     4. Applicaitons
        1. Speech and handwriting recognition
        2. Automatic Face Recognition
        3. Video segmentation and keyframe extraction
     5. Appendices
         1. Statistics
             - Fundamentals
                 - probability and relative frequency 
                 - the sample space
                 - the addition law
                 - conditional probability
                 - statistical independence
             - random variables
                 - fundamentals
                 - mathematical expectation
                 - variance and covariance
         2. Signal processing
             - the complex numbers
             - the z-transform
                 - z-transform properties
                 - the fourier transform
                 - the discrete fourier transform
             - the discrete cosine transform
         3. Matrix algebra
             - fundamentals
             - determinants
             - eigenvalues and eigenvectors
         4. mathematical foundation of kernel methods
             - scalar products, norms and metrics
             - positive definite kernels and matrices
             - conditionate positive definite kernels matrices
             - negative definite kernels and matrices
             - relations between positive and negative definite kernels
             - metric computation by mercer kernels
             - hilbert space representation of positive definite kernels
- **Algebraic Geometry and Statistical Learning Theory**
    1. Introduction
        - basic concepts
        - statistical models and learning machines
        - statistical estimation methods
        - four main formulas
    2. Singularity theory
        - polynomials and analytic functions
        - algebraic set and analytic set
        - singularity
        - resolution of singularities
        - normal crossing singularities
        - manifold
    3. Algebraic geometry
        - ring and ideal
        - real algebraic set
        - singularities and dimension
        - real projective space
        - blow-up 
    4. Zeta function and singular integral
        - schwartz distribution
        - state density function
        - mellin transform
        - evaluation of singular integral
        - asymptotic expansion and b-function
    5. empirical processes
        - convergence in law
        - function-valued analytic function
        - empirical processes
        - fluctuation of gaussian processes
    6. singular learning theory
        - standard form of likelihood ratio function
        - evidence and stochastic complexity
        - bayes and gibbs estimation
        - maximum likelihood and *a posteriori*
    7. singular learning machines
        - learning coefficient
        - three-layered neural networks
        - mixture models
        - bayesian network
        - hidden markov model
        - singular learning process
        - bias and variance
        - non-analystic learning machines
    8. singular statistics
        - universal optimal learning
        - generalized bayes informaiton criterion
        - widely applicable information criteria
        - singular hypothesis test
        - realization of *a posteriori* distribution
        - from regular to singular
- **Bayesian reasoning and machine learning**
    - I. Inference in probabilistic models
        1. Probabilistic learning
            - probability refresher
            - probabilistic reasoning
            - prior, likelihood and posterior
        2. Basic graph concepts
            - graphs
            - numerically encoding graphs
                - edge list
                - adjacency matrix
                - clique matrix
        3. Belief networks
            - the benefits of structure
            - uncertain and unreliable evidence
            - belief networks
            - causality
        4. graphical models
            - graphical models
            - markov networks
            - chain graphical models
            - factor graphs
            - expressiveness of graphical models
        5. efficient inference in trees
            - marginal inference
            - other forms of inference
            - inference in multiply connected graphs
            - message passing for continuous distribution
        6. the junction tree algorithm
            - clustering variables
            - clique graphs
            - junction trees
            - constraining a junction tree for singly connected distributions
            - junction trees for multiply connected distributions
            - the junction tree algorithm
            - finding the most likely state
            - reabsorption: converting a junction tree to a directed network
            - the need for approximations
        7. making decisions
            - expected utility
            - decision trees
            - extending bayesian networks for decisions
            - solving influence diagrams
            - markov decision processes
            - temporally unbounded MDPs
            - variational inference and planning
            - financial matters
    - II. Learning in probabilistic models
        1. statistics for machine learning
            - representing data
            - distributions
            - classical distributions
            - multivariate gaussian
            - exponential family
            - learning distributions
            - properties of maximum likelihood
            - learning a gaussian 
        2. learning as inference
            - learning as inference
            - Bayesian methods and ML-II
            - maximum likelihood training of belief networks
            - Bayesian belief network training
            - structure learning
            - maximum likelihood for undirected models
        3. naive bayes
            - naive Bayes and conditional independence
            - estimation using maximum likelihood
            - Bayesian naive Bayes
            - tree augmented naive Bayes
        4. learning with hidden variables
            - Hidden variables and missing data
            - Expectation maximization
            - Extensions of EM
            - A failure case for EM
            - Variational Bayes
            - Optimising the likelihood by gradient methods
        5. Bayesian model selection
            - comparing models the Bayesian way
            - illustrations: coin tossing
            - occam's razor and Bayesian complexity penalisation
            - a continuous sample: curve fitting
            - approximating the model likelihood
            - Bayesian hypothesis testing for outcome analysis
    - III. Machine learning
        1. Machine learning concepts
            - styles of learning
                - supervised
                - unsupervised
                - anomaly detection
                - online (sequential) learning
                - interacting with the environment
                - semi-supervised learning
            - supervised learning
            - Bayes versus empirical decisions
        2. Nearest neighbour classification
            - do as your neighbor does
            - K-nearest neighbors
            - a probabilistic interpretation of nearest neighbors
        3. Unsupervised linear dimension reduction
            - high-dimensional spaces-low-dimensional manifolds
            - principal components analysis
            - high-dimensional data
            - latent sematic analysis
            - PCA with missing data
            - matrix decomposition methods
            - Kernel PCA
            - canonical correlation analysis
        4. Supervised linear dimension reduction
            - supervised linear projections
            - fisher's linear discriminant
            - canonical variates
        5. Linear models
            - introduction: fitting a straight line
            - linear parameter models for regression
            - the dual representation and kernels
            - linear parameter models for classification
            - support vector machines
            - soft zero-one loss for outlier robustness
        6. Bayesian linear models
            - regression with additive Gaussian noise
            - Classification
        7. Gaussian processes
            - non-parametric prediction
            - Gaussian process prediction
            - covariance functions
            - analysis of covariance functions
            - gaussian processes for classification
        8. Mixture models
            - density estimation using mixtures
            - expectation maximization for mixture models
            - the gaussian mixture model 
            - mixture of experts
            - indicator models
            - mixed membership models
        9. Latent linear models
            - factor analysis
            - factor analysis: maximum likelihood
            - interlude: modelling faces
            - probabilistic principal components analysis
            - canonical correlation analysis and factor analysis
            - independent component analsis
        10. Latent ability models
            - The Rasch models
            - competition models
    - IV. Dynamical models
        1. Discrete-state Markov models
            - markov models
            - hidden markov models
            - learning HMMs
            - related models
        2. Continuous-state Markov models
            - observed linear dynamical systems
            - auto-regressive models
            - latent linear dynamical models
            - inference
            - learning linear dynamical systems
            - switching auto-regressive models
        3. Switching linear dynamical systems
            - the switching LDS
            - Gaussian sum filtering
            - Gaussian sum smoothing
            - reset models
        4. Distributed computation
            - stochastic Hopfield networks
            - learning sequences
            - tractable continuous latent variable models
            - neural models
    - V. Approximate inference
        1. sampling
            - ancestral sampling
            - Gibbs sampling
            - Markov chain Monte Carlo (MCMC)
            - auxiliary variable methods
            - importance sampling
        2. deterministic approximate inference
            - the Laplace approximation
            - Properties Kullback-Leibler variational inference
            - Variaitonal bounding using KL(q|p)
            - local and KL variational approximations
            - mutual information maximization: a KL variational approach
            - loopy belief propagation
            - expectation propagation
            - MAP for Markov networks
    - A. Background methematics
        1. linear algebra
        2. multivariate calculus
        3. inequalities
        4. optimization
        5. multivariate optimization
        6. constrained optimization using Lagrance multipliers
- **Growing Adaptive Machines - combining development and learning in ANNs**
    1. Artificial Neurogenesis: an introduction and selective review
    2. A brief introduction to Probabilistic Machine Learning and its relation to Neuroscience
    3. **Evolving Culture versus local minima (Y bengio)**
    4. Learning Sparse features with an auto-associator
    5. HyperNEAT: the first Fiver years
    6. Using the genetic regulatory evolving ANs (GReaNs) Platform for Signal Processing, Animat Control, and Artificial Multicellular Development
    7. Constructing complex systems via activity-driven unsupervised Hebbian self-organizaiton
    8. Neuro-centric and holocentric approaches to the evolution of developmental neural networks
    9. Artificial evolution of plastic neural networks: a few key concepts

- **Machine Learning** (mitchell)
    1. Introduction
        - well-posed learning problems
        - designing a learning system
        - perspectives and issues in machine learning
    2. Concept Learning and the general-to-specific ordering
        - a concept learning task
        - concept learning as search
        - Find-S: Finding a maximally specific hypotheses
        - Version spaces and the candidate-elimination algorithm
        - remarks on vesion spaces and cadidate-elimination
        - inductive bias
    3. Decision tree learning
        - decision tree representation
        - appropriate problems for decision tree learning
        - the basic decision tree learning algorithm
        - hypothesis space search in decision tree learing
        - inductive bias in decision tree learning
        - issues in decision tree learning
    4. Artificial Neural networks
        - neural network representations
        - appropriate problems for neural network learning
        - perceptrons
        - multilayer networks and the backpropagation algorithm
        - remarks on the backpropagation algorithm
        - alternative error functions
        - alternative error minimization procedures
        - recurrent networks
        - dynamically modifying network structure
    5. Evaluating hypotheses
        - estimating hypothesis accuracy
        - basics of sampling theory
        - error estimation and estimating binomial proportions
        - the binomial distribution
        - mean and variance
        - estimators, bias and variance
        - confidence intervals
        - two-sided and one-sided bounds
        - general approach for deriving confidence intervals
        - difference in error of two hypotheses
        - comparing learning algorithms
    6. Bayesian learning
        - bayes theorem
        - bayes theorem and concept learning
        - maximum likelihood and least-squared error hypotheses
        - maximum likelihood hypotheses for predicting probabilities
        - minimum description length principle
        - bayes optimal classifier
        - Gibbs algorithm
        - Naive Bayes classifier
        - Bayesian belief networks
           - conditional independence
           - representation
           - inference
           - learning bayesian belief networks
           - gradient ascent training of bayesian networks
           - learning the structure of bayesian networks
        - The EM Algorithm
    7. Computational learning theory
        - probably learning an approximately correct hypothesis
        - sample complexity for finite hypothesis spaces
        - sample complexity for infinite hypothesis spaces
        - the mistake bound model of learning
    8. Instance-based learning
        - k-Nearest neighbor learning
        - locally weighted regression
        - radial basis functions
        - case-based reasoning
        - remarks on lazy and eager learning
    9. genetic algorithms
        - representing hypotheses
        - genetic operators
        - fitness function and selection
        - hypothesis space search
        - population evolution and the scheme theorem
        - genetic programming
        - models of evolution and learning
        - parallelizing genetic algorithms
    10. Learning sets of rules
        - sequential covering algorithms
        - learning rule sets: summary
        - learning first-order rules
        - learning sets of first-order rules: FOIL
        - induction as inverted deduction
        - inverting resolution
    11. Analytical learning
        - learning with perfect domain theories
        - remaks on explanation-based learning
        - explanation-based learning of search control knowledge
    12. combining inductive and analytical learning
        - inductive-analytical approaches to learning
        - using prior knowledge to initialize the hypothesis
        - using prior knowledgeto alter the search objective
        - using prior knowledge to augment search operators
    13. reinforcement learning
        - the learning task
        - Q learning
        - the Q function
        - nondeterministic rewards and actions
        - temporal difference learning
        - generalizing from examples
        - relationship to dynamic programming
- **Autonomous learning systems - from data streams to knowledge in real-time**
    1. Introduction
        - autonomous systems
        - the role of machine learning in autonomous systems
        - system identification - an abstract model of the real world
            - system structure identification
            - parameter identification
            - novelty detection, outliers and the link to structure innovation
        - online versus offline identification
        - adaptive and evolving systems
        - evolving or evolutionary systems
        - supervised versus unsupervised learnind
    2. Fundamentals
        1. Fundamentals of probability theory
            - randomness and determinism
            - frequentistic versus belief-based approach
            - probability densities and moments
            - density estimation - kernel-based approach
            - recursive density estimation (RDE)
            - detecting novelties/anomalies/outlier using RDE
        2. Fundamentals of machine learning and pattern recognition
            - preprocessing
                - normalization and standardization
                - orthogonalization of inputs/features - rPCA method
            - clustering
                - proximity measures and clusters shape
                - offline methods
                - evolving clustering methods
            - classification
        3. Fundamentals of fuzzy systems theory
            - fuzzy sets
            - fuzzy systems, fuzzy rules
            - fuzzy systems with nonparametric antecedent (AnYa)
            - FRB (offline) classifiers
            - neurofuzzy systems
            - state space perspective
    3. Methodology of autonomous learning systems
        1. evolving system structure from streaming data
            - defining system structure based on *prior* knowledge
            - data space partitioning
            - normalization and standardization of streaming data in an evolving environment
            - autonomous monitoring of the structure quality
            - short- and long-term focal points and submodels
            - simplification and interpretability
        2. autonomous learning parameters of the local submodels
            - learning parameters of local submodels
            - global versus local learning
            - evolving systems structure recursively
            - learning modes
            - robustness to outliers in autonomous learning
        3. autonomous predictors, estimators, filters, inferential sensors
            - predictors, estimators, filters - problem formulation
            - nonlinear regression
            - time series
            - autonomous learning sensors
        4. autonomous learning classifiers
            - classifying data streams
            - why adapt the classifier structure?
            - architecture of autonomous classifiers of the family AutoClassify
            - learning AutoClassify from streaming data
        5. autonomous learning controllers
            - indirect adaptive control scheme
            - evolving inverse plant model from online streaming data
            - evolving fuzzy controller structure from online streaming data
        6. collaborative autonomous learning systems
            - distributed intelligence
            - autonomous collaborative learning
            - collaborative autonomous clustering
            - collaborative autonomous predictors, estimators, filters and Autosense by a team of ALSs
            - superposition of local submodels
    4. Applications of ALS
        1. autonomous learning sensors for chemical and pretrochemical industries
        2. autonomous learning systems in mobile robotics
            - the mobile robot pioneer 3DX
            - autonmous classifier for landmark recognition
            - autonmous leader follower
        3. autonmous novelty detection and object tracking in video streams
            - problem definition
            - background subtraction and KDE for detecting visual novelties
            - detectign visual novelties with the RDE method
            - object identification in image frames using RDE
            - real-time tracking in video streams using ALS
        4. modelling evolving user behavior with ALS
            - user behavior as an evolving phenomenon
            - desining the user behavior profile
    5. Appendices
        1. mathematical foundations
            - probability distributions
            - basic matrix properties
        2. pseudocode of the basic algorithms
- ** reinforcement learning - an introduction**
     - I. The problem
         1. introduction
             - elements of reinforcement learning
         2. evaluative feedback
             - an n-armed Bandit problem
             - action-value methods
             - softmax action selection
             - evaluation versus instruction
             - incremental implementation
             - tracking a nonstationary problem
             - optimistic initial values
             - reinforcement comparison
             - pursuit methods
             - associative search
         3. the reinforcement learning problem
             - the agent-environment interface
             - goals and rewards
             - returns
             - a unified notation for episodic and continual tasks
             - the markov property
             - markov decision processes
             - value functions
             - optimal value functions
             - optimality and approximation
     - II. Elementary methods
         1. dynamic programming
             - policy evaluation
             - policy improvement
             - policy interation
             - value iteration
             - asynchronous dynamic programming
             - generalized policy iteration
             - efficiency of DP
         2. Monte carlo methods
             - monte carlo policy evaluation
             - monte carlo estimation of action values
             - monte carlo control
             - on-policy monte carlo control
             - evaluating one policy while following another
             - off-policy monte carlo control
             - incremental implementation 
         3. temporal difference learning
             - td precition
             - optimality of TD(0)
             - sarsa: on-policy TD control
             - Q-learning: off-policy TD control
             - actor-critic methods
             - R-learning for undiscounted continual tasks
             - games, after states, and other special cases
     - III. a unified view
         1. eligibility traces
         2. generalization and function approximation
             - value prediction with function approximation
             - gradient-descent methods
             - linear methods
                 - coarse coding
                 - tile coding
                 - radial basis functions
                 - kanerva coding
             - control with function approximation 
             - off-policy bootstrapping
             - should we bootstrap?
         3. planning and learning
             - models and planning
             - integrating planning, acting and learning
             - when the model is wrong
             - prioritized sweeping
             - full s. sample backups
             - trajectory sampling
             - heuristic search
         4. dimensions
         5. case studies
- **Simulated Evolution and learning**
    1. evolutionary learning
        1. modelling behavior cycles for life-long learning in motivated agents
        2. breaking the synaptic dogma: evolving a neuro-inspired developmental network
        3. a new approach to adapting control parameters in differential evolution algorithm
        4. a nocal genetic algorithm with orthogonal prediction for global numerical optimization
        5. phylogeny inference using a multi-objective evolutionary algorithm with indirect representation
        6. evolved look-up tables for simulated DNA controlled robots
        7. multi-objective improvement of software using co-evolution and smart seeding
        8. policy evolution with grammatical evolution
        9. a pso based Adaboost approach to objet detection
        10. adaptive non-uniform distribution of quantum particles in mQSO
        11. genetically evolved fuzzy rule-based classifiers and application to automotive classification
        12. improving XCS performance by distribution
        13. evolving an ensemble of neural networks using artificial immune systems
        14. improving the performance and scalability of differential evolution
        15. a fuzzy-GA decision support system for enhancing postponement strategies in supply chain management
    2. evolutionary optimization
        1. solving the delay-constrained capacitated minimum spanning tree problem using a dandelion-encoded evolutionary alorithm
        2. generalized extremal optimization for solving multiprocessor task scheduling problem
        3. improving NSGA-II algorithm based on minimum spanning tree
        4. an island based hyprid evolutionary algorithm for optimization
        5. a particle swarm optimization based algorithm for fuzzy bilevel decision making with objective-shared followers
        6. eference point-based particle based swarm optimization using a steady-state approach
        7. genetic algorithm based methods for identification of health risk factors aimed at preventing metabolic syndrome
        8. extremal optimization and bin packing
        9. extremal optimization with a penalty approach for the multidimensional knapsack problem
        10. a generator for multimodal test functions with multiple global optima
        11. choosing leaders for multi-objective PSO algorithms using differential evolution
        12. comparison between genetic algorithm and genetic programming performance for photomosaic generation
        13. parameter tuning of real-valued crossover operators for statistics preservation
        14. hybrid particle swarm optimization based on thermodynamic
        15. multiagent evolutionary algorithm for T-coloring problem
        16. non-photorealisic rendering using genetic programming
        17. use of local ranking incellular genetic algorithms with two neighborhood structures
        18. information theoretic classification problems for metaheuristics
        19. task decomposition for optimization problem solving
        20. discussion of search strategy for multi-objective genetic algorithm with consideration of accuracy and broadness of pareto optimal solution
        21. discussion of offspring generation method for interactive genetic algorithms with consideration of multimodal preference
        22. solving very difficult japanese puzzles with a hybrid evolutionary-logic algorithm
        23. joint multicast routing and channel assignment in multirado multichannel wireles mesh networks using simulated annealing
        24. general game playing ants
        25. a generalized approach to construct benchmark problems for dynamic optimization
        26. a study on the performance of substitute distance based approaches for evolutionary many objective optimization
        27. performance evaluation of an adaptive ant colony optimization applied to single machine scheduling
        28. robust optimization by $\Sigma$-ranking on high dimensional objective spaces
        29. an evolutionary method for natural language to SQL translation
        30. attributes of dynamic combinatorial optimization
        31. a weighted local sharing technique for multimodel optimization
    3. Hybrid learning
        1. hybrid genetic programming for optimal approximation of high order and sparse linear systems
        2. genetic vector quantizer design on reconfigurable hardware
        3. pattern learning and decision making in a photovltaic system
        4. using numerical simplification to control bloat in genetic programming
        5. horn query leanring with multiple refinement
        6. evolving digitial circuits in an industry standard hardware descripiton language
        7. parameterized indexed FOR-loops in genetic programming and regular binary pattern strings
        8. hierarchical fuzzy control for the inverted pendulum over the set of initial conditions
        9. genetic programming for feature ranking in classification problems
        10. time series prediction with evolved, composite echo state networks
    4. adaptive systems
        1. genetic synthesis of software architecture
        2. dual phase evolution and self-organization in networks
        3. heterogeneous payoffs and social diversity in the spatial prisoner's dilemma game
    5. theoretical issues in evolutionary computation
        1. crossover can be constructive when computing unique input output sequences
    6. real-world applications of evolutionary computation techniques
        1. power electronic circuits design: a particle swarm optimization approach
        2. computational intelligence in radio astronomy: using computational techniques to tune geodesy models
        3. an efficient hybrid algorithm for optimization of discrete structures
        4. evolutionary multi-objective optimization for biped walking
        5. a method for assigning men and women with good affinity to matchmaking parties through interactive evolutionary computation
- **Compressed sensing and sparse filtering**
    1. introduction to compressed sensing and sparse filtering
    2. the geometry of compressed sensing
    3. sparse signal recovery with exponential-family noise
    4. nuclear norm optimizaiton and its applicaiton to observation model specification
    5. nonnegative tensor decomposition
    6. sub-nyquist sampling and compressed sensing in cognitive radio networks
    7. sparse nonlinear MIMO filtering and identification
    8. optimization viewpoint on kalman smoothing with applications to robust and sparse estimation
    9. compressive system identification
    10. distributed approximation and tracking using selective gossip
    11. recursive reconstruction of sparse signal sequences
    12. estimation of time-varying sparse signals in sensor networks
    13. sparsity and compressed sensing in mono-static and multi-static radar imaging
    14. structured sparse Bayesian modelling for audio restoration
    15. sparse representations for speech recognition
- **Compressed sensing - theory and applications**
    1. introduction compressed sensing
    2. second-generation sparse modeling: structured and collaborative signal analysis
    3. Xampling: compressed sensing of analog signals
    4. sampling at the rate of innovation: theory and applications
    5. introduction to the non-asymptotic analysis of random matrices
    6. adaptive sensing for sparse recovery
    7. fundamental thresholds in compressed sensing: a high-dimensional geometry approach
    8. greedy algorithms for compressed sensing
    9. graphical models concepts in compressed sensing
    10. finding needles in compressed haystacks
    11. data separation by sparse representations
    12. face recognition by sparse representation
- **Sparse and redundant representations - from theory to applications in signal and image processing**
    - I. sparse and redundant representations - theoretical and numerical foundations
        1. Prologue
            - underdetermined linear systems
            - regularization
            - the temptation of convexity
            - a closer look at $\mathscr{l}_1$ minimization
            - conversion of ($P_1$) to linear programming
            - promoting sparse soltions
            - the $\mathscr{l}_0$-Norm and implications
            - the ($P_0$) probelm - our main interest
            - the signal processing perspective
        2. Uniqueness and uncertainty
            - treating the two-ortho case
                - an uncertainty principle
                - uncertainty of redundant solutions
                - from uncertainty to uniqueness
            - uniqueness analysis for the general case
            - constructing grassmannian matrices
        3. pursuit algorithms - practice
            - greedy algorithms
            - convex relaxation techniques
        4. pursuit algorithms - guarantees
            - back to the two-ortho case
            - the general case
            - the role of the sign-pattern
            - tropp's exact recovery condition
        5. from exact to approximate solutions
            - general motivation
            - stability of the Sparsest solution
            - Pursuit algorithms
            - the unitary case
            - performance of pursuit algorithms
        6. iterative-shrinkage algorithms
            - background
            - the unitary case - a source of inspiration
            - developing iterative-shrinkage algorithms
            - acceleration using line-search and SESOP
            - iterative-shrinkage algorithms: tests
        7. Towards average performance analysis
            - a glimpse into probabilistic analysis
            - average performance of thresholding
        8. the dantzip-selector algorithm
            - dantzig-selector versus basis-pursuit
            - the unitary case
            - revisiting the restricted isometry machinery
            - dantzig-selector performance guaranty
            - dantzig-selector in practice
    - II. from theory to practice - signal and image processing applications
        1. sparsity-seeking methods in signal processing
            - priors and transforms for signals
            - the sparse-land model
            - geometric interpretation of sparse-land
            - processing of sparsely-generated signals
            - analysis versus synthesis signal modeling
        2. image deblurring - a case study
        3. MAP versus MMSE estimation
        4. the quest for a dictionary
            - choosing versus learning
            - dictionary-learning algorithms
            - training structured dictionaries
        5. Image compression - facial images
        6. image denoising
        7. other applications
            - image inpainting and impulsive noise removal
            - image scale-up
- **from animals to robots and back - reflections on hard problems in the study of cognition**
    1. Bringing together different pieces to better understand whole minds
    2. arron sloman: a bright tile in AI's mosaic
    3. losing control within the H-Cogaff architecture
    4. acting on the world: understanding how agents use information to guide their action
    5. a proof and some representations
    6. what does it mean to have an architecture?
    7. virtual machines: nonreductionaist bridges beween the functional and the physical
    8. building for the future: architectures for the next generation of intelligent robots
    9. what vision can, can't and should do
    10. the rocky road from Hume to Kant: correlations and theories in robots and animals
    11. combining planning and action, lessons from robots and the natural world
    12. developing expertise with objective knowledge: motive generators and productive practice
    13. from cognitive science to data mining: the first intelligence amplifier
    14. modelling user linguistic communicative competences for individual and collaborative learning
    15. loop-closing semantics
- **A concise introduction to multiagent systems and distributed artificial intelligence**
    1. Introduction
    2. Rational agents
        - agents as rational decision makers
        - observable worlds and the markov property
        - stochastic transitions and utilities
    3. Strategic Games
        - game theory
        - strategic games
        - iterarted elimination of dominated actions
        - nash equilibrium
    4. Coordination
        - coordination games
        - social conventions
        - roles
        - coordination graphs
    5. Partial observability
        - thinking interactively
        - common knowledge
        - partial observability and actions
    6. Mechanism design
        - self-interested agents
        - the mechanism design problem
        - the revelation principle
        - the vickrey-clarke-groves mechanism
    7. Learning
        - reinforcement learning
        - markov decision processes
        - markov games
        - the problem of exploration
- **creating brian-like intelligence: from basic principles to complex intelligent systems**
    1. creating brain-like intelligence
    2. **from complex networks to intelligent systems (sporns)**
    3. stochastic dynamics in the brain and probabilistic decision-making
    4. formal tools for the analysis of brain-like structures and dynamics
    5. morphological computation - connecting brain, body and environment
    6. trying to grasp a sketch of a brain for grasping
    7. learning actions through imitation and exploration: towards humanoid robots that learn from actions
    8. towards learning by interaction
    9. planning and moving in dynamic environments: a statistical machine learning approach
    10. towards cognitive robotics
    11. approaches and challenges for cognitive vision systems
    12. some requirements for human-like robots: why the recent over-emphasis on emodiment has held up progress (a. sloman)
    13. co-evolution of rewards and meta-parameters in embodied evolution
    14. active vision for goal-oriented humanoid robot walking
    15. cognitive adequacy in brain-like intelligence
    16. basal ganglia models for autonomous behavior learning
- **from animals to animats 13**
    1. Animat Approach and methodology
        - a role for sleep in artificial cognition through deferred restructuring of experience in autonomous machines
        - time in consciousness, memory and human-robot interaction
        - non-representational sensorimotor knowledge
    2. Perception and motor control
        - self-exploration of the stumpy robot with predictive information maximization
        - detecting the vibration in the artificial web inspired by the spider
        - modelling reaction times in non-linear classification tasks
        - multiple decoupled CPGs with local sensory feedback for adaptive locomotion behaviors of bio-inspired walking robots
        - the role of a cerebellum-driven perceptual prediction within a robot postural task
        - biomemetic agent based modelling using male frog calling behavior as a case study
    3. Navigation and internal world models
        - snapshot homing navigation based on edge features
        - ground-nesting insects could use visual tracking for monitoring nest position during early flight
        - adaptive landmark-based navigation system using learning techniques
        - robustness study of a multimodel compass inspired from HD-cells and dynamic neural fields
    4. learning and adaptation
        - developmental dynamics of RNNPB: new insight about infant action developments
        - simulating the emergence of early physical and social interactions: a developmental route through low level visuomotor learning
        - intrinsically motivated decision making for situated, goal-driven agents
        - an anti-Hebbian learning rule to represent drive motivations for reinforcement learning
        - **unsupervised learning for sensory primitives from optical flow fields**
        - reinforcement driven shaping of sequence learning in neural dynamics
        - **rapid humanoid motion learing through coordinated, parallel evolution**
    5. Evolution
        - programmable self-assembly with chained soft cells: an algorithm to fold into 2-D shapes
        - voxel robot: a pneumatic robot with deformable morphology
        - task-driven evolution of modulra self-reconfigurable robots
        - a bacterial-based algorithm to simulate complex adaptive systems
        - **online evolution of deep convolutional network for vision-based reinforcement learning** (schmidhuber)
    6. Collective and social behavior
        - a swarm robotics approach to task allocation under soft deadlines and negligible switching costs
        - supervised robot groups with reconfigurable formation: theory and simulations
        - coupling learning capability and local rules for the improvement of the objects' aggregation task by a cognitive multi-robot system
        - honeybee-inspired quality monitoring of routing paths in mobile ad hoc networks
        - human inspiration and comparison for monitoring strategies in a robotic convoy task
        - animal social behavior: a visual analysis
        - crowd emotion detection using dynamic probabilistic models
- **Artificial Intellience - A new synthesis**
    1. Ractive Machines
        1. Stimulus-response agents
            - perception and action
                - perception 
                - action
                - Boolean algebra
                - classes and forms of Boolean functions
            - representing and implementing action functions
                - production systems
                - networks
                - the subsumption architecture
        2. Neural networks
            - Training single TLUs
                - TLU geometry
                - augmented vectors
                - gradient descent methods
                - the widrow-hoff procedure
                - the generalized delta prodedure
                - the error-correction procedure
            - Neural networks
                - motivation
                - notation
                - the backpropagation method
                - computing weight changes in the final layer
                - computing changes to the weights in intermediate layers
            - generalization, accuracy and overfitting
        3. Machine evolution
            - evolutionary computation
            - genetic programming
                - program representation in GP
                - the GP process
                - evolving a wall-following robot
        4. State machines
            - representing the environment by feature vectors
            - elman networks
            - iconic representations
            - blackboard systems
        5. robot vision
            - steering an automobile
            - two stages of robot vision
            - image processing
                - averaging
                - edge enhancement
                - combining edge enhancement with averaging
                - region finding
                - using image attributes other than intensity
            - scene analysis
                - interpreting lines and curves in the image
                - model-based vision
            - stereo vision and depth information
    2. Search in state spaces
        1. agents that plan
            - memory versus computation
            - state-space graph
            - searching explicit state spaces
            - feature-based state spaces
            - graph notation
        2. uninformed search
            - formulating the state space
            - components of implicit state-space graphs
            - breadth-first search
            - depth-first of backtracking search
            - iterative deepening
        3. heuristic search
            - using evaluation functions
            - a general graph-searching algorithms (algorithm A\*)
            - heuristic functions and search efficiency
        4. planning, acting and learning
            - the sense/plan/act cycle
            - approximate search
                - island-driven search
                - hierarchical search
                - limited-horizon search
                - cycles
                - building reactive procedures
            - learning heuristic functions
                - explicit graphs
                - implicit graphs
            - rewards instead of goals
        5. alternative search formulations and applications
            - assignment problems
            - constructive methods
            - heuristic repair
            - function optimization
        6. adversarial search 
            - two-agent games
            - the minimax procedure
            - the alpha-beta procedure
            - the search efficiency of the alpha-beta procedure
            - other important matters
            - games of chance
            - learning evauation functions
    3. Knowledge Representation and Reasoning
        1. the propositional calculus
            - using constraints on feature vectors
            - the language
            - rules of inference
            - definition of proof
            - semantics
            - soundness and completeness
            - the PSAT problem
        2. resolution in the propositional calculus
            - a new rule of inference: resolution
            - converting arbitrary wffs to conjunctions of clauses
            - resolution refutations
            - resolution refutation search strategies
            - horn clauses
        3. the predicate calculus
            - motivation
            - the language and its syntax
            - semantics
            - quantification
            - semantics of quntifiers
            - predicate calculus as a language for representing knowledge
        4. resolution in the predicate calculus
            - unification
            - predicate-calculus resolution
            - completeness and soundness
            - converting arbitrary wffs to clause form
            - using resolution to prove theorems
            - answer extraction
            - the equality predicate
        5. knowledge-based systems
            - confronting the real world
            - reasoning using horn clauses
            - maintenance in dynamics knowledge bases
            - rule-based expert systems
            - rule learning
        6. representing commonsense knowledge
            - the commonsense world
            - time
            - knowledge representation by networks
        7. reasoning with uncertain information
            - review of probability theory
            - probabilistic inference
            - bayes networks
            - patterns of inference in bayes networks
            - uncertain evidence
            - D-separation
            - probabilistic inference in polytrees
        8. learning and acting with bayes nets
            - learning Bayes nets
            - probabilistic inference and action
    4. Planning Methods based on logic 
        1. the situation calculus
        2. planning
    5. Communication and integration 
        1. multiple agents
        2. communication among agents
        3. agent architectures
            - three-level architectures
            - goal arbitration
            - the triple-tower architecture
            - bootstrapping
- **intelligent autonomous robotics - a robot soccer case study**
    1. Introduction
    2. The class
    3. initial behaviors
    4. Vision
        - camera settings
        - color segmentation
        - region building and merging
        - object recognition with bounding boxes
        - position and bearing of objects
        - visual opponent modeling
    5. Movement 
        - Walking
            - basics
            - forward kinematics
            - inverse kinematics
            - general walking structure
            - omnidirectional control
            - tilting the body forward
            - tuning the parameters
            - odometry calibration
        - General movement
            - movement module
            - movement interface
            - high-level control
        - learning movement tasks
            - forward gait
            - ball acquisition
    6. Fall detection
    7. kicking
        - creating the critical action
        - integrating the critical aciton in the walk
    8. localization
        - background
            - basic monte carlo localization
            - mcl for vision-based legged robots
        - enhancements to the basic approach
            - landmark histories
            - distance-based updates
            - extended motion model
        - experimental setup and results
    9. Communication
        - initial robot-to-robot communication
        - message types
        - knowing which robots are communicating
        - determining when a teammate is "dead"
    10. General architecture
    11. Global map
        - maintaining location data
        - information from teammates
        - providing a high-level interface
    12. behaviors
        - goal scoring
            - initial solution
            - incorporating localization
            - a finite state machine
    13. Coordination
        - dibs
            - relevant data
            - thrashing 
            - stabilization
            - taking the average
            - aging
            - calling the ball
            - support distance
        - final strategy
            - roles
            - supporter behavior
            - defender behavior
            - dynamic role assigment
    14. simulator
    15. UT assist
- **artificial intelligence - a modern approach**
    1. introduction
    2. intelligent agents
    3. solving problems by searching
    4. beyond classical search 
    5. adversarial search
    6. logical agents
    7. constraint satisfaction problems
    8. first-order logic
    9. inference in first-order logic
    10. classical planning
    11. planning and acting in the real world
    12. knowledge representation
    13. quantifying uncertainty
    14. probabilistic reasoning
    15. probabilistic reasoning over time
    16. making simple decisions
    17. making complex decisions
    18. learning from examples
    19. knowledge in learning
    20. learning probabilistic models
    21. natural language processing
    22. reinforcement learning
    23. natural language for communication
    24. perception
    25. robotics
    26. mathematical background
    27. notes on languages and algorithms
- **understanding the artificial - on the furutre shape of artificial intelligence**
    1. introduction: artificial intelligence: its future and its cultural roots
    2. the cognitive dimension in the processing of natural language
    3. making a mind versus modelling the brain: artificial intelligence back at the branchpoint
    4. alternative intelligence
    5. artificial intelligence as a dialectic of science and technology
    6. biological and artificial intelligence
    7. computers, musical notation and the externalization of knowledge: towards a comparative study in the hisory of information technology
    8. cognitive science and the computer metaphor
    9. intelligent behavior in machines
    10. conclusions: the dissymmetry of mind and the role of the artificial 
    11. appendix: one hundred definitions of AI
    12. Appendix B: an attempt at getting a basis for a rational definition of the artificial
- **Computer and the brain** (von neumann)
    1. The computer
        - the analog procedure
            - the conventional basic operations
            - unusual basic operations
        - the digital procedure
            - markers, their combinations and embodiments
            - digital machine types and their basic components
            - parallel and serial schemes
            - the conventional basic operations
        - logical control
            - plugged control
            - logical tape control
            - the principle of only one organ for each basic operation
            - the consequent need for a special memory organ
            - control by "control sequence" points
            - memory-stored control
            - modus operandi of the memory-stored control
            - mixed forms of control
        - mixed numerical procedures
            - mixed representations of numbers. machines built on this basis
        - precision
            - reasons for the high (digital) precsion requirements
        - characteristics of modern analog machines
        - characteristics of modern digital machines
            - active components; questions of speed
            - number of active components required
            - memory organs. access times and memory capacities
            - memory registers built from active organs
            - the hierarchic principle for memory organs
            - memory components; question of access
            - complexities on the concept of access time
            - the principle of direct addressing
    2. the brain
        - simplified description of the function of the neuron
        - the nature of the nerve impulse
            - the process of stimulation
            - the mechanism of stimulating pusles by pulses; its digital character
            - time characteristics of nerve responses, fatigue and recovery
            - size of a neuron. comparisons with artificial components
            - energy dissipation
        - stimulation criteria
            - the simplest- elementary logical
            - more complicated stimulation criteria
            - the threshold
            - the summation time
            - stimulation criteria for receptors
        - the problem of memory within the nervous system
            - principles for estimating the capacity of the memory in the nervous system
            - memory capacity estimated with these stipulations
            - various possible physical embodiments of the memory
            - analogies with artificial computing machines
            - the underlying componentry of the memory need not be the same as that of the basic active organs
        - digital and analog parts in the nervous system
            - role of the genetic mechanism in the above context
        - codes and their role in the contol of the functioning of a machine
            - the concept of a complete code
            - the concept of a short code
            - the function of a short code
        - the logical structure of the nervous system
            - importance of the numerical procedures
            - interaction of numerical procedures with logic
            - reasons for expecting high precision requirements
        - nature of the system of notations employed: not digital but statistical
            - arithmetical deterioration. Roles of arithmetical and logical depths
            - arithmetical precision or logical reliability, alternatives
            - other statistical traits of the message systems that could be used
        - the language of the brain not the language of mathematics
- ** Handbook on Neural information processing**
    1. Deep Learning of Representations (y bengio)
        - a review and recent trends
            - greedy layerwise pre-training
            - undirected graphical models and boltzmann machines
            - the restricted boltzmann machine
            - the zoo: auto-encoders, sparse coding, predictive sparse decomposition, denoising auto-encoders, score matching and more
        - convolutional architectures
            - local receptive fields and weight sharing
            - feature pooling
        - learning invariant feature sets
            - dealing with factors of variation: invariant features
            - invariance via sparsity
            - teasing apart explanatory factors via slow features analysis
            - learning to pool features
            - beyond learning invariant features
        - disentangling factors of variation
        - on the importance of top-down connections
    2. Recurrent Neural networks
        - architecture
            - conncetionist network topologies
            - specific architectures
        - memory
            - delayed activations as memory
            - short-term memory and generic predictor
            - types of memory kernels
        - learning
            - recurrent back-propagation: learning with fixedpoints
            - back-propagation through time: learning with non-fixed points
            - long-term dependencies
        - modeling
            - finite state automata
            - beyond finite state automata
            - applications
                - natural language processing
                - identification and control of dynamical systems
    3. Supervised neural network models for processing graphs
        - graphs
        - neural models for graph processing
            - the graph neural network model
            - processing DAGs with recursive neural networks
        - supervised learning for graph neural networks
            - learning objective
            - learning procedure for GNNs
            - learning procedure for recursive neural networks
    4. topics in cellular neural networks
        - the CNN concept
            - the architecture
            - mathematical description
            - other tasks CNN's can accomplish - the CNN universal machine
        - a particular architecture
            - the architecture and the equations
            - the decoupling technique
            - particular cases
            - implementation issues
            - a "toy" application: 1D "edge" detection
        - two-grid coupled CNN's
            - the architecture and the equations
            - the decoupling technique
            - boundary conditions (BC's) and their influence on pattern formation
            - dispersion curve
            - turing pattern formation mechanism
            - boundary conditions in 2D CNN's
    5. approximating multivariable functions by feedforward neural nets
        - dictionaries and variable-basis approximation
        - the universal approximation property
        - quadratic rates of approximation
        - geometric rates of approximation
        - approximation of balls in variational norms
        - best approximation and non-continuity of approximation
        - tractability of approximation
            - a shift in point-of-view: complexity and dimension
            - measure worst-case error in approximation
            - Gaussian RBF network tractability
            - perceptron network tractability
    6. Bochner integrals and neural networks
        - variational norms and completeness
        - bochner integrals
        - spaces of bochner integrable functions
        - main theorem
        - an example involving the bessel potential
        - application: a Gamma function inequality
        - tensor-product interpretation
        - pointwise-integrals vs. bochner integrals
    7. semi-supervised learning
        - self-training
        - SSL with generative models
        - Semi-supervised SVMs (S3VMs)
        - semi-supervised learning with graphs
        - semi-supervised learning with committees (SSLC)
        - combination with active learning
    8. Statistical relational learning
        - relational learning versus attribute-value learning
            - attribute-value learning
            - relational learning
            - mapping relational data to attribute-value data
        - relational learning: tasks and formalisms
            - inductive logic programming
            - learning from graphs
            - multi-relational data mining
        - neural network based approaches to relational learning
            - CIL$^2$P
            - relational neural networks
            - graph neural networks
        - statistical relational learning
            - structuring graphical models
            - approaches in the relational database setting
            - approaches in the logical setting
        - general remarks and challenges
            - understanding commonalities and differences
            - parameter learning and structure learning
            - scalability
    9. Kernel methods for structured data
        - mathematical foundations
            - kernels
            - supervised learning with kernels
        - kernel machines for structured input
            - SVM for binary classification
            - SVM for refression
            - smallest enclosing hypersphere
            - kernel principal component analysis
        - kernels on structured data
            - basic kernels
            - kernel combination
            - kernels on discrete structures
            - kernels from generative models
            - kernels on logical representations
        - learning kernels
            - learning kernel combinations
            - learning logical kernels
        - supervised kernel machines for structured output
    10. multiple classifier systems: theory, applications and tools
        - MCS theory
            - MCS architectures
            - combining rules
            - strategies for constructing a classifier ensemble
        - applications
            - remote-sensing data analysis
            - document analysis
            - biometrics
            - figure and ground
            - medical diagnosis support
            - chemistry and biology
            - time series Prediction/Analysis
            - image and video analysis
            - computer and network security
    11. self organization and model learning: algorithms and applications
        - snap-drift neural network
        - snap-drift self-organizing map
    12. Bayesian networks, introduction and practical applications
    13. relevance feedback in content-based image retrieval: a survey
        - content-based image retrieval 
            - low-level feature extraction
            - similarity measure
            - classification methods
        - short-term learning RF
        - long-term learning RF
            - latent semantic indexing-based techniques
            - correlation-based approaches
            - clustering-based algorithms
            - feature represtnation-based methods
            - similarity measure modification-based approaches
    14. learning structural representations of text documents in large document collections
        - representation of unstructured or semi-structured text documents
        - general framework for processing graph structured data
        - self-organizing maps for structures
        - graph neural networks
        - clustering of the wikipedia data set
    15. neural networks in bioinformatics
        - analyzing DNA sequences
        - peptide sequence analysis
        - diagnostic predictions
- **Neural Networks and COmputing: learning algorithms and applications**
    1. introduction
        - Neuron model
        - historical remarks
        - network architecture
            - supervised neural networks
                - McCulloh and Pitts model
                - the perceptron model
                - multi-layer feedforward network
                - recurrent networks
            - unsupervised neural networks
        - modeling and learning mechanism
            - determination of parameters
            - gradient descent searching method
    2. Learning performance and enhancement
        - fundamentals of gradient descenet optimization
        - conventional backpropagation algorithm
        - convergence enhancement
            - extended backpropagation algorithm
            - least squares based training algorithm
            - extended least squares based algorithm
        - initialization consideration
            - weight initialization algorithm I-III
        - global learning algorithms
            - simulated annealing algorithm
            - Alopex algorithm
            - reactive Tabu search
            - the NOVEL algorithm
            - the heuristic hybrid global learning algorithm
        - concluding remarks
            - fast learning algorithms
            - weight initialization methods
            - global learning algorithms
    3. Generalization and performance enhancement
        - Cost function and performance surface
            - maximum likelihood estimation
            - the least-square cost function
        - Higher-order statistic generalization
            - definitions and properties of higher-order statistics
            - the higher-order cumulants based cost function
            - property of the higher-order cumulant cost function
            - learning and generalization performance
                - EX 1: Henon Attractor
                - EX 2: Sunspot time-series
        - regularization for generalization enhancement
            - adaptive regularization parameter selection (ARPS) methods
                - stalling identification method
                - $\lambda$ selection schemes
            - synthetic function mapping
        - concluding remarks
            - objective function selection
            - regularization selection
    4. basis function networks for classification
        - Linear separation and perceptions
        - basis function model for parametric smoothing
        - radial basis function network
            - RBF networks architecture
            - universal approximation
            - initialization and clustering
            - learning algorithms
                - linear weights optimization
                - gradient descent optimization
                - hybrid of least squares and penalized optimization
            - regularization networks
         - advanced radial basis function networks
             - support vector machine
             - wavelet network
             - fuzzy RBF controllers
             - probabilistic neural netwrosk
    5. self-organizing maps
        - learning algorithm
        - growing SOMs
            - cell splitting grid
            - growing hierarchical self-organizing quadtree map
        - probabilistic SOMs
            - cellular probabilistic SOM
            - probabilistic regularized SOM
        - clustering of SOM
        - Multi-layer SOM for tree-structured data
            - SOM input representation
            - MLSOM training
            - MLSOM visualization and classification
    6. Classification and feature selection
        - support vector machines (SVM)
        - cost function
            - MSE and MCE cost functions
            - hybrid MCE-MSE cost function
            - implementing MCE-MSE
        - feature selection
            - information theory
                - mutual information
                - probability density function (PDF) estimation
            - MI based forward feature selection
    7. Engineering applications
        - electrical load forecasting
        - content-based image retrieval using SOM
        - feature selection for cDNA microarray
- **Cellular neural networks and visual computing: foundations and applications**
    1. Introduction
    2. Notation, definition, and mathematical foundation
        - basic notation and definitions
        - mathematical foundations
    3. characteristics and analysis of simple CNN templates
    4. Simulation of the CNN dynamics
        - integration of the standard CNN differential equation
        - image input
        - software simulation
        - digital hardware accelerators
        - analog CNN implementations
        - scaling the signals
        - discrete-time CNN (DTCNN)
    5. Binary CNN characterization via boolean functions
    6. uncoupled CNNs: unified theory and applications
        - the complete stability phenomena
        - explcit CNN output formula
        - proof of completely stable CNN theorem
        - the primary CNN mosaic
        - explicit formula for transient waveform and settling time
        - which local boolean functions are realizable by uncoupled CNNs?
        - geometrical interpretations
        - how to design uncoupled CNNs with prescribed Boolean functions
        - how to realize non-separable local boolean functions?
    7. introduction to the CNN universal machine
        - global clock and global wire
        - set inclusion
        - translation of sets and binary images
        - opening and closing and implemnting any morphological operator
        - implementing any prescribed boolean transition function by not more than 256 templates
        - minimizing the number of templates when implementing any possible boolean transition function
        - analog-to-digital array converter
    8. Back to basics: nonlinear dynamics and complete stability
        - a glimpse of things to come
        - an oscillatory CNN with only two cells
        - a chaotic CNN iwth only two cells and one sinusoidal input
        - symmetrical **A** template implies complete stability
        - positive and sign-symmetric **A** template implies complete stability
        - positive and cell-linking **A** template implies complete stability
        - stability of some sign-antisymmetric CNNs
    9. The CNN universal machine (CNN-UM)
        - the architecture
        - a simple example in more detail
        - a very simple exampleon the circuit level
        - language compiler, operating system
    10. Template design tools
        - various design techniques
        - binary representation, linear separability, and simple decomposition
        - template optimization
        - template decomposition techniques
    11. CNNs for linear image processing
        - linear image processing with **B** templates is equivalent to spatial convolution with FIR kernels
        - spatial frequency characterization
        - a primer on properties and applications of discrete-space Fourier transform (DSFT)
        - linear image processing with **A** and **B** templates is equivalent to spatial convolution with IIR kernels
    12. coupled CNN with linear synaptic weights
        - active and inactive cells, dynamic local rules
        - binary activation pattern and template format
        - a simple propagating type example with B/W symmetrical rule
        - the connectivity problem
    13. uncoupled standard CNNs with nonlinear synaptic weights
        - dynamic equations and DP plot
    14. standard CNNs with delayed synaptic weights and motion analysis
        - dynamic equations
        - **motion analysis - discrete time and continuous time image acquisition**
    15. video microprocessors - analog and digital VLSI implementation of the CNN universal machine
        - the analog CNN core
        - analogic CNN-UM cell
        - emulated digital implementation
        - the visual microprocesso and its computational infrastructure
        - computing power comparison
    16. CNN models in the visual pathway and the "bionic eye"
        - receptive field organization, synaptic weights, and cloning template
        - some protoype elementary functions and CNN models of the visual pathway
        - **a simple qualitative "engineering" model of a vertebrate retina**
        - the "bionic eye" implemented on a CNN universal machine
- **An information-theoretic approach to neural computing**
    1. Introduction
    2. Preliminaries of Information theory and neural networks
        - Elements of information theory
            - entropy and information
            - joint entropy and conditional entropy
            - kullback-leibler entropy
            - mutual information
            - differential entropy, relative entropy, and mutual information
            - chain rules
            - fundamental information theory inequalities
            - coding theory
        - elements of the theory of neural networks
            - neural network modeling
            - neural architectures
            - learning paradigms
            - feedforward networks: backpropagation
            - stochastic recurrent networks: boltzmann machine
            - unsupervised competitive learning
            - biological learning rules
    3. I - Unsupervised Learning
        1. Linear Feature Extraction: infomax principle
            - principal component analysis: statistical approach
                - PCA and diagonalization of the covariance matrix
                - PCA and optimal reconstruction
                - neural network algorithms and PCA
            - information theoretic approach: infomax
                - minimization of information loss principle and infomax principle
                - upper bound of information loss
                - information capacity as a Lyapanov function of the general stochastic approximation
        2. independent component analysis: general formulation and linear case
            - ICA-definition
            - General criteria for ICA
                - cumulant expansion based criterion for ICA
                - mutual information as criterion for ICA
            - linear ICA
            - Gaussian input distribution and linear ICA
                - networks with anti-symmetric lateral connections
                - neworks with symmetrix lateral connections
                - examples of learning with symmetric and anti-symmetric networks
            - Learning in gaussian ICA with rotation matrices: PCA
                - relationship between PCA and ICA in gaussian input case
                - linear gaussian ICA and the output dimension reductions
            - linear ICA in arbitrary input distribution
                - some properties of cumulants at the output of a linear transformation
                - the edgeworth expansion criteria and theorem 4.6.2
                - algorithms for output factorization in the non-gaussian case
                - experimental results of linear ICA algorithms in the non-gaussian case
        3. Nonlinear feature extraction: boolean stochastic networks
            - Infomax principle for boltzmann machines
                - learning model
                - examples of infomax principle in boltzmann machine
            - redundancy minimization and infomax for the boltzmann machine
                - learning model
                - numerical complexity of the learning rule
                - factorial learning experiments
                - **receptive fields formation from a retina**
        4. nonlinear feature extraction: deterministic neural networks
            - redundancy reduction by triangular volume conserving architectures
                - networks with linear, sigmoidal and higher order activation functions
                - simulation and results
            - unsupervised modeling of chaotic time series
                - dynamical systems modeling
            - redundancy reduction by general symplectic architectures
                - general entropy preserving nonlinear maps
                - optimizing a parameterized symplectic map
                - density estimation and novelty
            - example **theory of early vision**
                - theoretical background
                - retina model
    4. II - Supervised learning
        1. Supervised learning and statistical estimation
            - statistical parameter estimation - basic definitions
                - cramer-rao inequality for unbiased estimators
            - maximum likelihood estimators
                - maximum likelihood and the information measure
            - maximum a posteriori estimations
            - extentions of MLE to include model selection
                - akaike's information theoretic criterion (AIC)
                - minimal description length and stochastic complexity
            - generalization and learning on the same data set
        2. statistical physics theory of supervised learning and generalization
            - statistical mechanics theory of supervised learning
                - maximum entropy principle
                - probability inference with an ensemble of networks
                - information gain and complexity analysis
            - learning with higher order neural networks
                - partition function evaluation
                - information gain in polynomail networks
                - numerical experiments
            - learning with general feedforward neural networks
                - partition function approximation 
                - numerical experiments
            - statistical theory of unsupervised and supervised factorial learning
                - statistical theory of unsupervised factorial learning
                - duality between unsupervised and maximum likelihood based supervised learning
        3. composite networks
            - cooperation and specialization in composite networks
            - composite models as gaussian mixtures
        4. information theory based regularizing methods
            - theoretical framework
                - network complexity regulation
                - network architecture and learning paradigm
                - applications of them mutual information based penalty term
            - regularization in stochastic potts neural network
- **Neural network design**
    1. Inroduction
        - objectives
        - history
        - biological inspiration
    2. Neuron model and network architectures
        - objectives
        - theory and examples
            - notation
            - neuron model
                - single-input neuron
                - transfer functions
                - multiple-input neuron
            - network architectures
                - a layer of neurons
                - multiple layers of neurons
                - recurrent networks
    3. an illustrative example
        - objectives
        - theory and examples
            - problem statement
            - perceptron
                - two-input case
                - pattern recognition example
            - hamming network
                - feedforward layer
                - recurrent layer
            - hopfield network
    4. perceptron learning rule
        - objectives
        - theory and examples
            - learning rules
            - perceptron architecture
                - single-neuron perceptron
                - multiple-neuron perceptron
            - perceptron learning rule
                - test problem
                - constructing learning rules
                - unified learning rule
                - training multiple-neuron perceptrons
            - proof of convergence
    5. signal and weight vector spaces
        - objectives
        - theory and examples
            - linear vector spaces
            - linear independence
            - spanning a space
            - inner product
            - norm
            - orthogonality
                - gram-schmidt orthogonalization
            - vector expansions
                - reciprocal basis vectors
    6. linear transformations for neural networks
        - objectives
        - theory and examples
            - linear transformations
            - matrix representations
            - change of basis
            - eigenvalues and eigenvectors
                - diagonalization
    7. supervised hebbian learning
        - objectives
        - theory and examples
            - linear associator
            - the Hebb rule
            - pseudoinverse rule
            - application
            - variations of Hebbian learning
    8. performance surgaces and optimum points
        - objectives
        - theory and examples
            - taylor series
                - vector case
            - directional derivatives
            - minima
            - necessary conditions for optimality
                - first-order conditions
                - second-order conditions
            - quadratic functions
                - eigensystem of the Hessian
    9. performance optimization
        - objectives
        - theory and examples
            - steepest descent
                - stable learning rates
                - minimizing along a line
            - newton's method
            - conjugate gradient
    10. widrow-hoff learning
        - objectives
        - theory and examples
            - ADALINE network
            - mean square error
            - LMS algorithm
            - analysis of convergence
            - adaptive filtering
                - adaptive noise cancellation
                - echo cancellation
    11. backpropagation
        - objectives
        - theory and examples
            - multilayer perceptrons
                - pattern classification
                - function approximation
            - the backpropagation algorithm
                - performance index
                - chain rule
                - backpropagating the sensitivities
            - using backpropagation
                - choice of network architecture
                - convergence
                - generalization
    12. variations on backpropagation
        - drawbacks of backpropagation
            - performance surface example
            - convergence example
        - heuristic modifications of backpropagation
            - momentum
            - variable learning rate
        - numerical optimization techniques
            - conjugate gradient
            - levenberg-marquardt algorithm
    13. associative learning
        - simple associative network
        - unsupervised Hebb rule
            - hebb rule with decay
        - simple recognition network
        - instar rule
            - Kohonen rule
        - simple recall network
        - outstar rule
    14. competitive networks
        - Hamming network
        - competitive layer
            - competitive learning
            - problems with competitive layers
        - competitive layers in biogy
        - self-organizing feature maps
            - improving feature maps
        - learning vector quantization
            - LVQ learning
            - improving LVQ networks (LVQ2)
    15. grossberg network
        - biological motivation: vision
            - illusions
            - vision normalization
        - basic nonlinear model
        - two-layer competitive network
            - choice of transfer function
            - learning law
        - relation to kohonen law
    16. adaptive resonance theory
        - overview of adaptive resonance
        - layer - steady state analysis
        - orienting subsystem
        - learning law: L1-L2
            - subset/superset dilemma
            - learning law
        - ART1 Algorithm summary
            - initialization 
            - algorithm
    17. stability
        - recurrent networks
        - stability concepts
        - lyapunov stability theorem
        - pendulum example
        - lasalles invariance theorem
    18. Hopfield network
        - hopfield model
        - Lyapunov function
            - invariant sets
            - hopfield attractors
        - effect of gain
        - hopfield design
            - content-addressable memory
            - Hebb rule
            - Lyapunov surface
    19. Epilogue
        - feedforward and related networks
        - competitive networks
        - dynamic associative memory networks
        - classical foundations of neural networks
- **Introduction to theory of computation**
    1. introduction
    2. the hopfield model
        - the associative memory problem
        - statistical mechanics of magnetic systems
        - stochastic networks
        - capacity of the stochastic network
    3. extensions of the hopfield model
        - variations on the hopfield model
        - correlated patterns
        - continuous-valued units
        - hardware implementations
        - temporal sequences of patterns
    4. optimization problems
        - the weighted matching problem
        - the travelling salesman problem
        - graph bipartitioning
        - optimization problems in image processing
    5. simple perceptrons
        - feed-forward networks
        - threshold units
        - proof of convergence of the percepron learning rule
        - linear units
        - nonlinear units
        - stochastic units
        - capactity of the simple perceptron
    6. multi-layer networks
        - back-propagation
        - variations on back-propagation
        - performance of multi-layer feed-forward networks
        - a theoretical framework for generalization
        - optimal network architectures
    7. recurrent networks
        - boltzmann machines
        - recurrent back-propagation
        - learning time sequences
        - reinforcement learning
    8. unsupervised Hebbian learning
        - unsupervised learning
        - one linear unit 
        - principal component analysis
        - self-organizing feature extraction
    9. unsupervised competitive learning
        - simple competitive learning
        - adaptive resonance theory
        - feature mapping
        - theory of feature mapping
        - the travelling salesman problem
        - hybrid learning schemes
    10. formal statistical mechanics of neural networks
        - the hopfield model
        - gardner theory of the connections
    11. Statistical mechanics
        - the boltzmann-gibbs distribution
        - free energy and entropy
        - stochastic dynamics
- **Neural networks - a systematic introduction**
     1. The biological paradigm (PDF)
        - Neural computation
            - Natural and artificial neural networks
            - Models of computation
            - Elements of a computing model
        - Networks of neurons
            - Structure of the neurons
            - Transmission of information
            - Information processing at the neurons and synapses
            - Storage of information - Learning
            - The neuron - a self-organizing system
        - Artificial neural networks
            - Networks of primitive functions
            - Approximation of functions
            - Caveat
        - Historical and bibliographical remarks
     2. Threshold logic (PDF)
        - Networks of functions
            - Feed-forward and recurrent networks
            - The computing units
        - Synthesis of Boolean functions
            - Conjunction, disjunction, negation
            - Geometric interpretation
            - Constructive synthesis
        - Equivalent networks
            - Weighted and unweighted networks
            - Absolute and relative inhibition
            - Binary signals and pulse coding
        - Recurrent networks
            - Stored state networks
            - Finite automata
            - Finite automata and recurrent networks
            - A first classification of neural networks
        - Harmonic analysis of logical function
            - General expression
            - The Hadamard-Walsh transform
            - Applications of threshold logic
        - Historical and bibliographical remarks
     3. Weighted Networks - The Perceptron (PDF)
        - Perceptrons and parallel processing
            - Perceptrons as weighted threshold elements
            - Computational limits of the perceptron model
        - Implementation of logical functions
            - Geometric interpretation
            - The XOR problem
        - Linearly separable functions
            - Linear separability
            - Duality of input space and weight space
            - The error function in weight space
            - General decision curves
        - Applications and biological analogy
            - Edge detection with perceptrons
            - The structure of the retina
            - Pyramidal networks and the neocognitron
            - The silicon retina
        - Historical and bibliographical remarks
     4. Perceptron learning(PDF)
        - Learning algorithms for neural networks
            - Classes of learning algorithms
            - Vector notation
            - Absolute linear separability
            - The error surface and the search method
        - Algorithmic learning
            - Geometric visualization
            - Convergence of the algorithm
            - Accelerating convergence
            - The pocket algorithm
            - Complexity of perceptron learning
        - Linear programming
            - Inner points of polytopes
            - Linear separability as linear optimization
            - Karmarkar´s Algorithm
        - Historical and bibliographical remarks
     5. Unsupervised learning and clustering algorithms(PDF)
        - Competitive learning
            - Generalization of the perceptron problem
            - Unsupervised learning through competition
        - Convergence analysis
            - The one-dimensional case - Energy function
            - Multidimensional case - The classical methods
            - Unsupervised learning as minimization problem
            - Stability of the solutions
        - Principal component analysis
            - Unsupervised reinforcement learning
            - Convergence of the learning algorithm
            - Multiple principal components
        - Examples
            - Pattern recognition
            - Image compression
        - Historical and bibliographical remarks
     6. One and two layered networks(PDF)
        - Structure and geometric visualization
            - Network architecture
            - The XOR problem revisited
            - Geometric visualization
        - Counting regions in input and weight space
            - Weight space regions for the XOR problem
            - Bipolar vectors
            - Projection of the solution regions
            - Geometric interpretation
        - Regions for two layered networks
            - Regions in weight space for the XOR problem
            - Number of regions in general
            - Consequences
            - The Vapnik-Chervonenkis dimension
            - The problem of local minima
        - Historical and bibliographical remarks
     7. The backpropagation algorithm(PDF)
        - Learning as gradient descent
            - Differentiable activation functions
            - Regions in input space
            - Local minima of the error function
        - General feed-forward networks
            - The learning problem
            - Derivatives of network functions
            - Steps of the backpropagation algorithm
            - Learning with Backpropagation
        - The case of layered networks
            - Extended network
            - Steps of the algorithm
            - Backpropagation in matrix form
            - The locality of backpropagation
            - An Example
        - Recurrent networks
            - Backpropagation through time
            - Hidden Markov Models
            - Variational problems
        - Historical and bibliographical remarks
     8. Fast learning algorithms(PDF)
        - Introduction - Classical backpropagation
            - Backpropagation with momentum
            - The fractal geometry of backpropagation
        - Some simple improvements to backpropagation
            - Initial weight selection
            - Clipped derivatives and offset term
            - Reducing the number of floating-point operations
            - Data decorrelation
        - Adaptive step algorithms
            - Silva and Almeida´s algorithm
            - Delta-bar-delta
            - RPROP
            - The Dynamic Adaption Algorithm
        - Second-order algorithms
            - Quickprop
            - Second-order backpropagation
        - Relaxation methods
            - Weight and node perturbation
            - Symmetric and asymmetric relaxation
            - A final thought on taxonomy
        - Historical and bibliographical remarks
     9. Statistics and Neural Networks(PDF)
        - Linear and nonlinear regression
            - The problem of good generalization
            - Linear regression
            - Nonlinear units
            - Computing the prediction error
            - The jackknife and cross-validation
            - Committees of networks
        - Multiple regression
            - Visualization of the solution regions
            - Linear equations and the pseudoinverse
            - The bidden layer
            - Computation of the pseudoinverse
        - Classification networks
            - An application: NETtalk
            - The Bayes property of classifier networks
            - Connectionist speech recognition
            - Autoregressive models for time series analysis
        - Historical and bibliographical remarks
     10. The complexity of learning(PDF)
        - Network functions
            - Learning algorithms for multilayer networks
            - Hilbert´s problem and computability
            - Kolmogorov´s theorem
        - Function approximation
            - The one-dimensional case
            - The multidimensional case
        - Complexity of learning problems
            - Complexity classes
            - NP-complete learning problems
            - Complexity of learning with AND-OR networks
            - Simplifications of the network architecture
            - Learning with hints
        - Historical and bibliographical remarks
     11. Fuzzy Logic(PDF)
        - Fuzzy sets and fuzzy logic
            - Imprecise data and imprecise rules
            - The fuzzy set concept
            - Geometric representation of fuzzy sets
            - Set theory, logic operators and geometry
            - Families of fuzzy operators
        - Fuzzy inferences
            - Inferences from imprecise data
            - Fuzzy numbers and inverse operation
        - Control with fuzzy logic
            - Fuzzy controllers
            - Fuzzy networks
            - Function approximation with fuzzy methods
            - The eye as a fuzzy system - color vision
        - Historical and bibliographical remarks
     12. Associative Networks(PDF)
        - Associative pattern recognition
            - Recurrent networks and types of associative memories
            - Structure of an associative memory
            - The eigenvector automaton
        - Associative learning
            - Hebbian Learning - The correlation matrix
            - Geometric interpretation of Hebbian learning
            - Networks as dynamical systems - Some experiments
            - Another visualization
        - The capacity problem
        - The pseudoinverse
            - Definition and properties of the pseudoinverse
            - Orthogonal projections
            - Holographic memories
            - Translation invariant pattern recognition
        - Historical and bibliographical remarks
     13. The Hopfield Model(PDF)
        - Synchronous and asynchronous networks
            - Recursive networks with stochastic dynamics
            - The bidirectional associative memory
            - The energy function
        - Definition of Hopfield networks
            - Asynchronous networks
            - Examples of the model
            - Isomorphism between the Hopfield and Ising models
        - Converge to stable states
            - Dynamics of Hopfield networks
            - Convergence proof
            - Hebbian learning
        - Equivalence of Hopfield and perceptron learning
            - Perceptron learning in Hopfield networks
            - Complexity of learning in Hopfield models
        - Parallel combinatorics
            - NP-complete problems and massive parallelism
            - The multiflop problem
            - The eight rooks problem
            - The eight queens problem
            - The traveling salesman
            - The limits of Hopfield networks
        - Implementation of Hopfield networks
            - Electrical implementation
            - Optical implementation
        - Historical and bibliographical remarks
     14. Stochastic networks(PDF)
        - Variations of the Hopfield model
            - The continuous model
        - Stochastic systems
            - Simulated annealing
            - Stochastic neural networks
            - Markov chains
            - The Boltzmann distribution
            - Physical meaning of the Boltzmann distribution
        - Learning algorithms and applications
            - Boltzmann learning
            - Combinatorial optimization
        - Historical and bibliographical remarks
     15. Kohonen networks(PDF)
        - Self-organization
            - Charting input space
            - Topology preserving maps in the brain
        - Kohonen´s model
            - Learning algorithm
            - Mapping low dimensional spaces with high-dimensional grids
        - Analysis of convergence
            - Potential function - the one-dimensional case
            - The two-dimensional case
            - Effect of a unit´s neighborhood
            - Metastable states
            - What dimension for Kohonen networks?
        - Applications
            - Approximation of functions
            - Inverse kinematics
        - Historical and bibliographical remarks
     16. Modular Neural Network(PDF)
        - Constructive algorithms for modular networks
            - Cascade correlation
            - Optimal modules and mixtures of experts
        - Hybrid networks
            - The ART architecures
            - Maximum entropy
            - Counterpropagation networks
            - Spline networks
            - Radial basis functions
        - Historical and bibliographical remarks
     17. Genetic Algorithms(PDF)
        - Coding and operators
            - Optimization problems
            - Methods of stochastic optimization
            - Genetic coding
            - Information exchange with genetic operators
        - Properties of genetic algorithms
            - Convergence analysis
            - Deceptive problems
            - Genetic drift
            - Gradient methods versus genetic algorithms
        - Neural networks and genetic algorithms
            - The problem of symmetries
            - A numerical experiment
            - Other applications of Gas
        - Historical and bibliographical remarks
     18. Hardware for neural networks(PDF)
        - Taxonomy of neural hardware
            - Performance requirements
            - Types of neurocomputers
        - Analog neural networks
            - Coding
            - VLSI transistor circuits
            - Transistors with stored charge
            - CCD components
        - Digital networks
            - Numerical representation of weights and signals
            - Vector and signal processors
            - Systolic arrays
            - One-dimensional structures
        - Innovative computer architectures
            - VLSI microprocessors for neural networks
            - Optical computers
            - Pulse coded networks
        - Historical and bibliographical remarks
- **Handbook of Neural Network Signal Processing**
    1. Introduction to neural networks for signal processing
    2. signal processing using the multilayer perceptron
    3. Radial basis functions
    4. an introduction to kernel-based learning algorithms
    5. committee machines
    6. dunamical neural networks and optimal signal processing
    7. blind signal separation and blind deconvolution
    8. neural networks and principal component analysis
    9. applications of artificial neural networks to time series prediction
    10. applications of ANNs to speech processing
    11. learning and adaptive characterization of visual contents in image retrievel systems
    12. applications of neural networks to image processing
    13. hierarchical fuzzy neural networks for pattern classification

