Skip to content

Latest commit

 

History

History
20 lines (19 loc) · 1.3 KB

dl_02.md

File metadata and controls

20 lines (19 loc) · 1.3 KB

Deep learning

Yann LeCun, Yoshua Bengio, Geoffrey Hinton (2015)

Key points

  • Conventional ML: careful feature engineering + domain knowledge
  • DL: learn features + representation from data
    • ML needs a good feature extractor to generalize well, while DL can automatically learn representations sensitive to details and invariant to irrelevant fluctuations
  • Advent of DL: unsupervised pre-training + fine-tuning with backprop works very well for generalization with smaller datasets
  • CNNs: convolutional and pooling layers + hierarchy
    • Convolution: detect local conjunctions of features from the previous layer
    • Pooling: merge semantically similar features into one
    • Natural signals are hierarchies (edge --> motif --> part --> object)
    • Pooling allows equal representation despite varying position/appearance
  • Deep nets make use distributed representations
    • Enable generalization through combining learned features
    • Same goes for using many layers of representations
  • RNNs: sequential inputs, maintain state vector of history
    • Suffer from vanishing/exploding gradients
    • Even longer-term memory: LSTM (memory cell with input and forget gate), neural Turing machine (explicit tape-like memory)
  • Future of DL: unsupervised learning, representation learning + complex reasoning