Skip to content

EmbraceLife/LIE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LIE: Learning Is Experimenting

Can we learn deep learning in depth by just experimenting working examples?

transfer learning based on VGG16

  • learn source codes of tf with example on VGG16
    • minimum comment on source codes
  • Notebook to show process example by fast.ai
    • process of image data going through each filter or layer
    • process of weights update through running each sample
    • process of effects of small and large learning rate
  • core tf.contrib.keras building VGG16

Understanding the nature of transfer learning

  • uniqueness about imagenet datasets for this task
    • one central object in each image
    • rest is background or environment
  • nature of the problem:
    • inputs:lots of images, each contain a central object
    • return: an object-name of 1000 object-names
  • VGG16 use many layers of filters to screen the same image with different features (of different depths and focus)
  • weights and layers are to capture the correlation between inputs and output

how to use VGG16 to transfer learning

borrow on VGG16

when you don't have large images datasets and huge GPU computing power

  • prepare data
  • build model
  • load weights

make it work for you

when you have a small datasets of your own (correctly labeled for training)

  • finetune
  • train
  • predict

interactive visualization of CNN model

what if there is no readily labeled datasets, but unlabeled images and videos

  • find useful weights or layers (e.g. weights or layers to detect vertically flying object)
  • build model using those specific weights and layers
  • use model to filter out interesting images

workflow

prepare data

  • how to prepare images into train, valid, test, sample folders notebook

  • convert train|valid|test|sample folder into batch_iterator objects source

  • convert images in folders to batches then to a large array: source


build model

  • how to create vgg16 model instance from tf.contrib.keras source

Fine_tuning vgg16 model

  • fine_tune1: just replace the last layer: source

  • fine_tune2: replace the last layer and add an additional dense layer before the last layer source


train fine-tuned model

  • train or fit vgg16 model: source

  • save-load model+weights, save-load model, save-load weights, load old and make new model, load weights to a new model: source

  • load, train_again, and save vgg16 model: source

  • save large arrays with maximum memory efficiency: source

    • for the same large array to be saved, bcolz can shrink its size by 4 times smaller than file saved by numpy, pickle, torch, kur.idx

predict or test models

  • load vgg16 model, predict with test batches, and save preds: source

  • np.clip, log-loss, ids-pred array: source

    • why and how to clip predictions
    • extract image ids and cbind ids with predictions
    • how log-loss or cross-entropy behave against predictions

  • build model with Sequential: source

  • vgg16 model decode preds for 1000 classes: source

  • process image dataset for vgg16? source


  1. how to call a keras function manually to test it? forum

  2. (todo) how to plot source

    • A few correct labels at random
    • A few incorrect labels at random
    • The most correct labels of each class (ie those with highest probability that are correct)
    • The most incorrect labels of each class (ie those with highest probability that are incorrect)
    • The most uncertain labels (ie those with probability closest to 0.5).
    • confusion matrix

  1. (kr1.2) how to finetune vgg16 with new set of classes fast.ai source

fast.ai Lesson notes

Lesson1: vgg16 on dogscats

  1. most noble goal of fast ai course: fast.ai.wiki

  2. why VGG16 preprocess images that way: fast.ai.forum

    • mean and order of RGB are prefixed
  3. how to apply VGG to catsdogs problem: fast.ai

  4. why do batch training: fast.ai

  5. (kr) how to switch from theano to tf backend, and switch cpu and gpu in theano: fast.ai

  6. why study state-of-art models like VGG: fast.ai

  7. 7 steps to recognize basic images using vgg16: fast.ai

  8. how to see limit and bias of a pretrained model like vgg16: fast.ai

  9. What is finetune to a pretrained model like vgg16? fast.ai

  10. Why create a sample/train,valid,test folders from full dataset? fast.ai

  11. how to count number of files in a folder: ls folder/ | wc -l

  12. how to unzip a zip file: unzip -q data.zip

  13. how to check python version: which python

  14. how to organize dataset before training: source

    • train, test, sample folder
    • sample: train, valid, test subfolders
    • experiment codes on sample dataset
  15. how to check the first few files of a folder: ls folder/ | head

  16. how to submit to kaggle: fast.ai

  17. how to use kaggle cli to download and submit #@F: source


Lesson2: vgg16 on dogscats

  1. make a todo list before working on any project or problem

  2. how to prepare data after downloaded from kaggle? fast.ai.wiki

  3. why we need save weights other than save model?

    • Answer: with save, load empty model, save and load only weights, we can load weights to the same or even a different model keras doc
  4. how to submit to kaggle using keras.predict_generator and FileLink: wiki

  5. why clip the final predictions for better log loss measurement? wiki

  6. what does CNNs learn fast.ai video

    • beautifully explained
    • it seems very useful to view all layers and weights of vgg16
      • (todo) check forum for similar questions
      • (todo) do it myself with my own prewritten functions
  7. how deep learning work in excel? fast.ai video

    • (todo) display in excel how input layer and weights layer create output | activation layer
    • (todo) display how to do Axier initialization on weights
  8. how to visualize SGD gradually optimize weights to plot the correct line? fast.ai video

    • (todo) rewrite to numpy code
    • (todo) keras on linear model with SGD
  9. how to efficient save and load large arrays; save, load, plot from images files in folders; do one_hot encoding fast.ai video

    • (todo) use bcolz to efficiently save and load predictions array
    • (todo) get batches of images loaded from folders into proper arrays
    • (todo) save, load, plot those arrays into images
    • (todo) do one_hot encoding
  10. how to use a single dense network with Vgg16 model's power? fast.ai video

    • (todo) use keras Sequential, Dense and vgg16's outputs
  11. power of activation layer - non-linear function fast.ai video

    • input dot weights == linear model
    • activation function == non-linear model
    • activation func make deep learning powerful
  12. (todo) how to visualize model as dots: forum

    • source
      • not useful at all, model.summary() can do a better job, seemingly
  13. challenges:

    • (todo) dissect every line of vgg16()

Lesson 3 fast.ai part1

  1. how to experiment the notebook video

    • write the source code
    • experiment for 30 minutes,, check questions and answers on wiki and forum
    • then ask questions on forum
  2. better explain the usage of deep viz toolbox video

    • (todo) install and try as the video do
    • (todo) can this apply to RNN? and other neuralnets?
    • (todo later) there are vis tools on RNN too
  3. understand cnn with spreadsheet and notebook: video

    • (todo) spreadsheet: compare Dense with Conv on create activation layer
    • (todo) notebook: explain Conv, Pad, and Pool in steps
    • (todo) convert notebook to spreadshee: spreadsheet is more intuitive displaying
  4. review VGG: video

    • (todo) demo the above todo task with vgg16 model example
  5. maxpooling, padding: video

    • (todo) spreadsheet and notebook: maxpool lose pixel, and layer goes deep, filter shrink size further, but activation layer becomes more high concept recognisable images?
    • how to make sense of it all
  6. softmax with spreadshee: video

    • (todo)
  7. review SGD notebook: video

    • (todo) intuition: how sgd or derivative help update weight, bias to the optimal weight and bias of target
    • (todo) convert notebook to spreadsheet?
  8. how to decide on filter size and num_fitlers: video

    • practically, filter size (3,3)
    • num_fitlers: not yet finalized
    • how to deal with much larger images:
      • not yet finalized: help from attention with LSTM to mimic how actual eye work?
  9. how to retrain vgg16 with more layers? video

    • (todo) how many layers to make trainable?
      • intuition: knowing which layer does what through weights vis
      • experiment to see what works better
      • vgg16 retain 1 dense layer for dogscats, retain 3 dense layers for statefarm dataset, as CNN layers are positional invariance oriented.
  10. understand under and overfitting: wiki

    • underfitting: use linear model to do vgg16's job, use far too less parameters to take on more complex problems, in the end error for both training and validation are still very high
    • overfitting: use too many parameters to take on relatively simpler problems, in the end training loss is very low, but val_loss is higher
  11. what dropout layer do: video and wiki

    • drop out 50% neurons, model won't get overfit even though model has a huge number of neurons
    • in vgg16 dogscats case, 50% is too much, and made model underfitting, so drop 50% to 10% maybe can avoid underfitting
    • dropout: like ensemble, is to create many smaller but different models
  12. split vgg16 into 2 models: video

    • (todo) split to a cnn model
      • don't change anything in cnn model, as it is expensive to train cnn model weights
    • (todo) split to a dense model: do change dense model as you like, it is cheaper to train
  13. how to do data augumentation: video

    • help reduce overfitting
    • how to rotate or augumentation on your dataset
    • (todo) make it your own
  14. how to do batch normalization: video

    • why normalize inputs:
      • cos if inputs have different scales, it is harder to train and loss can mess up and too high, so ALWAYS normalize your inputs
    • why do batch normalization:
      • 10 faster
      • reduce overfitting
    • what is batch normalization:
      • normalize not inputs but intermediate layers
      • apply two trainable parameters to each layer: arbitrary std and mean
    • (todo) cnn inputs + dense model + batch_normalization and train this new model
  15. (todo) end-to-end model building process with mnist: video

    • load mnist dataset
    • onehot label
    • normalize inputs
    • single dense model, 1-hidden dense model on mnist
    • vgg-style simple cnn model video
    • make sure the model is capable of some overfitting, then try to reduce overfitting
    • data augmentation, batch_normalization on every layer(do understand the source), dropout
    • ensembling
  16. challenge: statefarm solutions


Stanford Tensorflow for Deep Learning Research

  1. (tf) how to use math ops: source

  2. how to create histogram plot: source

  3. (tf) how to create random dataset: source

  4. (tf) how to make a sequence with tf.linspace, tf.range: source

    • np.linspace and range can do iterations
    • not for tf.linspace, tf.range
  5. (tf) how to use tf.ones, tf.ones_like, tf.zeros, tf.zeros_like, tf.fill: source

  6. (tf) how to use tf.constant and tricks: source

  7. (tf) how to use sess, graph, to display ops in tensorboard: source


matplotlib

  1. how to do subplots without space in between: source

  2. how to quick-plot stock csv (use full csv path, plot close and volume right away): source

  3. how to stock csv with date (csv to vec object, date formatter): source

  4. how to reverse a list or numpy array?

    • list(reversed(your_list))
    • your_array[::-1]
  5. how to gridsubplot (stock chart like subplots): source

  6. how to subplot (4 equal subplots, 1 large + 3 small subplots): source

  7. how to plot images (array shape for image, interpolation, cmap, origin studied): source

  8. how to plot contour map (not studied): source

  9. how to plot bars (set facecolor, edgecolor, text loc for each bar, ticks, xlim): source

  10. how to do scatter plot (set size, color, alpha, xlim, ignore ticks): source

  11. how to set x,y ticks labels fontsize, color, alpha: source

  12. how to add annotation or text: source

  13. how to add labels to lines and change labels when setting legend: source

  14. how to reposition x,y axis anywhere with plt.gca(), ax.spines, ax.xaxis: source

  15. how to set line params, xlim, xticks: source

  16. how to plot subplots of 4 activation lines: source


Pytorch

  1. torch activations sources

  2. torch Variables: source

  3. torch.tensor vs numpy: source

  4. read stocks csv:

  5. train cnn net: source

  6. train rnn net: source

  7. build rnn net: source

  8. build cnn net: source

  9. prepare_mnist: source



Numpy

  1. 100 exercises to numpy: source

  2. pandas exercises: source

  3. how to save and load numpy arrays with numpy: doc example

Visualize convolution

  1. regression on fake 1d dataset:

  1. classification on fake 1d dataset:

  1. cnn on mnist:

  1. rnn on mnist: need tensorboard to see model structure

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published