# Chapter 01 Your Deep Learning Journey

## Objective

To enhance my learning practices with capturing notes, solving problems and answering questions in the book. 

## Questionnaire

1. Do you need these for deep learning?
 * Lots of math T/F
 * Lots of data T/F
 * Lots of expensive computers T/F
 * A PhD T/F

  ***Answer***: We don't need lots of math/data/expensive computers/A PhD for using deep learning.

  \[book\]
  * High school math is sufficient
  * We've seen record-breaking results with <50 items of data
  * You can get what you need for state-of-the-art work for free
  * A PhD is not needed as you learn and practice deep learning to solve real problems.

2. Name five areas where deep learning is now the best tool in the world.

  ***Answer***: 
  * Natural Language Processing (such as classification, summation, translation, etc.)
  * Medical / bioscience (protein discovery)
  * Computer Vision (such as image classification, image detection and synmatic segmentation, etc.)
  * Time series forecasting (with global model)
  * robotics

  \[book\]
  * Natural language processing (NLP) (answering questions; speech recognition; summarizing documents; classifying documents; finding names, dates, etc. in documents
  * Computer Vision (CV) Satellite and drone imagery interpretation, face recognition, image captioning, reading traffic signs, locating pedestrains and vehicles in autonomous vehicles
  * Medicine (finding anomalies in radiology images, including CT, MBR and X-ray images; counting fitures in pathology slides; measure features in ultrasounds, diagnosing diabetic retinopathy
  * Biology (folding proteins; classifying proteins; many genomics tasks, such as tumornormal sequencing and classifying clinically actionable genetic mutations; cell classification; analyzing protein/protein interactions.
  * Image generation (color images, increasing image resolution, removing noise from images, coverting images to art in the style of famous artists
  * Recommendation systems (web search, production recommendation, home page layout)
  * Playing games
  * Robotics


3. What was the name of the first device that was based on the principle of the artificial neuron?

  ***Answer***:

  \[book\]
  * The Mark I Perceptron
  * Rosenblatt wrote about this work: " we are now about to witness the birth of such a machine - a machine capable of perceiving, recognizing and identifying its surroundings without any human training or control' The perceptron was built and was able to successfully recognize simple shapes

4. Based on the book of the same name, what are the requirements for parallel distributed processing (PDP)?

  ***Answer***:

  \[book\]
  * A set of processing units
  * A state of activation
  * An output function for each units
  * A pattern of connectivity among units
  * A propagation rule for propagating patterns of activities through the network of connectivities
  * An activition rule for combining the inputs impinging on a unit with the current state of that unit to produce an output for the unit
  * A learning rule whereby patterns of connectivity are modified by experience
  * An environment within which the system must operate


5. What were the two theoretical misunderstandings that held back the field of neural networks?

  ***Answer***:

  \[book\]
  * A single layer of these devices was unable to learn some simple but critical mathematical functions (such as XOR); in the same book, they also showed that using multiple layers of the devices would allow these limitations to be addressed; unfortunately, only the first of these insights was widely recognized. As a result, the global academic community nearly entirely gave up on neural networks for the next two decades.
  * In theory, adding just one extra layer of neurons was enough to allow any mathematical function to be approximated with these neural networks, but in practice such networks were often too big and too slow to be useful. 

6. What is a GPU?

  ***Answer***:
  * GPU standards Graphics Processing Unit, which is good at parallel computing, especially 

  \[book\]
  * also known as a graphics card. a special kind of processor in your computer that can handle thousands of single tasks at the same time, especially designed for displaying 3D environments on a computer for playing games. These same basic tasks are very similar to what neural networks do, such that GPUs can run neural networks hundreds of times fast than regular CPUs. All om

7. Open a notebook and execute a cell containing: 1+1. What happens

  ***Answer***:
  * It will print '2' as output; the 1+1 expression will be taken to execute and 2 as output.

  \[book\]

In [1]:
1+1

2

8. Follow through each cell of the stripped version of the notebook for this chapter. Before executing each cell, guess what will happen.

  ***Answer***:
  * they are model training code for CV, NLP, Tabular and recommendations problems. 
  * I can see the training progress and output of the inference result, which I can guess what it output.

  \[book\]

9. Complete the Jupyter Notebook online appendix (https://oreil.ly/9uPZe).

  ***Answer***:
  * I am familiar with the appendix content. Will run through it.

  \[book\]

10. Why is it hard to use a traditional computer program to recognize images in a photo?

  ***Answer***:
  * Images data are unstructured and it's very hard to define specific rules to identify features from image(s). 

  \[book\]
  * It is very tricky to recognize objects in a photo; we don't know what are the steps to take to recognize an object in a picture; it all happens in our brain without us being consciously aware of it?

11. What did Samuel mean by "weight assignment"?
  
  ***Answer***:
  * "weight assignment" is a process to optimize the coefficients / parameters for the neural network approximite function.

  \[book\]
  * Weights are just variables, and a weight assignment is a particular choice of values for those variables. The program's inputs are values that it processes in order to product its results - for instance, taking image pixels as inputs, and returning the classification "dog" as a result. The program's weight assignments are other values that define how the program will operate.


12. What term do we normally use in deep learning for what Samuel called "weights"?
  
  ***Answer***:
  * model parameters

  \[book\]


13. Draw a picture that summarizes Samuel's view of a machine learning model.
  
  ***Answer***:
  ```
    inputs  -- |       |  
               | model | -- inference             -- |      |
    weights -- |       |                             | loss |  --+ 
       ^                    label / target value  -- |      |    |
       |                                                         |
       ----------------------------------------------------------+
   ```

  \[book\]


14. Why is it hard to understand why a deep learning model makes a particular prediction?
  
  ***Answer***:
  * deep learning model is built by multiple-layer NN, the weights being trained to meet the model performance forms a complex NN, making it hard for human being to interpret.

  \[book\]
  * It's not at all obvious what the model might look like for an image recognition program, or for understanding  text, or for many other interesting problems we might imagine. 

15. What is the name of the theorem that shows that a neural network can solve any mathematical problem to any level of accuracy?
  
  ***Answer***:
  * universal approximation theorem. 

  \[book\]
  * A mathematical proof called the universal approximation theorem shows that this function can solve any problem to any level of accuracy, in theory. The fact that neural networks are so flexible means that, in practice, they are often a suitable kind of model, and you can focus your effort on the process of training them - that is, of finding good weight assignments. 


16. What do you need in order to train a model?
  
  ***Answer***:
  * We need labelled data while using supervised ML; 
  * During model training with DL, we need to define the NN architecture; then, during model training process, we will need a optimization process on the weight assignment so that the model performance may improve during training process. 

  \[book\]
  * data to train a model
  * a model can learn to operate on only the patterns seen in the input data used to train it.
  * this learning approach creates only predictions, not recommended actions
  * It's not enough to just have examples of input data; we need labels for that data too (e.g., pictures of dogs and cats aren't enough to train a model; we need a label for each one, sayiong which ones are dogs and which cats. 


17. How could a feedback loop impact the rollout of a predictive policy model?
  
  ***Answer***:
  * The purpose of predictive policy model may be for predict crimes, but the way to use the prediction result may result in bias on the data and cause more predicted policying on the similar data.
    * The predictive policing model may base on learning the data representation of historical data, and provides policing prediction; 
    * however, if the prediction result will be accumulated and being used as model training data, it will end up model policying prediction on the major of the historical data.
  
  \[book\]
  * a model interacts with environment can create feedback loops:
    * a predictive policing model is created based on where arrests have been made in the past, In practice, this is not actually predicting crime, but rather predicting arrests, and is therefore partially simply reflecting biases in the existing policing process.
    * Law enforcement officers then might use that model to decide where to focus their policing activity, resulting in increased arrests in those areas.
    * data on these additional arrests would then be fed back in to retrain future versions of the model. 
    
  * this is a positive feedback loop: the more the model is used, the more biased the data becomes, making the model even more biased, and so forth.


18. Do we always have to use 224*224-pixel images with the cat recognition model?
  
  ***Answer***:
  * Not necessarily, but using 224*224-pixel images are for simplicity. (given pictures may have different shapes.)

  \[book\]
  * this is the standard size for historical reasons (old pretrained models require this sie exactly), but you can pass pretty much anything. If you increase the size, you'll often get a model with better results (since it will be able to focus on more details), but at the price of speed and memory consumption; the opposite is true if you decrease the size.


19. What is the difference between classification and regression?
  
  ***Answer***:
  * classification is used to identify discrete values or categorities. In a nutshell, it's to classify categories. 
  * regression is used to predict continuous numbers.

  \[book\]
  * a classification model is one that attempts to predict a class, or category. that is, it's predicting from a number of discrete possibilities, such as 'dog' or 'cat'. 
  * a regression model is one that attempts to predict one or more numeric quantities, such as temperature or a location. 


20. What is a validation set? What is a test set? Why do we need them?
  
  ***Answer***:
  * validation set and test set are being used to verify the model performance with unseen data during model training, and overall, the purpose of using them is to avoid overfitting.
  * validation set is mainly being used during model training process, we don't use training data to evaluate model performance, instead, validation set (development set) are being used.
  * test set is mainly being used after training completed.  
  
  \[book\]
  * during model training, only measure model performance with validation set. when the accuracy of your model start to get worse, this is called overfitting. 


21. What will fastai do if you don't provide a validation set?

  ***Answer***
  
  \[book\]

22. Can we always use a random sample for a validation set? Why or why not?

  ***Answer***
  
  \[book\]

23. What is overfitting? Provide an example.

  ***Answer***
  
  \[book\]

24. What is a metric? How does it differ from loss?

  ***Answer***
  
  \[book\]

25. How can pretrain models help?

  ***Answer***
  
  \[book\]

25. 

  ***Answer***
  
  \[book\]

## Notes

* We strongly believe that the focus in learning needs to be on understanding the underlying techniques and how to apply them in practice, and how to quickly build expertise in new tools and techniques as they are released.