-
Notifications
You must be signed in to change notification settings - Fork 14
/
Keras-with-R-talk-slideshow.Rpres
289 lines (154 loc) · 7.89 KB
/
Keras-with-R-talk-slideshow.Rpres
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
Using Keras with R talk
========================================================
author: Anton Antonov
date: 2018-06-02
autosize: true
## [Orlando Machine Learning and Data Science meetup](https://www.meetup.com/Orlando-MLDS)
### [Deep Learning series (session 2)](https://www.meetup.com/Orlando-MLDS/events/250086544/)
Very short introduction
========================================================
Talking about TensorFlow / Keras / R combination:
```{r, eval=FALSE}
library(keras)
model <- keras_model_sequential()
model %>%
layer_dense(units = 256, activation = 'relu', input_shape = c(784)) %>%
layer_dropout(rate = 0.4) %>%
layer_dense(units = 128, activation = 'relu') %>%
layer_dropout(rate = 0.3) %>%
layer_dense(units = 10, activation = 'softmax')
summary(model)
```
Detailed introduction 1
========================================================
## Goals (messages to convey)
- Understanding deep learning by comparison
- Taking a system analysis approach
- Analogy with [a man made Machine Learning algorithm](https://mathematicaforprediction.wordpress.com/2013/08/26/classification-of-handwritten-digits/)
- Deep learning libraries
- TensorFlow, Keras, MXNet.
- With making neural networks is not so much of [Goldberg machines](https://en.wikipedia.org/wiki/Rube_Goldberg_machine) (anymore);
- more of a building with a Lego set or Soma cube.
Detailed introduction 2
========================================================
## Keras in R
- Classification with the [MNIST data set](http://yann.lecun.com/exdb/mnist/)
- Classification of IMDB reviews
- Some questions / explorations to consider
## Other
- The Trojan horse ([MXNet](https://mxnet.incubator.apache.org), [Mathematica](https://www.wolfram.com))
- [Powered By](https://mxnet.incubator.apache.org/community/powered_by.html)
Links
========================================================
- The book ["Deep learning with R"](https://www.manning.com/books/deep-learning-with-r)
- First three chapters are free. (And well-worth reading just them.)
- \[[1st](`https://manning-content.s3.amazonaws.com/download/6/3bdf613-e2f6-48fa-8710-b3bd0b7979e6/SampleCh01.pdf`)\],
\[[2nd](`https://manning-content.s3.amazonaws.com/download/4/481437b-2746-4ab1-94a7-c25eab8fae44/SampleCh02.pdf`)\],
\[[3rd](`https://manning-content.s3.amazonaws.com/download/9/9a3b0d8-e651-4239-8c4f-94267be64fee/SampleCh03.pdf`)\],
- [The book Rmd notebooks](https://github.com/jjallaire/deep-learning-with-r-notebooks) are at GitHub.
- [RStudio's Keras page](https://keras.rstudio.com)
- [another one](https://tensorflow.rstudio.com/keras/)
Who am I?
========================================================
- MSc in Mathematics (Abstract Algebra).
- MSc in Computer Science (Databases).
- PhD in Applied Mathematics (Large Scale Air Pollution Simulations).
- Former Kernel Developer of Mathematica (7 years).
- Currently branding as a "Senior Data Scientist."
- 10+ years experience in applying machine learning algorithms in commercial setting.
- Large part in recommendations systems building and related data analysis.
- Currently working in healthcare.
Audience questions
========================================================
- How many use R?
- How many use Python?
- How many are data scientists?
- How many are engineers?
- How many are students?
How Keras addresses Deep Learning's most important feature?
========================================================
- The principle: "Trying to see without looking."
- No special feature engineering required.
- The development speed-up of using Keras, in general and in R.
- The Paris Gun pattern.
Analogy: a classifier based on matrix factorization 1
========================================================
**1.** [Training phase](https://mathematicaforprediction.wordpress.com/2013/08/26/classification-of-handwritten-digits/)
1.1. Rasterize each training image into an array of 16 x 16 pixels.
1.2. Each raster image is linearized — the rows are aligned into a one dimensional array.
In other words, each raster image is mapped into a R^256 vector space.
We will call these one dimensional arrays raster vectors.
1.3. From each set of images corresponding to a digit make a matrix with 256 columns of the corresponding raster vectors.
1.4. Using the matrices in step 1.3 use thin SVD to derive orthogonal bases that describe the image data for each digit.
Analogy: a classifier based on matrix factorization 2
========================================================
**2.** [Recognition phase](https://mathematicaforprediction.wordpress.com/2013/08/26/classification-of-handwritten-digits/)
2.1. Given an image of an unknown digit derive its raster vector, R.
2.2. Find the residuals of the approximations of R with each of the bases found in 1.4.
2.3. The digit with the minimal residual is the recognition result.
- See [more](https://mathematicaforprediction.wordpress.com/?s=NNMF).
Neural network construction in general
========================================================
- See this diagram.
- Steps:
- Prepare the data.
- Chain layers.
- Pick an optimizer.
- Train and evaluate.
Neural network layers primer
========================================================
- Is this something the audience want to see/hear?
- Separate presentation or referenced along in the code runs?
- Sub-presentation done in Mathematica (~15 min.)
- See the functionality breakdowns:
- RStudio: [Keras reference](https://keras.rstudio.com/reference/index.html);
- Mathematica: ["Neural Networks guide"](http://reference.wolfram.com/language/guide/NeuralNetworks.html).
The code runs 1
========================================================
- First run with a basic, non-trivial example (over MNIST.)
- The breakdown:
- binary classification;
- multi-label classification;
- regression.
The code runs 2
========================================================
- The specific topics:
- encoders and decoders;
- dealing with over-fitting;
- categorical classification;
- vector classification.
Some questions to consider in more detail 1
========================================================
- Can we change the metrics function?
- Can we do out-of-core training?
- [Or, how we do batch training?](https://mathematica.stackexchange.com/a/174150/34008)
- How do we deal with over-fitting?
- Can we visualize the layers?
- Are there repositories we can use to download already made nets?
Some questions to consider in more detail 2
========================================================
- How easy to add a custom classifier to an already made and pre-trained net?
- Where we can find explanations and/or directions for which type layer to use under what conditions?
- How the data is “uplifted” into the space of a net?
- Encoders
- And of course what are the decoders?
Some guidelines 1
========================================================
- Most likely we will not be making neural network from scratch.
- Two important skills to acquire first:
- Knowing well how to utilize different encoders (over different data.)
- Knowing basic neural networks and how to obtain them.
- Copy & paste or from dedicated repositories.
- "Next wave" skills
- Knowing how to do batch training and out-of-core training.
- Knowing how to deal with over-fitting.
- Knowing how to do network surgery.
Some guidelines 2
========================================================
- Given a problem:
- Is it simple to apply neural networks to it?
- Do we have enough data with enough quality in order to apply neural networks?
- What result we get with alternative methods, like random forest, nearest neighbors, etc.?
Future plans
========================================================
- Conversational agent for building neural networks.