-
Notifications
You must be signed in to change notification settings - Fork 282
/
layers-core.R
583 lines (570 loc) · 18.5 KB
/
layers-core.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
#' Just your regular densely-connected NN layer.
#'
#' @description
#' `Dense` implements the operation:
#' `output = activation(dot(input, kernel) + bias)`
#' where `activation` is the element-wise activation function
#' passed as the `activation` argument, `kernel` is a weights matrix
#' created by the layer, and `bias` is a bias vector created by the layer
#' (only applicable if `use_bias` is `TRUE`).
#'
#' # Note
#' If the input to the layer has a rank greater than 2, `Dense`
#' computes the dot product between the `inputs` and the `kernel` along the
#' last axis of the `inputs` and axis 0 of the `kernel` (using `tf.tensordot`).
#' For example, if input has dimensions `(batch_size, d0, d1)`, then we create
#' a `kernel` with shape `(d1, units)`, and the `kernel` operates along axis 2
#' of the `input`, on every sub-tensor of shape `(1, 1, d1)` (there are
#' `batch_size * d0` such sub-tensors). The output in this case will have
#' shape `(batch_size, d0, units)`.
#'
#' # Input Shape
#' N-D tensor with shape: `(batch_size, ..., input_dim)`.
#' The most common situation would be
#' a 2D input with shape `(batch_size, input_dim)`.
#'
#' # Output Shape
#' N-D tensor with shape: `(batch_size, ..., units)`.
#' For instance, for a 2D input with shape `(batch_size, input_dim)`,
#' the output would have shape `(batch_size, units)`.
#'
#' # Methods
#' - ```r
#' enable_lora(
#' rank,
#' a_initializer = 'he_uniform',
#' b_initializer = 'zeros'
#' )
#' ```
#'
#' - ```r
#' quantize(mode, type_check = TRUE)
#' ```
#'
#' # Readonly properties:
#'
#' - `kernel`
#'
#' @param units
#' Positive integer, dimensionality of the output space.
#'
#' @param activation
#' Activation function to use.
#' If you don't specify anything, no activation is applied
#' (ie. "linear" activation: `a(x) = x`).
#'
#' @param use_bias
#' Boolean, whether the layer uses a bias vector.
#'
#' @param kernel_initializer
#' Initializer for the `kernel` weights matrix.
#'
#' @param bias_initializer
#' Initializer for the bias vector.
#'
#' @param kernel_regularizer
#' Regularizer function applied to
#' the `kernel` weights matrix.
#'
#' @param bias_regularizer
#' Regularizer function applied to the bias vector.
#'
#' @param activity_regularizer
#' Regularizer function applied to
#' the output of the layer (its "activation").
#'
#' @param kernel_constraint
#' Constraint function applied to
#' the `kernel` weights matrix.
#'
#' @param bias_constraint
#' Constraint function applied to the bias vector.
#'
#' @param lora_rank
#' Optional integer. If set, the layer's forward pass
#' will implement LoRA (Low-Rank Adaptation)
#' with the provided rank. LoRA sets the layer's kernel
#' to non-trainable and replaces it with a delta over the
#' original kernel, obtained via multiplying two lower-rank
#' trainable matrices. This can be useful to reduce the
#' computation cost of fine-tuning large dense layers.
#' You can also enable LoRA on an existing
#' `Dense` layer by calling `layer$enable_lora(rank)`.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @param ...
#' For forward/backward compatability.
#'
#' @returns The return value depends on the value provided for the first argument.
#' If `object` is:
#' - a `keras_model_sequential()`, then the layer is added to the sequential model
#' (which is modified in place). To enable piping, the sequential model is also
#' returned, invisibly.
#' - a `keras_input()`, then the output tensor from calling `layer(input)` is returned.
#' - `NULL` or missing, then a `Layer` instance is returned.
#' @export
#' @family core layers
#' @family layers
#' @seealso
#' + <https://keras.io/api/layers/core_layers/dense#dense-class>
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense>
#' @tether keras.layers.Dense
layer_dense <-
function (object, units, activation = NULL, use_bias = TRUE,
kernel_initializer = "glorot_uniform", bias_initializer = "zeros",
kernel_regularizer = NULL, bias_regularizer = NULL, activity_regularizer = NULL,
kernel_constraint = NULL, bias_constraint = NULL, lora_rank = NULL,
...)
{
args <- capture_args(list(units = as_integer, lora_rank = as_integer,
input_shape = normalize_shape, batch_size = as_integer,
batch_input_shape = normalize_shape), ignore = "object")
create_layer(keras$layers$Dense, object, args)
}
#' A layer that uses `einsum` as the backing computation.
#'
#' @description
#' This layer can perform einsum calculations of arbitrary dimensionality.
#'
#' # Examples
#' **Biased dense layer with einsums**
#'
#' This example shows how to instantiate a standard Keras dense layer using
#' einsum operations. This example is equivalent to
#' `layer_Dense(64, use_bias=TRUE)`.
#'
#' ```{r}
#' input <- layer_input(shape = c(32))
#' output <- input |>
#' layer_einsum_dense("ab,bc->ac",
#' output_shape = 64,
#' bias_axes = "c")
#' output # shape(NA, 64)
#' ```
#'
#' **Applying a dense layer to a sequence**
#'
#' This example shows how to instantiate a layer that applies the same dense
#' operation to every element in a sequence. Here, the `output_shape` has two
#' values (since there are two non-batch dimensions in the output); the first
#' dimension in the `output_shape` is `NA`, because the sequence dimension
#' `b` has an unknown shape.
#'
#' ```{r}
#' input <- layer_input(shape = c(32, 128))
#' output <- input |>
#' layer_einsum_dense("abc,cd->abd",
#' output_shape = c(NA, 64),
#' bias_axes = "d")
#' output # shape(NA, 32, 64)
#' ```
#'
#' **Applying a dense layer to a sequence using ellipses**
#'
#' This example shows how to instantiate a layer that applies the same dense
#' operation to every element in a sequence, but uses the ellipsis notation
#' instead of specifying the batch and sequence dimensions.
#'
#' Because we are using ellipsis notation and have specified only one axis, the
#' `output_shape` arg is a single value. When instantiated in this way, the
#' layer can handle any number of sequence dimensions - including the case
#' where no sequence dimension exists.
#'
#' ```{r}
#' input <- layer_input(shape = c(32, 128))
#' output <- input |>
#' layer_einsum_dense("...x,xy->...y",
#' output_shape = 64,
#' bias_axes = "y")
#'
#' output # shape(NA, 32, 64)
#' ```
#'
#' # Methods
#' - ```r
#' enable_lora(
#' rank,
#' a_initializer = 'he_uniform',
#' b_initializer = 'zeros'
#' )
#' ```
#'
#' - ```r
#' quantize(mode, type_check = TRUE)
#' ```
#'
#' # Readonly properties:
#'
#' - `kernel`
#'
#' @param equation
#' An equation describing the einsum to perform.
#' This equation must be a valid einsum string of the form
#' `ab,bc->ac`, `...ab,bc->...ac`, or
#' `ab...,bc->ac...` where 'ab', 'bc', and 'ac' can be any valid einsum
#' axis expression sequence.
#'
#' @param output_shape
#' The expected shape of the output tensor
#' (excluding the batch dimension and any dimensions
#' represented by ellipses). You can specify `NA` or `NULL` for any dimension
#' that is unknown or can be inferred from the input shape.
#'
#' @param activation
#' Activation function to use. If you don't specify anything,
#' no activation is applied
#' (that is, a "linear" activation: `a(x) = x`).
#'
#' @param bias_axes
#' A string containing the output dimension(s)
#' to apply a bias to. Each character in the `bias_axes` string
#' should correspond to a character in the output portion
#' of the `equation` string.
#'
#' @param kernel_initializer
#' Initializer for the `kernel` weights matrix.
#'
#' @param bias_initializer
#' Initializer for the bias vector.
#'
#' @param kernel_regularizer
#' Regularizer function applied to the `kernel` weights
#' matrix.
#'
#' @param bias_regularizer
#' Regularizer function applied to the bias vector.
#'
#' @param kernel_constraint
#' Constraint function applied to the `kernel` weights
#' matrix.
#'
#' @param bias_constraint
#' Constraint function applied to the bias vector.
#'
#' @param lora_rank
#' Optional integer. If set, the layer's forward pass
#' will implement LoRA (Low-Rank Adaptation)
#' with the provided rank. LoRA sets the layer's kernel
#' to non-trainable and replaces it with a delta over the
#' original kernel, obtained via multiplying two lower-rank
#' trainable matrices
#' (the factorization happens on the last dimension).
#' This can be useful to reduce the
#' computation cost of fine-tuning large dense layers.
#' You can also enable LoRA on an existing
#' `EinsumDense` layer by calling `layer$enable_lora(rank)`.
#'
#' @param ...
#' Base layer keyword arguments, such as `name` and `dtype`.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @inherit layer_dense return
#' @export
#' @family core layers
#' @family layers
# @seealso
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/EinsumDense>
#'
#' @tether keras.layers.EinsumDense
layer_einsum_dense <-
function (object, equation, output_shape, activation = NULL,
bias_axes = NULL, kernel_initializer = "glorot_uniform",
bias_initializer = "zeros", kernel_regularizer = NULL, bias_regularizer = NULL,
kernel_constraint = NULL, bias_constraint = NULL, lora_rank = NULL,
...)
{
args <- capture_args(list(lora_rank = as_integer, input_shape = normalize_shape,
batch_size = as_integer, batch_input_shape = normalize_shape,
output_shape = normalize_shape), ignore = "object")
create_layer(keras$layers$EinsumDense, object, args)
}
#' Turns positive integers (indexes) into dense vectors of fixed size.
#'
#' @description
#' e.g. `rbind(4L, 20L)` \eqn{\rightarrow}{->} `rbind(c(0.25, 0.1), c(0.6, -0.2))`
#'
#' This layer can only be used on positive integer inputs of a fixed range.
#'
#' # Example
#'
#' ```{r}
#' model <- keras_model_sequential() |>
#' layer_embedding(1000, 64)
#'
#' # The model will take as input an integer matrix of size (batch,input_length),
#' # and the largest integer (i.e. word index) in the input
#' # should be no larger than 999 (vocabulary size).
#' # Now model$output_shape is (NA, 10, 64), where `NA` is the batch
#' # dimension.
#'
#' input_array <- random_integer(shape = c(32, 10), minval = 0, maxval = 1000)
#' model |> compile('rmsprop', 'mse')
#' output_array <- model |> predict(input_array, verbose = 0)
#' dim(output_array) # (32, 10, 64)
#' ```
#'
#' # Input Shape
#' 2D tensor with shape: `(batch_size, input_length)`.
#'
#' # Output Shape
#' 3D tensor with shape: `(batch_size, input_length, output_dim)`.
#'
#' # Methods
#' - ```r
#' enable_lora(
#' rank,
#' a_initializer = 'he_uniform',
#' b_initializer = 'zeros'
#' )
#' ```
#'
#' - ```r
#' quantize(mode, type_check = TRUE)
#' ```
#'
#' - ```r
#' quantized_build(input_shape, mode)
#' ```
#'
#' - ```r
#' quantized_call(...)
#' ```
#'
#' # Readonly properties:
#'
#' - `embeddings`
#'
#' @param input_dim
#' Integer. Size of the vocabulary,
#' i.e. maximum integer index + 1.
#'
#' @param output_dim
#' Integer. Dimension of the dense embedding.
#'
#' @param embeddings_initializer
#' Initializer for the `embeddings`
#' matrix (see `keras3::initializer_*`).
#'
#' @param embeddings_regularizer
#' Regularizer function applied to
#' the `embeddings` matrix (see `keras3::regularizer_*`).
#'
#' @param embeddings_constraint
#' Constraint function applied to
#' the `embeddings` matrix (see `keras3::constraint_*`).
#'
#' @param mask_zero
#' Boolean, whether or not the input value 0 is a special
#' "padding" value that should be masked out.
#' This is useful when using recurrent layers which
#' may take variable length input. If this is `TRUE`,
#' then all subsequent layers in the model need
#' to support masking or an exception will be raised.
#' If `mask_zero` is set to `TRUE`, as a consequence,
#' index 0 cannot be used in the vocabulary (`input_dim` should
#' equal size of vocabulary + 1).
#'
#' @param weights
#' Optional floating-point matrix of size
#' `(input_dim, output_dim)`. The initial embeddings values
#' to use.
#'
#' @param lora_rank
#' Optional integer. If set, the layer's forward pass
#' will implement LoRA (Low-Rank Adaptation)
#' with the provided rank. LoRA sets the layer's embeddings
#' matrix to non-trainable and replaces it with a delta over the
#' original matrix, obtained via multiplying two lower-rank
#' trainable matrices. This can be useful to reduce the
#' computation cost of fine-tuning large embedding layers.
#' You can also enable LoRA on an existing
#' `Embedding` layer instance by calling `layer$enable_lora(rank)`.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @param ...
#' For forward/backward compatability.
#'
#' @inherit layer_dense return
#' @export
#' @family core layers
#' @family layers
#' @seealso
#' + <https://keras.io/api/layers/core_layers/embedding#embedding-class>
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/Embedding>
#' @tether keras.layers.Embedding
layer_embedding <-
function (object, input_dim, output_dim, embeddings_initializer = "uniform",
embeddings_regularizer = NULL, embeddings_constraint = NULL,
mask_zero = FALSE, weights = NULL, lora_rank = NULL, ...)
{
args <- capture_args(list(input_dim = as_integer, output_dim = as_integer,
input_shape = normalize_shape, batch_size = as_integer,
batch_input_shape = normalize_shape, input_length = as_integer),
ignore = "object")
create_layer(keras$layers$Embedding, object, args)
}
#' Identity layer.
#'
#' @description
#' This layer should be used as a placeholder when no operation is to be
#' performed. The layer just returns its `inputs` argument as output.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @param ...
#' For forward/backward compatability.
#'
#' @inherit layer_dense return
#' @export
#' @family core layers
#' @family layers
# @seealso
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/Identity>
#' @tether keras.layers.Identity
layer_identity <-
function (object, ...)
{
args <- capture_args(list(input_shape = normalize_shape,
batch_size = as_integer, batch_input_shape = normalize_shape),
ignore = "object")
create_layer(keras$layers$Identity, object, args)
}
#' Wraps arbitrary expressions as a `Layer` object.
#'
#' @description
#' The `layer_lambda()` layer exists so that arbitrary expressions can be used
#' as a `Layer` when constructing Sequential
#' and Functional API models. `Lambda` layers are best suited for simple
#' operations or quick experimentation. For more advanced use cases,
#' prefer writing new subclasses of `Layer` using [`new_layer_class()`].
#'
#'
#' # Examples
#' ```{r}
#' # add a x -> x^2 layer
#' model <- keras_model_sequential()
#' model |> layer_lambda(\(x) x^2)
#' ```
#'
#' @param f
#' The function to be evaluated. Takes input tensor as first
#' argument.
#'
#' @param output_shape
#' Expected output shape from function. This argument
#' can usually be inferred if not explicitly provided.
#' Can be a list or function. If a list, it only specifies
#' the first dimension onward; sample dimension is assumed
#' either the same as the input:
#' `output_shape = c(input_shape[1], output_shape)` or,
#' the input is `NULL` and the sample dimension is also `NULL`:
#' `output_shape = c(NA, output_shape)`.
#' If a function, it specifies the
#' entire shape as a function of the input shape:
#' `output_shape = f(input_shape)`.
#'
#' @param mask
#' Either `NULL` (indicating no masking) or a callable with the same
#' signature as the `compute_mask` layer method, or a tensor
#' that will be returned as output mask regardless
#' of what the input is.
#'
#' @param arguments
#' Optional named list of arguments to be passed to the
#' function.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @param ...
#' For forward/backward compatability.
#'
#' @inherit layer_dense return
#' @export
#' @family core layers
#' @family layers
#' @seealso
#' + <https://keras.io/api/layers/core_layers/lambda#lambda-class>
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/Lambda>
#'
#' @tether keras.layers.Lambda
layer_lambda <-
function (object, f, output_shape = NULL, mask = NULL, arguments = NULL,
...)
{
args <- capture_args(list(input_shape = normalize_shape,
batch_size = as_integer, batch_input_shape = normalize_shape,
output_shape = normalize_shape), ignore = "object")
names(args)[match("f", names(args))] <- "function"
create_layer(keras$layers$Lambda, object, args)
}
#' Masks a sequence by using a mask value to skip timesteps.
#'
#' @description
#' For each timestep in the input tensor (the second dimension in the tensor),
#' if all values in the input tensor at that timestep
#' are equal to `mask_value`, then the timestep will be masked (skipped)
#' in all downstream layers (as long as they support masking).
#'
#' If any downstream layer does not support masking yet receives such
#' an input mask, an exception will be raised.
#'
#' # Examples
#' Consider an array `x` of shape `c(samples, timesteps, features)`,
#' to be fed to an LSTM layer. You want to mask timestep #3 and #5 because you
#' lack data for these timesteps. You can:
#'
#' - Set `x[, 3, ] <- 0.` and `x[, 5, ] <- 0.`
#' - Insert a `layer_masking()` layer with `mask_value = 0.` before the LSTM layer:
#'
#' ```{r}
#' c(samples, timesteps, features) %<-% c(32, 10, 8)
#' inputs <- c(samples, timesteps, features) %>% { array(runif(prod(.)), dim = .) }
#' inputs[, 3, ] <- 0
#' inputs[, 5, ] <- 0
#'
#' model <- keras_model_sequential() %>%
#' layer_masking(mask_value = 0) %>%
#' layer_lstm(32)
#'
#' output <- model(inputs)
#' # The time step 3 and 5 will be skipped from LSTM calculation.
#' ```
#'
#' # Note
#' in the Keras masking convention, a masked timestep is denoted by
#' a mask value of `FALSE`, while a non-masked (i.e. usable) timestep
#' is denoted by a mask value of `TRUE`.
#'
#' @param object
#' Object to compose the layer with. A tensor, array, or sequential model.
#'
#' @param ...
#' For forward/backward compatability.
#'
#' @param mask_value
#' see description
#'
#' @inherit layer_dense return
#' @export
#' @family core layers
#' @family layers
#' @seealso
#' + <https://keras.io/api/layers/core_layers/masking#masking-class>
# + <https://www.tensorflow.org/api_docs/python/tf/keras/layers/Masking>
#' @tether keras.layers.Masking
layer_masking <-
function (object, mask_value = 0, ...)
{
args <- capture_args(list(input_shape = normalize_shape,
batch_size = as_integer, batch_input_shape = normalize_shape),
ignore = "object")
create_layer(keras$layers$Masking, object, args)
}