Skip to content

Commit

Permalink
Merge pull request #1840 from d2l-ai/master
Browse files Browse the repository at this point in the history
Release v0.17.0
  • Loading branch information
astonzhang committed Jul 25, 2021
2 parents b9483ae + 7684ce7 commit 5462733
Show file tree
Hide file tree
Showing 69 changed files with 18,898 additions and 673 deletions.
6 changes: 3 additions & 3 deletions README.md
Expand Up @@ -34,6 +34,8 @@ Our goal is to offer a resource that could

## Cool Papers Using D2L

1. [**Descending through a Crowded Valley--Benchmarking Deep Learning Optimizers**](https://arxiv.org/pdf/2007.01547.pdf). R. Schmidt, F. Schneider, P. Hennig. *International Conference on Machine Learning, 2021*

1. [**Universal Average-Case Optimality of Polyak Momentum**](https://arxiv.org/pdf/2002.04664.pdf). D. Scieur, F. Pedregosan. *International Conference on Machine Learning, 2020*

1. [**2D Digital Image Correlation and Region-Based Convolutional Neural Network in Monitoring and Evaluation of Surface Cracks in Concrete Structural Elements**](https://www.mdpi.com/1996-1944/13/16/3527/pdf). M. Słoński, M. Tekieli. *Materials, 2020*
Expand All @@ -42,11 +44,9 @@ Our goal is to offer a resource that could

1. [**Detecting Human Driver Inattentive and Aggressive Driving Behavior Using Deep Learning: Recent Advances, Requirements and Open Challenges**](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9107077). M. Alkinani, W. Khan, Q. Arshad. *IEEE Access, 2020*

1. [**Diagnosing Parkinson by Using Deep Autoencoder Neural Network**](https://link.springer.com/chapter/10.1007/978-981-15-6325-6_5). U. Kose, O. Deperlioglu, J. Alzubi, B. Patrut. *Deep Learning for Medical Decision Support Systems, 2020*

<details><summary>more</summary>

1. [**Descending through a Crowded Valley--Benchmarking Deep Learning Optimizers**](https://arxiv.org/pdf/2007.01547.pdf). R. Schmidt, F. Schneider, P. Hennig.
1. [**Diagnosing Parkinson by Using Deep Autoencoder Neural Network**](https://link.springer.com/chapter/10.1007/978-981-15-6325-6_5). U. Kose, O. Deperlioglu, J. Alzubi, B. Patrut. *Deep Learning for Medical Decision Support Systems, 2020*

1. [**Deep Learning Architectures for Medical Diagnosis**](https://link.springer.com/chapter/10.1007/978-981-15-6325-6_2). U. Kose, O. Deperlioglu, J. Alzubi, B. Patrut. *Deep Learning for Medical Decision Support Systems, 2020*

Expand Down
1 change: 1 addition & 0 deletions chapter_attention-mechanisms/bahdanau-attention.md
Expand Up @@ -431,6 +431,7 @@ d2l.show_heatmaps(attention_weights[:, :, :, :len(engs[-1].split()) + 1],
* When predicting a token, if not all the input tokens are relevant, the RNN encoder-decoder with Bahdanau attention selectively aggregates different parts of the input sequence. This is achieved by treating the context variable as an output of additive attention pooling.
* In the RNN encoder-decoder, Bahdanau attention treats the decoder hidden state at the previous time step as the query, and the encoder hidden states at all the time steps as both the keys and values.


## Exercises

1. Replace GRU with LSTM in the experiment.
Expand Down
2 changes: 1 addition & 1 deletion chapter_computer-vision/transposed-conv.md
Expand Up @@ -308,7 +308,7 @@ Therefore,
the transposed convolutional layer
can just exchange the forward propagation function
and the backpropagation function of the convolutional layer:
its forward propagation
its forward propagation
and backpropagation functions
multiply their input vector with
$\mathbf{W}^\top$ and $\mathbf{W}$, respectively.
Expand Down
1 change: 1 addition & 0 deletions chapter_convolutional-neural-networks/channels.md
Expand Up @@ -102,6 +102,7 @@ corr2d_multi_in(X, K)
```

## Multiple Output Channels
:label:`subsec_multi-output-channels`

Regardless of the number of input channels,
so far we always ended up with one output channel.
Expand Down
18 changes: 17 additions & 1 deletion chapter_deep-learning-computation/model-construction.md
Expand Up @@ -206,13 +206,29 @@ Before we implement our own custom block,
we briefly summarize the basic functionality
that each block must provide:

:begin_tab:`mxnet, tensorflow`

1. Ingest input data as arguments to its forward propagation function.
1. Generate an output by having the forward propagation function return a value. Note that the output may have a different shape from the input. For example, the first fully-connected layer in our model above ingests an input of arbitrary dimension but returns an output of dimension 256.
1. Calculate the gradient of its output with respect to its input, which can be accessed via its backpropagation function. Typically this happens automatically.
1. Store and provide access to those parameters necessary
to execute the forward propagation computation.
1. Initialize model parameters as needed.

:end_tab:

:begin_tab:`pytorch`

1. Ingest input data as arguments to its forward propagation function.
1. Generate an output by having the forward propagation function return a value. Note that the output may have a different shape from the input. For example, the first fully-connected layer in our model above ingests an input of arbitrary dimension but returns an output of dimension 256.
1. Generate an output by having the forward propagation function return a value. Note that the output may have a different shape from the input. For example, the first fully-connected layer in our model above ingests an input of dimension 20 but returns an output of dimension 256.
1. Calculate the gradient of its output with respect to its input, which can be accessed via its backpropagation function. Typically this happens automatically.
1. Store and provide access to those parameters necessary
to execute the forward propagation computation.
1. Initialize model parameters as needed.

:end_tab:


In the following snippet,
we code up a block from scratch
corresponding to an MLP
Expand Down
28 changes: 16 additions & 12 deletions chapter_natural-language-processing-applications/index.md
@@ -1,30 +1,34 @@
# Natural Language Processing: Applications
:label:`chap_nlp_app`

We have seen how to represent text tokens and train their representations in :numref:`chap_nlp_pretrain`.
We have seen how to represent tokens in text sequences and train their representations in :numref:`chap_nlp_pretrain`.
Such pretrained text representations can be fed to various models for different downstream natural language processing tasks.

This book does not intend to cover natural language processing applications in a comprehensive manner.
Our focus is on *how to apply (deep) representation learning of languages to addressing natural language processing problems*.
Nonetheless, we have already discussed several natural language processing applications without pretraining in earlier chapters,
In fact,
earlier chapters have already discussed some natural language processing applications
*without pretraining*,
just for explaining deep learning architectures.
For instance, in :numref:`chap_rnn`,
we have relied on RNNs to design language models to generate novella-like text.
In :numref:`chap_modern_rnn` and :numref:`chap_attention`,
we have also designed models based on RNNs and attention mechanisms
for machine translation.
we have also designed models based on RNNs and attention mechanisms for machine translation.

However, this book does not intend to cover all such applications in a comprehensive manner.
Instead,
our focus is on *how to apply (deep) representation learning of languages to addressing natural language processing problems*.
Given pretrained text representations,
in this chapter, we will consider two more downstream natural language processing tasks:
sentiment analysis and natural language inference.
These are popular and representative natural language processing applications:
the former analyzes single text and the latter analyzes relationships of text pairs.
this chapter will explore two
popular and representative
downstream natural language processing tasks:
sentiment analysis and natural language inference,
which analyze single text and relationships of text pairs, respectively.

![Pretrained text representations can be fed to various deep learning architectures for different downstream natural language processing applications. This chapter focuses on how to design models for different downstream natural language processing applications.](../img/nlp-map-app.svg)
:label:`fig_nlp-map-app`

As depicted in :numref:`fig_nlp-map-app`,
this chapter focuses on describing the basic ideas of designing natural language processing models using different types of deep learning architectures, such as MLPs, CNNs, RNNs, and attention.
Though it is possible to combine any pretrained text representations with any architecture for either downstream natural language processing task in :numref:`fig_nlp-map-app`,
Though it is possible to combine any pretrained text representations with any architecture for either application in :numref:`fig_nlp-map-app`,
we select a few representative combinations.
Specifically, we will explore popular architectures based on RNNs and CNNs for sentiment analysis.
For natural language inference, we choose attention and MLPs to demonstrate how to analyze text pairs.
Expand All @@ -33,7 +37,7 @@ for a wide range of natural language processing applications,
such as on a sequence level (single text classification and text pair classification)
and a token level (text tagging and question answering).
As a concrete empirical case,
we will fine-tune BERT for natural language processing.
we will fine-tune BERT for natural language inference.

As we have introduced in :numref:`sec_bert`,
BERT requires minimal architecture changes
Expand Down
Expand Up @@ -48,7 +48,7 @@ To study this problem, we will begin by investigating a popular natural language

## The Stanford Natural Language Inference (SNLI) Dataset

Stanford Natural Language Inference (SNLI) Corpus is a collection of over $500,000$ labeled English sentence pairs :cite:`Bowman.Angeli.Potts.ea.2015`.
Stanford Natural Language Inference (SNLI) Corpus is a collection of over 500000 labeled English sentence pairs :cite:`Bowman.Angeli.Potts.ea.2015`.
We download and store the extracted SNLI dataset in the path `../data/snli_1.0`.

```{.python .input}
Expand Down Expand Up @@ -110,7 +110,7 @@ def read_snli(data_dir, is_train):
return premises, hypotheses, labels
```

Now let us print the first $3$ pairs of premise and hypothesis, as well as their labels ("0", "1", and "2" correspond to "entailment", "contradiction", and "neutral", respectively ).
Now let us print the first 3 pairs of premise and hypothesis, as well as their labels ("0", "1", and "2" correspond to "entailment", "contradiction", and "neutral", respectively ).

```{.python .input}
#@tab all
Expand All @@ -121,8 +121,8 @@ for x0, x1, y in zip(train_data[0][:3], train_data[1][:3], train_data[2][:3]):
print('label:', y)
```

The training set has about $550,000$ pairs,
and the testing set has about $10,000$ pairs.
The training set has about 550000 pairs,
and the testing set has about 10000 pairs.
The following shows that
the three labels "entailment", "contradiction", and "neutral" are balanced in
both the training set and the testing set.
Expand Down Expand Up @@ -246,7 +246,7 @@ def load_data_snli(batch_size, num_steps=50):
return train_iter, test_iter, train_set.vocab
```

Here we set the batch size to $128$ and sequence length to $50$,
Here we set the batch size to 128 and sequence length to 50,
and invoke the `load_data_snli` function to get the data iterators and vocabulary.
Then we print the vocabulary size.

Expand All @@ -258,7 +258,7 @@ len(vocab)

Now we print the shape of the first minibatch.
Contrary to sentiment analysis,
we have $2$ inputs `X[0]` and `X[1]` representing pairs of premises and hypotheses.
we have two inputs `X[0]` and `X[1]` representing pairs of premises and hypotheses.

```{.python .input}
#@tab all
Expand Down

0 comments on commit 5462733

Please sign in to comment.