Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finish basic_text_classification translation #98

Merged
merged 6 commits into from
Oct 24, 2018

Conversation

TobiasLee
Copy link
Contributor

resolve: #96
.md 文件中 Colab Notebook 我不知道怎么更改... 麻烦 @leviding 检查一下~
另外 .ipynb 我是在 Jupyter Notebook 中直接修改的,也不知道有没有问题, 有问题随时告知我,辛苦啦!

@leviding
Copy link
Member

leviding commented Oct 9, 2018

@TobiasLee 这样可以

@leviding
Copy link
Member

leviding commented Oct 9, 2018

校对可以选择 split 来 diff 内容进行校对,或者也使用 Jupyter Notebook 进行校对。


image

@rocheers
Copy link

@leviding 认领校对

@leviding
Copy link
Member

@rocheers 👍

Copy link

@rocheers rocheers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

翻译超赞!有几处格式问题,还有英文原文也保留着。需要查看一下

"\n",
"在这个任务中,我们将把电影评论分为**积极**和**消极**两种,即是一个**二分类**任务,这是一个非常重要并且已经被广泛应用的机器学习问题。\n",
"\n",
"我们将使用 [IMDB 数据集](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb),其中包括了 50000 条来自 [Internet Movie Database](https://www.imdb.com/) 的电影评论。这些评论被等分成两份分别用于训练和测试,并且,训练集和测试集的样本是**平衡**的,也就是说,积极和消极的评论数目相同。\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「这些评论被等分成两份分别用于训练和测试」=>「这些评论被等分成两份,分别用于训练和测试」

"\n",
"我们将使用 [IMDB 数据集](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb),其中包括了 50000 条来自 [Internet Movie Database](https://www.imdb.com/) 的电影评论。这些评论被等分成两份分别用于训练和测试,并且,训练集和测试集的样本是**平衡**的,也就是说,积极和消极的评论数目相同。\n",
"\n",
"接下来的代码中,我们会使用一个用于创建和训练 TensorFlow 模型的高级 API —— [tf.keras](https://www.tensorflow.org/guide/keras)。如果你希望查看进阶版的文本分类教程,请查看 [MLCC Text Classification Guide](https://developers.google.com/machine-learning/guides/text-classification/)。"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「如果你希望查看进阶版的文本分类教程」=>「如果你希望查看 tf.keras 进阶版的文本分类教程」

"## 下载 IMDB 数据集\n",
"\n",
"\n",
"IMDB 数据集随 TensorFlow 附带,并且已经被预处理过:单词序列已经被转换成证书序列,并且每个整数对应字典中特定的一个单词。\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「单词序列已经被转换成证书序列」=>「单词序列已经被转换成整数序列」

"source": [
"## 探索数据\n",
"\n",
"让我们先来看看数据的格式。数据集已经被预处理过了,其中:每个电影评论样本(一连串的单词)由一个整数数组代表,每个评论的标签是一个 0 或者 1 的整数,其中 0 代表消极的评论,1 代表积极的评论。"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「其中:每个电影评论样本(一连串的单词)由一个整数数组代表,」=>「其中:每个电影评论样本(一连串的单词)由一个整数数组代表,其中每个整数表示一个单词。」

"source": [
"## 准备数据\n",
"\n",
"The reviews—the arrays of integers—must be converted to tensors before fed into the neural network. This conversion can be done a couple of ways:\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的英文已经翻译了,但没有删除这段原文,是有特殊考虑?

"source": [
"### 隐藏单元\n",
"\n",
"The above model has two intermediate or \"hidden\" layers, between the input and output. The number of outputs (units, nodes, or neurons) is the dimension of the representational space for the layer. In other words, the amount of freedom the network is allowed when learning an internal representation.\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里和上面有一段一样,翻译了中文之后没有删除英文。

"\n",
"The above model has two intermediate or \"hidden\" layers, between the input and output. The number of outputs (units, nodes, or neurons) is the dimension of the representational space for the layer. In other words, the amount of freedom the network is allowed when learning an internal representation.\n",
"\n",
"上面的模型在输入和输出之间有两层隐藏层。输出向量的维度(单位,节点或神经元)是网络层的表示空间的维度。 换句话说,是网络在学习内部表示时所具有的自由度。\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「上面的模型在输入和输出之间有两层隐藏层」=>「上面的模型在输入和输出之间有两个中间层,或者叫“隐藏”层」

"上面的模型在输入和输出之间有两层隐藏层。输出向量的维度(单位,节点或神经元)是网络层的表示空间的维度。 换句话说,是网络在学习内部表示时所具有的自由度。\n",
"\n",
"\n",
"If a model has more hidden units (a higher-dimensional representation space), and/or more layers, then the network can learn more complex representations. However, it makes the network more computationally expensive and may lead to learning unwanted patterns—patterns that improve performance on training data but not on the test data. This is called *overfitting*, and we'll explore it later.\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

英文原文没删

"source": [
"## 评估模型\n",
"\n",
"让我们看看模型最终表现的怎么样,我们将得到两个指标:loss(代表模型的错误,越低越好)以及准确率。"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

「loss(代表模型的错误,越低越好)以及准确率。」=>「Loss(代表模型的错误,值越低越好)以及准确率。」

@rocheers
Copy link

@leviding @TobiasLee 校对完成

@leviding
Copy link
Member

@TobiasLee 可以修改啦

@leviding leviding added the enhancement New feature or request label Oct 20, 2018
感谢仔细的校对~
@TobiasLee
Copy link
Contributor Author

@leviding 修改完毕

@leviding
Copy link
Member

@TobiasLee 你检查一下,jpynb 文件和英文原文预览效果不同 https://github.com/xitu/tensorflow-docs/blob/v1.10/tutorials/keras/basic_text_classification.ipynb

检查什么问题。

@leviding leviding added help wanted Extra attention is needed and removed enhancement New feature or request labels Oct 23, 2018
@TobiasLee
Copy link
Contributor Author

@leviding 麻烦再看一下?好像之前是有个 cell 的 type 错了

@leviding
Copy link
Member

leviding commented Oct 23, 2018

@TobiasLee 还是不一样,你两边对比一下,应该很明显,不用截图说明吧?开头的 In [ ]: 中标号不显示,文章结尾多余一个代码块

@TobiasLee
Copy link
Contributor Author

@leviding 开头的标号不显示是因为我把运行记录清空了,Notebook 处于未运行状态,一般网上的 Jupyter Notebook 都是这样子,所以我建议维持原样。最后一行的空行已经去掉了。

@leviding
Copy link
Member

@TobiasLee 好的,辛苦啦~

@leviding leviding merged commit e223b9d into xitu:zh-hans Oct 24, 2018
@leviding
Copy link
Member

辛苦啦各位 👍

@leviding leviding added 翻译完成 and removed help wanted Extra attention is needed labels Oct 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

install/gpu.md, basic_text_classification.ipynb and basic_text_classification.md
3 participants