Paper-list-for-Deep-learning-in-NLP

There shows the paper list in the book of Deep learning in NLP(Li Deng&Yang Liu Editors)

这里主要列出了《Deep Learning in Natural Language Processing》（Li Deng&Yang Liu Editors）一书中各个章节后的参考论文，便于学习！

更新

2019-03-10：整理第一章

chapter 1 A Joint Introduction to Natural Language Processing and to Deep Learning

Abdel-Hamid, O., Mohamed, A., Jiang, H., Deng, L., Penn, G., & Yu, D. (2014). Convolutional neural networks for speech recognition. IEEE/ACM Trans. on Audio, Speech and Language Processing.

Amodei, D., Ng, A., et al. (2016). Deep speech 2: End-to-end speech recognition in English and Mandarin. In Proceedings of ICML.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of ICLR.

Baker, J., et al. (2009a). Research developments and directions in speech recognition and understanding. IEEE Signal Processing Magazine, 26(4).

Baker, J., et al. (2009b). Updated MINDS report on speech recognition and understanding. IEEE Signal Processing Magazine, 26(4).

Baum, L., & Petrie, T. (1966). Statistical inference for probabilistic functions of ﬁnite state markov chains. The Annals of Mathematical Statistics.

Bengio, Y. (2009). Learning Deep Architectures for AI. Delft: NOW Publishers.

Bengio, Y., Ducharme, R., Vincent, P., & d Jauvin, C. (2001). A neural probabilistic language model. Proceedings of NIPS.

Bishop, C. (1995). Neural Networks for Pattern Recognition. Oxford: Oxford University Press.

Bishop, C. (2006). Pattern Recognition and Machine Learning. Berlin: Springer.

Bridle, J., et al. (1998). An investigation of segmental hidden dynamic models of speech coarticulation for automatic speech recognition. Final Report for 1998 Workshop on Language Engineering, Johns Hopkins University CLSP.

Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer, R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19.

Charniak, E. (2011). The brain as a statistical inference engine—and you can too. Computational Linguistics, 37.

Chiang, D. (2007). Hierarchical phrase-based translation. Computaitional Linguistics.

Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.

Chorowski, J., Bahdanau, D., Serdyuk, D., Cho, K., & Bengio, Y. (2015). Attention-based models for speech recognition. In Proceedings of NIPS.

Church, K. (2007). A pendulum swung too far. Linguistic Issues in Language Technology, 2(4).

Church, K. (2014). The case for empiricism (with and without statistics). In Proceedings of Frame Semantics in NLP.

Church, K., & Mercer, R. (1993). Introduction to the special issue on computational linguistics using large corpora. Computational Linguistics, 9(1).

Collins, M. (1997). Head-driven statistical models for natural language parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia.

Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In Proceedings of EMNLP.

Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. Journal of Machine Learning Reserach, 12.

Dahl, G., Yu, D., & Deng, L. (2011). Large-vocabulry continuous speech recognition with context-dependent DBN-HMMs. In Proceedings of ICASSP.

Dahl, G., Yu, D., Deng, L., & Acero, A. (2012). Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transaction on Audio, Speech, and Language Processing, 20.

Deng, L. (1998). A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition. Speech Communication, 24(4).

Deng, L. (2014). A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Transactions on Signal and Information Processing, 3.

Deng, L. (2016). Deep learning: From speech recognition to language and multimodal processing. APSIPA Transactions on Signal and Information Processing, 5.

Deng, L. (2017). Artiﬁcial intelligence in the rising wave of deep learning—The historical path and future outlook. In IEEE Signal Processing Magazine, 35.

Deng, L., & O’Shaughnessy, D. (2003). SPEECH PROCESSING A Dynamic and Optimization-Oriented Approach. New York: Marcel Dekker.

Deng, L., & Yu, D. (2007). Use of differential cepstra as acoustic features in hidden trajectory modeling for phonetic recognition. In Proceedings of ICASSP.

Deng, L., & Yu, D. (2014). Deep Learning: Methods and Applications. Delft: NOW Publishers.

Deng, L., Hinton, G., & Kingsbury, B. (2013). New types of deep neural network learning for speech recognition and related applications: An overview. In Proceedings of ICASSP.

Deng, L., Seltzer, M., Yu, D., Acero, A., Mohamed, A., & Hinton, G. (2010). Binary coding of speech spectrograms using a deep autoencoder. In Proceedings of Interspeech.

Deng, L., Yu, D., & Platt, J. (2012). Scalable stacking and learning for building deep architectures. In Proceedings of ICASSP.

Devlin, J., et al. (2015). Language models for image captioning: The quirks and what works. In Proceedings of CVPR.

Dhingra, B., Li, L., Li, X., Gao, J., Chen, Y., Ahmed, F., & Deng, L. (2017). Towards end-to-end reinforcement learning of dialogue agents for information access. In Proceedings of ACL. Fang, H., et al. (2015). From captions to visual concepts and back. In Proceedings of CVPR.

Fei-Fei, L., & Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. In Proceedings of CVPR.

Fei-Fei, L., & Perona, P. (2016). Stacked attention networks for image question answering. In Proceedings of CVPR.

Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of ICML.

Gan, Z., et al. (2017). Semantic compositional networks for visual captioning. In Proceedings of CVPR.

Gasic, M., Mrk, N., Rojas-Barahona, L., Su, P., Ultes, S., Vandyke, D., Wen, T., & Young, S. (2017).Dialogue manager domain adaptation using gaussian process reinforcement learning. Computer Speech and Language, 45.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. Cambridge: MIT Press.

Goodfellow, I., et al. (2014). Generative adversarial networks. In Proceedings of NIPS.

Graves, A., et al. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538.

Hashimoto, K., Xiong, C., Tsuruoka, Y., & Socher, R. (2017). Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In Proceedings of EMNLP.

He, X., & Deng, L. (2012). Maximum expected BLEU training of phrase and lexicon translation models. In Proceedings of ACL.

He, X., & Deng, L. (2013). Speech-centric information processing: An optimization-oriented approach. Proceedings of the IEEE, 101.

He, X., Deng, L., & Chou, W. (2008). Discriminative learning in sequential pattern recognition. IEEE Signal Processing Magazine, 25(5).

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of CVPR.

Hinton, G., & Salakhutdinov, R. (2012). A better way to pre-train deep Boltzmann machines. In Proceedings of NIPS.

Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V.,Nguyen, P., Kingsbury, B., & Sainath, T. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 29.

Hinton, G., Osindero, S., & Teh, Y. -W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18.

Hochreiter, S., et al. (2001). Learning to learn using gradient descent. In Proceedings of International Conference on Artiﬁcial Neural Networks.

Huang, P., et al. (2013b). Learning deep structured semantic models for web search using click-through data. Proceedings of CIKM.

Huang, J. -T., Li, J., Yu, D., Deng, L., & Gong, Y. (2013a). Cross-lingual knowledge transfer using multilingual deep neural networks with shared hidden layers. In Proceedings of ICASSP. Jackson, P. (1998). Introduction to Expert Systems. Boston: Addison-Wesley.

Jelinek, F. (1998). Statistical Models for Speech Recognition. Cambridge: MIT Press.

Juang, F. (2016). Deep neural networks a developmental perspective. APSIPA Transactions on Signal and Information Processing, 5.

Kaiser, L., Nachum, O., Roy, A., & Bengio, S. (2017). Learning to remember rare events. In Proceedings of ICLR.

Karpathy, A., & Fei-Fei, L. (2015). Deep visual-semantic alignments for generating image descriptions. In Proceedings of CVPR.

Koh, P., & Liang, P. (2017). Understanding black-box predictions via inﬂuence functions. In Proceedings of ICML.

Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). Imagenet classiﬁcation with deep convolutional neural networks. In Proceedings of NIPS.

Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random ﬁelds: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521.

Lee, L., Attias, H., Deng, L., & Fieguth, P. (2004). A multimodal variational approach to learning and inference in switching state space models. In Proceedings of ICASSP.

Lee, M., et al. (2016). Reasoning in vector space: An exploratory study of question answering. In Proceedings of ICLR.

Lin, H., Deng, L., Droppo, J., Yu, D., & Acero, A. (2008). Learning methods in multilingual speech recognition. In NIPS Workshop.

Liu, Y., Chen, J., & Deng, L. (2017). An unsupervised learning method exploiting sequential output statistics. In arXiv:1702.07817.

Ma, J., & Deng, L. (2004). Target-directed mixture dynamic models for spontaneous speech recognition. IEEE Transaction on Speech and Audio Processing, 12(4).

Maclaurin, D., Duvenaud, D., & Adams, R. (2015). Gradient-based hyperparameter optimization through reversible learning. In Proceedings of ICML.

Manning, C. (2016). Computational linguistics and deep learning. In Computational Linguistics.

Manning, C., & Schtze, H. (1999). Foundations of statistical natural language processing.Cambridge: MIT Press.

Manning, C., & Socher, R. (2017). Lectures 17 and 18: Issues and Possible Architectures for NLP; Tackling the Limits of Deep Learning for NLP. CS224N Course: NLP with Deep Learning.

Mesnil, G., He, X., Deng, L., & Bengio, Y. (2013). Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding. In Proceedings of Interspeech.

Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Proceedings of NIPS.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A.,Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518.

Mohamed, A., Dahl, G., & Hinton, G. (2009). Acoustic modeling using deep belief networks. In NIPS Workshop on Speech Recognition.

Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Cambridge: MIT Press.

Nguyen, T., et al. (2017). MSMARCO: A human generated machine reading comprehension dataset. arXiv:1611,09268

Nilsson, N. (1982). Principles of Artiﬁcial Intelligence. Berlin: Springer.

Och, F. (2003). Maximum error rate training in statistical machine translation. In Proceedings of ACL.

Och, F., & Ney, H. (2002). Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of ACL.

Oh, J., Chockalingam, V., Singh, S., & Lee, H. (2016). Control of memory, active perception, and action in minecraft. In Proceedings of ICML.

Palangi, H., Smolensky, P., He, X., & Deng, L. (2017). Deep learning of grammatically-interpretable representations through question-answering. arXiv:1705.08432

Parloff, R. (2016). Why deep learning is suddenly changing your life. In Fortune Magazine.

Pereira, F. (2017). A (computational) linguistic farce in three acts. In http://www.earningmyturns.org.

Picone, J., et al. (1999). Initial evaluation of hidden dynamic models on conversational speech. In Proceedings of ICASSP.

Plamondon, R., & Srihari, S. (2000). Online and off-line handwriting recognition: A comprehensive survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22.

Rabiner, L., & Juang, B. -H. (1993). Fundamentals of Speech Recognition. USA: Prentice-Hall. Ratnaparkhi, A. (1997). A simple introduction to maximum entropy models for natural language processing. Technical report, University of Pennsylvania.

Reddy, R. (1976). Speech recognition by machine: A review. Proceedings of the IEEE, 64(4).

Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning representations by back-propagating errors. Nature, 323.

Russell, S., & Stefano, E. (2017). Label-free supervision of neural networks with physics and domain knowledge. In Proceedings of AAAI.

Saon, G., et al. (2017). English conversational telephone speech recognition by humans and machines. In Proceedings of ICASSP.

Schmidhuber, J. (1987). Evolutionary principles in self-referential learning. Diploma Thesis, Institute of Informatik, Technical University Munich.

Seneff, S., et al. (1991). Development and preliminary evaluation of the MIT ATIS system. In Proceedings of HLT.

Smolensky, P., et al. (2016). Reasoning with tensor product representations. arXiv:1601,02745

Sutskevar, I., Vinyals, O., & Le, Q. (2014). Sequence to sequence learning with neural networks. In Proceedings of NIPS.

Tur, G., & Deng, L. (2011). Intent Determination and Spoken Utterance Classiﬁcation; Chapter 4 in book: Spoken Language Understanding. Hoboken: Wiley.

Turing, A. (1950). Computing machinery and intelligence. Mind, 14.

Vapnik, V. (1998). Statistical Learning Theory. Hoboken: Wiley.

Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P. -A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research, 11.

Vinyals, O., et al. (2016). Matching networks for one shot learning. In Proceedings of NIPS.

Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57.

Wang, Y. -Y., Deng, L., & Acero, A. (2011). Semantic Frame Based Spoken Language Understanding; Chapter 3 in book: Spoken Language Understanding. Hoboken: Wiley.

Wichrowska, O., et al. (2017). Learned optimizers that scale and generalize. In Proceedings of ICML.

Winston, P. (1993). Artiﬁcial Intelligence. Boston: Addison-Wesley.

Xiong, W., et al. (2016). Achieving human parity in conversational speech recognition. In Proceedings of Interspeech.

Young, S., Gasic, M., Thomson, B., & Williams, J. (2013). Pomdp-based statistical spoken dialogue systems: A review. Proceedings of the IEEE, 101.

Yu, D., & Deng, L. (2015). Automatic Speech Recognition: A Deep Learning Approach. Berlin: Springer.

Yu, D., Deng, L., & Dahl, G. (2010). Roles of pre-training and ﬁne-tuning in context-dependent dbn-hmms for real-world speech recognition. In NIPS Workshop.

Yu, D., Deng, L., Seide, F., & Li, G. (2011). Discriminative pre-training of deep nerual networks.In U.S. Patent No. 9,235,799, granted in 2016, ﬁled in 2011.

Zue, V. (1985). The use of speech knowledge in automatic speech recognition. Proceedings of the IEEE, 73.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Paper-list-for-Deep-learning-in-NLP

更新

chapter 1 A Joint Introduction to Natural Language Processing and to Deep Learning

About

Releases

Packages

IrvingBei/Paper-list-for-Deep-learning-in-NLP

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Paper-list-for-Deep-learning-in-NLP

更新

chapter 1 A Joint Introduction to Natural Language Processing and to Deep Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages