Clone this wiki locally
Note: These lists and links are non-exhaustive. Feel free to suggest additions that you think should be here!
Books & Online Courses
Natural Language Processing
- Bird, Klein and Loper, Natural Language Processing with Python
- Jurafsky and Martin, Speech and Language Processing
- Lin and Dyer, Data-Intensive Text Processing with MapReduce
- Manning and Schuetze, Foundations of Statistical Natural Language Processing
- NLP Stanford Coursera course
- Manning, Raghavan, and Schutze. Introduction to Information Retrieval Note: Free PDF/HTML of book available on book website!
- Bishop, Pattern Recognition and Machine Learning
- Murphy, Machine Learning: A Probabilistic Perspective
- Stanford Coursera course
- UW Coursera course
- Horstmann, Scala for the Impatient. Excellent for diving quickly into Scala.
- Odersky et al, Programming in Scala (2nd edition). Excellent and authorative (Odersky is the creator of the Scala language).
- Wampler and Payne, Programming Scala.
- Pollack, Beginning Scala
- For other options, see this list of books for learning Scala.
- Chuisano and Bjarnason, Functional Programming in Scala
- Seureth, Scala in Depth
- Wyatt, Akka Concurrency
- White, Hadoop: The Definitive Guide
- My Scala tutorials for beginning programmers. Though they are oriented toward first time programmers, but contain much that is useful for experienced programmers getting familiar with Scala.
- The Scala REPL, expressions, variables, basic types, simple functions, saving and running programs, comments.
- Tuples, Lists, methods on Lists and Strings
- Conditional execution with if-else blocks and matching
- Iterating, mapping, filtering and counting
- Regular expressions and matching with them
- Regular expression matching and substitution with the Regex API
- Maps, Sets, groupBy, Options, flatten, flatMap
- Word counting, scala.io.Source, file access, flatMap, mutable Maps
- Objects, classes, inheritance, traits, Lists with multiple related types, apply
- Scripting, compiling, main methods, return values of functions
- SBT, scalabha, packages, build systems
- Code blocks, coding style, closures, Scala documentation project.
- Variations for computing results from sequences in Scala
- Student questions about Scala, Part 1
- Student questions about Scala, Part 2
- Incorporating and using OpenNLP in Scalabha’s SBT build system
- Basic XML processing with Scala
- My Scala tutorials about APIs and particular topics
- Scala School: Resources created by Twitter for training programmers new to Scala.
- Stackoverflow Scala Tutorial: organized links to answers on Stackoverflow that cover many questions in Scala.
- SimplyScala: tutorial and online REPL for beginning programmers
- Kojo: a learning environment that includes an interactive Scala tutorial (which was adapted from SimplyScala)
Applications and Toolkits
Note: These lists are certainly not exhaustive. Suggestions for additions welcome!
Natural language processing and machine learning
- Chalk: my evolving Scalafication of OpenNLP (see below)
- Nak: a Java/Scala machine learning toolkit (forked from OpenNLP and used by Chalk)
- OpenNLP: a Java toolkit for NLP
- Breeze: a Scala toolkit for NLP and learning
- Mallet: a Java toolkit for NLP that includes many classifiers and topic modeling
- Junto: a Java/Scala toolkit for label propagation
Other useful and relevant software
- Hadoop: Open source MapReduce implementation and related tools.
- Scoobi: Scala wrapper to Hadoop.
- Spark: Scala system for distributed computation (similar to Hadoop, but better for iterative algorithms)
- Storm: Real-time scalable, distributed processing system.
- Akka: Java/Scala system for concurrent, distributed computing.