Skip to content

10.Natural language processing.Common packages for natural language processing

Ricardo Pietrobon edited this page Dec 26, 2020 · 2 revisions

Common packages for natural language processing

  1. Beautiful Soup libraries is a Python library for pulling data out of HTML and XML files.
  2. text2vec provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents that are larger than available RAM.
  3. rvest package works with magrittr to make it easy to scrape information from web pages, like beautiful soup.
  4. tidytext Text mining for word processing and sentiment analysis using 'dplyr', 'ggplot2', and other tidy tools.
  5. stringr particularly handy package to work with regular expressions as it provides a few useful pattern matching functions.
  6. spacyr provides a convenient wrapper of that package in R, making it easy to access the powerful functionality of spaCy in a simple format.
Clone this wiki locally