10.Natural language processing.Common packages for natural language processing
Ricardo Pietrobon edited this page Dec 26, 2020
·
2 revisions
- Beautiful Soup libraries is a Python library for pulling data out of HTML and XML files.
- text2vec provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents that are larger than available RAM.
- rvest package works with magrittr to make it easy to scrape information from web pages, like beautiful soup.
- tidytext Text mining for word processing and sentiment analysis using 'dplyr', 'ggplot2', and other tidy tools.
- stringr particularly handy package to work with regular expressions as it provides a few useful pattern matching functions.
- spacyr provides a convenient wrapper of that package in R, making it easy to access the powerful functionality of spaCy in a simple format.