goLang crawler restricted to only topic relevant/curated URLs. Includes token frequency analysis and NLP nGram detection
-
Updated
May 10, 2023 - Go
goLang crawler restricted to only topic relevant/curated URLs. Includes token frequency analysis and NLP nGram detection
Minimalistic REST API to scrape Tweets via queries from twitter's advanced search. Mainly wraps the awesome scraper-module made by n0madic in a REST API. Created for my final thesis in Text Mining
PipeIt is a text transformation, conversion, cleansing and extraction tool.
Find text even if it doesn't want to be found
A news crawler in go
Extract indicators of compromise from text, including "escaped" ones.
textextract is a tiny library (87 lines of Go) that identifies where the article content is in a HTML page (as opposed to navigation, headers, footers, ads, etc), extracts it and returns it as a string. Like Boilerpipe but for Go in Go.
A concurrent solution, built in GoLang for performing keyword density analysis on large bodies of text.
Extract content from HTML by removing unwanted boilerplate text.
Add a description, image, and links to the text-mining topic page so that developers can more easily learn about it.
To associate your repository with the text-mining topic, visit your repo's landing page and select "manage topics."