Home

LineMiner

A tool for "distant reading" via word searches in temporalized text files, for example CSVs where every line is a textual unit with timestamp (tweet, comment, etc.). Should work with any file that has some text and a timestamp per line. Currently automatically detects files exported from Netvizz, YouTube Data Tools, DMI-TCAT, and Reddit Tools.

Requirements

LineMiner uses basic PHP (> 5.3) for server-side processing and JavaScript for interface and visualization. An SSD is recommended when working with larger files.

Installation

Git or download files into a directory on your server/machine. Make sure that the script can read from folders /data and /stopwords, and write to /output.

File types and locations

The data files to analyze should be uploaded (via FTP) to /data and need to have either a .tab/.tsv or .csv file extension. Currently, the tool automatically detects the following files:

Netvizz comment files
YouYube Data Tools comment files
DMI-TCAT exports

Additional stopword files can be added to /stopwords, using the stopwords_nameoflanguage.txt naming scheme. Stopword files should contain one word per line. There are some stopword files to be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

LineMiner

Requirements

Installation

File types and locations

Clone this wiki locally