Skip to content
main
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 

Tweet Archives Unleashed Toolkit (twut)

codecov Maven Central LICENSE Contribution Guidelines

An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.

Dependencies

Getting Started

Packages

Spark Shell

$ spark-shell --packages "io.archivesunleashed:twut:0.0.4"

Jars

You can download the latest release files here and include it like so:

Spark Shell

$ spark-shell --jars /path/to/twut-0.0.4-fatjar.jar

PySpark

$ pyspark --py-files /path/to/twut-0.0.4.zip

You will need the PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON environment variables set.

Documentation! Or, how do I use this?

Once built or downloaded, you can follow the basic set of recipes and tutorials here.

License

Licensed under the Apache License, Version 2.0.

Acknowledgments

This work is primarily supported by the Andrew W. Mellon Foundation. Other financial and in-kind support comes from the Social Sciences and Humanities Research Council, Compute Canada, the Ontario Ministry of Research, Innovation, and Science, York University Libraries, Start Smart Labs, and the Faculty of Arts and David R. Cheriton School of Computer Science at the University of Waterloo.

Any opinions, findings, and conclusions or recommendations expressed are those of the researchers and do not necessarily reflect the views of the sponsors.

About

An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published