Searchable tweet archive, powered by Hibernate Search and Spring Boot
This is a personal tweet archive. Its purpose is to store your tweets, retweets, quoted tweets and favorites in a PostgreSQL database and provide a full text search with Hibernate Search.
So fare, the storage of your tweets and retweets is implemented as well as deleting tweets, if the application is configured to track your account.
There's a super simple "interface" to upload an archive generated by Twitter itself, but the search is only available as a REST endpoint. Be aware: No security has been implemented yet, don't run this publicly if you don't want to expose your Twitter history!
There's a lengthy blog post on why and how this application was written:
How to build and run
To build this project, you'll need a valid Java 1.8 installation and either a local PostgreSQL database running on localhost:5432 with a schema named
tweetArchive and user
tweetArchive with the same password or a Docker installation.
To run a PostgreSQL database instance inside a Docker container use
docker run --name tweet-archive-db-dev -e POSTGRES_USER=tweetArchive -e POSTGRESS_PASSWORD=tweetArchive -p 5432:5432 -d postgres
The database connection can be configured through the means of Spring Boot.
Built the application
./mvnw clean verify
Register an application with Twitter and generate access tokens (optional)
Note: This is an optional step. If you just want to upload an existing Twitter archive, skip it. If you want the tweet-archive to track your new tweets and deletions, follow the instructions.
Login to Twitter and open https://apps.twitter.com and hit "Create new app". This will be your tweet archive. Give a reasonable name. You won't need a callback URL. The program won't need write permissions, I'd remove them. Then, goto "Keys and Access Tokens" and note the consumer key and secret. Take this values and run
java -jar target/tweetarchive-0.0.1-SNAPSHOT.jar --generate-tokens consumer_key,consumer_secret
Follow the instructions. You'll need to open an URL like https://api.twitter.com/oauth/authorize?oauth_token=someToken. Do this and copy the PIN you'll get into the shell.
This will generate a properties file containing your apps consumer token and secret as well as an access token and secret for your account.
Right now, you can start the application with
java -jar target/tweetarchive-0.0.1-SNAPSHOT.jar
if you have a local PostgresSQL database ready.
Or build a Docker image
If you plan to run this permanently, install Docker for your platform and run:
./mvnw clean verify docker:build
This will create one Docker image based on the official Java image, containing this application and a link to a Docker container running PostgreSQL.
Run the Docker image
After the above step, run
It will start a PostgresSQL container and this apps container. The database files will be stored inside
./var/db/prod and the Lucene search index at
./var/index/prod so that those data won't vanish if you stop and restart the container.
Use the application
I assume that you used the Docker method. If you configured your credentials, than the application will track your new tweets.
Upload a Twitter archive
Open http://localhost:8980/upload and upload the file you received from Twitter. That take a while depending on the size, but you'll get a notice eventually.
Search your tweets
Those are only examples.
All tweets containing the keyword Java
curl -X "GET" "http://localhost:8980/search?q=java"
All tweets containing the keyword JavaOne autumn 2015
curl -X "GET" "http://localhost:8980/search?q=JavaOne&from=2015-10-15&to=2015-10-31"
All tweets that I send to Vlad:
curl -X "GET" "http://127.0.0.1:8980/extendedSearch?q=reply.to:vlad_mihalcea"
The "extendedSearch" endpoint supports all Lucene queries and escapes. All books I read 2015:
curl -X "GET" "http://127.0.0.1:8980/extendedSearch?q=%22Gelesen%20%2F%20Read:%22%20AND%20year:2015"
All tweets I sent from my iPhone:
curl -X "GET" "http://127.0.0.1:8980/extendedSearch?q=source:%22Twitter%20for%20iPhone%22"
As you can see, you can do a lot with a simple archive application.