Skip to content

Latest commit

 

History

History
32 lines (19 loc) · 3.36 KB

README.org

File metadata and controls

32 lines (19 loc) · 3.36 KB

TwitBlog

Premise

Twitter has so much information hidden in it’s threads, but navigating them by topic is somewhat hard.

To solve this disconnect between value and accessibility, I created TwitBlog, an easily naviagable UI to explore the informative content generated by users.

As a brief overview of the tech, Twitblog uses the Twitter API and Python to collect and process tweet data, the MERN stack to store, query, and display the data, and Docker to rapidly deploy the program

Link here: Twitblog

Architecture

This project uses the MERN stack (MongoDB, Express, React, Node) with Python data processing to make finding and consuming Twitter threads easier. Specifically, the frontend is built with React and styled with Tailwind, with inspiration from the Twitter UI for each view (displaying all of an author’s threads, reading threads, etc.). The frontend is compiled and served during production, with Nginx acting as a reverse proxy and eventually a load balancer. The frontend communicates with a node backend running Express, which requests MongoDB for data. The app is fully Dockerized, with a container for the Nginx frontend and the Express backend. The project is rapidly iterated with an entire CI/CD pipeline to manage the images (on Github Images) and launch containers on the server (hosted by DigitalOcean).

Finally, the data consists of a user’s tweet history—collected through Twitter’s API—and is cleaned/processed with Python. I built an algorithm to construct “trees” of tweets where a user replies to themselves, and the most extended branch of this would be a thread. These threads are treated as a complete document and fed into a Spacy Pipeline to pick out essential keywords like nouns and phrases (and other meaningful information) and create summaries. This data is finally cleaned (to get rid of URLs and thread markers) and uploaded to MongoDB. Mongodb was the the best and easiest choice for a DB, since it is very easy to modify and upload the json data from the TwitterAPI, and has easy/fast text search capabilities.

Referencnes

This was amazing reference while figuring out tailwind Used this vue and elastic tutorial and this other tutorial to help figure out the MERN stack

For refactoring the app after solidifying my stack, I used this link. Also, this link help directly build the app and push into nginx

getting multicontainer to work is so hard, this post explains someting I found helpful

using nginx as a reverse proxy to get all requests, and channel to right ports useful link to setup for ports here digital ocean guide