Visualising sentiment trends from real-time social media data
sentiment-sweep.com
Important
As of Feb 2023, X no longer provides free API. This makes it impractical to continue running this app afforably or at scale.
Further to this, now that general purpose AI models are faster, more availible and less expensive than ever before, the static analysis approach (used here) no longer makes sense, given that AI sentiment analysis will yeild much more accurate and insightful results.
For these reasons, the public instance of Sentiment Sweep is no longer availible, and this project will cease to be maintained going forwards.
We had a good run! The Sentiment Sweep website ran for nearly a decade, saw upwards of 1 million sessions, and won 2 awards. It was a great learning experience, and a fun little project. Thank you everybody who used, supported and visited the website! π
A project to make large quantities of social media data more understandable.
The app that streams live social media data, and runs it through a custom sentiment analysis algorithm, to determine trends which are then visualised with a series of dynamic real-time charts.
The aim of the app is to allow trends to be found between sentiment and other factors such as geographical location, time of day, demographics, similar topics, etc.
It has a range of uses, like analysing the effectiveness of a marketing campaign, comparing competing products, viewing local trends, gauging public opinion by location, determining best time of day to advertise to certain audiences, etc.
A live demo is available at: http://sentiment-sweep.com (archived)
This project was initially developed in 2015. Some of the technologies used are a little out-dated now, although the app still works great. A few of the external services that were used to provide additional context (like HP Idol on Demand, and IBM Watson, and certain GCP features) have been discontinued, meaning certain features may now be unavailible on the live instance.
See the Dev Setup docs for local dev setup.
- Prerequisites - You will need Node.js, MongoDB and git installed on your system.
- Get the files -
git clone https://github.com/Lissy93/twitter-sentiment-visualisation.git tsv
thencd tsv
- Install dependencies -
npm i
/yarn
will download requirements into node_modules, then automatically kick off abower install
for frontend libraries - Set Config -
yarn run config
will generate theconfig\src\keys.coffee
file, which you will then need to populate with your API keys and save. 5. Apply Settings - Check that your happy with the general app config inconfig/src/app-config.coffee
- Build Project -
yarn build
will compile the project from the source, outputting files into dist ready to be published - Start MongoDB -
mongod
will start a MongoDB instance (run in separate terminal instance, see instructions: Starting a MongoDB instance) - Run the project -
yarn dev
will build, start the dev server, with live-reload and auto-testing - Open Browser - Navigate to the specified port, to view running app, e.g. http://localhost:8080
See the [Prod Deployment(/docs/build-environment.md) docs for more info.
Follow the instructions above, then
- Execute Tests -
yarn test
Ensure all tests pass and everything is working as expected - Build for Prod -
yarn build
Compile all source files to the dist directory - Start Server -
yarn start
Spin up HTTP server to start API and serve up compiled files
See the Test Strategy Docs for more info.
TSV is fully unit tested, and follows a BHD pattern. Unit, integration, coverage and depencency tests can be run usingyarn test
.
Pass/ Fail Criteria
Test Type | Pass Condition |
---|---|
Functional Testing | All acceptance criteria must be met, checked and documented |
Unit Tests | 100% of unit tests must pass. It will be immediately clear when a unit test is failing |
Integration Tests | 100% pass rate after every commit |
Coverage Tests | 80% or greater |
Code Reviews | B grade/ Level 4 or higher. Ideally A grade/ Level 5 if possible. |
Dependency Checks | Mostly up-to-date dependencies except in justified circumstances. |
Testing Tool
- Framework - Mocha
- Used in order to store, write and run the tests in a structured way
- Assertion Library - Chai
- Provides a structure and syntax in order to actually write the test cases
- Coverage Testing - Istanbul
- Measures the proportion of your source code that is covered by your unit tests
- Stubs, Spies and Mocking - Sinon.js
- Mocking removes the need to call production APIs while running frontend unit tests
- Continuous Integration Testing - Travis CI
- Ensures that all the standalone modules function correctly when put together
- Dependency Checking - David
- Checks that each dependency is present, correct, secure and functional
- Automated Code Review's - Code Climate
- Scans for best practices, and fails in any part of the code could be improved upon
- Headless Browser Testing - PhantomJS
- Runs frontend tests without the need for a GUI browser
- Testing HTTP services - SuperTest
- Tests API endpoints and ensures routing is working correctly
TSV uses the Gulp streaming build tool to automate the prod and dev workflows. For more info, see the Build Environment docs.
The following tasks are useful for getting started:
gulp generate-config
- Generates correctly structured default configuration files for settings and API keysgulp build
- Builds the project fully, including optimization, compilation, minification and validationgulp nodemon
- Runs the application on the default port (probably 8080), with live refreshgulp test
- Executes all unit and coverage tests, and generates a report containing the resultsgulp
- Default dev task - check the project is configured correctly, build ALL the files, run the server, watch for changes, recompile relevant files and reload browsers on change, and keep all browsers in sync, when a test condition changes it will also re-run tests - a lot going on!
The project was developed in a modular approach, made up of several distinct components. Each is published as a fully tested, documented and MIT-licensed NPM module for easy re-use.
- sentiment-analysis - Useses AFINN-111 approach to calculate overall sentiment of a given sentence
- fetch-tweets - Fetches tweets from Twitter based on topic, location, timeframe or combination
- stream-tweets - Streams live Twitter data in real-time, based on location, given term, etc
- remove-words - Removes all non-key words from a given string
- place-lookup - Finds the latitude and longitude for any fuzzy place name using the Google Places API
- hp-haven-sentiment-analysis - A Node.js client library for HP Haven OnDemand Sentiment Analysis module
- haven-entity-extraction - Node.js client for HP Haven OnDemand Entity Extraction
- tweet-location - Calculates the location from geo-tagged Tweets using the Twitter Geo API
- find-region-from-location - Given a latitude and longitude calculates which region that point belongs in
A set of User Stories with Acceptance Criteria and Complexity Estimates were drawn up outlining what features the finished solution should have. These were expaned upon further with wireframes in the Methodology section.
View full tech stack at: stackshare.io/Lissy93/sentiment-sweep
The backend is primarily written in Node.js, with web-sockets facilitating the real-time communication with the frontend, and a data cache stored in MongoDB. Pages are rendered isomorphically, with data visualizations written using D3.js. Social data is fetched from Twitter, compute happens locally, and a few external APIs were used to provide additional context in the form of AI. Views are written in Pug, styles in Less, scripts in CoffeeScript and everything is compiled via a Gulp script.
The project and app are still functional, however 5 years on, this would not be an ideal tech stack. There are now better technologies available that would enable greater performance, less code, easier project management and improved developer experience. If I was to re-write this project in 2022, a better tech stack would likely be Go for the backend, Svelte + Svelte Kit for the frontend and TypeScript for the code, with Pixi.js for the interactive content, styled-components for styling and Rollup for putting it all together.
A live demo of the application has been deployed to: http://sentiment-sweep.com
View Screenshots of each screen in the docs.
The first stages of the project were developed at StartHack Switzerland 2014, where it won first-place.
It was then further expanded upon, and used as part of my undergraduate thesis, where it won the Oxford BCS best Dissertation Award.
The University Project recieved 96%, so feel free to use it as an example - here's the Final Report in PDF format (warning - it's 300 pages!). And the deck used for the technical presentation, us available at: presentation.sentiment-sweep.com
- Development Documentation
- Project Information
- Project Planning
- Research
twitter-sentiment-visualisation was developed by Alicia Sykes, licensed under MIT Β© 2014 - 2022.
For information, see TLDR Legal > MIT
The MIT License (MIT)
Copyright (c) Alicia Sykes <alicia@omg.com>
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files
(the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sub-license, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be
included install copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANT ABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON INFRINGEMENT.
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Β© Alicia Sykes 2015 - 2020
Licensed under MIT
Thanks for visiting :)