Who Wrote This Server
Simple Python-based Flask application which, serving a web-based tool, allows users to search for "prototypical articles" published by a news agency though the method described in "Machine Learning Techniques for Detecting Identifying Linguistic Patterns in the News Media" by A Samuel Pottinger.
This tiny web application uses an inverted index, allowing users to query a dataset for articles most like their publishing agency. For example, it can provide the most NPR-like NPR articles across all topics or the most CNN-like articles about climate change. The user can do this by providing optional search queries through a web-based UI to review coverage of a topic across many different media outlets.
This application requires Python 3 and pip. Other dependencies can be installed via
$ pip install -r requirements.txt. If using telemetry, provide the following env vars:
The application is deployed publicly to https://whowrotethis.com. The application serves a UI for users at the root URL. Running locally, users can simply run
$ python application.py and navigate to the URL printed.
Automated tests are provided using the Python-standard
unittest library. Users can execute via
Please unit test and follow the Google Python Style Guide where possible.
Note that this is in a series of related projects as linked:
- who-wrote-this-training: logic for machine learning.
- who-wrote-this-server: web application to demo the model.
- who-wrote-this-news-crawler: crawler to record RSS feeds.
who_wrote_this_data.zip, and HTML files are released under CC BY-NC 4.0. The following libraries are used:
- Flask used under the BSD License.
- itsdangerous used under the BSD license.
- Jinja2 used under the BSD license.
- MarkupSafe used under the BSD license.
- pg8000 used under the BSD license.
- Werkzeug used under the BSD license.