Assignment #1 - Automated News Writing
Out: February 13, 2017
Due: March 3, 2017 (5pm)
Overview
The goal of this assignment is to code and write a template that can produce stories automatically that might contribute to ongoing coverage of the issue of fatal shootings by police officers in the US. The code and text you write will take data from the Washington Post Fatal Police Shooting database and output a written story describing an overview, specifics of a single row of data, and any additional context. The story should be compellingly written, be distinct based on the row of data input but also be put in context of the entire dataset, and use conditional logic to alter the story text based on data values. Each row of data should produce a variant of the story (though there will obviously be some commonalities as well). Each story should be at least 250 words long.
You'll use the Jinja Template Engine to complete the assignment.
Getting Started
You'll want to spend some time exploring the dataset. What's interesting or insightful about the dataset as a whole? Those facts / insights should be integrated into your story, but also don't forget to include the specifics of the particular row of data that is generating the story. In other words your story should include aspects of an overview of the data, as well as specifics about the row of data.
- Be sure to create at least 2 (and probably more than that) derived or aggregated data columns from the dataset that can enhance your story (e.g. averages, trends, counts etc).
- Be sure to use synonym sets (synsets) to add variability to your writing.
- Be sure to consider other general context that you find through research to introduce and contextualize the issue.
- Be sure to consider the tone and style of your writing.
- Be sure that there is some variability in the stories produced for different rows of data (wouldn't it be boring if all the stories were too similar?)
See the skeleton notebook for helpful code snippets that demonstrate the Jinja template engine as well as how to apply filters and conditional logic to your templates.
The skeleton file also has a link to the dataset to use for the assignment. However, your code will be tested with new data that you haven't seen before so that there's a realistic assessment of whether it can adapt and be flexible to future data.
Brief Report
In addition to the notebook which you flesh out, you should work on a written report including (1) a description of your writing approach and rationale for your process and content (e.g. what does your story include and why); (2) reflections on the assignment, including for instance challenges, difficulties, and what you might do differently next time; (3) more general reflection on how you see automation in the news media evolving: besides crime what other stories might it be useful for? where is it too limited? do you think it should be adopted by news organizations broadly?
Submission
This is an individual assignment and you may NOT work in groups. All work should be your own. If you find and use code snippets online that is fine, but you should clearly note this and include a comment with a url link to the original source.
You will be evaluated based on the quality of the text generation templates / rules you produce (e.g. complexity, variability in vocabulary or structure, use of derivative or aggregate data, conditional logic, compelling and interesting writing); the quality of your written report (easy to follow, explains assumptions, reasoning, and evidence, thoughtful and plausible reflections), and functionality on new data inputs (e.g. runs and outputs readable stories with no grammatical errors, accurate).
Your should submit (1) your report of less than 600 words (excessively short or long write-ups will be penalized), (2) two different example stories generated by your template, and (3) your .ipynb file so that your analysis can be re-run. Parts 1 and 2 you should submit as a single PDF.
Mail the .pdf (filename of ASGN1_<your lastname>.pdf) of your write-up, and the .ipynb (filename of ASGN1_<your lastname>.ipynb) to Professor Diakopoulos: nad@umd.edu by the due date.