Computational and Data Journalism Unconference
Participants in the Unconference - please feel free to edit/submit a PR with a better bio!
- Martin - Lecturer, interested in data visualisation
- Glyn - Lecturer, reporter background, interested in data and stories
- Caroline - come to see what was going on at datajconf
- Eva - Works setting up sustainable data journalism in developing countries
- Marianne - Non-profit HEI-DA, data journalism training (multi-lingual data journalism)
- Barry - Quantitative trainer, career break, interested in machine learning
- Adrian - Data analyst, working with Open Data
- Tomas - Computer Scientist, interested in tools for transparent data journalism
- David - Data journalist, physics background, wall street trader
- Multi-lingual Data journalism
- Open-data in closed societies
- Proxy measures when data is not available
- Tools for transparent data journalism
- Pedagogy of teaching data journalism
- Is data interesting? Is there a story?
Multi-lingual Data journalism
How do you 'do' data journalism in an area where you need to cover multiple languages.
Linguistic isolation means that examples of good journalism are few/non-existent. Technical problems aren't the worst problem.
Eva has an 800 page training manual (based on Hunter's Inquiry method) - helps to redefine journalism for them. Some languages of this training manual are available online.
Education - there is a gap that needs to be filled.
Survey data can be collected in local language - published in english - is there translation bias?
Data Journalism handbook - useful primer, but is more a collection of essays than a really useful training manual.
David - Balkans has a similar problem - lots of small countries with inter-connected problems.
Sharing of data and creation of networks.
- Excel deals with UTF-8 poorly
- Open-Refine has very poor language support
- Google Translate
👎(only good for popular languages)
- Tabula is no good in Burmese
- They have activated right-to-left support
- OCR is no good for Burmese
- Digits and Numeric data is much easier and OCR works better
- Tesseract OCR can be useful (and support many language)
- TensorFlow as an OCR tool?
- Different calendars can cause issues - what year is it!
- Also data formats between US and elsewhere
- ISO date format - YYYY-MM-DD !!!
Teaching Data Journalism
Teaching scientists: "I pushed the button and it didn't work. I'll push a different button" Teaching non-scientists: "I pushed the button and it didn't work. It's broken"
Problem solving - how to make students identify problem solving as part of journalism.
Focus on the story and the message - really difficult to get people to focus on this over the tools. Hypothesis writing as a key part of the data-journalism process - including peer-review of the hypothesis "What are you counting?", "What are you measuring?" - it's basically a research methods class
No point learning how to use the tools if you can't figure out what the question is...
Computational Thinking as a problem solving process for teaching data-journalism
Examples of gender-bias in data journalism examples/training - IRE manual is all based around baseball
Open Data in closed societies
Data is unavailable, meetings are changed at random Very far away from an FOI law
Just presenting the data you do get is very confusing - need to add extra context
platform for opening data from closed society Contacting journalists inside the country and asking them to write stories Will be open to contributions - leads to verification issues Collaborating with others to learn about ways of accessing data and relaxing and sharing data Doing a lot of manual cleaning etc.
CKAN - open source software for making open-data portals
One of the problems is visibility - letting people know that tools exist and are around
Need communities for people dealing with open-data in these areas - using a blog to start and build the community.
Kaggle - could be used to help get people to work on data sets and solve problems.
"Data for Change" - event to bring together tech and data journalists etc.
Trust is a big challenge in closed societies - there is no norm of belonging to and participating in online communities.
There is an assumption 'if you have data, democracy will happen' but of course that isn't the case
DataKind offer similar help with data and problems that need to be solved for NGOs
Data for Democracy slack group
Demand for "Data that is perfectly true"
Balkans/Eastern Europe - subject to a lot of European FOI/Data laws but are still quite closed - hide data that can reveal issues (corruption etc).
'I can make you look good/fix problems' can be a way of gaining access to data
'A little bit of data journalism can be a dangerous thing' - poor use of the little open data that does exist can result in a closing down of the data - how do you communicate uncertainty within data, and would more nuance help?
Communicating uncertainty / Transparent data journalism
We need a better way of communicating uncertainty
Surgeon Scorecard by ProPublica is a good example of this
You won't solve lack of data with a "we'll build an app for that"
Perhaps start with more innocuous topics (health, education) rather than corruption (which will scare officials off). There's always some data - so whatever story you do there will be the first data story.
People doing their own analysis and writing about their process and methodology
Journalists don't explain enough what things mean
"Data Journalism should sound serious and authoritative" - but this is unnecessary!
Does the form itself force a lack of transparency due to deadlines and formats?
Editorial decision to report on polling predictions leads to losing the uncertainty and probability issue around polling data. Are there better ways to communicate these sorts of things?
"Attention Economy" - things need to be short and snappy and un-complicated. Does this work against a more nuanced telling of data stories?
Are they answering the wrong question? "Who is going to win?" rather than "Who should you vote for?"
Descending into the "What is Proper Journalism?" question/black-hole - brb...
Activists vs. Journalists - where's the line and what's the difference. Journalism loses some of it's power when it moves towards activism. Activists always have a position. Journalists should be neutral? Are they actually neutral?
Journalists won't (shouldn't) ignore the data just because it doesn't support their opinions/outlook. Activists more likely to ignore that which doesn't fit their bias.
Climate Scientists - similar problem
(oh we've got to BREXIT) - too much uncertainty to even start to figure it out.
Is the goal 'foster critical thinking'?
"Lazy Journalism" - systemic problems within Journalism. Intellectual laziness.
Ascending from the black-hole
Improving News Quality
The Bureau Local have a real chance to help improve things
Channel 4 Fact Check, Full Fact, working for a long time to widen fact checking
There are changes ahead (BBC local reporters)
Tools for journalists to make life easier (automatic information retrieval etc) - How to get the 'mainstream industry' to use these tools as well?
"An easy copy and paste" - but attribution is important!
Tools for Exploratory Data Analysis, archive retrieval (archives for newsrooms are very large and could be useful but aren't used). CMS Tools identifying entities and retrieving information.
What would you change to improve data journalism?
- EDUCATION! we need more education
- It's not a rigid structure - there are changes to make in newsrooms, but 'data journalism' is fluid and changes as it needs to anyway
- Find some 'non-google DNI' support!
- Be able to fund, sustain and grow the intermediaries - ODI, Bureau Local etc.
- A european 538
- Access and flow of data, tools for finding data, particularly in closed-societies
- Soul searching from data journalists (post trump/post brexit) - "Why are we doing this?"