Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Perspectives on Data Science for Software Engineering


This repo holds chapters, under development for a forthcoming Morgan Kaufmann book.

Keywords: software engineering; data science; analytics; data mining; visualization; decision making Editors:


At a recent Dagstuhl seminar on Software Analytics (June’2014, attended by 40 of the top researchers in the field), A repeated question at that meeting was how to transfer best (or safe) practices from seasoned SE data scientist to newcomers.

To address this problem, this book was planned to present the hard won lessons learned of seasoned data miners. Dagstuhl meeting, participants conducted brainstorming sessions to collect mantras for data mining. Subsequently, the convenors of that meeting (Menzies, Williams, Zimmermann) refined that list into the list of chapters explored in this book.

Why This Book?

There are many texts that cover the basics of data mining or advanced data mining scripting. However, there are no practitioner level texts that reflect the insights of seasoned data scientists.

This book of wisdom aims to address that gap. Lessons learned (about SE data science) will be presented in small standalone chapters that are easy to read and understand. Newcomers to SE data science can learn tips and tricks of the trade. More experienced SE data scientists will benefit from “war stories” showing what traps to avoid.

Target Audience

This book is targeted at industrial data science workers. Each chapter will be a short (2 to 4 pages) and focused discussion on one mantra of SE data science. Chapters will be written for a generalist audience (no excessive use of technical terminology) with a minimum of diagrams and references.


When What
Apr 30, 2015 Invitations issued (and accepted) to write each chapter to 40+ authors
June 30, 2015 Chapters submitted. Commence peer review of chapters (by other chapter authors)
July 31, 2015 First round reviews completed and sent back to authors
Sept 15, 2015 Revised chapters received, optional second peer review
Sept 30, 2015 Second round reviews completed and sent back to author
Oct 21, 2015 Final chapters sent to Morgan Kaufman

Instructions to Authors

The chapter title should be a mantra; i.e. some slogan reflecting best practice for data science for SE.

Chapters should use diagrams in gray scale (or no diagrams at all).

Chapters should minimize the use of references (less than half a dozen).

Chapters to be written in Markdown and committed to this repo:

  • To begin that task, send your Githib repo id to any of the above editors. They will add you to the "colloberators" of this repo.
  • If you do not want to hassle with Github, then write in Markdown using on the free Markdown editors and send to the editors.

Chapters should be short. For example, our sample chapter has:

  • 250 lines
  • 1500 words
  • 9000 characters

Chapters should be approachable and have a take away message. For example, our sample chapter has certain features which you can (optionally) copy:

  • Some light hearted story to start the chapter;
  • Some list of specific recommendations to end the chapter (see the heading In Summary at the end of the file).


Perspectives on Data Science for Software Engineering



No releases published


No packages published
You can’t perform that action at this time.