Mining the Social Web
Last year I read an article in Nature about Paul Erdős’s on the occasion of his 100th birthday. Outside mathematical circles Erdős is most known for the so called Erdős number. There are several different definitions of the Erdős number but according to Wikipedia it is defines as the "'collaborative distance' between a person and mathematician Paul Erdős, as measured by authorship of mathematical papers". So if you co-authored a paper with Erdős your Erdős number is 1. Your number will be 2 if you co-authored a paper with an author who wrote a paper directly with Erdős and so forth. Analyzing Erdős numbers is an application of social network theory and ever since I read the article I wanted to learn more about data mining applied to modern social media platforms. When researching for a book on this topic I came across Mining the Social Web and the books very practical approach convinced me to that this was the book I wanted to read.
Virtual Machine experience
The book is accompanied with a Virtual Machine experience that sets new standards for interactions between technical programming books and the code samples provided by the book. In no time you are up and running with the code samples in a IPython notebook that also can be edited and used as basis for your own data mining experiments. I would really love to see this approach adopted by other programming books.
The reader is gently guided through a software setup of VirtualBox and Vagrant and once these two programs have been installed it is just a matter of writing "vagrant up" in a terminal window and all of the necessary software used throughout the book will be installed and running in a virtual machine accessible through a web browser. Setting up the virtual machine might sound complicated but it is really quite easy. I tested the procedure for on both Mac and Windows and had no troubles getting the environment up and running in less than half an hour. And the really cool thing is that you don't have to install and manage a lot of dependencies yourself as well as you can delete everything afterwards just by deleting the virtual machine. The whole setup process is both described in the book and on videos found on the book's Github pages.
Some knowledge and experience with Python is required fully understand the code samples. If you have experience from other modern programming languages you should not have troubles understanding basic Python code. So the choice of Python cannot be considered as a barrier for reading the book.
The books does not cover social network theory in general nor graph theory so if you are looking for a book with a theoretical approach then this book is not for you. However most chapters in the book ends with a list of additional resources that can be used for further research.
This book is the best computer book I have read in several years. Social networks and data mining is a hot topic and reading Mining the Social Web will not only provide you knowledge about data mining but also supply practical code examples. In addition the books is an easy read and quite funny!
I review for the O`Reilly Reader Review Program and I want to be transparent about my reviews so you should know that I received a free copy of this ebook in exchange of my review.
Title: Mining the Social Web, 2nd Edition Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More
Author: Matthew A. Russell
Publisher: O'Reilly Media
Release Date: October 2013