Accompanying Source Code for the Haskell Data Analysis Cookbook
Clone or download
Latest commit f8c4698 Oct 3, 2015
Type Name Latest commit message Commit time
Failed to load latest commit information.
Ch01 Adapted code to current version of Database.MongoDB Aug 9, 2014
Ch02 Fixed wording in README May 31, 2014
Ch03 Corrected expected output Aug 9, 2014
Ch04 Combined two styles: pattern-matching vs if-then-else Aug 10, 2014
Ch05 Modified code to accommodate 'Null' trees Aug 10, 2014
Ch06 Added chapter 6 Jun 21, 2014
Ch07 base version Oct 3, 2015
Ch08 Added Chapter 8 Jun 21, 2014
Ch09 Added Chapter 9 Jun 24, 2014
Ch10 Added Chapter 10 Jun 24, 2014
Ch11 Added cabal file Jul 6, 2014
Ch12 Added Chapter 12 Jun 25, 2014 Updated README Jun 25, 2014


This is the accompanying source code for Haskell Data Analysis Cookbook.

The latest source code is available on GitHub:


Chapter 1

The Hunt for Data, identifies core approaches in reading data from various external sources such as CSV, JSON, XML, HTML, MongoDB, and SQLite.

Chapter 2

Integrity and Inspection, explains the importance of cleaning data through recipes about trimming whitespace, lexing, and regular expression matching.

Chapter 3

The Science of Words, introduces common string manipulation algorithms including base conversions, substring matching, and computing the edit distance.

Chapter 4

Data Hashing, covers essential hashing functions such as MD5, SHA256, GeoHashing, and perceptual hashing.

Chapter 5

A Dance with Trees, establishes an understanding of the tree data structure through examples including tree traversals, balancing trees, and Huffman coding.

Chapter 6

Graph Fundamentals, manifests rudimentary algorithms for graphical networks such as graph traversals, visualization, and maximal clique detection.

Chapter 7

Statistics and Analysis, begins the investigation of important data analysis techniques encompassing regression algorithms, Bayesian networks, and neural networks.

Chapter 8

Clustering and Classification, involves quintessential analysis methods involving k-means clustering, hierarchical clustering, constructing decision trees, and implementing the k-Nearest Neighbors classifier.

Chapter 9

Parallel and Concurrent Design, introduces advance topics in Haskell such as forking IO actions, mapping over lists in parallel, and benchmarking performance.

Chapter 10

Real-time Fugue, incorporates streamed data interactions from Twitter, Internet Relay Chat (IRC), and sockets.

Chapter 11

Stunning Visuals, comprises of sundry approaches to plotting graphs including line charts, bar graphs, scatter plots, and D3.js visualizations.

Chapter 12

The Final Exporting, concludes the book with an enumeration of algorithms for exporting data to CSV, JSON, HTML, MongoDB, and SQLite.

Artwork Attribution

Illustrations by Lonku.