Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
Dear Reader,

That you for purchasing my book:

"Scripting Intelligence, Web 3.0 Information Gathering and Processing"  APress 2009

This directory contains the source code and data for the examples in my book. Even if you didn't buy the book, hopefully you will still find the example code useful.

When I wrote the examples for this book I created an Amazon EC2 AMI with the examples installed and running (as described in Appendix A). This AMI is very out of date and I suggest that you not try to use it.

I also suggest that you skip the old Rails demo programs in Part 4 of the book for reasons that I document on the errata web page for this book (

The code in Parts 1, 2, and 3 of the book, while old, should still be relevant and useful. Almost all of the code is Ruby (with some Java Hadoop example code) and has useful utilities for Natural Language Processing (NLP), Semantic Web, accessing both relational and NoSQL type data stores, etc.

There are subdirectories for each part of my book. I did not separate the examples into directories for individual chapters because sometimes examples for different chapters in a book part share libraries and data.

Each subdirectory also contains a README.txt file. Many of the examples require other software (usually open source) to run - these dependencies, with download links, are listed in the book. The README.txt files in the book part subdirectories contain information for running the examples in the same order as the material appears in the book.

Here is a summary of the table of contents for the book:

PART 1 Text Processing: Natural Language Processing, Parsing Common Document Types, Cleaning, Segmenting, and Spell-Checking Text

PART 2 The Semantic Web: Using RDF and RDFS Data Formats, Delving Into RDF Data Stores, Performing SPARQL Queries and Understanding Reasoning, Implementing SPARQL Endpoint Web Portals

PART 3 Information Gathering and Storage: Relational Databases, Indexing and Search, Using Web Scraping to Create Semantic Relations, Strategies for Large-Scale Data Storage

PART 4 Information Publishing: Creating Web Mashups, Performing Large-Scale Data Processing, Building Information Web Portals
Best regards,
Mark Watson

## Donate on Patreon to support all of my projects

Please visit []( and sign up to donate $1/month


Examples from my book "Scripting Intelligence: Web 3.0 Information Gathering and Processing"






No releases published


No packages published