A Seriously Fun guide to Big Data Analytics in Practice
Ruby PigLatin Other
Latest commit 9f26ae8 Jun 15, 2015 @rjurney rjurney Edits for Meghan
Permalink
Failed to load latest commit information.
attic organizing and attic-ing material to determine final outline Sep 17, 2014
bin organizing and attic-ing material to determine final outline Sep 16, 2014
code updating paths to match bd4c cluster Nov 14, 2014
data @ f3c0820 bump submodule data May 14, 2014
images Docker setup instructions with screenshots for boot2docker and virtua… Mar 9, 2015
supplementary organizing and attic-ing material to determine final outline Sep 16, 2014
.dexy finished reconciliation code Aug 20, 2012
.gitattributes Tuning Storm+Trident: executor queue sizing Oct 11, 2013
.gitignore
.gitmodules removed obsolete gitmodule Aug 18, 2012
.gitscribe getting book generation on lock Aug 8, 2012
Ch00-preface.asciidoc Edits for Meghan Jun 15, 2015
Ch01-hadoop_basics.asciidoc Edits for Meghan Jun 15, 2015
Ch02-map_reduce.asciidoc Edits for Meghan Jun 15, 2015
Ch03-introducing-baseball-data.asciidoc Edits from Meghan addressed. Jun 10, 2015
Ch04-introduction_to_pig.asciidoc Edits from Meghan addressed. Jun 10, 2015
Ch05-map_only_patterns.asciidoc Edits from Meghan addressed. Jun 10, 2015
Ch06-grouping_patterns.asciidoc Edits from Meghan addressed. Jun 10, 2015
Ch07-joining_patterns.asciidoc Minor edits, and converted Outro to Wrapping Up Jun 10, 2015
Ch08-ordering_patterns.asciidoc Minor edits, and converted Outro to Wrapping Up Jun 10, 2015
Ch09-uniquing_patterns.asciidoc Minor edits, and converted Outro to Wrapping Up Jun 10, 2015
Part_II-patterns.asciidoc Mid way through second review feedback, adding part I intro to first … May 19, 2015
Part_I_Intro.asciidoc Minor edits, and converted Outro to Wrapping Up Jun 10, 2015
README.md rebuild book Jan 26, 2013
Rakefile reconcile identifiers Aug 19, 2012
TODO.asciidoc census of code blocks Aug 14, 2014
XX00-outlines.asciidoc Added file: XX00-outlines.asciidoc Feb 4, 2015
XX01-intro.asciidoc Added file: XX01-intro.asciidoc Feb 4, 2015
XX01-opening.asciidoc Added file: XX01-opening.asciidoc Feb 4, 2015
XX02---part_one-basics.asciidoc Added file: XX02---part_one-basics.asciidoc Feb 4, 2015
XX03.2-extra-bits.asciidoc Added file: XX03.2-extra-bits.asciidoc Feb 4, 2015
XX03.5-advanced-mapreduce.asciidoc Added file: XX03.5-advanced-mapreduce.asciidoc Feb 4, 2015
XX05.5-advanced-material.asciidoc Added file: XX05.5-advanced-material.asciidoc Feb 4, 2015
XX10-statistics_and_sampling.asciidoc Added file: XX10-statistics_and_sampling.asciidoc Feb 4, 2015
XX11-advanced_patterns.asciidoc Added file: XX11-advanced_patterns.asciidoc Feb 4, 2015
XX12---part_three-applications.asciidoc Added file: XX12---part_three-applications.asciidoc Feb 4, 2015
XX12-event_streams.asciidoc Added file: XX12-event_streams.asciidoc Feb 4, 2015
XX13-munging.asciidoc Added file: XX13-munging.asciidoc Feb 4, 2015
XX14a-spatial-intro.asciidoc Added file: XX14a-spatial-intro.asciidoc Feb 4, 2015
XX14b-spatial-mechanics.asciidoc
XX14c-spatial-aggregations_on_regions.asciidoc Added file: XX14c-spatial-aggregations_on_regions.asciidoc Feb 4, 2015
XX14d-spatial-joins_on_regions.asciidoc Added file: XX14d-spatial-joins_on_regions.asciidoc Feb 4, 2015
XX15-text_analysis.asciidoc Added file: XX15-text_analysis.asciidoc Feb 4, 2015
XX40---part_four-practicalities.asciidoc Added file: XX40---part_four-practicalities.asciidoc Feb 4, 2015
XX41-big_data_ecosystem.asciidoc Added file: XX41-big_data_ecosystem.asciidoc Feb 4, 2015
XX42-organizing_data.asciidoc Added file: XX42-organizing_data.asciidoc Feb 4, 2015
XX43-commandline_mojo.asciidoc Added file: XX43-commandline_mojo.asciidoc Feb 4, 2015
XX46-tips_and_gotchas.asciidoc Added file: XX46-tips_and_gotchas.asciidoc Feb 4, 2015
XX50---part_five-internals_and_tuning.asciidoc Added file: XX50---part_five-internals_and_tuning.asciidoc Feb 4, 2015
XX51-java_api.asciidoc Added file: XX51-java_api.asciidoc Feb 4, 2015
XX52-advanced_pig.asciidoc Added file: XX52-advanced_pig.asciidoc Feb 4, 2015
XX53-hadoop_internals.asciidoc Added file: XX53-hadoop_internals.asciidoc Feb 4, 2015
XX53-tuning-practical_and_eager.asciidoc Added file: XX53-tuning-practical_and_eager.asciidoc Feb 4, 2015
XX53-tuning-wise_and_lazy.asciidoc Added file: XX53-tuning-wise_and_lazy.asciidoc Feb 4, 2015
XX54-tuning-brave_and_foolish.asciidoc Added file: XX54-tuning-brave_and_foolish.asciidoc Feb 4, 2015
XX54-tuning-use_method_checklist.asciidoc Added file: XX54-tuning-use_method_checklist.asciidoc Feb 4, 2015
XX55-hbase_data_modeling.asciidoc Added file: XX55-hbase_data_modeling.asciidoc Feb 4, 2015
XX80-appendixes.asciidoc Added file: XX80-appendixes.asciidoc Feb 4, 2015
XXE_and_C.md Added file: XXE_and_C.md Feb 4, 2015
XXLICENSE.asciidoc Added file: XXLICENSE.asciidoc Feb 4, 2015
big_data_for_chimps.pdf build of the book including geospatial Aug 1, 2014
book-docinfo.xml Fixed bad xml Dec 6, 2014
book.asciidoc Edits from Meghan addressed. Jun 10, 2015
cover.pdf Swapped out cover PDF. Not PNG. Will this work? Feb 23, 2015
cover.png Added cover png to add my name to the cover. Feb 23, 2015
list.html Minor edits, and converted Outro to Wrapping Up Jun 10, 2015

README.md

Big Data for Chimps: A Seriously Fun guide to Terabyte-scale data processing

This is the work-in-progress version of the upcoming O'Reilly book, Big Data for Chimps: A Seriously Fun guide to Hadoop and Terabyte-scale data processing.

Our intent is to provide the best guide for exploratory data analytics using Hadoop -- for data science in practice. We use high-level languages (Pig and Ruby) that make Hadoop a tool, not a framework, allowing re-use and rapid development. We'll cover enough Hadoop internals to save you from diving into the source code, and enough tuning advice to let you know where to drill deep.

In all cases, the focus is on maximizing your time and creativity -- on helping you uncover what question to ask and the right way to ask it.

O'Reilly has courageouly agreed to release the book under an http://creativecommons.org/licenses/by-nc-sa/3.0/[CC-BY-NC-SA]. To buy a physical copy of the book, or a Kindle (.mobi) or iOS/Nook (.epub), visite the early release http://shop.oreilly.com[O'Reilly bookstore] (TODO: link to early release page). Buy it now, and you'll get frequently-updated access and the final version once available.

License

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.

Code is Apache licensed unless specifically labeled otherwise.