Code for the Data Lake Talk
Python Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Creating a Local Data Lake

Companion code and slides to my talk about whether you need Hadoop. A lot of processing can be done in-memory on your laptop, if you have a reasonably modern laptop.

For example, you can run MapReduce with PyPy.