Skip to content

koslab/ansible-pydatalab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Big Data Scientific Computing Kit

This ansible script deploys a server with a collection of Python Big Data and Scientific Computing tools and libraries, preconfigured for running on a local Spark cluster.

Included packages:

Installation

  1. Setup a server or VM with CentOS7

  2. Ensure FQDN is configured correctly. Spark requires the host system hostname to be resolvable, quickest fix is to ensure the hostname resolves as 127.0.0.1 by adding an entry in /etc/hosts:

    127.0.0.1 localhost.localdomain localhost pydatalab.server.local pydatalab
    
  3. Create an ansible hosts inventory (assuming server hostname is your hostname is :

    [master]
    pydatalab.server.local
    
  4. Execute ansible

    ansible-playbook -i hosts playbook.yml
    
  5. Jupyter should be running at pydatalab.server.local:8888

  6. Default login for pydatalab user is pydatalab:pydatalab

Integration with Hortonworks Hadoop

This ansible script detects whether it is being installed on Hortonworks Data Platform and will create a Jupyter kernel with the right environment variables set and configured to use Spark on HDP.

Supported platforms

This script have been tested on:

  • CentOS 7.1
  • Red Hat Enterprise Linux 7.1

About

Ansible playbook for creating a python based datalab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published