Skip to content

Simply add a vagrant based jupyter, anaconda environment to your python project

Notifications You must be signed in to change notification settings

rreben/basket4py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

basket4py

basket4py is an infrastructure project

Simply add basket4py to any python (especially data science) project. And you will be able to run jupyter notebooks from a virtual vagrant enabled development environment. Of couse you can use basket4py to get an easy start on starting with python and jupyter notebooks.

How it works

A python environment with anaconda and vagrant

This project gives you an easy start with python:

  • Develop your python scripts within jupyter notebooks
  • Use a full blown anaconda stack for data science task
  • Visualize you data with
    • Matplotlib
    • seaborn
  • The whole environment is setup within a virtual linux box, so your computer won't be impacted by any installation of the python environment.
  • The provisioning is done with chef. So the chef recipes can easily be customized
  • Everything is based on virtualbox and vagrant. So the whole setup is portable from one computer to the next and works independently from your OS (i.e. it works as well with OSX as with windows)

basket4py is the pythons basket

Integrated platform to edit and run your programs

Get an idea of a jupyter notebook

How to get started

Installation

If you know how to insall programs on a Mac or PC, you should be able to get everything up and running. If not, ask someone to help you.

Follow these simple steps to install everything you need to start programming:

  1. install virtual box.
  • install vargrant.
  • Copy the zipfile from Github. And extract it somewhere.
  • Use a terminal (dos-prompt / cmd) and navigate to the folder that contains the extracted files. You should find a file named vagrantfile.
  • type in vagrant up. This command will prepare a "virtual computer" on your pc or mac. Everything will be installed within this "virtual computer" so there won't be any interferences with other programs on your mashine.
  • type in vagrant provision this command may take even longer (leave it for the night). It will install a modern python development environment.

Use basket4py as a basis for Mining-the-Social-Web

Thx @zyx954 for this instruction:

  1. Go to https://github.com/rreben/basket4py
  2. Download this repo
  3. do a vagrant up (just like you did with the other two repos (original and fork)
  4. vagrant provision get the python stack installed.
  5. Now you should have a fully functional anaconda stack.
  6. open a browser (safari) type in 192.168.33.12:8888 You should see a jupyter notebook now
  7. type in vagrant ssh in your terminal. Now the command prompt will change. You are now logged in to your linux virtual guest machine.
  8. Use sudo -i pip install twitter twat the command line from within the guest machine to add the twitter framework
  9. Use sudo -i pip install prettytable
  10. Now you should be able to use the code examples from the book.
  11. You can either type them in or you can copy the notebooks: Do a copy of the *.ipny (ipython notebook files) from the directory ipnb in the mining-the-social-web folder to the notebooks folder in the basket4py repo

Check the installation

After the installation. Use http://192.168.33.12:8888 in your web browser, to start the environment. Click on the notebook and run the code blocks in the order in which they occur in the notebook.

Stoping and resuming

  • Use vagrant status to check whether the vagrant machine is up and running.
  • start and stop vagrant via vagrant up and vagrant halt (do not use vagrant suspend in most cases)
  • Use vagrant destory if you have to restart completly from scratch or have to reuse the disk space.

Behind the scenes

  • Vagrant is used to install python 3, jupyter and some other tool from the Anaconda eco system to a virtual mashine.
  • Vagrant is instructed to use ansible installer.
  • The virtual machine is provided via Oracles virtual box.
  • A web server is running on the virtual (guest) computer. This server serves the jupyter notebooks.
  • These notebooks can be accessed via port forwading from the host computer.
  • This way all the tutorials are brought to the users browser.

Acknowledgement

This work is inspired by Matthew A. Russel's work on Mining the social Web, where I found out about iPython (now jupyter) and how to use Vagrant and chef to prepare an easy to deploy development environment.

  • Right now I switched to ansible, thanks to @fhenri for doint the work of porting the project to ansible.
  • The project is based on the anaconda installer from @andrewrothstein

I used the following chef recipes to cook up the development environment in former versions of this repo:

  • anaconda
  • apt
  • bzip2 chef cookbook from John Bellone
  • compat chef cookbook from John Keiser
  • packagecloud
  • runit chef cookbook from the Heavy Water Operations, LLC.
  • tar chef cookbook from the Cramer Development, Inc.

Status

  • The vagrantfile is done, so setting up the development environment is working.
  • The anaconda stack is working

Handling errors

Problems with mounting the directories / Guest additions do not match

You might see a warning while vagrant up, telling you that guest additions do not match the version of the virtual box.

important warning

The effect might be that the directories with the jupyter notebooks are not mounted correctly. In this case you will see that jupyter is running (192.168.33.12:8888 will show a webpage), however you will not see any meaningful tutorials.

If this happens, you have to update your virtualbox installation to the newest version. Use vagrant destroy to restart from scratch, use vagrant up to install again (do this in a strong wifi network). This should fix everything.

Tips for analyzing errors

In most cases, this should solve your problems. But if the message "The guest additions on this VM do not match the installed version of VirtualBox! ..." persists, you might try to issue. vagrant plugin install vagrant-vbguest and restart vagrant. This might indicate further problems with the guest additions.

Use vagrant ssh to login to your guest mashine. Here you might issue ipython notebook --help to learn more about starting the jupyter service.

Other bugs and errors

Your stuck with the installation. Please create an issue on Github, I will try to help you then.

Twitter sent status 401 ... Timestamp out of bounds

It might happen that the guest machine is not working with the correct system time. This will lead issues with various APIs especially the twitter API. Just do a vagrant halt followed with vagrant up to sync the time of the guest machine again.

Some tips and tricks to deal with disk space

Move away from the primary disk

On my Windows 7 mashine the VMBox takes up about 1.5 to 5 GB of diskspace, vagrant uses around 750 MB. As I have a SSD as my first disk, I need to move this to my secondary disk. To achieve this:

  1. Create a new directory on your target disk. Set the VAGRANT_HOME environment variable to point to this directory. On Windows go to explorer (right click) -> "Erweiterte Systemeinstellungen" -> "Umgebungsvariablen".
  2. Create a different new directory on your target disk for your VirtualBox. Open the VirtualBox app. In the settings, specify this directory to store the VirtualBox-files.

Move VirtualBox to a flash-drive

On my MacBook I need to have ghe VirtualBox on a flash-drive. This leads however to some obstacles: vagrant will not be able to provision the virtual mashine, because the certificate to log in to the virtual box is fully accessible. For security reasons ssl will not accept a fully accessible certificate, so vagrant can not log in to its created guest machine. So after using vagrant up to download and install the virtual machine (takes 20 min) there might be an error with the permissions on the private-key file for the ssh to the virtual machine. SSH error when trying to use a keyfile with unrestricted access privileges In this case do the following:

  • let us assume that /project is the folder where the vagrant file lives.
  • So then goto /project/.vagrant/mashines/default/virtualtbox and copy the file to a local folder /home folder (let us assume /Users/username/certificates/), where you can change the file permissions via chmod 0600 key_file
  • now set the vagrant system to find the file in this folder:
  • open the vagrant file and add the last line below the two lines (so this block look as follows:)
override.vm.box = "precise64"
override.vm.box_url = "http://files.vagrantup.com/precise64.box"
config.ssh.private_key_path = "/Users/username/certificates/private_key"
  • Note: with version 1.8.5 of vagrant the behaviour changed a little:
    • The key file will now be named insecure_private_key
    • Vagrant will try to substitute this key file with a different secure key file, however this will lead to unrecoverable errors. So you have to add the following lines to the vagrant (or uncomment and adapt the lines in the vagrant file).
    override.vm.box = "ubuntu/trusty64"
    config.ssh.private_key_path = "/Users/username/certificates/insecure_private_key"
    config.ssh.insert_key = false
  • Note: After using vagrant destroy you first have to deactivate config.ssh.private_key_path again in the Vagrantfile, because the next vagrant up will create a new guest virtual machine, with a new and different certificate.

Install of vagrant and virtual box for a secondary disk

  • install virtual box. The target directory is fixed Unfortunately (Mac).
  • install vargrant. The can be changed to point to a flash drive (Mac).
  • install git client for Windows. Do check (it's guarded by a warining) the git and bash command linen tools. Otherwise vagrant ssh will not work at all (Windows). Alternatively you could use putty to login to your guest machine. How to download the git unix tools
  • clone the github repo from GitHub

Use some basic vagrant commands

  • Use vagrant up to download and install the guest machine (also use this to bring the virtual machine up after halt or suspend)
  • Use vagrant status to check whether the vagrant machine is up and running.
  • you might have to update via vagrant box update
  • start and stop vagrant via vagrant up and vagrant halt (do not use vagrant suspend in most cases)
  • use vagrant provision to start the provisioning of the machine. In our case this will start the chef machinery to install the python environment. You can restart this command.
  • Use vagrant destory if you have to restart completly from scratch or have to reuse the disk space.

Maintenance

  • go to the folder that contains the vagrantfile and isue vagrant plugin install vagrant-vbguest
  • see this blog for details.

Get in touch

  • Use Github to open tickets for support questions.
  • Follow me on Twitter @r_rbn
  • Tweet using #basket4py. Or send me a DM.
  • Forking, starring, following the github repo would be great.

About

Simply add a vagrant based jupyter, anaconda environment to your python project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published