Skip to content

gregaw/pyspark-workshop

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed Big Data Processing pySpark workshop

Local setup

NOTE: If you are using jupyter available in the cluster you can skip this setup. It is useful for people wanting to run workshop exercises locally.

  1. Install anaconda https://conda.io/docs/user-guide/install/index.html

  2. Create conda environment with packages from requirements file

> conda create --name pyspark_env --file environment/requirements.txt python=3.5

When prompted to install lots of pacakges click Enter to accept.

  1. Activate newly created conda environment
> source activate pyspark_env
  1. Run jupyter notebook and open workshop exercises
> jupyter notebook

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 73.3%
  • Jupyter Notebook 26.4%
  • Other 0.3%