Skip to content

vincentclaes/testing-glue-pyspark-jobs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Testing Glue Pyspark jobs

This repo contains example code that shows how you can test your glue pyspark jobs.

It accompanies this article on medium https://towardsdatascience.com/testing-glue-pyspark-jobs-4b544d62106e

testing_mocked_s3.py -> is the script used in the article.

glue_job.py -> is the glue pyspark job

test_glue_job.py -> is the test for glue_job.py

Prerequisites

  • you have python 3.6.8 installed;
  • you have java jdk 8 installed;
  • you have spark 2.4.3 for hadoop 2.7 installed.

Run the test

pip install pipenv
git clone git@github.com:vincentclaes/testing-glue-pyspark-jobs.git
cd testing-glue-pyspark-jobs
pipenv install
pipenv shell
python -m unittest test_glue_job.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages