Skip to content

Local integration test setup for pyspark with AWS through Localstack

Notifications You must be signed in to change notification settings

richardcann/spark-integration-localstack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark Integration with Localstack

Aiming to enable local testing of pyspark when uploading/downloading files from S3 buckets. Localstack aim to provide an easy way to bring up various AWS services locally using Docker. Through the spark/conf/ folder, the application knows to connect to a custom endpoint locally (localhost:4572) when executing pyspark commands (i.e. spark.read.json('s3://my-test/data'))

Check an integration test example in tests/ folder.

Project is run with spark version 2.4.4

Dependencies:

pip install pyspark

Run the application:

make docker-run

About

Local integration test setup for pyspark with AWS through Localstack

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published