Skip to content

wtfzambo/delta-bug-working-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goal

The purpose of this repo is to help in reproduce a strange behavior with Delta Tables 1.0.0.

Package versions:

python = 3.7
delta = 1.0.0
spark = 3.1.2
java = openjdk 11.0.12

Getting Started

Installing Java

You need Java installed and JAVA_HOME set. If you don't have it, I recommend using SDKMAN since it's fast and easy.

curl -s "https://get.sdkman.io" | bash
# Replace `bash` with your own shell

Follow the instructions on-screen to complete installation. Then run:

source "$HOME/.sdkman/bin/sdkman-init.sh"

Ensure that the installation completed successfully:

sdk version

Finally, install java by running:

sdk install java 11.0.20-amzn

This should set JAVA_HOME by itself, but if it doesn't, run:

sdk use java 11.0.20-amzn

Installing dependencies and setting SPARK_HOME

mkdir delta-test-bug && cd delta-test-bug
git clone https://github.com/wtfzambo/delta-bug-working-example.git .

The project is configured to run with python>=3.7 <3.8. If you don't have a compatible python version, install and set it for local use with:

pyenv install 3.7
pyenv local 3.7
poetry env use $(pyenv which python)

Run the following commands to complete the setup and set SPARK_HOME.

poetry install --no-root
poetry shell

Followed by:

sparkhome=$(echo 'sc.getConf.get("spark.home")' \
    | spark-shell \
    | grep "res0" \
    | cut -d\  -f4
) > /dev/null 2>&1
export SPARK_HOME=$sparkhome

Lastly, open the notebook server:

jupyter notebook

You should be good to go.

About

Mini repo to reproduce a bug with Delta Tables

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published