Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
IMP add a dataframe breakpoint method for testing
Frequently, we have long transformations of DataFrames: ``` df = ( df # some transformations 1 # some transformations 2 # some transformations 3 # some transformations 4 ) ``` and stepping through them in an interactive testing session is painful; we have to break up the code like so: ``` df = ( df # some transformations 1 ) breakpoint() df = ( df # some transformations 2 ) breakpoint() df = ( df # some transformations 3 ) breakpoint() df = ( df # some transformations 4 ) ``` this can become cumbersome. the proposal here is to register a `breakpoint()` method on `DataFrame` objects while testing with `SparklyGlobalContextTest` so you can just sprinkle `.breakpoint()` method calls within your chain of transformations: ``` df = ( df # some transformations 1 .breakpoint() # some transformations 2 .breakpoint() # some transformations 3 .breakpoint() # some transformations 4 .breakpoint() ) ``` you can even pass your own function, for example if you don't want to jump into a debugger but maybe just print some rows: ``` def show_df(df): df.show(10, False) df = ( df # some transformations 1 .breakpoint(show_df) # some transformations 2 .breakpoint(show_df) # some transformations 3 .breakpoint(show_df) # some transformations 4 .breakpoint(lambda df: df.show(20, False)) ) ```
- Loading branch information