## Test whether Window function will update column in-place or not

In [1]:
a=app.createDF("id:String;dt:Integer", """
    1,1;
    1,2;
    1,5;
    1,8""")

Here we simply want to add another column to count number of records (equivalent to rowNumber) by updating the `cnt` column in-place.

In [2]:
from pyspark.sql import Window
w=Window.partitionBy(col('id')).orderBy(col('dt'))

Let's try to create `cnt` column and using it in a single `withColumn` operation

In [13]:
b=a.withColumn('cnt',
        when(rowNumber().over(w) == 1, lit(1)).
        otherwise(lag(col('cnt')).over(w) + 1).alias('cnt')
    )

AnalysisException: cannot resolve 'cnt' given input columns id, dt;

Sounds like you can't use a column while definning it.

Let's try create the column in on `withColumn` and then use it in another

In [11]:
b=a.withColumn('cnt',
        when(rowNumber().over(w) == 1, lit(1)).alias('cnt')
    ).withColumn('cnt',
        lag(col('cnt')).over(w) + 1
    )

In [12]:
b.show()

+---+---+----+
| id| dt| cnt|
+---+---+----+
|  1|  1|null|
|  1|  2|   2|
|  1|  5|null|
|  1|  8|null|
+---+---+----+



The result shows that `lag` is the lag on the input data. You can't change column in place.