## create a User Defined Function (UDF) in PySpark to double the salary

You are given a dataset of employees with their respective salaries.
create a `UDF` in PySpark to double the salary of each employee and add it as a new column in the DataFrame.

In [0]:
data = [ 
    (1, "Rohish Zade", 2000),
    (2, "Priya Ramteke", 3000),
    (3, "Faizal Reza", 2500),
]
columns = ["id", "full_name", "salary"]

df = spark.createDataFrame(data, columns)
df.show()

+---+-------------+------+
| id|    full_name|salary|
+---+-------------+------+
|  1|  Rohish Zade|  2000|
|  2|Priya Ramteke|  3000|
|  3|  Faizal Reza|  2500|
+---+-------------+------+



In [0]:
# Define the Python function
def double_salary(salary):
    return salary * 2 if salary is not None else None

In [0]:
# Register the UDF
from pyspark.sql.functions import udf, col
from pyspark.sql.types import IntegerType

double_salary_udf = udf(double_salary, IntegerType())

In [0]:
salary_df = df.withColumn("new_salary", double_salary_udf(col("salary")))
salary_df.show()

+---+-------------+------+----------+
| id|    full_name|salary|new_salary|
+---+-------------+------+----------+
|  1|  Rohish Zade|  2000|      4000|
|  2|Priya Ramteke|  3000|      6000|
|  3|  Faizal Reza|  2500|      5000|
+---+-------------+------+----------+

