## lit() – Add Literal or Constant to DataFrame
PySpark SQL functions lit() and typedLit() are used to add a new column to DataFrame by assigning a literal or constant value. Both these functions return Column type as return type.

Both of these are available in PySpark by importing pyspark.sql.functions

First, let’s create a DataFrame.

In [1]:
from pyspark.sql import SparkSession

spark = SparkSession.builder.appName('lit').getOrCreate()


data = [("111x23",500),("222y67",600),("333z89",400)]
columns= ["EmpId","Remuneration"]
df = spark.createDataFrame(data = data, schema = columns)
df.show(truncate=False)

Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
25/08/07 20:54:05 WARN Utils: Your hostname, javier-ubuntu, resolves to a loopback address: 127.0.1.1; using 10.0.0.205 instead (on interface wlx0013eff3e14d)
25/08/07 20:54:05 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/08/07 20:54:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
                                                                                

+------+------------+
|EmpId |Remuneration|
+------+------------+
|111x23|500         |
|222y67|600         |
|333z89|400         |
+------+------------+



PySpark lit() function is used to add constant or literal value as a new column to the DataFrame.

Creates a [[Column]] of literal value. The passed in object is returned directly if it is already a [[Column]]. If the object is a Scala Symbol, it is converted into a [[Column]] also. Otherwise, a new [[Column]] is created to represent the literal value

### Simple usage of lit() function
example of how to create a new column with constant value using lit() Spark SQL function. On the below snippet, we are creating a new column by adding a literal ‘1’ to PySpark DataFrame

In [2]:
from pyspark.sql.functions import col,lit
df2 = df.select(col("EmpId"),col("Remuneration"),lit("1").alias("lit_value1"))
df2.show(truncate=False)

+------+------------+----------+
|EmpId |Remuneration|lit_value1|
+------+------------+----------+
|111x23|500         |1         |
|222y67|600         |1         |
|333z89|400         |1         |
+------+------------+----------+



Adding the same constant literal to all records in DataFrame may not be real-time useful so let’s see another example.

### lit() function with withColumn
The following example shows how to use pyspark lit() function using withColumn to derive a new column based on some conditions.

In [3]:
from pyspark.sql.functions import when, lit, col
df3 = df2.withColumn("lit_value2", when((col("Remuneration") >=400) & (col("Remuneration") <= 500),lit("100")).otherwise(lit("200")))
df3.show(truncate=False)

+------+------------+----------+----------+
|EmpId |Remuneration|lit_value1|lit_value2|
+------+------------+----------+----------+
|111x23|500         |1         |100       |
|222y67|600         |1         |200       |
|333z89|400         |1         |100       |
+------+------------+----------+----------+

