#PySpark lit() – Add Literal or Constant to DataFrame

---

**PySpark SQL functions lit() and typedLit() are used to add a new column to DataFrame by assigning a literal or constant value. Both these functions return Column type as return type.**

**Both of these are available in PySpark by importing pyspark.sql.functions**


**First, let’s create a DataFrame.**

In [0]:
from pyspark.sql.functions import lit, col, when

In [0]:
data = [("111",50000),("222",60000),("333",40000)]
columns= ["EmpId","Salary"]

df = spark.createDataFrame(data=data, schema=columns)
df.printSchema()

root
 |-- EmpId: string (nullable = true)
 |-- Salary: long (nullable = true)



##lit() Function to Add Constant Column


**PySpark lit() function is used to add constant or literal value as a new column to the DataFrame.**



---


##Example 1: Simple usage of lit() function

**Let’s see an example of how to create a new column with constant value using lit() Spark SQL function. On the below snippet, we are creating a new column by adding a literal ‘1’ to PySpark DataFrame.**

In [0]:
df2 = df.select(col("EmpId"), col("Salary"), lit("1").alias("lit_value1"))
df2.printSchema()
df2.show(truncate=False)

root
 |-- EmpId: string (nullable = true)
 |-- Salary: long (nullable = true)
 |-- lit_value1: string (nullable = false)

+-----+------+----------+
|EmpId|Salary|lit_value1|
+-----+------+----------+
|111  |50000 |1         |
|222  |60000 |1         |
|333  |40000 |1         |
+-----+------+----------+



**Adding the same constant literal to all records in DataFrame may not be real-time useful so let’s see another example.**


---


##Example 2 : lit() function with withColumn


**The following example shows how to use pyspark lit() function using withColumn to derive a new column based on some conditions.**

In [0]:
df3 = df2.withColumn("lit_value2", when((col("Salary") >= 40000) & (col("Salary") <= 50000),lit("100")).otherwise(lit("200")))
df3.show(truncate=False)

+-----+------+----------+----------+
|EmpId|Salary|lit_value1|lit_value2|
+-----+------+----------+----------+
|111  |50000 |1         |100       |
|222  |60000 |1         |200       |
|333  |40000 |1         |100       |
+-----+------+----------+----------+



##typedLit() Function – Syntax


**Difference between lit() and typedLit() is that, typedLit function can handle collection types e.g.: Array, Dictionary(map) e.t.c. Unfortunately, I could not find this function in PySpark, when I find it, I will add an example.**