### Filter customers whose names start with 'A' and display their details

In PySpark, the `startswith() `function is used to filter rows where a column's string value starts with a specified prefix.
This function is applied on columns and works well for string comparisons.

In [0]:
# sample data
data = [
    (1, 'Rohish', 'HR', 5000),
    (2, 'Smit', 'HR', 6000),
    (3, 'Faisal', 'IT', 7000),
    (4, 'Pushpak', 'IT', 9000),
    (5, 'Rishabh', 'HR', 5500),
    (6, 'Vinit', 'IT', 8000),
    (7, 'DemonSlayer69', 'IT', 90000),
    (8, 'Ajit', 'Finance', 15000)
]

columns = ["EmployeeID", "Name", "Department", "Salary"]

df = spark.createDataFrame(data, columns)
df.show()

+----------+-------------+----------+------+
|EmployeeID|         Name|Department|Salary|
+----------+-------------+----------+------+
|         1|       Rohish|        HR|  5000|
|         2|         Smit|        HR|  6000|
|         3|       Faisal|        IT|  7000|
|         4|      Pushpak|        IT|  9000|
|         5|      Rishabh|        HR|  5500|
|         6|        Vinit|        IT|  8000|
|         7|DemonSlayer69|        IT| 90000|
|         8|         Ajit|   Finance| 15000|
+----------+-------------+----------+------+



In [0]:
# Filter customers whose names start with 'A'
from pyspark.sql.functions import col

filtered_df = df.filter(col("Name").startswith("A"))
filtered_df.show()

+----------+----+----------+------+
|EmployeeID|Name|Department|Salary|
+----------+----+----------+------+
|         8|Ajit|   Finance| 15000|
+----------+----+----------+------+



startswith() is a powerful function to filter data based on string patterns in PySpark.

You can use it for quick filtering based on the first few characters of a string.