* ### String Manipulation Functions
  * Case Conversion - `lower`,  `upper`
  * Getting Length -  `length`
  * Extracting substrings - `substring`, `split`
  * Trimming - `trim`, `ltrim`, `rtrim`
  * Padding - `lpad`, `rpad`
  * Concatenating string - `concat`, `concat_ws`
* ### Date Manipulation Functions
  * Getting current date and time - `current_date`, `current_timestamp`
  * Date Arithmetic - `date_add`, `date_sub`, `datediff`, `months_between`, `add_months`, `next_day`
  * Beginning and Ending Date or Time - `last_day`, `trunc`, `date_trunc`
  * Formatting Date - `date_format`
  * Extracting Information - `dayofyear`, `dayofmonth`, `dayofweek`, `year`, `month`
* ### Aggregate Functions
  * `count`, `countDistinct`
  * `sum`, `avg`
  * `min`, `max`
* ### Other Functions
  * `CASE` and `WHEN`
  * `CAST` for type casting
  * Functions to manage special types such as `ARRAY`, `MAP`, `STRUCT` type columns
  * Many others

In [0]:
# Reading data

orders = spark.read.csv(
    '/public/retail_db/orders',
    schema='order_id INT, order_date STRING, order_customer_id INT, order_status STRING'
)

In [0]:
# We can find those functions in the pyspark.sql.functions:
from pyspark.sql.functions import *

In [0]:
# We can use help() function to get the documentation about the function
help(concat_ws)

In [0]:
# It's important to remember that some of these functions doesn't work with column names (strings)
# They need col() or lit() object as an argument.

# For example concat() needs lit():
orders.select(concat('order_id', lit(' abc '), 'order_date')).show(truncate=False)

In [0]:
# And .alias() works only on col() object:
orders.select(col('order_id').alias('order_id_alias')).show()