# Built-in Functions

PySpark comes with built-in functions which are available in the `pyspark.sql.functions` library.

## Imports


In [None]:
import os

import findspark
import pyspark
from IPython.core.display import HTML
from pyspark.sql import SparkSession, functions
from pyspark.sql.functions import (
    col,
    date_add,
    date_sub,
    lit,
    lower,
    max,
    min,
    substring,
    to_timestamp,
    upper,
)

findspark.init()

## SparkSession

## Display

To allow the browser to display scrollable dataframes.

## Load the data

In [None]:
data_path = "file:///" + os.getcwd() + "/data"

file_path = data_path + "/reported-crimes.csv"

crimes_df = (
    spark.read.option("header", "true")
    .csv(file_path)
    .withColumn("Date", to_timestamp(col("Date"), "MM/dd/yyyy hh:mm:ss a"))
    .filter(col("Date") <= lit("2018-11-11"))
)

crimes_df.show(5)

## String functions

**Display the Primary Type column in lower and upper characters, and the first 4 characters of the column**

In [None]:
crimes_df.printSchema()

In [None]:
crimes_df.select(
    lower(col("Primary Type")),
    upper(col("Primary Type")),
    substring(col("Primary Type"), 1, 4),
).show(5)

## Numeric functions

**Show the oldest date and the most recent date**

In [None]:
crimes_df.select(min(col("Date")), max(col("Date"))).show(1)

## Date

**What is 3 days earlier than the oldest date and 3 days later than the most recent date?**

In [None]:
crimes_df.select(date_sub(min(col("Date")), 3), date_add(max(col("Date")), 3)).show(1)