# Scala Basics

## Introduction

Scala combines object-oriented and functional programming in one concise, high-level language. Scala's static types help avoid bugs in complex applications, and its JVM and JavaScript runtimes let you build high-performance systems with easy access to huge ecosystems of libraries.

## Overview

- Statically Typed (Java, C, C++, Scala)
- Object-oriented programming (OOP) paradigm
- Scala code results in .class files that run on the Java Virtual Machine
- Like Java, Scala also uses a curly-brace syntax.
- Easy to use Java libraries in Scala

Creating variables in Scala. There are two types of variables in Scala. 

Val: creates an immutable varaible

Var creates a mutable variable

This is what declaring a variable in Scala looks like!

In [40]:
val x = 2
var y = 3

x: Int = 2
y: Int = 3


In [41]:
x = 3

<console>: 29: error: reassignment to val

In [42]:
y = 4

y: Int = 4


So here, we can see that I was unable to reassign the value of x because it was declared type val. However, if it is type val like variable y is, reassigning the value is no issue because it is mutable.

## Data Types
Scala contains the same data types as Java such as:
- Boolean
- Byte
- Short
- Char
- Int
- Long
- Float
- Double
- String

However, Scala does not need to specify a type. The scala complier is similar to Python here where the complier can determine the data type. From the example above, we can see that x and y were not declared of INT type but when we ran it, the output showed both x and y as INT.

## For loops examples


In [43]:
for (i <- 0 to 5) println(i)

0
1
2
3
4
5


### Example expression that iterates over a list of strings:

In [44]:
val fruits = List("apple", "banana", "lime", "orange")

val fruitLengths = for {
    f <- fruits
    if f.length > 4
} yield f.length


fruits: List[String] = List(apple, banana, lime, orange)
fruitLengths: List[Int] = List(5, 6, 6)


## Class Examples

In [45]:
class Person(var firstName: String, var lastName: String) {
    def printFullName() = println(s"$firstName $lastName")
}

defined class Person


In [46]:
val p = new Person("Julia", "Kern")
println(p.firstName)
p.lastName = "Manes"
p.printFullName()

Julia
Julia Manes


p: Person = Person@715233c7
p.lastName: String = Manes


## Scala Methods
An example of how to create and method and how the method is being called.

In [47]:
def add(a: Int, b: Int): Int = a + b

add: (a: Int, b: Int)Int


In [48]:
val add_x = add(1, 2)

add_x: Int = 3


## Examples of Scala in Spark

In [3]:
val data = Seq((1,2,3), (4,5,6), (6,7,8), (9,19,10))
val ds = spark.createDataset(data)
ds.show()

+---+---+---+
| _1| _2| _3|
+---+---+---+
|  1|  2|  3|
|  4|  5|  6|
|  6|  7|  8|
|  9| 19| 10|
+---+---+---+



data: Seq[(Int, Int, Int)] = List((1,2,3), (4,5,6), (6,7,8), (9,19,10))
ds: org.apache.spark.sql.Dataset[(Int, Int, Int)] = [_1: int, _2: int ... 1 more field]


In [4]:
val cubed= (s:Long) => {  s * s * s}
// Register UDF
spark.udf.register("cubed", cubed)
// Create temporary view
spark.range(1, 9).createOrReplaceTempView("udf_test")

cubed: Long => Long = $Lambda$2493/0x0000000801b01040@7734d80a


In [5]:
spark.sql("SELECT id, cubed(id) AS id_cubed FROM udf_test").show()

+---+--------+
| id|id_cubed|
+---+--------+
|  1|       1|
|  2|       8|
|  3|      27|
|  4|      64|
|  5|     125|
|  6|     216|
|  7|     343|
|  8|     512|
+---+--------+



In [6]:
import org.apache.spark.sql.functions._

import org.apache.spark.sql.functions._


In [20]:
val dataFile = "/Users/sheng/Downloads/archive/aug_test.csv"

dataFile: String = /Users/sheng/Downloads/archive/aug_test.csv


In [21]:
val dataSet = spark.read.options(Map("inferSchema"->"true","delimiter"->",","header"->"true"))
  .csv(dataFile)

dataSet: org.apache.spark.sql.DataFrame = [enrollee_id: int, city: string ... 11 more fields]


In [22]:
dataSet.printSchema()

root
 |-- enrollee_id: integer (nullable = true)
 |-- city: string (nullable = true)
 |-- city_development_index: double (nullable = true)
 |-- gender: string (nullable = true)
 |-- relevent_experience: string (nullable = true)
 |-- enrolled_university: string (nullable = true)
 |-- education_level: string (nullable = true)
 |-- major_discipline: string (nullable = true)
 |-- experience: string (nullable = true)
 |-- company_size: string (nullable = true)
 |-- company_type: string (nullable = true)
 |-- last_new_job: string (nullable = true)
 |-- training_hours: integer (nullable = true)



In [23]:
dataSet.show()

+-----------+--------+----------------------+------+--------------------+-------------------+---------------+----------------+----------+------------+--------------+------------+--------------+
|enrollee_id|    city|city_development_index|gender| relevent_experience|enrolled_university|education_level|major_discipline|experience|company_size|  company_type|last_new_job|training_hours|
+-----------+--------+----------------------+------+--------------------+-------------------+---------------+----------------+----------+------------+--------------+------------+--------------+
|      32403| city_41|    0.8270000000000001|  Male|Has relevent expe...|   Full time course|       Graduate|            STEM|         9|         <10|          null|           1|            21|
|       9858|city_103|                  0.92|Female|Has relevent expe...|      no_enrollment|       Graduate|            STEM|         5|        null|       Pvt Ltd|           1|            98|
|      31806| city_21|        

In [24]:
dataSet.collect.foreach(println)

[32403,city_41,0.8270000000000001,Male,Has relevent experience,Full time course,Graduate,STEM,9,<10,null,1,21]
[9858,city_103,0.92,Female,Has relevent experience,no_enrollment,Graduate,STEM,5,null,Pvt Ltd,1,98]
[31806,city_21,0.624,Male,No relevent experience,no_enrollment,High School,null,<1,null,Pvt Ltd,never,15]
[27385,city_13,0.8270000000000001,Male,Has relevent experience,no_enrollment,Masters,STEM,11,10/49,Pvt Ltd,1,39]
[27724,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10000+,Pvt Ltd,>4,72]
[217,city_23,0.899,Male,No relevent experience,Part time course,Masters,STEM,10,null,null,2,12]
[21465,city_21,0.624,null,Has relevent experience,no_enrollment,Graduate,STEM,<1,100-500,Pvt Ltd,1,11]
[27302,city_160,0.92,Female,Has relevent experience,no_enrollment,Graduate,STEM,>20,null,null,>4,81]
[12994,city_173,0.878,Male,Has relevent experience,no_enrollment,Graduate,STEM,14,null,null,4,2]
[16287,city_21,0.624,Male,Has relevent experience,Full time course,Gr

[15367,city_103,0.92,Male,No relevent experience,Full time course,Graduate,STEM,<1,null,null,1,24]
[22556,city_21,0.624,Male,Has relevent experience,Full time course,Graduate,STEM,5,10/49,Funded Startup,1,120]
[27439,city_100,0.887,Male,Has relevent experience,Part time course,Graduate,STEM,15,null,null,1,61]
[18426,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,10,null,null,1,23]
[9789,city_160,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10000+,Pvt Ltd,>4,4]
[28172,city_114,0.9259999999999999,null,Has relevent experience,no_enrollment,Masters,STEM,>20,100-500,Pvt Ltd,>4,139]
[9184,city_136,0.897,Male,Has relevent experience,Part time course,Masters,STEM,3,10000+,Pvt Ltd,1,34]
[3587,city_16,0.91,Male,Has relevent experience,no_enrollment,Masters,Humanities,7,null,Pvt Ltd,4,21]
[1167,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,15,100-500,Pvt Ltd,4,10]
[1663,city_134,0.698,null,Has relevent experience,Part time cour

[19504,city_21,0.624,null,Has relevent experience,no_enrollment,Graduate,STEM,4,10/49,Pvt Ltd,1,18]
[16336,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,7,10000+,Pvt Ltd,>4,272]
[6048,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,15,1000-4999,Pvt Ltd,2,29]
[6827,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,2,100-500,Pvt Ltd,never,100]
[9268,city_160,0.92,Male,Has relevent experience,no_enrollment,null,null,>20,<10,Pvt Ltd,2,26]
[11169,city_21,0.624,Female,Has relevent experience,no_enrollment,Graduate,STEM,6,10000+,Pvt Ltd,1,13]
[31131,city_103,0.92,Female,No relevent experience,Part time course,Graduate,Other,3,5000-9999,Other,1,56]
[12361,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,<1,<10,Early Stage Startup,null,23]
[15165,city_160,0.92,null,No relevent experience,Part time course,Graduate,STEM,>20,null,null,null,16]
[27773,city_114,0.9259999999999999,Male,Has relevent experience

[5907,city_149,0.6890000000000001,null,Has relevent experience,no_enrollment,Graduate,STEM,3,50-99,Pvt Ltd,1,35]
[1392,city_103,0.92,null,No relevent experience,no_enrollment,Primary School,null,1,null,Pvt Ltd,never,15]
[10348,city_103,0.92,Male,No relevent experience,no_enrollment,Graduate,STEM,18,50-99,Pvt Ltd,>4,28]
[7957,city_152,0.698,null,Has relevent experience,no_enrollment,Masters,STEM,15,null,Pvt Ltd,1,42]
[19058,city_103,0.92,Male,No relevent experience,no_enrollment,Masters,STEM,15,null,Pvt Ltd,1,55]
[19243,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,19,100-500,null,1,31]
[7770,city_103,0.92,null,No relevent experience,Full time course,Graduate,STEM,2,10000+,Pvt Ltd,null,25]
[15910,city_97,0.925,Male,Has relevent experience,no_enrollment,Graduate,STEM,15,1000-4999,Pvt Ltd,1,41]
[30486,city_30,0.698,null,No relevent experience,Full time course,null,null,4,50-99,Funded Startup,1,18]
[20333,city_19,0.682,Female,No relevent experience,no_enrollment,Gr

[28842,city_136,0.897,Male,No relevent experience,Full time course,Graduate,STEM,7,null,null,1,32]
[26563,city_41,0.8270000000000001,Male,Has relevent experience,no_enrollment,Graduate,STEM,9,5000-9999,Pvt Ltd,2,30]
[19256,city_103,0.92,Male,No relevent experience,Full time course,Graduate,STEM,4,null,Pvt Ltd,1,40]
[25579,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,7,1000-4999,NGO,3,35]
[27521,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,11,1000-4999,Pvt Ltd,>4,160]
[31378,city_16,0.91,Male,Has relevent experience,no_enrollment,Masters,Business Degree,>20,1000-4999,Pvt Ltd,>4,14]
[14870,city_11,0.55,Male,Has relevent experience,no_enrollment,Graduate,STEM,17,1000-4999,Pvt Ltd,1,55]
[11929,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10000+,Pvt Ltd,2,68]
[14585,city_21,0.624,null,Has relevent experience,no_enrollment,Graduate,STEM,2,10000+,Pvt Ltd,never,112]
[32487,city_21,0.624,Male,Has relevent experien

[976,city_67,0.855,Male,Has relevent experience,no_enrollment,Graduate,STEM,7,100-500,Pvt Ltd,2,57]
[30784,city_114,0.9259999999999999,Male,Has relevent experience,Full time course,High School,null,17,500-999,Pvt Ltd,>4,180]
[31557,city_100,0.887,Male,No relevent experience,no_enrollment,Graduate,Humanities,1,null,null,1,15]
[22982,city_16,0.91,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,100-500,Pvt Ltd,1,50]
[29300,city_16,0.91,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10/49,Pvt Ltd,1,33]
[5932,city_72,0.795,Male,Has relevent experience,no_enrollment,Graduate,STEM,6,5000-9999,Pvt Ltd,1,116]
[31852,city_73,0.754,Male,Has relevent experience,no_enrollment,Graduate,STEM,4,50-99,null,2,4]
[20833,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,13,10/49,Pvt Ltd,1,124]
[9501,city_71,0.884,Male,Has relevent experience,no_enrollment,Graduate,STEM,5,100-500,null,1,20]
[9707,city_103,0.92,Male,No relevent experience,Full time course,Gradu

[25802,city_103,0.92,null,Has relevent experience,no_enrollment,Masters,STEM,>20,null,null,>4,42]
[19385,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,11,10000+,Pvt Ltd,1,36]
[27304,city_115,0.789,Male,No relevent experience,no_enrollment,Masters,STEM,20,50-99,Pvt Ltd,>4,94]
[3697,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,15,10000+,Pvt Ltd,>4,57]
[11800,city_25,0.698,Male,Has relevent experience,no_enrollment,Masters,STEM,4,500-999,Pvt Ltd,2,101]
[180,city_103,0.92,Male,No relevent experience,no_enrollment,Graduate,Humanities,2,100-500,Funded Startup,2,35]
[7733,city_21,0.624,null,Has relevent experience,Full time course,Graduate,STEM,3,50-99,null,1,24]
[6935,city_16,0.91,Male,Has relevent experience,Full time course,Masters,STEM,>20,50-99,Pvt Ltd,2,130]
[4082,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10000+,Public Sector,>4,96]
[31959,city_104,0.924,Male,Has relevent experience,no_enrollment,Gradu

[33033,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,Humanities,3,null,null,1,12]
[10579,city_101,0.5579999999999999,null,Has relevent experience,Full time course,Graduate,STEM,5,500-999,Pvt Ltd,null,118]
[5258,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,4,1000-4999,Pvt Ltd,2,25]
[853,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,10,100-500,Public Sector,2,11]
[27245,city_21,0.624,Male,Has relevent experience,no_enrollment,Masters,STEM,16,1000-4999,Pvt Ltd,1,36]
[30670,city_103,0.92,Female,Has relevent experience,no_enrollment,Graduate,STEM,6,null,null,1,43]
[22497,city_26,0.698,Other,Has relevent experience,Part time course,High School,null,5,100-500,null,1,41]
[3218,city_102,0.804,Male,Has relevent experience,no_enrollment,Graduate,STEM,16,100-500,Pvt Ltd,1,33]
[464,city_114,0.9259999999999999,null,No relevent experience,Full time course,High School,null,20,null,Pvt Ltd,never,76]
[15770,city_160,0.92,Male,Has r

[6092,city_150,0.698,Male,Has relevent experience,no_enrollment,Masters,Business Degree,>20,1000-4999,Pvt Ltd,>4,56]
[17249,city_28,0.9390000000000001,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,50-99,Pvt Ltd,>4,81]
[1227,city_123,0.738,Male,No relevent experience,no_enrollment,Graduate,STEM,10,null,null,never,32]
[23515,city_114,0.9259999999999999,Male,Has relevent experience,Part time course,Masters,STEM,9,100-500,Public Sector,never,27]
[16309,city_77,0.83,null,Has relevent experience,Part time course,Graduate,STEM,<1,100-500,Pvt Ltd,1,280]
[5368,city_101,0.5579999999999999,Female,Has relevent experience,null,Graduate,STEM,7,50-99,Pvt Ltd,1,15]
[27693,city_83,0.9229999999999999,Male,Has relevent experience,no_enrollment,Graduate,STEM,18,100-500,Pvt Ltd,>4,50]
[2582,city_16,0.91,null,Has relevent experience,no_enrollment,Graduate,STEM,>20,10/49,Pvt Ltd,>4,41]
[12096,city_115,0.789,null,No relevent experience,Full time course,Masters,STEM,<1,null,null,1,64]
[17526,city

[9649,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,Business Degree,16,50-99,Pvt Ltd,1,33]
[23255,city_100,0.887,null,No relevent experience,Full time course,Graduate,STEM,5,null,null,never,54]
[24081,city_64,0.6659999999999999,Male,No relevent experience,Full time course,Graduate,Arts,2,null,null,1,94]
[23116,city_16,0.91,Male,No relevent experience,no_enrollment,Primary School,null,2,null,null,never,47]
[27338,city_21,0.624,Male,No relevent experience,Full time course,Graduate,STEM,11,null,null,1,103]
[14873,city_36,0.893,null,Has relevent experience,no_enrollment,Graduate,STEM,8,100-500,null,>4,5]
[11886,city_123,0.738,Male,No relevent experience,no_enrollment,Masters,STEM,14,null,null,>4,12]
[9207,city_114,0.9259999999999999,Male,Has relevent experience,no_enrollment,High School,null,8,<10,Pvt Ltd,never,39]
[3291,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,100-500,Pvt Ltd,>4,62]
[5002,city_73,0.754,null,Has relevent experience,Part

[1274,city_75,0.9390000000000001,Male,Has relevent experience,no_enrollment,Graduate,STEM,9,1000-4999,null,never,310]
[4957,city_90,0.698,Male,Has relevent experience,no_enrollment,Graduate,STEM,12,<10,Pvt Ltd,3,20]
[7424,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,Business Degree,>20,<10,Pvt Ltd,2,39]
[12055,city_64,0.6659999999999999,Female,Has relevent experience,no_enrollment,Graduate,STEM,6,500-999,Pvt Ltd,2,21]
[9163,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,Other,4,100-500,Pvt Ltd,1,24]
[29171,city_65,0.802,Male,Has relevent experience,no_enrollment,Masters,STEM,14,10000+,Pvt Ltd,1,11]
[16193,city_103,0.92,null,Has relevent experience,no_enrollment,Phd,STEM,>20,5000-9999,Pvt Ltd,>4,28]
[25623,city_152,0.698,null,No relevent experience,Full time course,Graduate,STEM,4,null,null,never,7]
[2575,city_158,0.7659999999999999,Male,Has relevent experience,no_enrollment,Graduate,STEM,5,50-99,Pvt Ltd,1,155]
[21353,city_21,0.624,null,No releven

[10928,city_21,0.624,Male,Has relevent experience,Full time course,Graduate,STEM,4,10000+,Pvt Ltd,1,10]
[10640,city_116,0.743,null,Has relevent experience,no_enrollment,Masters,STEM,13,100-500,NGO,1,166]
[14904,city_67,0.855,Male,Has relevent experience,Full time course,Graduate,STEM,5,50-99,Pvt Ltd,1,304]
[5160,city_75,0.9390000000000001,null,Has relevent experience,Part time course,Masters,STEM,6,10000+,Pvt Ltd,2,10]
[8239,city_21,0.624,Male,Has relevent experience,Full time course,Masters,STEM,9,5000-9999,Pvt Ltd,never,48]
[11912,city_103,0.92,Male,Has relevent experience,Part time course,Graduate,Business Degree,5,10000+,Pvt Ltd,4,17]
[23554,city_176,0.764,null,No relevent experience,no_enrollment,Primary School,null,<1,null,null,never,94]
[17221,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,4,null,null,never,36]
[25072,city_136,0.897,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,50-99,Pvt Ltd,1,116]
[19240,city_103,0.92,Male,Has relevent ex

[1206,city_160,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,13,50-99,Pvt Ltd,>4,54]
[14412,city_160,0.92,Male,Has relevent experience,Full time course,Graduate,STEM,3,10000+,Pvt Ltd,1,65]
[19987,city_136,0.897,null,Has relevent experience,no_enrollment,Masters,STEM,9,50-99,null,3,7]
[15586,city_115,0.789,Male,Has relevent experience,Part time course,Masters,STEM,11,10/49,Pvt Ltd,3,47]
[17424,city_28,0.9390000000000001,null,No relevent experience,Full time course,High School,null,9,null,null,never,80]
[19783,city_16,0.91,Male,No relevent experience,no_enrollment,Graduate,STEM,16,100-500,Public Sector,1,4]
[15763,city_97,0.925,Male,Has relevent experience,no_enrollment,Graduate,STEM,16,500-999,Pvt Ltd,1,24]
[6433,city_21,0.624,Male,No relevent experience,no_enrollment,Graduate,STEM,2,1000-4999,Pvt Ltd,1,26]
[7881,city_114,0.9259999999999999,Male,Has relevent experience,no_enrollment,Graduate,STEM,10,<10,Pvt Ltd,1,38]
[9237,city_65,0.802,null,Has relevent experience,no_en

[14477,city_21,0.624,null,Has relevent experience,no_enrollment,Graduate,STEM,3,100-500,Pvt Ltd,1,55]
[25737,city_114,0.9259999999999999,Other,No relevent experience,Full time course,High School,null,5,100-500,Public Sector,1,15]
[21032,city_160,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,3,null,null,2,58]
[30770,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,9,50-99,Pvt Ltd,1,20]
[28733,city_71,0.884,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,<10,Pvt Ltd,3,33]
[26032,city_103,0.92,Male,No relevent experience,Full time course,Graduate,STEM,6,null,null,never,74]
[2421,city_21,0.624,null,No relevent experience,no_enrollment,Graduate,STEM,5,50-99,Pvt Ltd,1,10]
[6027,city_103,0.92,Female,No relevent experience,no_enrollment,Phd,STEM,6,100-500,Funded Startup,2,40]
[7266,city_128,0.527,null,No relevent experience,no_enrollment,Graduate,STEM,4,null,null,4,136]
[20956,city_136,0.897,Male,Has relevent experience,no_enrollment,Graduate,

[2358,city_11,0.55,Male,Has relevent experience,Full time course,Graduate,STEM,7,<10,Early Stage Startup,1,11]
[7645,city_21,0.624,Male,Has relevent experience,no_enrollment,Masters,STEM,10,10000+,Public Sector,1,30]
[15411,city_61,0.9129999999999999,Male,Has relevent experience,no_enrollment,Masters,STEM,12,10000+,Pvt Ltd,>4,27]
[32727,city_21,0.624,Male,Has relevent experience,Full time course,Graduate,STEM,11,<10,Pvt Ltd,2,38]
[32693,city_40,0.7759999999999999,Male,Has relevent experience,no_enrollment,Masters,STEM,12,1000-4999,Pvt Ltd,4,312]
[471,city_103,0.92,null,Has relevent experience,Full time course,Graduate,STEM,8,<10,Public Sector,2,32]
[16922,city_73,0.754,null,No relevent experience,no_enrollment,High School,null,2,null,null,never,78]
[31991,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,Arts,10,50-99,Funded Startup,1,125]
[16212,city_21,0.624,Male,Has relevent experience,no_enrollment,Graduate,STEM,5,500-999,Pvt Ltd,1,152]
[22477,city_21,0.624,Male,Has

[29159,city_99,0.915,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,1000-4999,Pvt Ltd,>4,40]
[20909,city_16,0.91,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,<10,Pvt Ltd,4,56]
[13140,city_73,0.754,Male,Has relevent experience,Full time course,Masters,STEM,16,1000-4999,Pvt Ltd,2,14]
[8903,city_99,0.915,null,Has relevent experience,no_enrollment,Graduate,STEM,9,1000-4999,Pvt Ltd,1,68]
[20079,city_103,0.92,Male,No relevent experience,no_enrollment,Graduate,Other,1,5000-9999,Pvt Ltd,never,29]
[24628,city_116,0.743,null,Has relevent experience,Full time course,Primary School,null,2,5000-9999,Pvt Ltd,1,101]
[5482,city_103,0.92,null,Has relevent experience,no_enrollment,Masters,STEM,>20,<10,Early Stage Startup,>4,42]
[16050,city_160,0.92,null,Has relevent experience,no_enrollment,null,null,>20,null,null,1,72]
[18581,city_101,0.5579999999999999,Male,Has relevent experience,no_enrollment,Masters,STEM,15,100-500,Pvt Ltd,4,20]
[14438,city_21,0.624,Male,Has relevent ex

[24998,city_173,0.878,Male,No relevent experience,no_enrollment,High School,null,3,null,null,1,74]
[15853,city_50,0.8959999999999999,null,No relevent experience,no_enrollment,Graduate,Other,9,10/49,Pvt Ltd,>4,21]
[14581,city_16,0.91,Male,Has relevent experience,no_enrollment,Graduate,STEM,2,<10,Early Stage Startup,1,57]
[27390,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,3,10/49,Pvt Ltd,1,58]
[21976,city_16,0.91,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,1000-4999,Public Sector,>4,34]
[28808,city_71,0.884,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,null,null,>4,66]
[14285,city_173,0.878,null,No relevent experience,Full time course,Graduate,Business Degree,2,100-500,Pvt Ltd,1,156]
[7784,city_45,0.89,Male,Has relevent experience,Part time course,High School,null,10,10/49,Pvt Ltd,>4,13]
[15973,city_127,0.745,null,No relevent experience,Full time course,Graduate,STEM,3,null,null,1,53]
[24155,city_21,0.624,Male,Has relevent experie

[5266,city_102,0.804,Male,Has relevent experience,Part time course,Graduate,STEM,13,50-99,Pvt Ltd,1,12]
[27308,city_103,0.92,Female,No relevent experience,no_enrollment,Graduate,STEM,8,null,null,never,46]
[6721,city_21,0.624,null,Has relevent experience,Full time course,Masters,STEM,5,50-99,Early Stage Startup,1,52]
[667,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,100-500,Pvt Ltd,>4,4]
[3027,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,100-500,Pvt Ltd,>4,50]
[8735,city_103,0.92,null,No relevent experience,Full time course,Graduate,STEM,5,1000-4999,Public Sector,null,34]
[24238,city_101,0.5579999999999999,null,Has relevent experience,Full time course,Graduate,STEM,3,<10,Early Stage Startup,1,9]
[28911,city_160,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,4,100-500,Funded Startup,1,165]
[18575,city_67,0.855,Female,Has relevent experience,Full time course,Graduate,STEM,2,null,null,1,36]
[15098,city_103,0.92,null

[24800,city_160,0.92,Male,No relevent experience,no_enrollment,Graduate,Business Degree,12,null,null,2,59]
[29243,city_165,0.903,Female,No relevent experience,Part time course,Masters,STEM,<1,10/49,Funded Startup,1,270]
[22835,city_27,0.848,Male,No relevent experience,no_enrollment,Graduate,STEM,10,null,null,never,82]
[6615,city_21,0.624,Male,No relevent experience,Part time course,Graduate,STEM,2,<10,Pvt Ltd,1,68]
[32152,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,500-999,Pvt Ltd,2,64]
[7281,city_103,0.92,Female,No relevent experience,no_enrollment,Masters,STEM,4,1000-4999,Pvt Ltd,1,33]
[7314,city_136,0.897,Female,No relevent experience,no_enrollment,Phd,STEM,6,null,null,1,8]
[5479,city_103,0.92,Male,No relevent experience,no_enrollment,Graduate,STEM,2,null,null,2,81]
[28427,city_160,0.92,null,No relevent experience,Part time course,Masters,Humanities,16,10000+,Public Sector,1,60]
[18166,city_30,0.698,Female,Has relevent experience,no_enrollment,Graduate

[22338,city_160,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,10/49,Pvt Ltd,3,41]
[18519,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,4,null,null,1,53]
[8752,city_103,0.92,Male,No relevent experience,Full time course,Phd,STEM,10,null,Public Sector,>4,52]
[17254,city_91,0.691,null,No relevent experience,Full time course,Primary School,null,2,null,null,1,133]
[31441,city_75,0.9390000000000001,Male,Has relevent experience,no_enrollment,Graduate,STEM,15,null,Public Sector,2,156]
[13175,city_75,0.9390000000000001,null,No relevent experience,no_enrollment,Masters,STEM,9,50-99,Pvt Ltd,1,6]
[2122,city_21,0.624,null,No relevent experience,Full time course,Graduate,STEM,5,null,Pvt Ltd,never,13]
[734,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,10,500-999,Pvt Ltd,null,11]
[10655,city_136,0.897,Female,No relevent experience,Full time course,Graduate,STEM,4,null,null,never,102]
[20900,city_71,0.884,Male,Has relevent experienc

[8216,city_21,0.624,null,Has relevent experience,no_enrollment,Masters,STEM,11,100-500,Pvt Ltd,2,214]
[5320,city_173,0.878,Male,Has relevent experience,no_enrollment,High School,null,6,100-500,Pvt Ltd,never,89]
[30466,city_10,0.895,Male,No relevent experience,Full time course,High School,null,4,10/49,Pvt Ltd,1,89]
[24026,city_74,0.579,null,Has relevent experience,no_enrollment,Graduate,STEM,6,10/49,Pvt Ltd,1,87]
[30013,city_40,0.7759999999999999,Male,Has relevent experience,no_enrollment,Masters,Other,9,100-500,Pvt Ltd,1,21]
[22342,city_99,0.915,Female,No relevent experience,Full time course,Graduate,STEM,5,10/49,Pvt Ltd,1,21]
[370,city_36,0.893,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,null,null,>4,85]
[30365,city_11,0.55,null,No relevent experience,no_enrollment,Graduate,STEM,2,null,null,1,25]
[30316,city_21,0.624,null,Has relevent experience,Full time course,Graduate,STEM,2,5000-9999,Pvt Ltd,1,57]
[28350,city_67,0.855,Male,Has relevent experience,Full time course,

[14806,city_36,0.893,Male,Has relevent experience,no_enrollment,High School,null,10,50-99,Early Stage Startup,2,53]
[22480,city_40,0.7759999999999999,Male,Has relevent experience,Full time course,Graduate,STEM,11,100-500,null,4,88]
[29259,city_136,0.897,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,50-99,Pvt Ltd,>4,5]
[22746,city_103,0.92,Male,Has relevent experience,no_enrollment,High School,null,3,50-99,null,never,132]
[9852,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,STEM,>20,null,null,3,23]
[30,city_158,0.7659999999999999,Male,Has relevent experience,Part time course,Graduate,STEM,10,500-999,Pvt Ltd,2,63]
[17854,city_114,0.9259999999999999,null,Has relevent experience,no_enrollment,Masters,STEM,15,10000+,Pvt Ltd,2,334]
[9205,city_143,0.74,null,Has relevent experience,no_enrollment,Graduate,STEM,13,null,null,>4,37]
[17779,city_21,0.624,Male,No relevent experience,Full time course,Masters,STEM,5,null,null,1,129]
[20154,city_136,0.897,Male,Has relev

[21166,city_150,0.698,null,No relevent experience,Full time course,Graduate,STEM,4,null,null,1,7]
[24684,city_103,0.92,Male,No relevent experience,no_enrollment,Masters,Business Degree,2,1000-4999,Pvt Ltd,1,29]
[32511,city_16,0.91,Male,No relevent experience,no_enrollment,Graduate,STEM,14,1000-4999,Public Sector,>4,7]
[10587,city_64,0.6659999999999999,Female,Has relevent experience,no_enrollment,Graduate,Humanities,14,null,null,1,23]
[23210,city_103,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,5,10000+,Pvt Ltd,1,39]
[8243,city_45,0.89,Male,No relevent experience,Full time course,High School,null,6,null,null,never,114]
[2046,city_149,0.6890000000000001,Male,Has relevent experience,no_enrollment,Masters,STEM,7,50-99,Early Stage Startup,1,76]
[875,city_160,0.92,null,Has relevent experience,no_enrollment,Graduate,STEM,12,<10,Pvt Ltd,1,70]
[437,city_16,0.91,Male,Has relevent experience,no_enrollment,Masters,STEM,10,10000+,Pvt Ltd,2,14]
[9208,city_21,0.624,null,No relevent e

[30903,city_103,0.92,Male,Has relevent experience,no_enrollment,Graduate,Humanities,>20,50-99,Pvt Ltd,>4,44]
[23078,city_103,0.92,null,No relevent experience,Full time course,Graduate,STEM,3,100-500,Pvt Ltd,1,57]
[3520,city_116,0.743,Male,Has relevent experience,no_enrollment,Masters,Business Degree,7,50-99,Pvt Ltd,4,112]
[23948,city_103,0.92,Male,No relevent experience,no_enrollment,null,null,3,null,null,null,13]
[20536,city_21,0.624,Male,Has relevent experience,Full time course,Graduate,STEM,9,10/49,Pvt Ltd,1,86]
[12472,city_50,0.8959999999999999,Male,Has relevent experience,no_enrollment,Graduate,STEM,6,null,null,1,29]
[12756,city_136,0.897,Male,Has relevent experience,no_enrollment,Masters,STEM,>20,100-500,Pvt Ltd,4,21]
[634,city_160,0.92,Female,No relevent experience,Part time course,High School,null,1,10000+,Pvt Ltd,1,57]
[11458,city_27,0.848,Male,Has relevent experience,no_enrollment,Graduate,STEM,<1,null,null,>4,139]
[20348,city_150,0.698,Male,Has relevent experience,no_enrollm

In [27]:
dataSet.sort(col("training_hours").desc).show(5)

+-----------+--------+----------------------+------+--------------------+-------------------+---------------+----------------+----------+------------+------------+------------+--------------+
|enrollee_id|    city|city_development_index|gender| relevent_experience|enrolled_university|education_level|major_discipline|experience|company_size|company_type|last_new_job|training_hours|
+-----------+--------+----------------------+------+--------------------+-------------------+---------------+----------------+----------+------------+------------+------------+--------------+
|      17854|city_114|    0.9259999999999999|  null|Has relevent expe...|      no_enrollment|        Masters|            STEM|        15|      10000+|     Pvt Ltd|           2|           334|
|       7443|city_173|                 0.878|  Male|Has relevent expe...|      no_enrollment|        Masters|            STEM|         8|      10000+|     Pvt Ltd|           1|           334|
|      24845|city_103|                  