# **Scala Example**

This notebook will show you how to use **Apache Spark** with **Scala** to perform a simple word count.

You can run a cell by pressing **"shift-enter"**, which will compute the current cell and advance to the next cell, or by clicking in a cell and pressing **"control-enter"**, which will compute the current cell and remain in that cell.

** This notebook covers: **
* *Part 1:* Required Libraries
* *Part 2:* Spark Context
* *Part 3:* Word Count


## Import Required Libraries

This section shows how to import the required libraries.

Extra libraries can also be imported from maven repositories. See the comment bellow or the kernel [documentation](https://github.com/alexarchambault/jupyter-scala/blob/master/README.md) to know more.

In [None]:
// If the spark jars where not on the Worker classpath they could be added,
// directly from the maven repository, using the following code:

//classpath.add("org.apache.spark" % "spark-core_2.11" % "2.0.1")

In [None]:
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;

## Spark Context

This section shows how to initialize and configure a basic SparkContext.

In [None]:
val conf = new SparkConf()
    //.setMaster("spark://localhost:7077")
    .setMaster("local[2]")
    .setAppName("Word Count Scala App")
val sc = new SparkContext(conf)

## Word Count

This section shows the Word Counter application.

In [None]:
// load a text file
val textFile = sc.textFile("/srv/spark/LICENSE")

In [None]:
// count the times each word appears on the file
val counts = textFile.flatMap(line => line.split(" "))
                 .map(word => (word, 1))
                 .reduceByKey(_ + _)

In [None]:
// top 15 words
counts.takeOrdered(15)(Ordering[Int].reverse.on(x=>x._2))

*notebook writen by [fscm](https://github.com/fscm)*