[![icons8-linkedin.gif](attachment:c9494563-7284-4c71-9fe4-40d31b4558ff.gif 'Author : Suryakant Kumar')](https://www.linkedin.com/in/suryakantkumar/)[![icons8-github.gif](attachment:ecd1af6f-8660-4379-b68f-bad3ed6d67c8.gif 'Author : Suryakant Kumar')](https://github.com/SuryakantKumar)

## File Handling in Scala

File Handling is a way to store the fetched information in a file. 

Scala provides packages from which we can `create`, `open`, `read` and `write` the files. 

For writing to a file in scala we borrow `java.io._` from `Java` because we don’t have a class to write into a file, in the Scala standard library. 

We could also import `java.io.File` and `java.io.PrintWriter`.

### Creating a new file

* `java.io.File` defines classes and interfaces for the JVM access files, file systems and attributes.

* `File(String pathname)` converts the parameter string to abstract path name, creating a new file instance.

### Writing to the file

* `java.io.PrintWriter` includes all the printing methods included in `PrintStream`.

Below is the implementation for creating a new file and writing into it :

In [1]:
import java.io.File
import java.io.PrintWriter
  
// Main method
def main(args:Array[String])
{
    // Creating a file
    val file_Object = new File("data/file_handling.txt")

    // Passing reference of file to the printwriter
    val print_Writer = new PrintWriter(file_Object)

    // Writing to the file
    print_Writer.write("Hello, This is Suryakant")

    // Closing printwriter
    print_Writer.close()
}

Intitializing Scala interpreter ...

Spark Web UI available at http://192.168.1.138:4043
SparkContext available as 'sc' (version = 3.3.0, master = local[*], app id = local-1670360295614)
SparkSession available as 'spark'


import java.io.File
import java.io.PrintWriter
main: (args: Array[String])Unit


In [2]:
main(Array(""))

Scala does not provide class to `write` a file but it provide a class to `read` the files. This is the class `Source`. We use its companion object to read files. 

To read the contents of this file, we call the `fromFile()` method of class `Source` for reading the contents of the file which includes filename as argument.

### Reading a File

* `scala.io.Source` includes methods for iterable representation of the source file.

* `Source.fromFile` creates a source from the input file.

* `file.next` return the next element in the iteration and moves the iterator one step ahead.

* `file.hasnext` checks if there is next element available to iterate.

* `getLines` iterates through file line by line

Below is the implementation for Reading each character from a file : 

In [3]:
import scala.io.Source

// Main method
def main(args : Array[String])
{
    // file name
    val fname = "data/file_handling.txt" 

    // creates iterable representation of the source file
    val fSource = Source.fromFile(fname)
    while (fSource.hasNext)
    {
        println(fSource.next)
    }

    // closing file
    fSource.close()
}

import scala.io.Source
main: (args: Array[String])Unit


In [4]:
main(Array(""))

H
e
l
l
o
,
 
T
h
i
s
 
i
s
 
S
u
r
y
a
k
a
n
t


We can use `getLines()` method to read individual lines instead of the whole file at once.

Below is the implementation for Reading each line from a file : 

In [5]:
import scala.io.Source 
  
// Main method
def main(args:Array[String])
{
    val fname = "data/file_handling.txt"
    val fSource = Source.fromFile(fname)
    for(line<-fSource.getLines)
    {
        println(line)
    }
    fSource.close()
}

import scala.io.Source
main: (args: Array[String])Unit


In [6]:
main(Array(""))

Hello, This is Suryakant


* We can preview the data using `view` command

```terminal
view file_name
```

* We can also select limited lines of data from a file using `take()` method.

In [7]:
val orderItems = Source.fromFile("data/retail_db/order_items/part-00000").getLines

orderItems.take(10).foreach(println)

1,1,957,1,299.98,299.98
2,2,1073,1,199.99,199.99
3,2,502,5,250.0,50.0
4,2,403,1,129.99,129.99
5,4,897,2,49.98,24.99
6,4,365,5,299.95,59.99
7,4,502,3,150.0,50.0
8,4,1014,4,199.92,49.98
9,5,957,1,299.98,299.98
10,5,365,5,299.95,59.99


orderItems: Iterator[String] = <iterator>


* We can get Size of a file using `size` method.

In [8]:
orderItems.size

res4: Int = 172188


* `split()` method is used to split a string over some substrings/Characters and creates an array with all the splitted elements.

In [9]:
"1,2,3,4,5".split(",")

res5: Array[String] = Array(1, 2, 3, 4, 5)


* We can access elements of a sequence using `index`.

In [10]:
List(1, 2, 3, 4, 5)(0)

res6: Int = 1
