# 15. Text Files

## `StreamReader` and `StreamWriter`
---

The `StreamReader` and `StreamWriter` clasees are derived from the `System.IO` namespace:

In [1]:
using System.IO;

<br>

<br>

### `StreamReader` Class for Reading a Text File

While $C\#$ provides several ways to read files, not all are easy and intuitive to use. This is why we will use the `System.IO.StreamReader` class, which provides the easiest way to read a text file, as it resembles reading from the console.

<br>

#### Identifying a Text File for Reading

Let's say we have the following file, `Fish.txt`, residing in some `ExampleText` directory, which we would like to read:

```
One Fish
Two Fish
Red Fish
Blue Fish
```

<br>

#### Full V.S. Relative Paths

When searching for the file, we can use either: 
- a **full path** (e.g. `C:\PathToDirectory\ExampleText\Fish.txt`) 
- a **relative path** to the directory from which the application was started (e.g. `ExampleText\Fish.txt`).

When specifying a path, do not forget to apply **escaping of slashes**.    
In $C\#$, you can do this in two ways:

In [2]:
// By escaping the Slashes
string fullPathEscapedSlashes = "C:\\PathToDirectory\\ExampleText\\Fish.txt";

In [3]:
// By using a Verbatim string
string fullPathVerbatim = @"C:\PathToDirectory\ExampleText\Fish.txt";

Although the use of relative paths is more difficult because you have to take into account the directory structure of your project which may change during the life of the project, it is **highly recommended avoiding full paths**.

Using the full path to a file is bad practice because it makes your application dependent on the environment and also nontransferable. If you transfer it to another computer, you will need to correct paths to the files. However, if you use a relative path to the current directory (e.g. `ExampleText\Fish.txt`), your program will be easily portable.

<br>

#### Reading a Text File Line by Line and Manually Closing the File

In [4]:
// First, we open a stream to the input file 
// by creating in instance of the StreadReader class 
// which reads from our desired input source
StreamReader reader = new StreamReader( @"ExampleText\Fish.txt" );


// We then initialize a variable which will account for 
// each concurrent line number 
int lineNumber = 1;


// We may now read in the first line of the input
string currentLine = reader.ReadLine();


// From here, 
// Iterating While the current line still has valid data to be read:
while( currentLine != null )
{

    // Print the current line number, along with the line itself,
    // taking care to simultaneously increment the line number afterwards
    Console.WriteLine( $"Line {lineNumber++}: {currentLine}" );


    // Clobber the old line with the next concurrent line
    currentLine = reader.ReadLine();

}


// After all that,
// The file has been successfully read.
// Close the file stream to avoid resource conflicts
reader.Close();

Line 1: One Fish
Line 2: Two Fish
Line 3: Red Fish
Line 4: Blue Fish


<br>

#### Reading a Text File Line by Line and Automatically Closing the File

Very often, novice programmers forget to call the `Close()` method, thus blocking the file they use. This causes resource leakage and can lead to very unpleasant effects like program hanging, program misbehavior and strange errors.

To avoid this issue, it is advisable to **bypass the file closing requirement** with the `using` keyword, as generalized below:

```c#
using ( streamObjectName )
{
    // read the stream object and do something with it
}
```

Below, we'll see how implementing the `using()` construct simplifies the process by **automatically closing the file**:

In [5]:
public void ReadThenClose( string filePath )
{

    // First, we open a stream to the input file 
    // by creating in instance of the StreadReader class 
    // which reads from our desired input source
    StreamReader reader = new StreamReader( filePath );


    // Invoke the using construct to ensure 
    // that the file stream will be closed after processing the data
    using ( reader )
    {

        // We then initialize a variable which will account for 
        // each concurrent line number 
        int lineNumber = 1;


        // We may now read in the first line of the input
        string currentLine = reader.ReadLine();


        // From here, 
        // Iterating While the current line still has valid data to be read:
        while( currentLine != null )
        {

            // Print the current line number, along with the line itself,
            // taking care to simultaneously increment the line number afterwards
            Console.WriteLine( $"Line {lineNumber++}: {currentLine}" );


            // Clobber the old line with the next concurrent line
            currentLine = reader.ReadLine();

        }


        // After all that,
        // The file has been successfully read

    }

    // Here, we can be sure that
    // The file stream has been closed automatically

}

In [6]:
ReadThenClose( @"ExampleText\Fish.txt" )

Line 1: One Fish
Line 2: Two Fish
Line 3: Red Fish
Line 4: Blue Fish


<br>

<br>

### Character Encodings

In memory, everything is **stored in binary form**.    
   
This means that it is necessary for text files to be represented digitally, so that they can be stored in memory, as well as on the hard disk. This process is called **encoding** files, or more correctly encoding the characters stored in text files.

The **encoding process** consists of **replacing the text characters** (*letters*, *digits*, *punctuation*, etc.) with **specific sequences of binary values**. You can imagine this as a large table in which each character corresponds to a certain **value** (**sequence of bytes**).

#### The Unicode Standard

The **Unicode Standard** assigns a **code point** (a number) to each character in every supported script. A **Unicode Transformation Format** (**UTF**) is a way to encode that code point.

All character encoding classes in .NET inherit from the `System.Text.Encoding` class, which is an abstract class that defines the functionality common to all character encodings, as illustrated below:

- **UTF-8**, which represents each code point as a sequence of **one to four bytes**

In [7]:
Encoding.GetEncoding("UTF-8")

Preamble,BodyName,EncodingName,HeaderName,WebName,WindowsCodePage,IsBrowserDisplay,IsBrowserSave,IsMailNewsDisplay,IsMailNewsSave,IsSingleByte,EncoderFallback,DecoderFallback,IsReadOnly,CodePage
System.Text.UTF8Encoding+UTF8EncodingSealed,utf-8,Unicode (UTF-8),utf-8,utf-8,1200,True,True,True,True,False,EncoderReplacementFallback  DefaultString: �  MaxCharCount: 1,DecoderReplacementFallback  DefaultString: �  MaxCharCount: 1,True,65001


<br>

- **UTF-16**, which represents each code point as a sequence of **one to two 16-bit integers**

In [8]:
Encoding.GetEncoding("UTF-16")

Preamble,BodyName,EncodingName,HeaderName,WebName,WindowsCodePage,IsBrowserDisplay,IsBrowserSave,IsMailNewsDisplay,IsMailNewsSave,IsSingleByte,EncoderFallback,DecoderFallback,IsReadOnly,CodePage
System.Text.UnicodeEncoding,utf-16,Unicode,utf-16,utf-16,1200,False,True,False,False,False,EncoderReplacementFallback  DefaultString: �  MaxCharCount: 1,DecoderReplacementFallback  DefaultString: �  MaxCharCount: 1,True,1200


<br>

- **UTF-32**, which represents each code point as a **32-bit integer**.

In [9]:
Encoding.GetEncoding("UTF-32")

Preamble,BodyName,EncodingName,HeaderName,WebName,WindowsCodePage,IsBrowserDisplay,IsBrowserSave,IsMailNewsDisplay,IsMailNewsSave,IsSingleByte,EncoderFallback,DecoderFallback,IsReadOnly,CodePage
System.Text.UTF32Encoding,utf-32,Unicode (UTF-32),utf-32,utf-32,1200,False,False,False,False,False,EncoderReplacementFallback  DefaultString: �  MaxCharCount: 1,DecoderReplacementFallback  DefaultString: �  MaxCharCount: 1,True,12000


<br>

Using this convention, we may optionally specify the character encoding of any given **Stream Object** as generalized below: 

```c#
StreamReader readerName = new StreamReader(
    @"PathTo\SomeFile.txt",
    [Encoding.GetEncoding("SomeEncoding")]
);
```

<br>

<br>

### `StreamWriter` Class for Writing to a Text File

The `StreamWriter` class is part of the `System.IO` namespace and is used exclusively for working with text data.   
   
It resembles the `StreamReader` class, but instead of methods for reading, it offers similar methods for **writing to a text file**. Unlike other streams, *before* writing data to the desired destination, `StreamWriter` turns it into bytes. `StreamWriter` also enables us to set a preferred **character encoding** at the time it is created, but also features an additional optional **boolean** parameter which indicates whether to **Overwrite** or **Append to a file** if it is found to already exist. 

<br>

We can create an instance of the class as generalized below:

```c#
StreamWriter writerName = new StreamWriter(
    @"PathTo\SomeFile.txt",
    [ [Encoding:] Encoding.GetEncoding("SomeEncoding") ],
    [ [Append:] TrueOrFalse ]   
);
```

<br>

#### Writing to a New Text File Line by Line and Manually Closing the File

Let's suppose we wanted to write a file named `OneThruTwenty.txt` which **counts every number from 1 to 20** on a new line. We may implement that as follows:

In [10]:
// First, we open a stream to the output file 
// by creating in instance of the StreamWriter class 
// which will target the specified file, should it be found to exist,
// or create it, should it not be found to exist 
StreamWriter writer = new StreamWriter( 
    @"ExampleText\OneThruTwenty.txt"
);


// From here, 
// Iterating For every concurrent integer from 1 to 20:
for( int concurrentInteger = 1; concurrentInteger <= 20; concurrentInteger++ )
{

    // Write the concurrent integer to a new line in the text file
    writer.WriteLine( concurrentInteger );

}


// After all that,
// The file has been successfully read.
// Close the file stream to avoid resource conflicts
writer.Close();

<br>

After the writing operations are concluded, we may deploy our `ReadThenClose()` method which utilizes `StreamReader` to verify that `OneThruTwenty.txt` was successfully written to:

In [11]:
ReadThenClose( @"ExampleText\OneThruTwenty.txt" );

Line 1: 1
Line 2: 2
Line 3: 3
Line 4: 4
Line 5: 5
Line 6: 6
Line 7: 7
Line 8: 8
Line 9: 9
Line 10: 10
Line 11: 11
Line 12: 12
Line 13: 13
Line 14: 14
Line 15: 15
Line 16: 16
Line 17: 17
Line 18: 18
Line 19: 19
Line 20: 20


<br>

#### Appending Lines to an Existing Text File and Automatically Closing the File

Now, suppose that, rather than creating a new file, we wanted to **add a new line to an existing text file**, without overwiting the other lines of text present.

For example, say we wanted to add the following lines to the `Fish.txt` file:

In [12]:
public string[] linesToAppend = 
{
    "False Fish",
    "True Fish",
    "Me Fish",
    "You Fish"
}; 

<br>

We may implement our desired objective, while simultaneously ensuring to close the file automatically, as follows:

In [13]:
// This time, we will also specify the optional 'append' parameter
// to indicate that we choose not to overwrite the exisitng data 
StreamWriter writer = new StreamWriter( 
    @"ExampleText\Fish.txt",
    append: true
);


// Invoke the using construct to ensure 
// that the file stream will be closed after processing the data
using( writer )
{

    // From here, 
    // Iterating For Each current line in the array of lines to append:
    foreach( string concurrentLine in linesToAppend )
    {

        // Write the concurrent line to a new line in the text file
        writer.WriteLine( concurrentLine );

    }

}

// Here, we can be sure that
// The file stream has been closed automatically

<br>

After the writing operations are concluded, we may deploy our `ReadThenClose()` method which utilizes `StreamReader` to verify that `Fish.txt` was successfully appended:

In [14]:
ReadThenClose( @"ExampleText\Fish.txt" );

Line 1: One Fish
Line 2: Two Fish
Line 3: Red Fish
Line 4: Blue Fish
Line 5: False Fish
Line 6: True Fish
Line 7: Me Fish
Line 8: You Fish
