# Reading from a File

A 'file' is what your computer can save and read as **data**.

An incredible amount of data is available in files. 

Files can contain weather data, traffic data, socioeconomic data, literary works, and more. Reading from a file is particularly useful in data analysis applications, but it’s also applicable to any situation in which you want to analyze or modify information stored in a file. 

![](images/cabinet.gif)

## File types

You probably noticed that files are typically of the form `name.suffix`. The `suffix` part tells us about the type of the file. Here are some file ending examples:

* `.txt`    Plain text files
* `.html`   Website files
* `.csv`    Comma-separated values
* `.rtf`    Rich-text formatting
* `.doc`    Document files
* `.pdf`    Pdf files

We will get back to this later. For now let's look at a plain text file (`.txt`):

In [None]:
%%bash
cat bones_in_london.txt | head 

Text files can either be read from or written to. Let's have a look at reading first.

## Reading an Entire File

When you want to work with the information in a text file, the first step is to read the file into memory. You can read the entire contents of a file, or you can work through the file one line at a time.

To begin, we need a file with a few lines of text in it. Let’s start with a file that contains the Sherloack Holmes story _"A Study in Scarlet"_:

In [15]:
file_name = 'a_study_in_scarlet.txt'
with open(file_name) as file_pointer:
    contents = file_pointer.read()
    print(contents[:700])





                               A STUDY IN SCARLET

                               Arthur Conan Doyle







                                Table of contents

         Part I
        Mr. Sherlock Holmes
        The Science Of Deduction
        The Lauriston Garden Mystery
        What John Rance Had To Tell
        Our Advertisement Brings A Visitor
        Tobias Gregson Shows What He Can Do
        Light In The Darkness

         Part II
        On The Great Alkali Plain
        The Flower Of Utah
        John Ferrier Talks With The Prophet
        A Flight For Life
        The Avenging Angels
        A Continuation Of The Reminiscences Of John Watson, M.D.
        The Conclusion








Let’s start by looking at the `open() function`. To do any work with a file, even just printing its contents, you first need to open the file to access it. The `open()` function needs one argument: the name of the file you want to open. Python looks for this file in the directory where the program that is currently being executed is stored. 

The `open()` function returns an object representing the file. Here, `open(file_name)` returns an object representing `a_study_in_scarlet.txt`. Python stores this object in `file_object`, which we will work with later in the program.

The keyword `with` denotes a *Context Manager*, which essentially wraps a block of code and performs an action at the end of the block, no matter how it exits. In this case, it closes the file once access to it is no longer needed. Notice how we call `open()` in this program but not `close()`. You could open and close the file by calling `open()` and `close()`, but if a bug in your program prevents the `close()` statement from being executed, the file may never close. This may seem trivial, but improperly closed files can cause data to be lost or corrupted. And if you call `close()` too early in your program, you’ll find yourself trying to work with a closed file (a file you can’t access), which leads to more errors. It is not always easy to know exactly when you should close a file, but with the structure shown here, Python will figure that out for you. All you have to do is open the file and work with it as desired, trusting that Python will close it automatically when the time is right.

Once we have a file object representing `a_study_in_scarlet.txt`, we use the `read()` method in the second line of our program to read the entire contents of the file and store it as one long string in contents. When we print the value of contents, we get the **entire** text file back.

The only difference between this output and the original file is the extra blank line at the end of the output. The blank line appears because `read()` returns an empty string when it reaches the end of the file; this empty string shows up as a blank line. If you want to remove the extra blank line, you can use `rstrip()`.

# Files and folders

Files are stored inside directories (or folders). So a directory structure could look like this:


    python/
    ├── counting_words_project
    │   ├── find_caps_words.py
    │   └── text_analysis.py
    └── hello_world_project
        └── hello.py
        
To find `hello.py` you would have to enter the `python` folder and then the `hello_world_project` folder.

## File Paths

When you pass a simple filename like `a_study_in_scarlet.txt` to the `open()` function, Python looks in the directory where the file that is currently being executed (that is, your .py program file) is stored.

Sometimes, depending on how you organize your work, the file you want to open is not located in the same directory as your program file. Therefore, you can use *relative* and *absolute file paths* as arguments to the `open()` function.

### Relative file paths 

A relative file path tells Python to look for a given location relative to the directory where the currently running program file is stored. On Linux and OS X, you would write:

```python
with open('text_files/filename.txt') as file_object:
    pass
```

This tells Python to look for the "filename.txt" file in the folder `text_files`, which is assumed to be located inside your current working directory. On Windows systems, you use a backslash (`\`) instead of a forward slash (`/`) in the file path:

```python
with open('text_files\filename.txt') as file_object:
    pass
```

### Absolute file paths 
An absolute file path, tells the Python interpreter exactly where a file is located regardless of the current working directory. On Linux and OS X, absolute paths look like:

```python
file_path = '/tmp/text_files/filename.txt' 
with open(file_path) as file_object:
    pass
```

and on Windows they look like this:

```python
file_path = 'C:\Users\<username>\AppData\Local\Temp\text_files\filename.txt' 
with open(file_path) as file_object:
    pass
```

Note, on Windows, when you are using `GitBash` paths are translated from Windows style to Unix style. That is, a path like `C:\Users\helge\AppData\Local\Temp\a_study_in_scarlet.txt` is represented by `/c/Users/helge/AppData/Local/Temp/a_study_in_scarlet.txt`.

## Reading Line by Line

When you are reading a file, you will often want to examine each line of the file. You might be looking for certain information in the file, or you might want to modify the text in the file in some way. 

You can use a `for` loop on the file object to examine each line from a file one at a time.

In [21]:
with open('a_study_in_scarlet.txt') as file_pointer:
    for line in file_pointer:
        print(line)









                               A STUDY IN SCARLET



                               Arthur Conan Doyle















                                Table of contents



         Part I

        Mr. Sherlock Holmes

        The Science Of Deduction

        The Lauriston Garden Mystery

        What John Rance Had To Tell

        Our Advertisement Brings A Visitor

        Tobias Gregson Shows What He Can Do

        Light In The Darkness



         Part II

        On The Great Alkali Plain

        The Flower Of Utah

        John Ferrier Talks With The Prophet

        A Flight For Life

        The Avenging Angels

        A Continuation Of The Reminiscences Of John Watson, M.D.

        The Conclusion







































                                      PART I



                   (Being a reprint from the reminiscences of

                              John H. Watson, M.D.,

                      late of the Army Medical Department.)













     and occasionally applying his tape to the walls in an equally

     incomprehensible manner. In one place he gathered up very carefully a

     little pile of grey dust from the floor, and packed it away in an

     envelope. Finally, he examined with his glass the word upon the wall,

     going over every letter of it with the most minute exactness. This

     done, he appeared to be satisfied, for he replaced his tape and his

     glass in his pocket.



     "They say that genius is an infinite capacity for taking pains," he

     remarked with a smile. "It's a very bad definition, but it does apply

     to detective work."



     Gregson and Lestrade had watched the manoeuvres of their amateur

     companion with considerable curiosity and some contempt. They

     evidently failed to appreciate the fact, which I had begun to

     realize, that Sherlock Holmes' smallest actions were all directed

     towards some definite and practical end.



     "What do you think o

     body of his victim into the empty house. As to the candle, and the

     blood, and the writing on the wall, and the ring, they may all be so

     many tricks to throw the police on to the wrong scent."



     "Well done!" said Holmes in an encouraging voice. "Really, Gregson,

     you are getting along. We shall make something of you yet."



     "I flatter myself that I have managed it rather neatly," the

     detective answered proudly. "The young man volunteered a statement,

     in which he said that after following Drebber some time, the latter

     perceived him, and took a cab in order to get away from him. On his

     way home he met an old shipmate, and took a long walk with him. On

     being asked where this old shipmate lived, he was unable to give any

     satisfactory reply. I think the whole case fits together uncommonly

     well. What amuses me is to think of Lestrade, who had started off

     upon the wrong scent. I am afraid he won't make much of--W

     overcome with astonishment, and on joining him they were affected in

     the same way by the sight which met their eyes.



     On the little plateau which crowned the barren hill there stood a

     single giant boulder, and against this boulder there lay a tall man,

     long-bearded and hard-featured, but of an excessive thinness. His

     placid face and regular breathing showed that he was fast asleep.

     Beside him lay a little child, with her round white arms encircling

     his brown sinewy neck, and her golden haired head resting upon the

     breast of his velveteen tunic. Her rosy lips were parted, showing the

     regular line of snow-white teeth within, and a playful smile played

     over her infantile features. Her plump little white legs terminating

     in white socks and neat shoes with shining buckles, offered a strange

     contrast to the long shrivelled members of her companion. On the

     ledge of rock above this strange couple there stood th

     His first thought was that the prostrate figure was that of some

     wounded or dying man, but as he watched it he saw it writhe along the

     ground and into the hall with the rapidity and noiselessness of a

     serpent. Once within the house the man sprang to his feet, closed the

     door, and revealed to the astonished farmer the fierce face and

     resolute expression of Jefferson Hope.



     "Good God!" gasped John Ferrier. "How you scared me! Whatever made

     you come in like that."



     "Give me food," the other said, hoarsely. "I have had no time for

     bite or sup for eight-and-forty hours." He flung himself upon the

     cold meat and bread which were still lying upon the table from his

     host's supper, and devoured it voraciously. "Does Lucy bear up well?"

     he asked, when he had satisfied his hunger.



     "Yes. She does not know the danger," her father answered.



     "That is well. The house is watched on every side. That is why I

 


     fifty who can reason synthetically for one who can reason

     analytically."



     "I confess," said I, "that I do not quite follow you."



     "I hardly expected that you would. Let me see if I can make it

     clearer. Most people, if you describe a train of events to them, will

     tell you what the result would be. They can put those events together

     in their minds, and argue from them that something will come to pass.

     There are few people, however, who, if you told them a result, would

     be able to evolve from their own inner consciousness what the steps

     were which led up to that result. This power is what I mean when I

     talk of reasoning backwards, or analytically."



     "I understand," said I.



     "Now this was a case in which you were given the result and had to

     find everything else for yourself. Now let me endeavour to show you

     the different steps in my reasoning. To begin at the beginning. I

     approached the hous

When printing each line, we find even more blank lines. These blank lines appear because an invisible newline character is at the end of each line in the text file. The print statement adds its own newline each time we call it, so we end up with two newline characters at the end of each line: one from the file and one from the print statement. Again, using `rstrip()` on each line in the `print` statement eliminates these extra blank lines.

In [22]:
with open('a_study_in_scarlet.txt') as file_pointer:
    for line in file_pointer:
        print(line.rstrip())





                               A STUDY IN SCARLET

                               Arthur Conan Doyle







                                Table of contents

         Part I
        Mr. Sherlock Holmes
        The Science Of Deduction
        The Lauriston Garden Mystery
        What John Rance Had To Tell
        Our Advertisement Brings A Visitor
        Tobias Gregson Shows What He Can Do
        Light In The Darkness

         Part II
        On The Great Alkali Plain
        The Flower Of Utah
        John Ferrier Talks With The Prophet
        A Flight For Life
        The Avenging Angels
        A Continuation Of The Reminiscences Of John Watson, M.D.
        The Conclusion



















                                      PART I

                   (Being a reprint from the reminiscences of
                              John H. Watson, M.D.,
                      late of the Army Medical Department.)





          CHAPTER I
          Mr. Sherlock Holmes


     In t

     hopped over. There is no mystery about it at all. I am simply
     applying to ordinary life a few of those precepts of observation and
     deduction which I advocated in that article. Is there anything else
     that puzzles you?"

     "The finger nails and the Trichinopoly," I suggested.

     "The writing on the wall was done with a man's forefinger dipped in
     blood. My glass allowed me to observe that the plaster was slightly
     scratched in doing it, which would not have been the case if the
     man's nail had been trimmed. I gathered up some scattered ash from
     the floor. It was dark in colour and flakey--such an ash as is only
     made by a Trichinopoly. I have made a special study of cigar
     ashes--in fact, I have written a monograph upon the subject. I
     flatter myself that I can distinguish at a glance the ash of any
     known brand, either of cigar or of tobacco. It is just in such
     details that the skilled detective differs from the Gregson and

     sickish, in spite of my twenty years' experience. From under the door
     there curled a little red ribbon of blood, which had meandered across
     the passage and formed a little pool along the skirting at the other
     side. I gave a cry, which brought the Boots back. He nearly fainted
     when he saw it. The door was locked on the inside, but we put our
     shoulders to it, and knocked it in. The window of the room was open,
     and beside the window, all huddled up, lay the body of a man in his
     nightdress. He was quite dead, and had been for some time, for his
     limbs were rigid and cold. When we turned him over, the Boots
     recognized him at once as being the same gentleman who had engaged
     the room under the name of Joseph Stangerson. The cause of death was
     a deep stab in the left side, which must have penetrated the heart.
     And now comes the strangest part of the affair. What do you suppose
     was above the murdered man?"

     I felt a creep

     "Why don't you say some yourself?" the child asked, with wondering
     eyes.

     "I disremember them," he answered. "I hain't said none since I was
     half the height o' that gun. I guess it's never too late. You say
     them out, and I'll stand by and come in on the choruses."

     "Then you'll need to kneel down, and me too," she said, laying the
     shawl out for that purpose. "You've got to put your hands up like
     this. It makes you feel kind o' good."

     It was a strange sight had there been anything but the buzzards to
     see it. Side by side on the narrow shawl knelt the two wanderers, the
     little prattling child and the reckless, hardened adventurer. Her
     chubby face, and his haggard, angular visage were both turned up to
     the cloudless heaven in heartfelt entreaty to that dread being with
     whom they were face to face, while the two voices--the one thin and
     clear, the other deep and harsh--united in the entreaty for mercy and
     forg

     in different directions. Their concluding words had evidently been
     some form of sign and countersign. The instant that their footsteps
     had died away in the distance, Jefferson Hope sprang to his feet, and
     helping his companions through the gap, led the way across the fields
     at the top of his speed, supporting and half-carrying the girl when
     her strength appeared to fail her.

     "Hurry on! hurry on!" he gasped from time to time. "We are through
     the line of sentinels. Everything depends on speed. Hurry on!"

     Once on the high road they made rapid progress. Only once did they
     meet anyone, and then they managed to slip into a field, and so avoid
     recognition. Before reaching the town the hunter branched away into a
     rugged and narrow footpath which led to the mountains. Two dark
     jagged peaks loomed above them through the darkness, and the defile
     which led between them was the Eagle Cañon in which the horses were
     awaiting

     were, of a third person, who was sure to betray him. Lastly,
     supposing one man wished to dog another through London, what better
     means could he adopt than to turn cabdriver. All these considerations
     led me to the irresistible conclusion that Jefferson Hope was to be
     found among the jarveys of the Metropolis.

     "If he had been one there was no reason to believe that he had ceased
     to be. On the contrary, from his point of view, any sudden chance
     would be likely to draw attention to himself. He would, probably, for
     a time at least, continue to perform his duties. There was no reason
     to suppose that he was going under an assumed name. Why should he
     change his name in a country where no one knew his original one? I
     therefore organized my Street Arab detective corps, and sent them
     systematically to every cab proprietor in London until they ferreted
     out the man that I wanted. How well they succeeded, and how quickly I
     t

## Making a List of Lines from a File

When you use `with`, the file object returned by `open()` is only available inside the `with` block that contains it. If you want to retain access to a file’s contents outside the with block, you can store the file’s lines in a list inside the block and then work with that list.

The following example stores the lines of `pi_30_digits.txt` in a list inside the with block and then prints the lines outside the with block.

In [23]:
filename = 'a_study_in_scarlet.txt'

with open(filename) as file_pointer:
    lines = file_pointer.readlines()

for line in lines:
    print(line.rstrip())

     irretrievably ruined, but with permission from a paternal government



## Working with a File’s Contents

After you have read a file into memory, you can do whatever you want with that data, so let’s briefly explore some lines of the Sherlock Holmes story. First, we’ll attempt to build a single string containing all the digits in the file with no whitespace in it.

**Warning!** When Python reads from a text file, it interprets all text in the file as a string. If you read in a number and want to work with that value in a numerical context, you will have to convert it to an integer using the `int()` function or convert it to a float using the `float()` function.

Python has no inherent limit to how much data you can work with; you can work with as much data as your system’s memory can handle.

In [None]:
filename = 'a_study_in_scarlet.txt'

with open(filename) as file_pointer:
    lines = file_pointer.readlines()

story_string = ''
for line in lines[100:120]:
    story_string += line.rstrip()

print(story_string) 
print(len(story_string))

# A quick note on binary files

Not all files are good to read. Let's try to open an image file in Python:

In [24]:
with open('images/cabinet.gif') as file_pointer:
    for line in file_pointer:
        print(line)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe0 in position 6: invalid continuation byte

Uh-oh. What happened? As you know images are not text. Neither are `.pdf`, `.docx` or `.xlsx` files. 

If you try to open these files as text you run into problem. They are stored as binary digits (0 or 1s) in a very specific order.

## But.. What about text files?

Here is a 1'000'000$ question: are text files binary?

For text files you **always** have characters and only characters. That's pretty much the idea of *text*.

In practice that means 

For `.pdf` files, the order in which bytes are stored matter a great deal.

## Recap

* Files and folders
* Reading from files
  * ```python
with open('file.txt') as file_object:
       ...
```
  * Or on the command line: `cat file.txt`
* Difference between text and binary files

# Writing to a File

One of the simplest ways to save data is to write it to a file. When you write text to a file, the output will still be available after you close the terminal containing your program’s output. You can examine output after a program finishes running, and you can share the output files with others as well. You can also write programs that read the text back into memory and work with it again later.

## Writing to an Empty File

To write text to a file, you need to call `open()` with a second argument telling Python that you want to write to the file. To see how this works, let’s write a simple message and store it in a file instead of printing it to the screen.

The call to `open()` in the following example has two arguments. The first argument is still the name of the file we want to open. The second argument, `'w'`, tells the Python interpreter, that we want to open the file in write mode. You can open a file in *read mode* (`'r'`), *write mode* (`'w'`), *append mode* (`'a'`), or a mode that allows you to *read and write* to the file (`'r+'`). If you omit the mode argument, Python opens the file in read-only mode by default.

The `open()` function automatically creates the file you are writing to if it does not already exist. However, be careful opening a file in write mode (`'w'`) because if the file does exist, Python will erase the file before returning the file object.

In [25]:
# Reading
filename = 'a_study_in_scarlet.txt'
with open(filename) as file_pointer:
    lines = file_pointer.readlines()

# Processing
story_string = ''
for line in lines[100:120]:
    story_string += line.strip()

# Writing
filename = 'sherlock_copy.txt'
with open(filename, 'w') as file_object:
    file_object.write(story_string)

In [26]:
%%bash
cat sherlock_copy.txt

irretrievably ruined, but with permission from a paternal governmentto spend the next nine months in attempting to improve it.I had neither kith nor kin in England, and was therefore as free asair--or as free as an income of eleven shillings and sixpence a daywill permit a man to be. Under such circumstances, I naturallygravitated to London, that great cesspool into which all the loungersand idlers of the Empire are irresistibly drained. There I stayed forsome time at a private hotel in the Strand, leading a comfortless,meaningless existence, and spending such money as I had, considerablymore freely than I ought. So alarming did the state of my financesbecome, that I soon realized that I must either leave the metropolisand rusticate somewhere in the country, or that I must make acomplete alteration in my style of living. Choosing the latteralternative, I began by making up my mind to leave the hotel, and totake up my quarters in some less pretentious and less expensivedomicile.On the v

In [27]:
import platform


if platform.system() == 'Windows':
    newline = ''
else:
    newline = None
    
with open('sherlock_copy.txt', 'w', newline=newline) as file_object:
    file_object.write(story_string)

That is, on Windows you have to pass the newline argument when writing files.

##### Windows

```python
with open('sherlock_copy.txt', 'w', newline='') as file_object:
    file_object.write(story_string)
```

##### MacOS, Linux, other Unixes

```python
with open('sherlock_copy.txt', 'w') as file_object:
    file_object.write(story_string)
```

## Writing Multiple Lines

The `write()` function does not add any newlines to the text you write. So if you write more than one line without including newline characters, your file may not look the way you want it to.

In [None]:
filename = 'a_study_in_scarlet.txt'
with open(filename) as file_pointer:
    lines = file_pointer.readlines()
    
story_string = ''
for line in lines[100:120]:
    story_string += line.strip()

filename = 'sherlock_copy.txt'
with open(filename, 'w') as file_object:
    file_object.write(story_string[:50] + '\n')
    file_object.write(story_string[50:100])

In [None]:
%%bash
cat sherlock_copy.txt

## Appending to a File

If you want to add content to a file instead of writing over existing content, you can open the file in append mode. When you open a file in append mode, Python does not erase the file before returning the file object. Any lines you write to the file will be added at the end of the file. If the file does not exist yet, Python will create an empty file for you.

In [None]:
filename = 'sherlock_copy.txt'
with open(filename, 'a') as file_object:
    file_object.write(story_string[:50] + '\n')
    file_object.write(story_string[50:100] + '\n')
    file_object.write(story_string[150:200] + '\n')

In [None]:
%%bash
cat sherlock_copy.txt

## Exercises!!

  1. Write a program `print_file_contents.py` that opens the file `script1.trtl` and prints every line that it contains to the screen.
  2. Write a new program `print_file_contents_split.py`, which:
    - Opens the file `script1.trtl`, 
    - `split`s every line on the space, 
    - Converts the number value, i.e., the value following the space to an integer, and
    - Prints the two values on separate lines.
    
    Example:
    
    If the file `script1.trtl` contains
    ```
Walk 10
Turn 20
Walk 30
Turn 40
Walk 50
Turn 60
Walk 70
```
    The your program shall print:
    ```
Walk
10
Turn
20
Walk
30
Turn
40
Walk
50
Turn
60
Walk
70
```    
  3. Extend your first program `print_file_contents.py` so that it reads a file given to it via a command line argument. That is, if it is called via:
  
  ```python print_file_contents.py turtle_program.trtl``` 
  
  it shall print:
```
Walk 10
Turn 20
Walk 30
Turn 40
Walk 50
Turn 60
Walk 70
```
  
  

## The CSV File Format

One simple way to store data in a text file is to write the data as a series of values separated by commas, called comma-separated values. The resulting files are called CSV files. 
For example, here are two lines of population data from Copenhagen data in CSV format:

```csv
2015,1,0,5100,614
2015,1,0,5104,2
```


This is population data is from Copenhagen City with respect to year 2015. It includes the year of reference (2015 in first row), a code for the neighborhood of the city, the age of the corresponding citizens, a code of their nationality, and finally the amount of persons of that nationality and age.


The dataset is described in more detail here: http://data.kk.dk/dataset/befolkningen-efter-ar-bydel-alder-og-statsborgerskab
and the nationality codes are detailed here: http://www.dst.dk/da/Statistik/dokumentation/Times/forebyggelsesregistret/statkode.aspx



CSV files are simple. For example, CSV files
  * Do not have types for their values—everything is a string
  * Do not have settings for font size or color
  * Do not have multiple worksheets
  * Cannot specify cell widths and heights
  * Cannot have merged cells
  * Cannot have images or charts embedded in them
  
The advantage of CSV files is simplicity. CSV files are widely supported by many types of programs, can be viewed in text editors, and are a straightforward way to represent spreadsheet data. The CSV format is exactly as advertised: It is just a text file of comma-separated values.

However, it is recomended using a proper library, such as `csv` to read CSV files, instead of reading them as text files.

### Parsing the CSV File Headers
Python’s `csv` module in the standard library parses the lines in a CSV file and allows us to quickly extract the values we are interested in. Let’s start by examining the first line of the file, which contains a series of headers for the data.

In [None]:
# Befolkningen efter år, bydel, alder og statsborgerskab
# http://data.kk.dk/dataset/befolkningen-efter-ar-bydel-alder-og-statsborgerskab
import csv


filename = 'befkbhalderstatkode.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)
    print(header_row)


### Printing the Headers and Their Positions
To make it easier to understand the file header data, print each header and its position, i.e. the index of its row.

To read data from a CSV file with the csv module, you need to create a Reader object, see line 2 in the following. A Reader object lets you iterate over lines in the CSV file.

In [None]:
filename = 'befkbhalderstatkode.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)

    for index, column_header in enumerate(header_row): 
        print(index, column_header)

### Reading Data from Reader Objects in a `for` Loop

For large CSV files, you will want to use the Reader object in a `for` loop.

In [None]:
filename = 'befkbhalderstatkode.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)

    for row in reader:
        print(f'Row #{str(reader.line_num)} {str(row)}')
        
        # The following line is only for the example in class
        # as the file is quite big...
        if reader.line_num > 50:
            break

### Extracting and Reading Data

Now that we know which columns of data we need, let’s read in some of that data.

We make an empty set called `ages` and then loop through the remaining rows in the file. The reader object continues from where it left off in the CSV file and automatically returns each line following its current position. Because we have already read the header row, the loop will begin at the second line where the actual data begins. On each pass through the loop, we append the data from index 2, the third column storing the age.

In [None]:
ages = set([])

filename = 'befkbhalderstatkode.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader)

    for row in reader:
        # OBS: cast to int otherwise we would read strings!
        ages.add(int(row[2]))
        
print(sorted(ages))
print(max(ages))

### Writing Data to CSV Files

A Writer object lets you write data to a CSV file. To create a Writer object, you use the csv.writer() function. Enter the following into the interactive shell:

First, call `open()` and pass it `'w'` to open a file in write mode. This will create the object you can then pass to `csv.writer()` to create a Writer object.

On Windows, you’ll also need to pass a blank string for the `open()` function’s newline keyword argument. For technical reasons beyond the scope of this book, if you forget to set the newline argument, the rows in `output.csv` will be double-spaced.

The `writerow()` method for Writer objects takes a list argument. Each value in the list is placed in its own cell in the output CSV file. The return value of `writerow()` is the number of characters written to the file for that row (including newline characters). Notice how the Writer object automatically escapes the comma in the value `'614,5'` with double quotes in the CSV file. The `csv` module saves you from having to handle these special cases yourself.

In [None]:
import csv
import platform


if platform.system() == 'Windows':
    newline=''
else:
    newline=None
    
with open('output.csv', 'w', newline=newline) as output_file:
    output_writer = csv.writer(output_file)
    
    output_writer.writerow(['2015', '1', '0', '5100', '614,5'])
    output_writer.writerow(['2015', '1', '0', '5104', '2,3'])
    output_writer.writerow(['2015', '1', '0', '5106', '1'])
    output_writer.writerow(['2015', '1', '0', '5110', '1'])

In [None]:
%%bash
cat output.csv

In [None]:
import csv

with open('output.csv', 'w') as output_file:
    output_writer = csv.writer(output_file, delimiter='\t', quotechar='|')
    output_writer.writerow(['2015', '1', '0', '5100', '614\t5'])
    output_writer.writerow(['2015', '1', '0', '5104', '2,3'])
    output_writer.writerow(['2015', '1', '0', '5106', '1'])
    output_writer.writerow(['2015', '1', '0', '5110', '1'])