## Handling files in Python

Working with files in Python is very similar to working with a book in the physical world. When you want to get something from a book you have to open it, you can either read it in its natural order or skip some lines and pages or you can write on it.

The same rules apply to files in Python. You can opt to open a file by referencing its name and then indicating whether you want to read from the file or write to it.

A file in Python is named location used to permanently store some information.

Let's start off with writing into a file

In order to write to a file we first need to open the file. You can open a file using the inbuilt open() function

In [5]:
file = open('file.txt')
file

<_io.TextIOWrapper name='file.txt' mode='r' encoding='cp1252'>

From this we can see that the default if we don't specify the mode is read.

We can also specify the mode we want to engage our file with. For example do we want to read from the file, append new text to the file or write to the file. Let's see what happens when we try and write to the file after opening it

In [6]:
file = open('file.txt','w')
file

<_io.TextIOWrapper name='file.txt' mode='w' encoding='cp1252'>

You can also specify the type of encoding. The default is dependant on the platform you are using. If you are using the default will be 'cp1252' whereas if you are using Linux the default will be 'utf-8'.

Now let's try and read a few lines from the file. We can read by specifying the number of characters we want to read from the file we have opened. Let us try and read the first three characters from our file

In [20]:
file = open('file.txt')
file.read(3)

'Two'

We can also choose to read the entire file in one go. We can do this by leaving out the number inside the brackets specifying the length of characters we want to read

In [22]:
file = open('file.txt')
file.read()

'Two men went bear hunting. While one stayed in the cabin, the other went out looking for a bear. He soon found a huge bear, shot at it but only wounded it. The enraged bear charged toward him, he dropped his rifle and started running for the cabin as fast as he could.\nHe ran pretty fast but the bear was just a little faster and gained on him with every step. Just ashe reached the open cabin door, he tripped and fell flat. Too close behind to stop, the bear tripped over him and went rolling into the cabin.\nThe man jumped up, closed the cabin door and yelled to his friend inside, "You skin this one while I go and get another one!"'

Once you have opened the file you can also opt to use a loop to read the file line by line. This can work either with a while statement or a for loop.

With a while statement you can read each line in the file until there are no lines remaining. Let's look at a practical example of this.

In [36]:
file = open('file.txt')
while True:
    line = file.readline()
    if len(line) == 0:
        break
    print(line,end="")
file.close()

Two men went bear hunting. While one stayed in the cabin, the other went out looking for a bear. He soon found a huge bear, shot at it but only wounded it. The enraged bear charged toward him, he dropped his rifle and started running for the cabin as fast as he could.
/nHe ran pretty fast but the bear was just a little faster and gained on him with every step. Just ashe reached the open cabin door, he tripped and fell flat. Too close behind to stop, the bear tripped over him and went rolling into the cabin.
/nThe man jumped up, closed the cabin door and yelled to his friend inside, "You skin this one while I go and get another one!"/n

With this while statement we start off by opening the file 'file.txt'. Once open we start reading each individual line using the readline() method. This method reads one entire line from our file. We then look at the length of the line with the inbuilt len() function. This function returns the length of a string, array, list or tuple. If the length of the line in this instance is 0 we break the while loop to ensure we don't continually reading from the while when we run out of lines.

Generally speaking it is good practice to close a file once we are done using it.

If you are using a with statement to open and read from your file, you do not need to close your file as the with statement will take care of closing the file for you. Here is an example

In [37]:
with open('file.txt') as file:
    for line in file:
        print(line)

Two men went bear hunting. While one stayed in the cabin, the other went out looking for a bear. He soon found a huge bear, shot at it but only wounded it. The enraged bear charged toward him, he dropped his rifle and started running for the cabin as fast as he could.

He ran pretty fast but the bear was just a little faster and gained on him with every step. Just ashe reached the open cabin door, he tripped and fell flat. Too close behind to stop, the bear tripped over him and went rolling into the cabin.

The man jumped up, closed the cabin door and yelled to his friend inside, "You skin this one while I go and get another one!"


### Writing in a file

Now let's imagine a scenario where you want to write some text to the file using the write() method. You may either want to create a new file and write some text to it or to write to an existing file.

Let's start off by opening our current file and writing new text onto it. In this case you are trying to append text to an existing file, in order to do this you would need to add a to your open statement. If you are bit confused about this. You might find this resource helpful: http://book.pythontips.com/en/latest/open_function.html

In [38]:
with open('file.txt','a') as file:
    file.write('How would you respond if you were the friend')


Let's recap on what we just did. We opened the file file.txt with the intention of appending some new text to it. We want to add this new line at the end of the file.

**What if I want to prepend a line of text to the beginning of a file?**
'a' and 'a+' modes only allow you to append some text to the end of the file. The pointer moves to the end of the file before any writing is done. If you want to prepend some text at some other point than the end of the file here is an example of what you could do

In [50]:
with open('file.txt','r+') as file:
    file.write('This is not based on a real life:')
    file.seek(0)
    content =  file.read()
    print(content)

This is not based on a real life: 
 stayed in the cabin, the other went out looking for a bear. He soon found a huge bear, shot at it but only wounded it. The enraged bear charged toward him, he dropped his rifle and started running for the cabin as fast as he could.
He ran pretty fast but the bear was just a little faster and gained on him with every step. Just ashe reached the open cabin door, he tripped and fell flat. Too close behind to stop, the bear tripped over him and went rolling into the cabin.
The man jumped up, closed the cabin door and yelled to his friend inside, "You skin this one while I go and get another one!"How would you respond if you were the friend


Writing to a new file is a lot easier. To create a new txt file we need to specify the name of the new file in our open() statement and then select the write mode to write next to this new file. 

In [53]:
with open('newfile.txt','w') as file:
    file.write('This is a new file \n We are adding spaces between each sentence using backslash n.')


When we write next text to the file we can use \n to add a line between sentences.

Let's take a look at what our file looks like

In [55]:
with open('newfile.txt') as file:
    for line in file:
        print(line)

This is a new file 

 We are adding spaces between each sentence using backslash n.


We strongly recommend that you read these sources to further develop your understanding of file handling in Python

Sources:
* i.) Chapter 13, How to think like a computer scientists: Learning with Python 3 - http://openbookproject.net/thinkcs/python/english3e/files.html
* ii.) Working with file I/O in Python - https://dbader.org/blog/python-file-io

## Introduction to Pandas

Pandas in Python stands for "Python Data Analysis Library" From the name you can probably already guess that this language is very important for work involving data analysis, data cleanup and data exploration. For this portion of the notebook we are going to focus on introducing what the Pandas library can do, its background and how it can be used using a real dataset to give you a practical understanding of how you can use the library

In [31]:
pandas data structures
loading data
subsetting and filtering
calculating summary statistics 

  interactivity=interactivity, compiler=compiler, result=result)


In [33]:
list(file)

['Unnamed: 0',
 'Unnamed: 0.1',
 'miles',
 'location',
 'name',
 'sex',
 'w',
 'l',
 'd',
 'division',
 'from',
 'to',
 'players_links',
 'global_id',
 '86',
 '87',
 '88',
 '89',
 '90',
 '91',
 '92',
 '93',
 '94',
 '95',
 '96',
 '97',
 '98',
 '99',
 '100',
 'date1',
 'firstBoxerRating1',
 'firstBoxerWeight1',
 'JudgeID1',
 'Links1',
 'location1',
 'metadata1',
 'numberofrounds1',
 'outcome1',
 'rating1',
 'referee1',
 'secondBoxer1',
 'secondBoxerLast61',
 'secondBoxerRating1',
 'secondBoxerRecord1',
 'secondBoxerWeight1',
 'titles1',
 'date2',
 'firstBoxerRating2',
 'firstBoxerWeight2',
 'JudgeID2',
 'Links2',
 'location2',
 'metadata2',
 'numberofrounds2',
 'outcome2',
 'rating2',
 'referee2',
 'secondBoxer2',
 'secondBoxerLast62',
 'secondBoxerRating2',
 'secondBoxerRecord2',
 'secondBoxerWeight2',
 'titles2',
 'date3',
 'firstBoxerRating3',
 'firstBoxerWeight3',
 'JudgeID3',
 'Links3',
 'location3',
 'metadata3',
 'numberofrounds3',
 'outcome3',
 'rating3',
 'referee3',
 'secondBox