# Reading from and writing to files
Writing and reading to files - called input/output (io in short) - is an essential feature of many programing languages. In Python, io operations require three broad steps: 
1. Opening the file
2. Reading or writing to the file (or both)
3. Closing the file

Opening the file is done via the `open` function, which is a built-in function. The function takes two arguments: the path of the file we would like to open (either relative to the Python script or absolute, e.g. C:/users/...), and a mode. There are several modes: 
1. `r` (reading) is the mode used to read a file
2. `w` (writing) is the mode used to writing to a file. If the file already exists, **it is overwritten**
3. `a` (appending) is used to append to a file. If the file doesn't exist, it is created. 

## 1. Reading from a file

In [1]:
#Open the file with the open function
#This function returns a new object which represents a virtual file
file = open('misc/sample.txt', 'r')

type(file)

_io.TextIOWrapper

In [2]:
#Lets read the contents of the file
contents = file.read()
print(contents)

Did you know the average person falls asleep in 7 minutes?
Did you know 8% of people have an extra rib?


In [3]:
#Let's now close the file
#Unless you close it, the file will remain open and you won't be able to do anything with it in other software
file.close()

In [4]:
#An alternative way is to read line by line, iterating over the file
file = open('misc/sample.txt', 'r')

for line in file: 
    print(line)

#never forget to close the file! 
file.close()

Did you know the average person falls asleep in 7 minutes?

Did you know 8% of people have an extra rib?


In [5]:
#For huge files, you can also read line by line... but this involves file cursors
#For now, just be aware this exists, and google it further if you ever need to handle multi-gigabyte files
file = open('misc/sample.txt', 'r')

line = file.readline()
print(line)

file.close()

Did you know the average person falls asleep in 7 minutes?



## 2. Writing to files

In [6]:
#Create (or override an existing) file in the 'w' mode
file = open('misc/output.txt', 'w')

#Write into the file using the write method
file.write("I'm going to make him an offer he can't refuse.")

#Finally, close the file
file.close()

In [7]:
#Be aware that the 'w' mode overrides any existing file 
file = open('misc/output.txt', 'w')

#Write into the file using the write method
file.write("This file is not going to include the first sentence I wrote above")

#Finally, close the file
file.close()

In [8]:
#Want to check that? 
#Let's read the file
file = open('misc/output.txt', 'r')
print(file.read())
file.close()

This file is not going to include the first sentence I wrote above


## 3. Appending to a file
If you paid close attention, we said that the `w` mode **overwrites** any existing content when you call the open function. If you want to append to an existing file (or create it if it doesn't exist), use the `a` mode. 

In [9]:
file = open('misc/output-2.txt', 'w')
file.write('This is the first line of the text')
file.write('\n') #write a return carriage
file.write('This is the second line of the text')
file.write('\n') #write a return carriage
file.close()

#Now let's add (append) some more content to that file
file = open('misc/output-2.txt', 'a')
file.write('This is the third line of the text')
file.close()

#Let's inspect the contents of the file
file = open('misc/output-2.txt', 'r')
print(file.read())
file.close()

This is the first line of the text
This is the second line of the text
This is the third line of the text


## 4. The context manager
The above is intended to show you how to read and write to a file. In practice, we will always use the following syntax, which does not require us to close the file (it will be closed automatically).

In [10]:
with open('misc/sample.txt', 'r') as file:
    print(file.read())

Did you know the average person falls asleep in 7 minutes?
Did you know 8% of people have an extra rib?


**Why do we use the context manager**?
If you have an error in your script between the moment you opened a file and closed the file, you risk having your file left open in the system. Have you ever tried deleting a file (Excel, Word) which was open at the same time? To remediate this problem, use a context manager: this is an alternative syntax that will make sure the file is closed, regardless of whether you have an error or not. 

In [11]:
#Try this carefully
file = open('misc/fail.txt', 'w')
file.write("This will certainly fail")
file.write("\n")
file.write("1 divided by 0 is {}".format(1/0))

#This is useless, as the code will crash on the above line
file.close()

#If you now try to delete the file (via your file browser), it will likely tell you that you can't
#There is a lock on the file: it is locked in the Python process, even if your script crashed! 

ZeroDivisionError: division by zero

In [12]:
with open('misc/output.txt', 'w') as file: 
    file.write("This will certainly fail")
    file.write("\n")
    file.write("1 divided by 0 is {}".format(1/0))
    
#As above, the script failed. 
#But the file was closed anyways

ZeroDivisionError: division by zero

In [13]:
#The above is equivalent to the below: 
file = open('misc/output.txt', 'w')
try: 
    file.write("This will certainly fail")
    file.write("\n")
    file.write("1 divided by 0 is {}".format(1/0))
except: 
    raise
finally: 
    file.close()

ZeroDivisionError: division by zero

## Practice
### Problem 1
You operate a metal warehouse, where producers and consumers can collect and deposit metal inventories. Every time a client comes to the warehouse, your system records the amount (in tons) of metal that entered (with a `+` sign) or left the warehouse (with a `-` sign), and prints it to a file (see `data/warehouse.txt`). Using this logfile: 
1. Print the first three lines, and the last three lines
2. How many different metals does your metal warehouse store?
3. How many copper tons entered the warehouse? How many aluminium tons left the warehouse? 
4. Which metals had net withdrawls? 


### Problem 2
**Auction the land**. You are the owner of a huge swath of land (1000km by 1000km) in North Dakota - rich in shale oil. You have decided to auction your territory. You received a list of bids (see `data/bids.txt`), which your intern compiled in a single text file. Each bid is for a rectangular cut of your territory, defined by a width and a height (in km), starting at a particular set of coordinates (left, down) relative to the upper-left most point of your territory. 

Each bid contains the following information: 
- Bid unique identifier (e.g. # 124)
- Kilometres left from the reference point
- Kilometres down from the reference point
- Kilometres wide
- Kilometres tall

For example, bid `#123 @ 2,5: 4x3` is dolan's bid (#123), starting 2km left, 5km down, and is 4km wide and 3km tall. Assuming `X` is your reference point, this bid would be for the area marked with `O` in the below diagram:

```
  0123456789
 0X.........
 1..........
 2..........
 3..........
 4..........
 5..OOOO....
 6..OOOO....
 7..OOOO....
 8..........
 9..........
```

Using the file, answer the following: 
- How many bids did you receive? 
- What is the total surface bid? What is then the average bid size (expressed in square kilometres)?
- How many bids are taller than wider? What is the largest bid? 
- You receive a call from bidder `#451`. He is wondering if there are any other bidders for the area (of part thereof) he wants to buy? 
- Compute the number of square km for which there are overlapping bids.

### Problem 3
You are a young financial analyst at a private bank, and need to automate security analysis. Using the historical price (total return) series of the S&P 500 stock index (see `data/SP500TR.txt`), perform the following analysis.
1. Create a dictionary where you map dates to values. Each key (date) in the dictionary should be a tuple of the form (yyyy, mm, dd). Values should be stored as floating numbers.
2. Compute end-of-quarter values. What is the largest consecutive number of quarterly increases? How about quarterly decreases? 
3. Compute the daily percentage returns. 
4. What is the largest single-day index percentage increase? What is the largest single-day index percentage decrease?
5. "Sell in May and go away". You want to test the adage. How often does the month of May contribute positively to the index? Does the answer change if the index posted positive returns up to the first trading day of May?
6. Where would the index stand on 3 June 2019 if an investor missed the single-best trading day each year?
7. Compute a new series with the 20-day moving average. Compute a new series with the 200-day moving average. 
8. Backtest a strategy where you hold the index only when the 20-day moving average is above the 200-day moving average. What is your total return over the tradeable period?


### Problem 4
You are a detective at the Federal Bureau (FBI). You have been tipped by an informant about a money-laundering scheme possibly involving hundres of Bitcoin accounts. Your informant provided you with a list of large bitcoin transactions that took place this week.

Using the list of transactions in `data/transactions.txt`: 
1. How many unique account numbers are in the file? 
2. What is the average size of transaction? 
3. Which 3 accounts had the most outgoing number of individual transactions? Which three accounts had the largest aggregate outgoing transfers? Does one account stand out in particular? 
4. Compile a list of accounts to which this suspicious account sent bitcoins to.  
5. You decide to investigate further and try to follow the money. Can you identify the ultimate beneficiary of the transactions that the suspicious account initiated?