# Read and Write files
At this point, you all proved (hopefully) in the homework that you know how to access a Wikipedia API. Moreover, using the code from the last class or today's exercise you should be able to webscrap a simple webpage like WikiLeaks or Google Scholar search. Now, the question arises what to do next with the data you acquired? You have all the data stored in the variable in the Notebook, but probably you would like to either write it out to your local machine or to Google Drive. Once we are done with saving the data we will show you how to load the data either from Google Drive or your local machine.

## Write the data
Let's start with a really simple example. Imagine, that for some strange reasons we assigned the whole article from WikiLeaks text on fishrot in a variable text. The code below does just that.




In [None]:
## import packages
import requests
from bs4 import BeautifulSoup 

## get the response
response = requests.get('https://wikileaks.org/fishrot/')

In [None]:
## convert to Beautiful Soup object
html = BeautifulSoup(response.content, 'html.parser')
## save the content to a string
text = []
for p in html.select('div.leak-content'):
  text.append(p.text)
## check the type of the variable

text = "\n".join(text)

In [None]:
print(text)

So having the above code in the Notebook is a cool thing. Most likely it will work next time we want to use it. However, it would be a bit pointless executing the whole code every time we want to analyze the text about Fishrot. So obviously one way of doing it would be printing it first and then copying and pasting it to the desired file. Although it is not the most effective way of doing it, let's think for a moment how this process would look like. What does your friend do when they want to write to a file a text from the Internet? They usually start the following way:
1. Copy the text
2. Open a file
3. Paste the text
4. Save the file

So in Python, you do a very similar thing. Let's say we want to save this `text` variable:

In [None]:
## We start with opening the file
file = open('fishrot3.txt', mode='w')

## Then we have to put there something
file.write(text)

## And at last we close the file
file.close()

Actually, the last part is important although in Colab it might be a bit different (I mean it might not rise an error). In normal Python, if you do not close the file it will explode. Not really, but you might end up either with an error when you want to load the file or not really inserting there anything. However, there is a much smarter and more popular way of writing something out to a file.

In [None]:
with open('fishrot.txt', 'w') as file:
  file.write(text)

What happened above? Using `with` statement we told Python to open a file called `fishrot.txt` in write mode and assign it to a temporary variable `file`. Then we wrote the variable text to `fishrot.txt` and closed the file. So the good thing about using with statement is that when you exit the indent the file is safely closed.

Ok, so far so good. But our problem is still not done yet cause we moved our data from the notebook to our workspace. And we still do not have it in the place we can access it easily. If we want to download it to our local machine we have two fairly easy options. We can either just click on the file and press download or download it from Python. To do it from Python we just load addtional package and use a simple function called download.

In [None]:
## We need to load package files to do it in Python
from google.colab import files
## And then just use function download
files.download('fishrot.txt') 

This is all very good but how to move the file we just created to our Google Drive? Imagine, that you are not working on your own computer but you would like to save the results on your Google Drive without really downloading it. Here, again there are two options on how to do it. But this time we will just focus on the Python way since in the end, it will be just easier.

In [None]:
## First, we need to load package drive
from google.colab import drive

## And then just use function mount to connect workspace with Google Drive
drive.mount('/content/drive')

If everything went well, after you execute the code, you should see `Mounted at /content/drive`. It just means that the connection between your Google Drive and Colab workspace has been established. To write something on your Google Drive you just need to add the following path to the name of your file: `/content/drive/My Drive/`.

In [None]:
## Define the path
gd_path = '/content/drive/My Drive/'
## Write the file to a file in Google Drive
with open(gd_path+'fishrot.txt', 'w') as file:
  file.write(text)

Ok, but what happens if we want to add something to our file? Would it be enough to just download another article and save it the same file? Actually, there are two answers to that question. Yes and No. If we write something to our file before closing it then yes we would append it to the end of the file. However, if we close the file and open it again then we would overwrite the information which was there. So are we doomed and we can't append anything to the files? Obviously not. Function open allows us also to open a file and put the cursor at the end of the file and write something there. We just need to use mode `a` instead of `w`.

**Exercise 1.** Download two articles from WikiLeaks and save the content into one file called `wikileaks.txt`. Please, move the file to your Google Drive (ideally without downloading it and uploading it again).

In [None]:
## Exercise 1.

import requests
from bs4 import BeautifulSoup 

links = ['https://wikileaks.org/popeorders/',
         'https://wikileaks.org/fishrot/']

with open('/content/drive/My Drive/wikileaks.txt', 'w') as file:
  for link in links:
    response = requests.get(link)
    html = BeautifulSoup(response.content, 'html.parser')
    text = []
    for p in html.select('div.leak-content'):
        text.append(p.text)
    text = "\n".join(text)
    file.write(text)

files.download('wikileaks.txt') 


## Read a file
Now everyone should have the file called `wikileaks.txt` saved safely on their GoogleDrive. But how to load it now to Python? Yes, you guess well it is a similar process to writing it out. First, we need to open the file and later read its output. We are going to use again `with` statement instead of opening and closing the file.

In [None]:
with open('/content/drive/My Drive/wikileaks.txt', 'r') as file:
  text = file.read()

Yeah, this is as simple as this. However, before we move to write more interesting data then just simple strings let's first see whether we can read the text not as a string but as a list. So you remember that we added the end of the line sign (`\n`) when we saved the file, right? Let's try to read the file line after line instead of just reading it as a long string.

In [None]:
with open('/content/drive/My Drive/wikileaks.txt', 'r') as file:
  lines = file.readlines()

lines

This is again very good but what about uploading a file from your local machine. Again, you can either use a graphical interface or do it in Python. Therefore, you can either press Upload or just type the following code.

In [None]:
## We need to load package files to do it in Python
from google.colab import files
## And then just use function download
files.upload()

## JSON

Let's now move to something more interesting it means writing and loading a `json` format files. You must remember that `json` looks much alike `mapping` vel `dicitonary`, right? So let's first create a dictionary so we have some real data.