# File handling

```py
with open(path, option) as name:
    statements
    ...
```

Options
- "r" - read
- "a" - append to a file (if it doesn't exist - it will create the file)
- "w" - write (if it doesn't exist - it will create the file)
- "x" - create a file, error if it already exists

In [3]:
path = "../Data/quotes.txt"

with open(path, "r") as f:
    text = f.read()

print(text)

  If     we     knew what it was      we were doing, it would not be called research,          would it?     - Albert Einstein

Time is a drug. Too       much of it kills you.  -  Terry Pratchett


 An expert is a person who       has made all the mistakes that           can be made in a          very narrow field - Niels Bohr

   Everything must be made as simple as possible. But not simpler. - Albert Einstein     


  Nothing in life                is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less. - Marie  Curie  

If I have seen further     it is by standing on the shoulders of Giants. - Isaac Newton


## Cleaning up quotes.txt

- Inspect txt-file manually (some prankster has added random noise in form of whitespaces and newlines)
- Remove leading and trailing whitespaces
- Remove excessive whitespaces in between words
- Add quote numbers

In [24]:
import re

with open(path, "r") as f_read, open("../Data/quotes_clean.txt", "w") as f_write:
    quote_number = 1
    
    # Loops through each line in the text file
    for quote in f_read:
        quote = quote.strip(" \n") # Removes leading and trailing spaces and newlines
        quote = re.sub(" +", " ", quote) # RegEx to substitute >=1 whitespace with 1 whitespace
        
        # Write to new file if the line is not ""
        if quote != "":
            f_write.write(f"{quote_number}. {quote}\n")
            print(f"{quote_number}. {quote}\n", end="")
            quote_number += 1

1. If we knew what it was we were doing, it would not be called research, would it? - Albert Einstein
2. Time is a drug. Too much of it kills you. - Terry Pratchett
3. An expert is a person who has made all the mistakes that can be made in a very narrow field - Niels Bohr
4. Everything must be made as simple as possible. But not simpler. - Albert Einstein
5. Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less. - Marie Curie
6. If I have seen further it is by standing on the shoulders of Giants. - Isaac Newton


## Pick out the authors

- Find digit to find quote
- Extract first name and last name
- Join into full name
- Get unique values

In [25]:
with open("../Data/quotes_clean.txt", "r") as f_quotes, open("../Data/quotes_clean.txt", "a") as f_append:
    # Reads in each line as a list
    # Strips away "\n"
    quotes = [quote.strip("\n") for quote in f_quotes.readlines()]
    authors = [quote.split()[-2:] for quote in quotes]
    print(authors)

    # Set - gives the unique elements
    authors = set([" ".join(author) for author in authors])
    print(authors)

    f_append.write("\nAuthors: ")
    for author in authors:
        f_append.write(f"{author}, ")

[['Albert', 'Einstein'], ['Terry', 'Pratchett'], ['Niels', 'Bohr'], ['Albert', 'Einstein'], ['Marie', 'Curie'], ['Isaac', 'Newton']]
{'Terry Pratchett', 'Isaac Newton', 'Albert Einstein', 'Niels Bohr', 'Marie Curie'}


In [20]:
name = [["Pontus", "Gillenang"]]
" ".join(name[0])

'Pontus Gillenang'