## Reading Files with Open

In this section, we will us Python's built in open function to create a file object and obtain the data from a .txt file. 

We can open the file example.txt as follows:

In [1]:
file1 = open('example.txt','r')

We can now use the file object to obtain information about the file.
- We can use the data attribute name to get the name of the file:

In [2]:
file1.name

'example.txt'

We can see what mode the object is in using the data attribute mode:

In [3]:
file1.mode

'r'

You chould always close the file object using the method close

In [4]:
file1.close()

We can use a with statement to avoid needing to manually close the file

In [7]:
with open('example.txt','r') as file1:
    file_stuff = file1.read()
    print(file_stuff)

print(file1.closed) # Will return True if the file is closed
print(file_stuff)

This is example.txt
True
This is example.txt


The with statement will run everything within the indent block and then automatically closes the file.

We can output every line as an element in a list using the method readlines()

In [8]:
with open('example.txt','r') as file1:
    file_stuff = file1.readlines()
    print(file_stuff)

['This is example.txt\n', 'This is line 2\n', 'This is line 3']


We can use the method readline() to read the first line in the file. 

In [10]:
with open('example.txt','r') as file1:
    file_stuff = file1.readline()
    print(file_stuff)

This is example.txt



The first time it is called, it will save the first line in the variable 'file_stuff'. The sceond time it's called, it will save the second line, etc.

We can specify the number of characters we would like to read from the string as an argument to the method 'readline' or 'read'. readline can only read one line at most, however read will read the entire file.

In [17]:
with open('example.txt','r') as file1:
    file_stuff = file1.readline(2)
    print(file_stuff)
    file_stuff = file1.readline(16)
    print(file_stuff)

Th
is is example.tx


### Writing Files with open
We can use Python's open function to get a file object to create a text file. we can apply the method write to write data to that file.

We can create the file example2.txt as follows:

In [21]:
file1 = open('example.2.txt','w')
file1.close()

When using the 'w' mode, if the file already exist in the directory, it will be overwritten. Otherwise it will be created. As before, we can use the with statement:

In [24]:
with open('example2.txt','w') as file1:
    file1.write('This is line A\n')
    file1.write('This is line B\n')

This will create a file example2.txt and write the above lines to the file. Each time the write function is called, it will write to the file.

We can set the mode to append using 'a'. This will not create a new file but just use the existing one. If we call the method write, it will just write to the existing file.

In [26]:
with open('example2.txt','a') as file1:
    file1.write('This is line C\n')

We can copy one file to another as follows:

In [27]:
with open('example2.txt','r') as readfile:
    with open('example2copy.txt','w') as writefile:
        for line in readfile:
            writefile.write(line)

- First, we read the file example2.txt and interact with it via the file object, read file. 
- Then, we create a new file and use the file object write file to interact with it
- The for loop takes a line from the readfile and stores it in our new file.
- This process is repeated until the end of the file is reached, at which point both files are closed.

### Loading data with Pandas

In [28]:
import pandas as pd

One way pandas allows you to work with data is with a dataframe.

For this example, we will create a DataFrame out of a dictionary:

In [31]:
songs = {
    'Artist':['Michael Jackson','AC/DC','Pink Floyd','Whitney Houston','Meat Loaf','Eagles','Bee Gees','Fleetwood Mac'],
    'Album':['Thriller','Back in Black','The Dark Side of the Moon','The Bodyguard','Bat Out of Hell','Their Greatest Hits','Saturday Night Fever','Rumours'],
    'Released':[1982,1980,1973,1992,1977,1976,1977,1977],
    'Length':['00:42:19','00:42:11','00:42:49','00:57:44','00:46:33','00:43:08','01:15:54','00:40:01'],
    'Genre':['pop, rock, R&B','hard rock','progressive rock','R&B, soul, pop','hard rock, progressive rock','rock, soft rock, folk rock','disco','soft rocl'],
    'Music Recording Sales (millions)':[46.0,26.1,24.2,27.4,20.6,32.2,20.6,27.9],
    'Claimed Sales (millions)':[65,50,45,44,43,42,40,40],
    'Released.1':['30-Nov-82','25-Jul-80','01-Mar-73','17-Nov-92','21-Oct-77','17-Feb-76','15-Nov-77','04-Feb-77'],
    
}
songs_frame = pd.DataFrame(songs)
songs_frame.head()

Unnamed: 0,Album,Released,Length
0,Thriller,1982,00:42:19
1,Back in Black,1980,00:42:11
2,The Dark Side of the Moon,1973,00:42:49
3,The Bodyguard,1992,00:57:44
4,Bat Out of Hell,1977,00:46:33


The keys in the dictionary correspond to the table headers and the values are lists corresponding to the rows.

#### Working with data in Pandas
- unique() - determines the unique elements in the column of a DataFrame

In [32]:
songs_frame['Released'].unique()

array([1982, 1980, 1973, 1992, 1977], dtype=int64)