## Working with Files

open() returns a file object, and is most commonly used with two arguments: open(filename, mode).

In [13]:
f = open('/tmp/workfile', 'w')

# write some data to the file
f.write("This is the first line of the file!\n")
f.write("This is lthe second line...")

# close the file when done.
f.close()

The first argument is a string containing the filename. The second argument is another string containing a few characters describing the way in which the file will be used. mode can be 'r' when the file will only be read, 'w' for only writing (an existing file with the same name will be erased), and 'a' opens the file for appending; any data written to the file is automatically added to the end. 'r+' opens the file for both reading and writing. The mode argument is optional; 'r' will be assumed if it’s omitted.

Normally, files are opened in text mode, that means, you read and write strings from and to the file, which are encoded in a specific encoding. If encoding is not specified, the default is platform dependent (see [open()](https://docs.python.org/3/library/functions.html#open)). 'b' appended to the mode opens the file in binary mode: now the data is read and written in the form of bytes objects. This mode should be used for all files that don’t contain text.

In text mode, the default when reading is to convert platform-specific line endings (\n on Unix, \r\n on Windows) to just \n. When writing in text mode, the default is to convert occurrences of \n back to platform-specific line endings. This behind-the-scenes modification to file data is fine for text files, but will corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files.

It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point. Using with is also much shorter than writing equivalent try-finally blocks:

In [14]:
with open('/tmp/workfile') as f:
    read_data = f.read()
    
print(read_data)

This is the first line of the file!
This is lthe second line...


If you’re not using the with keyword, then you should call f.close() to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.

After a file object is closed, either by a with statement or by calling f.close(), attempts to use the file object will automatically fail.

In [11]:
f.read()

ValueError: I/O operation on closed file.

### Methods of File Objects

In [17]:
f = open('/tmp/workfile', 'r+')

To read a file’s contents, call f.read(size), which reads some quantity of data and returns it as a string (in text mode) or bytes object (in binary mode). size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it’s your problem if the file is twice as large as your machine’s memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ('').

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

In [18]:
for line in f:
    print(line, end='')

This is the first line of the file!
This is lthe second line...

f.tell() returns an integer giving the file object’s current position in the file represented as number of bytes from the beginning of the file when in binary mode and an opaque number when in text mode.

To change the file object’s position, use f.seek(offset, from_what). The position is computed from adding offset to a reference point; the reference point is selected by the from_what argument. A from_what value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. from_what can be omitted and defaults to 0, using the beginning of the file as the reference point.

In [19]:
f = open('/tmp/workfile', 'rb+')

In [20]:
f.write(b'0123456789abcdef')

16

In [21]:
f.seek(5)      # Go to the 6th byte in the file

5

In [22]:
f.read(1)

b'5'

In [23]:
f.seek(-3, 2)  # Go to the 3rd byte before the end

60

In [24]:
f.read(1)

b'.'

In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with seek(0, 2)) and the only valid offset values are those returned from the f.tell(), or zero. Any other offset value produces undefined behaviour.

File objects have some additional methods, such as isatty() and truncate() which are less frequently used; consult the Library Reference for a complete guide to file objects.

In [25]:
f.close()

### CSV Exercise

In [1]:
# use this data to create, read, modify and write a CSV file:

data = """Mar,13,22:23:16,localhost,systemd[1898]:,Starting,Accessibilityservices bus... 
Mar,13,22:23:16,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Successfully activated service 'org.a11y.Bus' 
Mar,13,22:23:16,localhost,systemd[1898]:,Started,Accessibilityservices bus. 
Mar,13,22:23:16,localhost,at-spi-bus-launcher[2087]:,dbus-daemon[2092]:,"Activatingservice name='org.a11y.atspi.Registry' requested by ':1.0' (uid=1000 pid=1905 comm=""mate-session "" label=""unconfined_u:unconfined_r:unconfi"
Mar,13,22:23:16,localhost,dbus-daemon[1916]:,[session,"uid=1000pid=1916] Activating service name='ca.desrt.dconf' requested by ':1.28' (uid=1000 pid=1905 comm=""mate-session "" label=""unconfined_u:unconfined_r:unconfined_t:s"
Mar,13,22:23:16,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Successfully activated service 'ca.desrt.dconf' 
Mar,13,22:23:16,localhost,at-spi-bus-launcher[2087]:,dbus-daemon[2092]:,Successfullyactivated service 'org.a11y.atspi.Registry' 
Mar,13,22:23:16,localhost,at-spi-bus-launcher[2087]:,SpiRegistry,daemonis running with well-known name - org.a11y.atspi.Registry 
Mar,13,22:23:16,localhost,gnome-keyring-daemon[1895]:,failed,tounlock login keyring on startup 
Mar,13,22:23:17,localhost,rtkit-daemon[676]:,Successfully,madethread 2124 of process 2124 (/usr/bin/pulseaudio) owned by '1000' high priority at nice level -11. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Activating via systemd: service name='org.gtk.vfs.UDisks2VolumeMonitor' unit='gvfs-udisks2-volume-monitor.service' requested by ':1.35' (uid=1000 pid
Mar,13,22:23:17,localhost,systemd[1898]:,Starting,Virtualfilesystem service - disk device monitor... 
Mar,13,22:23:17,localhost,pulseaudio[2124]:,[pulseaudio],alsa-util.c:Disabling timer-based scheduling because running inside a VM. 
Mar,13,22:23:17,localhost,pulseaudio[2124]:,[pulseaudio],sink.c:Default and alternate sample rates are the same. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Successfully activated service 'org.gtk.vfs.UDisks2VolumeMonitor' 
Mar,13,22:23:17,localhost,systemd[1898]:,Started,Virtualfilesystem service - disk device monitor. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Activating via systemd: service name='org.gtk.vfs.MTPVolumeMonitor' unit='gvfs-mtp-volume-monitor.service' requested by ':1.35' (uid=1000 pid=2116 co
Mar,13,22:23:17,localhost,rtkit-daemon[676]:,Successfully,madethread 2134 of process 2124 (/usr/bin/pulseaudio) owned by '1000' RT at priority 5. 
Mar,13,22:23:17,localhost,pulseaudio[2124]:,[pulseaudio],alsa-util.c:Disabling timer-based scheduling because running inside a VM. 
Mar,13,22:23:17,localhost,systemd[1898]:,Starting,Virtualfilesystem service - Media Transfer Protocol monitor... 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Successfully activated service 'org.gtk.vfs.MTPVolumeMonitor' 
Mar,13,22:23:17,localhost,systemd[1898]:,Started,Virtualfilesystem service - Media Transfer Protocol monitor. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Activating via systemd: service name='org.gtk.vfs.GPhoto2VolumeMonitor' unit='gvfs-gphoto2-volume-monitor.service' requested by ':1.35' (uid=1000 pid
Mar,13,22:23:17,localhost,systemd[1898]:,Starting,Virtualfilesystem service - digital camera monitor... 
Mar,13,22:23:17,localhost,rtkit-daemon[676]:,Successfully,madethread 2139 of process 2124 (/usr/bin/pulseaudio) owned by '1000' RT at priority 5. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Successfully activated service 'org.gtk.vfs.GPhoto2VolumeMonitor' 
Mar,13,22:23:17,localhost,systemd[1898]:,Started,Virtualfilesystem service - digital camera monitor. 
Mar,13,22:23:17,localhost,dbus-daemon[1916]:,[session,uid=1000pid=1916] Activating via systemd: service name='org.gtk.vfs.AfcVolumeMonitor' unit='gvfs-afc-volume-monitor.service' requested by ':1.35' (uid=1000 pid=2116 co
Mar,13,22:23:17,localhost,systemd[1898]:,Starting,Virtualfilesystem service - Apple File Conduit monitor... 
Mar,13,22:23:17,localhost,dbus-daemon[684]:,[system],"Activatingvia systemd: service name='org.bluez' unit='dbus-org.bluez.service' requested by ':1.92' (uid=1000 pid=2124 comm=""/usr/bin/pulseaudio --start --log-target=sys"
"""

In [15]:
def main ():
    
    f = open("mod.csv", "w")
    data_read = data.split('\n')

    for line in data_read:
        rep_str = line.replace(',',' ')
        f.write(rep_str+'\n')
    f.close()    

if __name__ == '__main__':
    main()
    
print(__name__)

__main__


In [11]:
!ls

 00.ipynb   09.ipynb			"Ken's testing 123.ipynb"
 01.ipynb   10.ipynb			 Manish_Mypythonnotes.ipynb
 02.ipynb   11.ipynb			 mod.csv
 03.ipynb   12.ipynb			 Mypythonnotes.ipynb
 04.ipynb  "Andy's 2nd notebook.ipynb"	"Prasath's Notebook.ipynb"
 05.ipynb  "Andy's Notebook.ipynb"	"Suhento's Notebook.ipynb"
 06.ipynb   AS_Notebook.ipynb		 Untitled.ipynb
 07.ipynb  "Dai's Workbook.ipynb"
 08.ipynb  "Harry's Notebook.ipynb"


In [16]:
!cat mod.csv

Mar 13 22:23:16 localhost systemd[1898]: Starting Accessibilityservices bus... 
Mar 13 22:23:16 localhost dbus-daemon[1916]: [session uid=1000pid=1916] Successfully activated service 'org.a11y.Bus' 
Mar 13 22:23:16 localhost systemd[1898]: Started Accessibilityservices bus. 
Mar 13 22:23:16 localhost at-spi-bus-launcher[2087]: dbus-daemon[2092]: "Activatingservice name='org.a11y.atspi.Registry' requested by ':1.0' (uid=1000 pid=1905 comm=""mate-session "" label=""unconfined_u:unconfined_r:unconfi"
Mar 13 22:23:16 localhost dbus-daemon[1916]: [session "uid=1000pid=1916] Activating service name='ca.desrt.dconf' requested by ':1.28' (uid=1000 pid=1905 comm=""mate-session "" label=""unconfined_u:unconfined_r:unconfined_t:s"
Mar 13 22:23:16 localhost dbus-daemon[1916]: [session uid=1000pid=1916] Successfully activated service 'ca.desrt.dconf' 
Mar 13 22:23:16 localhost at-spi-bus-launcher[2087]: dbus-daemon[2092]: Successfullyactivated service 'org.a11y.atspi.Registry' 
Mar 13 22:23: