# Python file I/P
*   File I/O
*   Installing Libraries

## File I/O

A file operation session usually takes place in the following order:

1. Open a file
  * There can be different modes of open
2. Perform file operations:
  * Read, write, or append
3. Close the file


### Open a file
Python has a built-in function open() to open a file. 
* This function returns a file object, also called a handle.
* File object is used to read or modify the file.

```
>>> f = open("test.txt")    # open file in current directory
>>> f = open("C:/Python33/README.txt")  # specifying full path
```

### Mode of file open
We can specify the **mode** while opening a file. 
* With mode
    * we specify whether we want to read 'r', write 'w' or append 'a' to the file.
    * We also specify if we want to open the file in text mode or binary mode.
* In **text mode**, we get strings when reading from the file.
* The **binary mode** returns bytes.
    * This is the mode to be used when dealing with non-text files like image or exe files.
*  The default is reading in text mode. 


Python File Modes

Mode | Description
--- | --- 
'r'	| open for reading (default)
'w'	| open for writing, truncating the file first
'x'	| open for exclusive creation, failing if the file already exists
'a'	| open for writing, appending to the end of the file if it exists
'b'	| binary mode
't'	| text mode (default)
'+'	| open a disk file for updating (reading and writing)
'U'	| universal newlines mode (deprecated)


Example:
```
f = open("test.txt")      # equivalent to 'r' or 'rt'
f = open("test.txt",'w')  # write in text mode
f = open("img.bmp", "w+b") # open for binary read/write with truncation
f = open("img.bmp",'r+b') # open for binary read/write without truncation
```
Also, when working with files in text mode, it is highly recommended to specify the encoding type.

```
f = open("test.txt", mode='r',encoding='utf-8')
```

### Closing a file
When we finished processing a file, we need to close it.
* Closing a file frees up the resources tied with the file.  
* This is done using Python close() method.

Python has a garbage collector to clean up unreferenced objects.  However, we must not rely on it to close the file.

```
f = open("test.txt", encoding = 'utf-8')
# perform file operations
f.close()
```

### Open a file using try and finally block

The following way is a safer way to open, use and close a file.
In case of exceptions during file access, and the execution gets interrupted, and the "finally" clause will still close the file properly.
```
try:
    f = open("myfile.txt")
    # performe file operations
    ...
finally:
    f.close()
```

### Open a file with `with` compound statement
A even better way to use a file is using the "with" block.  
* When exiting the "with block", the system will close the file automatically.
* We don't have to close the file explicity.

```
with open("myfile.txt") as f:
    # perform file operations
```

  

### Information about a file and its current status

Function | Description
---|---
closed | Is the file closed?
encoding | Return the encoding of the file
errors | error report mode
fileno() | file descriptor number
isatty() | Is the file an interactive stream?
mode   | The mode of the file
name   | The name of the name
newlines | Newlines of the file
readable() | Is the file readable?
seekable() | Is the file seekable?
tell()   | Return the current cursor position of the file
writable() | Is the file writable?



In [1]:
# Open a file
file = open('Test1.txt', "w+")
print(type(file))

print("The name of the file I just created is: ", file.name)
print("In what mode is this file opened: ", file.mode)
print("Is the file writabe?", file.writable())
print("Is the file closed? ", file.closed)
file.close()
print("Is the file closed? ", file.closed)

<class '_io.TextIOWrapper'>
The name of the file I just created is:  Test1.txt
In what mode is this file opened:  w+
Is the file writabe? True
Is the file closed?  False
Is the file closed?  True


## Reading and writing to a file

### Writing
In order to write into a file in Python, we need to open it in write 'w', append 'a' or exclusive creation 'x' mode.

We need to be careful with the 'w' mode as it will overwrite into the file if it already exists. All previous data are erased.

Writing a string or sequence of bytes (for binary files) is done using write() method. This method returns the number of characters written to the file.

In [1]:
# Write to a file
file = open('Test1.txt', 'w+')

file.write('1234567890\n')

file.close()

In [0]:
#!ls -l
#!cat Test1.txt
#!rm Test1.txt
#!ls -l

import os
print(os.listdir())
#os.remove('Test1.txt')
#print(os.listdir())

['.config', 'Test1.txt', 'sample_data']


### Reading

To read a file in Python, we must open the file in reading mode.

There are various methods available for this purpose. We can use the read(size) method to read in size number of data. If size parameter is not specified, it reads and returns up to the end of the file.


In [3]:
# read the file
file = open('Test1.txt', 'r')
x1 = file.read(3)  # read 3 bytes
x2 = file.read()   # read till the end of the file
file.close()

print("Type of x1 is: ", type(x1))
print("Content of x1 is: ", x1)
print("Content of x2 is: ", x2)

Type of x1 is:  <class 'str'>
Content of x1 is:  123
Content of x2 is:  4567890



### Seek() and tell()

We can see that, the read() method returns newline as '\n'. Once the end of file is reached, we get empty string on further reading.

* We can change our current file cursor (position) using the seek() method. 
* The tell() method returns the current cursor position in the file (in number of bytes).

In [4]:
# read the file
f = open('Test1.txt', 'r')
print("Read one line from the file: ", f.read())
print("The position of the file: ", f.tell())
print("Read again: ", f.read())
print("Change position to beginning: ", f.seek(0))
print("The position of the file: ", f.tell())
print("Read again: ", f.read())
f.close()



Read one line from the file:  1234567890

The position of the file:  11
Read again:  
Change position to beginning:  0
The position of the file:  0
Read again:  1234567890



### More about reading

Different ways to read:

* read()
* readline()
* readlines()
* for line in file:


In [6]:
# readline() vs readlines()
file  = open('Test2.txt', 'w+')
file.write('1234\n')
file.write('2234\n')
file.write('3234\n')
file.write('4234\n')
print("cursor position after write:", file.tell())

file = open('Test2.txt') # open in read mode
print("cursor position after reopen:", file.tell())
line = file.readline()
print('The first line of this file is:', line)




cursor position after write: 20
cursor position after reopen: 0
The first line of this file is: 1234



In [7]:
#Know we reset the file point
file.seek(0)
# Read all the lines
lines = open('Test2.txt').readlines()
print('The result of readlines() operation is: ', lines)
print('The type of lines is: ', type(lines))

# You also access individual lines
print("Line 0 is:", lines[0])
print("line 2 is:", lines[2])

The result of readlines() operation is:  ['1234\n', '2234\n', '3234\n', '4234\n']
The type of lines is:  <class 'list'>
Line 0 is: 1234

line 2 is: 3234



In [9]:
# When you want to process every line sequentially, a for loop is even more convenient
#file = open('Test2.txt') 
for line in open('Test2.txt'):
    print(line, end = '')
   

1234
2234
3234
4234


In [0]:
# Or, even more succinctly:
for line in open('Test2.txt'):
    print(line, end = '')

1234
2234
3234
4234


###Python File Methods

We list all the Python file methods below:

Method	| Description
--- | ---
close()	| Close an open file. It has no effect if the file is already closed.
detach() |	Separate the underlying binary buffer from the TextIOBase and return it.
fileno()	| Return an integer number (file descriptor) of the file.
flush()	| Flush the write buffer of the file stream.
isatty()	| Return True if the file stream is interactive.
read(n)	| Read atmost n characters form the file. Reads till end of file if it is negative or None.
readable()	| Returns True if the file stream can be read from.
readline(n=-1)	| Read and return one line from the file. Reads in at most n bytes if specified.
readlines(n=-1)	| Read and return a list of lines from the file. Reads in at most n bytes/characters if specified.
seek(offset,from=SEEK_SET)	| Change the file position to offset bytes, in reference to from (start, current, end).
seekable()	| Returns True if the file stream supports random access.
tell()	| Returns the current file location.
truncate(size=None)	| Resize the file stream to size bytes. If size is not specified, resize to current location.
writable()	| Returns True if the file stream can be written to.
write(s)	| Write string s to the file and return the number of characters written.
writelines(lines)	| Write a list of lines to the file.

## Manage directory and files

To help with management of files and directories, we introduce a very important module call "os", which contain a lot of useful command to access the operating system.

In [1]:
import os
# Get current working directory

# Get the path of the current working directory
print(os.getcwd())

# get the path of the current working directory in binary format
print(os.getcwdb())

C:\Users\cheean yu\Desktop\Data Science\week2
b'C:\\Users\\cheean yu\\Desktop\\Data Science\\week2'


  


In [0]:
# list current directory
print(os.listdir())

# list a particular directory
print(os.listdir("/content/sample_data"))

['.config', 'another_test', 'Test2.txt', 'sample_data']
['anscombe.json', 'README.md', 'california_housing_train.csv', 'mnist_test.csv', 'mnist_train_small.csv', 'california_housing_test.csv']


In [0]:
# change working directory
os.chdir("/content/sample_data")
print(os.getcwd())

/content/sample_data


In [0]:
# Making a new directory
os.chdir("/content")
os.mkdir('testdir')

# create a file
open("test.txt", "w")

print(os.listdir())

['.config', 'another_test', 'test.txt', 'Test2.txt', 'testdir', 'sample_data']


In [0]:
# rename a directory or a file
os.rename('testdir', 'another_testdir')
os.rename('test.txt', 'another_test.txt')
print(os.listdir())

['.config',
 'another_test.txt',
 'another_test',
 'Test2.txt',
 'another_testdir',
 'sample_data']

In [0]:
#remove a directory
os.rmdir('another_testdir')

#remove a file
os.remove('another_test.txt')
os.listdir()

['.config', 'another_test', 'Test2.txt', 'sample_data']

In [0]:
# How to remove the whole directory tree
import shutil

#shutil.rmtree('mytree')   # assume a directory tree 'mytree' exists.
os.listdir()

You are encouraged to find out more about the 'os' and 'shutil' packages using dir() and help() commands.

Note: We have introduce a basic set of file IO functions. There are other sets of functions, you are welcomed to explore them whenever necessary.