In [1]:
import os

In [2]:
os.getcwd()

'C:\\Users\\dell\\Data Science\\Python'

# Open

In [20]:
help(open)

Help on built-in function open in module io:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)
    Open file and return a stream.  Raise OSError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the file regardless of the current seek position

In [12]:
file = open('FileManagement', 'r', encoding = 'utf-8')
file

<_io.TextIOWrapper name='FileManagement' mode='r' encoding='utf-8'>

# Stream Objects

The **`open()`** function returns a stream object, which has methods and attributes for getting information about and manipulating a stream of characters.

In [15]:
dir(file)

['_CHUNK_SIZE',
 '__class__',
 '__del__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__enter__',
 '__eq__',
 '__exit__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '_checkClosed',
 '_checkReadable',
 '_checkSeekable',
 '_checkWritable',
 '_finalizing',
 'buffer',
 'close',
 'closed',
 'detach',
 'encoding',
 'errors',
 'fileno',
 'flush',
 'isatty',
 'line_buffering',
 'mode',
 'name',
 'newlines',
 'read',
 'readable',
 'readline',
 'readlines',
 'reconfigure',
 'seek',
 'seekable',
 'tell',
 'truncate',
 'writable',
 'write',
 'write_through',
 'writelines']

In [16]:
file.name

'FileManagement'

In [17]:
file.encoding

'utf-8'

In [18]:
file.mode

'r'

# Write

In [3]:
file = open('FileManagement', 'w') #r, w, a, r+, br, bw, ba, br+

In [4]:
file.write('This is a sample for practising File Management in Python\nYou do not need to know about anything')

96

In [5]:
file.close()

# Character Encoding

Did you notice the **`encoding`** parameter that got passed in to the **`open()`** function while you were opening a file for writing? It’s important; don’t ever leave it out! As you saw in the beginning of this chapter, files don’t contain strings, they contain bytes. Reading a “string” from a text file only works because you told Python what encoding to use to read a stream of bytes and convert it to a string. Writing text to a file presents the same problem in reverse. You can’t write characters to a file; characters are an abstraction. In order to write to the file, Python needs to know how to convert your string into a sequence of bytes. The only way to be sure it’s performing the correct conversion is to specify the encoding parameter when you open the file for writing.

# Read

In [6]:
file = open('FileManagement', 'r')

In [7]:
for line in file:
    print(line)

This is a sample for practising File Management in Python

You do not need to know about anything


In [8]:
file.close()

## Close

Open files consume system resources, and depending on the file mode, other programs may not be able to access them. **It’s important to close files as soon as you’re finished with them.**

Remember to close the file after reading or writing

<code>file.close()</code>

or we can use <span style = 'color:red'>with</span>

In [10]:
with open('FileManagement', 'r') as file:
    for line in file:
        print(line)

This is a sample for practising File Management in Python

You do not need to know about anything


In [11]:
with open('FileManagement', 'a') as file:
    file.write('VN Pikachu\n')
    file.write('Tank Cao')

In [12]:
with open('FileManagement', 'r') as file:
    for line in file:
        print(line)

This is a sample for practising File Management in Python

You do not need to know about anythingVN Pikachu

Tank Cao


Check if a file is closed: **`File.closed`**

In [21]:
file = open('FileManagement', 'r')
print(file.closed)
file.close()
file.closed

False


True

<hr>

①	You can’t read from a closed file; that raises an IOError exception.  
②	You can’t seek in a closed file either.  
③	There’s no current position in a closed file, so the tell() method also fails.  
④	Perhaps surprisingly, calling the close() method on a stream object whose file has been closed does not raise an exception. It’s just a no-op.  
⑤	Closed stream objects do have one useful attribute: the closed attribute will confirm that the file is closed.

<hr>

The **`with`** statement creates a runtime context. In these examples, the stream object acts as a context manager. Python creates the stream object a_file and tells it that it is entering a runtime context. When the with code block is completed, Python tells the stream object that it is exiting the runtime context, and the stream object calls its own close() method.

In [22]:
file.read()

ValueError: I/O operation on closed file.

In [23]:
file.tell()

ValueError: I/O operation on closed file.

In [24]:
file.seek(3)

ValueError: I/O operation on closed file.

In [25]:
file.closed

True

# Reading in one go

## as list 

In [13]:
lines = open('FileManagement', 'r').readlines()
lines

['This is a sample for practising File Management in Python\n',
 'You do not need to know about anythingVN Pikachu\n',
 'Tank Cao']

## as string

In [56]:
txt = open('FileManagement', 'r')
txt.read()

'This is a sample for practising File Management in Python\nYou do not need to know about anythingVN Pikachu\nTank Cao'

You can also read just a chunk of text

In [57]:
txt.seek(0)
txt.read(3) #read 3 characters, starting from 0-th byte

'Thi'

## as generator

In [1]:
file = open('FileManagement', 'r')
file.readline()

'This is a sample for practising File Management in Python\n'

In [2]:
file.readline()

'You do not need to know about anythingVN Pikachu\n'

In [3]:
file.readline()

'Tank Cao'

In [4]:
file.readline()

''

In [5]:
file.readline()

''

# Resetting the Files current pointer

In [15]:
with open('FilePointer', 'w') as file:
    file.write('0123456789')

In [16]:
open('FilePointer', 'r').read()

'0123456789'

## Seek

file.seek(n): move pointer the the n<sup>th</sup> byte

In [23]:
with open('FilePointer', 'r') as file:
    file.seek(3)
    print(file.read())

3456789


## Tell

tell what n<sup>th</sup> byte the pointer is pointing to

<b style = 'color:red'>NOTE: read(3) will read 3 characters, not 3 bytes. In UTF-8, for Chinese characters, we need more byte for 1 character</b>

e.g: "a是"  
character a takes 1 bytes. second Chinese character, let's say it take 2 bytes.  
if you seek(2) will raise an error, because the Chinese character is made of 1-th byte and 2-th

In [25]:
with open('FilePointer', 'r') as file:
    #fist, let's read first 3 characters to move the pointer to the 3-th index
    print(file.read(3))
    #print the index of the pointer
    print(file.tell())

012
3


<hr>

To change the file object’s position, use **`f.seek(offset, whence)`**. The position is computed from adding offset to a reference point; the reference point is selected by the whence argument. A whence value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as the reference point. whence can be omitted and defaults to 0, using the beginning of the file as the reference point.

<b style = 'color:red'>In text files (those opened without a b in the mode string), only seeks relative to the beginning of the file are allowed</b>

In [77]:
file = open('FileManagement', 'br') #open in binary mode

file.read()

b'This is a sample for practising File Management in Python\r\nYou do not need to know about anythingVN Pikachu\r\nTank Cao'

In [78]:
#move pointer to beginning
file.seek(0)

0

In [79]:
#read first 5 characters
file.read(5)

b'This '

In [80]:
file.tell()

5

In [81]:
#move pointer 2 bytes backward, starting from current bytes
file.seek(-2, 1)

3

In [82]:
file.tell()

3

In [84]:
#move points 10 bytes backward, staring form the last byte
file.seek(-10, 2)

107

In [85]:
file.read()

b'\r\nTank Cao'

## Read and write to the same file

In [28]:
with open('FilePointer', 'r+') as file:
    print(file.read())
    file.write('\nI love you, VN Pikachu~')
    print(file.tell())
    print(file.read())

0123456789
I love you, VN Pikachu~
60



In [29]:
with open('FilePointer', 'r') as file:
    print(file.read())

0123456789
I love you, VN Pikachu~
I love you, VN Pikachu~


# Saving and Loading data with Pickle

## Saving

In [31]:
import pickle

In [30]:
data = {'Meomeo888' : 'Commander', 'VN Pikachu' : 'Deputy Commander', 'Tank Cao' : 'Deputy Commander'}


first, create a file to store the data, with extension <b>pkl</b>

remember to write the file in the <code>binary mode</code>

In [35]:
file_storage = open('storage.pkl', 'bw')


then use <b>pickle.dump(data, file)</b> to store <code>data</code> into <code>file</code> 

In [36]:
pickle.dump(data, file_storage)

In [37]:
#close file
file_storage.close()

## Loading

In [40]:
with open('storage.pkl', 'br') as file_data:
    init_data = pickle.load(file_data)
    print(init_data)
    

{'Meomeo888': 'Commander', 'VN Pikachu': 'Deputy Commander', 'Tank Cao': 'Deputy Commander'}


# Exercises

In [41]:
txt = '''Chicago Sun 01:52
Columbus Sun 02:52
Riyadh  Sun 10:52
Copenhagen  Sun 08:52
Kuwait City Sun 10:52
Rome    Sun 08:52
Dallas  Sun 01:52
Salt Lake City  Sun 01:52
San Francisco Sun 00:52
Amsterdam Sun 08:52
Denver Sun 01:52
San Salvador Sun 01:52
Detroit Sun 02:52
Las Vegas Sun 00:52
Santiago    Sun 04:52
Anchorage Sat 23:52
Ankara Sun 10:52
Lisbon  Sun 07:52
São Paulo   Sun 05:52
Dubai   Sun 11:52
London  Sun 07:52
Seattle Sun 00:52
Dublin  Sun 07:52
Los Angeles Sun 00:52
Athens  Sun 09:52
Edmonton Sun 01:52
Madrid  Sun 08:52
Shanghai Sun 15:52
Atlanta Sun 02:52
Frankfurt   Sun 08:52
Singapore Sun 15:52
Auckland Sun 20:52
Halifax Sun 03:52
Melbourne Sun 18:52
Stockholm   Sun 08:52
Barcelona   Sun 08:52
Miami   Sun 02:52
Minneapolis Sun 01:52
Sydney Sun 18:52
Beirut  Sun 09:52
Helsinki    Sun 09:52
Montreal    Sun 02:52
Berlin  Sun 08:52
Houston Sun 01:52
Moscow  Sun 10:52
Indianapolis    Sun 02:52   
Boston  Sun 02:52
Tokyo   Sun 16:52
Brasilia Sun 05:52
Istanbul Sun 10:52
Toronto Sun 02:52
Vancouver   Sun 00:52
Brussels    Sun 08:52
Jerusalem   Sun 09:52
New Orleans Sun 01:52
Vienna  Sun 08:52
Bucharest   Sun 09:52
Johannesburg    Sun 09:52
New York    Sun 02:52
Warsaw  Sun 08:52
Budapest    Sun 08:52
Oslo    Sun 08:52
Washington DC   Sun 02:52
Ottawa  Sun 02:52
Winnipeg    Sun 01:52
Cairo   Sun 09:52
Paris   Sun 08:52
Calgary Sun 01:52
Kathmandu   Sun 13:37
Philadelphia    Sun 02:52
Zurich  Sun 08:52
Cape Town   Sun 09:52
Phoenix Sun 00:52       
Prague  Sun 08:52       
Casablanca  Sun 07:52
Reykjavik   Sun 07:52'''

In [42]:
with open('cities_and_times.txt', 'w') as file:
    file.write(txt)

In [43]:
with open('cities_and_Times.txt', 'r') as file:
    print(file.readlines())

['Chicago Sun 01:52\n', 'Columbus Sun 02:52\n', 'Riyadh  Sun 10:52\n', 'Copenhagen  Sun 08:52\n', 'Kuwait City Sun 10:52\n', 'Rome    Sun 08:52\n', 'Dallas  Sun 01:52\n', 'Salt Lake City  Sun 01:52\n', 'San Francisco Sun 00:52\n', 'Amsterdam Sun 08:52\n', 'Denver Sun 01:52\n', 'San Salvador Sun 01:52\n', 'Detroit Sun 02:52\n', 'Las Vegas Sun 00:52\n', 'Santiago    Sun 04:52\n', 'Anchorage Sat 23:52\n', 'Ankara Sun 10:52\n', 'Lisbon  Sun 07:52\n', 'São Paulo   Sun 05:52\n', 'Dubai   Sun 11:52\n', 'London  Sun 07:52\n', 'Seattle Sun 00:52\n', 'Dublin  Sun 07:52\n', 'Los Angeles Sun 00:52\n', 'Athens  Sun 09:52\n', 'Edmonton Sun 01:52\n', 'Madrid  Sun 08:52\n', 'Shanghai Sun 15:52\n', 'Atlanta Sun 02:52\n', 'Frankfurt   Sun 08:52\n', 'Singapore Sun 15:52\n', 'Auckland Sun 20:52\n', 'Halifax Sun 03:52\n', 'Melbourne Sun 18:52\n', 'Stockholm   Sun 08:52\n', 'Barcelona   Sun 08:52\n', 'Miami   Sun 02:52\n', 'Minneapolis Sun 01:52\n', 'Sydney Sun 18:52\n', 'Beirut  Sun 09:52\n', 'Helsinki    

In [55]:
import re

In [65]:

with open('cities_and_times.txt', 'r') as file:
    res = [re.split('\s+|:', line.strip()) for line in file]
    print(res)

[['Chicago', 'Sun', '01', '52'], ['Columbus', 'Sun', '02', '52'], ['Riyadh', 'Sun', '10', '52'], ['Copenhagen', 'Sun', '08', '52'], ['Kuwait', 'City', 'Sun', '10', '52'], ['Rome', 'Sun', '08', '52'], ['Dallas', 'Sun', '01', '52'], ['Salt', 'Lake', 'City', 'Sun', '01', '52'], ['San', 'Francisco', 'Sun', '00', '52'], ['Amsterdam', 'Sun', '08', '52'], ['Denver', 'Sun', '01', '52'], ['San', 'Salvador', 'Sun', '01', '52'], ['Detroit', 'Sun', '02', '52'], ['Las', 'Vegas', 'Sun', '00', '52'], ['Santiago', 'Sun', '04', '52'], ['Anchorage', 'Sat', '23', '52'], ['Ankara', 'Sun', '10', '52'], ['Lisbon', 'Sun', '07', '52'], ['São', 'Paulo', 'Sun', '05', '52'], ['Dubai', 'Sun', '11', '52'], ['London', 'Sun', '07', '52'], ['Seattle', 'Sun', '00', '52'], ['Dublin', 'Sun', '07', '52'], ['Los', 'Angeles', 'Sun', '00', '52'], ['Athens', 'Sun', '09', '52'], ['Edmonton', 'Sun', '01', '52'], ['Madrid', 'Sun', '08', '52'], ['Shanghai', 'Sun', '15', '52'], ['Atlanta', 'Sun', '02', '52'], ['Frankfurt', 'Sun',

In [66]:
with open('city_time_storage.pkl', 'bw') as storage_file:
    pickle.dump(res, storage_file)

In [67]:
with open('city_time_storage.pkl', 'br') as storage_file:
    print(pickle.load(storage_file))

[['Chicago', 'Sun', '01', '52'], ['Columbus', 'Sun', '02', '52'], ['Riyadh', 'Sun', '10', '52'], ['Copenhagen', 'Sun', '08', '52'], ['Kuwait', 'City', 'Sun', '10', '52'], ['Rome', 'Sun', '08', '52'], ['Dallas', 'Sun', '01', '52'], ['Salt', 'Lake', 'City', 'Sun', '01', '52'], ['San', 'Francisco', 'Sun', '00', '52'], ['Amsterdam', 'Sun', '08', '52'], ['Denver', 'Sun', '01', '52'], ['San', 'Salvador', 'Sun', '01', '52'], ['Detroit', 'Sun', '02', '52'], ['Las', 'Vegas', 'Sun', '00', '52'], ['Santiago', 'Sun', '04', '52'], ['Anchorage', 'Sat', '23', '52'], ['Ankara', 'Sun', '10', '52'], ['Lisbon', 'Sun', '07', '52'], ['São', 'Paulo', 'Sun', '05', '52'], ['Dubai', 'Sun', '11', '52'], ['London', 'Sun', '07', '52'], ['Seattle', 'Sun', '00', '52'], ['Dublin', 'Sun', '07', '52'], ['Los', 'Angeles', 'Sun', '00', '52'], ['Athens', 'Sun', '09', '52'], ['Edmonton', 'Sun', '01', '52'], ['Madrid', 'Sun', '08', '52'], ['Shanghai', 'Sun', '15', '52'], ['Atlanta', 'Sun', '02', '52'], ['Frankfurt', 'Sun',

# Binary File

a binary stream object has no **`encoding`** attribute. That makes sense, right? You’re reading (or writing) bytes, not strings, so there’s no conversion for Python to do. What you get out of a binary file is exactly what you put into it, no conversion necessary.

In [44]:
image_file = open('success.jpg', 'br')
image_file

<_io.BufferedReader name='success.jpg'>

In [45]:
image_file.name

'success.jpg'

In [46]:
image_file.mode

'rb'

In [47]:
#does not have encoding attribute
#imag.encoding

①	Like text files, you can read binary files a little bit at a time. But there’s a crucial difference…  
②	…you’re reading bytes, not strings. Since you opened the file in binary mode, the read() method takes the number of bytes to read, not the number of characters.  
③	That means that there’s never an unexpected mismatch between the number you passed into the read() method and the position index you get out of the tell() method. The read() method reads bytes, and the seek() and tell() methods track the number of bytes read. For binary files, they’ll always agree.

In [48]:
data = image_file.read(3)
data

b'\xff\xd8\xff'

In [49]:
image_file.tell()

3

In [50]:
image_file.seek(0)

0

In [51]:
image = image_file.read()

# Stream objects from non-source file

In [59]:
import io
#io.StringIO lets you treat a string as a text file
file = io.StringIO('VN Pikachu is the best') #create a stream object from a text
 
file.read()

'VN Pikachu is the best'

In [60]:
file.close()
file.closed

True

In [64]:
#io.BytesIO lets you treat a string as a binary file
binary_file = io.BytesIO(b'I LOVE U') #create a byte stream object from a text
binary_file.read()

b'I LOVE U'

# JSON

In [86]:
import json

In [94]:
#serializing:convert data to string representation
file = open('JSON_FILE', 'w')
data = {'VN pikachu': 35, 'Tank Cao': 31}

json.dump(data, file)
file.close()

In [95]:
#deserializing: convert string representation to data
json.load(open('JSON_FILE', 'r'))

{'VN pikachu': 35, 'Tank Cao': 31}

In [92]:
json.load