# File Objects in Python
## By Allen Huang

1. The Basics

### 1 The Basics

### 1.1 Read the files

Ways to read a file:
    1. context manager (recommend)
    2. open() (we need to explicitly close the file)

In [1]:
! pwd

/Users/hkmac/Desktop/Carzy Allen Github/Python


In [2]:
!ls
# testfile.txt is the file object that we want to deal with

Common Problem with Strings.ipynb [34mLibraries[m[m
Data Structure.ipynb              [34mModules[m[m
File Objects.ipynb                OOP.ipynb
Formatting.ipynb                  [34mPackage Management[m[m
Function.ipynb                    Variable Scope.ipynb
Generators.ipynb                  testfile.txt
Jupyter Notebook.ipynb


In [7]:
# the default is reading  
f = open("testfile.txt", "r")
# the the file is actually open and we can print the name of a file
print(f.name)
print(f.mode)
# if we open a file, we need to explicitly close it when we were done using it
f.close()

testfile.txt
r


1. write to a file: f = open("test.txt", "w")
2. appending to a file: f = open("test.txt", "a")
3. read and write a file: f = open("test.txt", "r+")

In [9]:
# Using context manager
# it allow us to work with files within this block, after we exit that block of code, it will automatically close the file for us 
# Reading Files:
with open("testfile.txt", "r") as f:
	pass

In [10]:
f

<_io.TextIOWrapper name='testfile.txt' mode='r' encoding='UTF-8'>

In [12]:
# it is closed and we can not read 
print(f.closed)

True


In [16]:
# how we can read the file 
with open("testfile.txt", "r") as f:
    f_content  = f.read()
    print(f_content)

The is a txt file:
1.first line
2.second line
3.third line
4.最后一道防线！


In [19]:
# we do not want to load all of the file 
# all of the line into a list
with open("testfile.txt", "r") as f:
    f_contentline  = f.readlines()
    print(f_contentline)

['The is a txt file:\n', '1.first line\n', '2.second line\n', '3.third line\n', '4.最后一道防线！']


In [25]:
with open("testfile.txt", "r") as f:
    for i in range(5):
        # every time it gives us a line in the file 
        f_line = f.readline()
        print(f_line, end = '')
        # end = ''就是去掉print之间的空格

The is a txt file:
1.first line
2.second line
3.third line
4.最后一道防线！

In [27]:
# Iterating through the file
# get one line at a time
with open("testfile.txt", "r") as f:
    for line in f:
        print(line, end = '')

The is a txt file:
1.first line
2.second line
3.third line
4.最后一道防线！

In [31]:
# Printing by characters:
# print first 20 characters of the file
with open("testfile.txt", "r") as f:
    f_contents = f.read(20)
    print(f_contents)
    
    f_contents = f.read(20)
    print(f_contents)

The is a txt file:
1
.first line
2.second


In [34]:
# Iterating through small chunks:
with open("testfile.txt", "r") as f:
    size_to_read = 10
    f_contents = f.read(size_to_read)
    while len(f_contents) > 0:
        print(f_contents, end = '*')
        f_contents = f.read(size_to_read)

The is a t*xt file:
1*.first lin*e
2.second* line
3.th*ird line
4*.最后一道防线！*

在前面的例子中，每次执行read(20)都会将前20个characters储存到f_contents中，此处应用while循环，每次读取10个characters，直到读取完整个while，这个file就close了，len(f_contents)就成了0，跳出循环。

In [38]:
# 比如：
with open("testfile.txt", "r") as f:
    size_to_read = 30
    f_contents1 = f.read(size_to_read)
    print(f_contents1)
    print(len(f_contents1))
    print('----分界线----')
    f_contents2 = f.read(size_to_read)
    print(f_contents2)
    print(len(f_contents2))
    print('----分界线----')
    f_contents3 = f.read(size_to_read)
    print(f_contents3)
    print(len(f_contents3))
    print('----分界线----')
    f_contents4 = f.read(size_to_read)
    print(f_contents4)
    print(len(f_contents4))

The is a txt file:
1.first lin
30
----分界线----
e
2.second line
3.third line
4
30
----分界线----
.最后一道防线！
8
----分界线----

0


In [39]:
# see our currently position in the file
with open("testfile.txt", "r") as f:
    size_to_read = 30
    f_contents = f.read(size_to_read)
    print(f.tell())

30


In [43]:
# what if i want the second read to begin not at 10 position
with open("testfile.txt", "r") as f:
    size_to_read = 10
    f_contents = f.read(size_to_read)
    print(f_contents, end = '\n')
    f.seek(6)
    f_contents = f.read(size_to_read)
    print(f_contents)

The is a t
 a txt fil


### 1.2 Write the files

You must open the files with a writable method

In [1]:
import os

In [2]:
os.chdir('/Users/hkmac/Desktop/Carzy_Allen_Github/Data_and_Testfile')

In [3]:
os.getcwd()

'/Users/hkmac/Desktop/Carzy_Allen_Github/Data_and_Testfile'

In [6]:
# Writing Starts:
# if this file exist, it will write; otherwise, it will create it at first
with open("testfile2.txt", "w") as f:
    f.write("Test")
    f.seek(10)
    f.write("Test")

In [5]:
# Copying Files:
with open("testfile.txt", "r") as rf:
    with open("test_copy.txt", "w") as wf:
        # here we have two files open, r for reading original file and w for writing our copy
        for line in rf:
            wf.write(line)

In [7]:
# copy a picture file
# we need to reading and writing bytes intead of text
# use a binary mode
with open("cat.jpg", "rb") as rf:
    with open("cat_copy.jpg", "wb") as wf:
        for line in rf:
            wf.write(line)

In [None]:
# Copying the image with chunks:
with open("cat.jpg", "rb") as rf:
    with open("cat_copy.jpg", "wb") as wf:
        chunk_size = 4096
        rf_chunk = rf.read(chunk_size)
        while len(rf_chunk) > 0:
            wf.write(rf_chunk)
            rf_chunk = rf.read(chunk_size)