## Working with Files

### Read a Text file with `read()`, `readline()` and `readlines()`   
*read()* - Read the whole file as characters  
*readline()* - Read one line at a time  
*readlines()* - Read all the lines as a list  

In [1]:
fh = open("sample.txt", mode='r')  # open the file if exists

data = fh.read()                   # read the whole file in one go, not prefferred method if file is huge
fh.close()                         # closing the file handler

print(data[:60])                   # printing first 60 char

Docker containers and microservices have become more or less


In [2]:
print(f'Total Characters in file: {len(data)}')

Total Characters in file: 10307


### Using `readline()` method

In [11]:
cnt = 0

fh = open("sample.txt", mode='r')  # open the file if exists
while True:                             # infinite loop
    line = fh.readline()                # reading one line at a time
    if not line:                        # if there is no line, if block will execute which will break the loop
        break
    
    cnt = cnt+1                         # if line exists, increase the count by 1

print(f'Total lines in file: {cnt}')    # print the total count

fh.close()

Total lines in file: 146


### Using `readlines()` method

In [13]:
fh = open("sample.txt", mode='r')  # open the file if exists
lines = fh.readlines()                # reading all line at a time and store them as list

print(f'Total lines in file: {len(lines)}')    # print the total count

fh.close()

Total lines in file: 146


In [9]:
type(lines)

list

In [10]:
lines[:3]

['Docker containers and microservices have become more or less analogous to powering the cloud\n',
 'No conversation about the cloud is without these terms and advocates for microservices tend to pitch it as a remedy for large monolithic ailments\n',
 "It's also a solution to come up to speed with the rest of the new software out there that is being developed currently\n"]

### Close file automatically by using `with` clause

In [14]:
with open("sample.txt", mode='r') as fh: 
    lines = fh.readlines()                # reading all line at a time and store them as list

print(f'Total lines in file: {len(lines)}')    # print the total count

Total lines in file: 146


### `seek` method will move the read pointer to specific location  
Once you read the while file via file handler, file pointer reached to end of file. If you want to use the same file handler again, You will not get any output. To view the data again, you have to move the pointer back to start of file.  

In [30]:
fh = open("data.csv")  # Open the whole file in one go
data = fh.read()

data_2 = fh.read()

In [31]:
print(len(data))
print(len(data_2))

589657
0


In [33]:
fh = open("data.csv")  # Open the whole file in one go
data = fh.read()

fh.seek(0,0)   # move read pointer to very start 
data_2 = fh.read()  

print(len(data))
print(len(data_2))

589657
589657


### Few file handler method

In [29]:
fh = open("data.csv")  # Open the whole file in one go


print(f"file name: {fh.name}")
print(f"file closed or not: {fh.closed}")
print(f"file opening mode: {fh.mode}") 

fh.close()
print(f"file closed or not: {fh.closed}")

file name: data.csv
file closed or not: False
file opening mode: r
file closed or not: True


### Write to File with `write()`   
`write()` method will write to a file, if exists it will over-write and if not exists, it will create a file. 

In [24]:
# Writing to a new file -   

with open('test.txt', mode='w') as fh:
    fh.write("this is the first line. \n")                     # write a sinle line
    fh.write("this is the second line. \n")
    
    fh.writelines(["third line\n", 'forth line\n'])            # write multiple lines, for new line (\n) character is added at last of each line

In [25]:
# reading the same file 

with open('test.txt', mode='r') as fh:
    print(fh.read())

this is the first line. 
this is the second line. 
third line
forth line



### Preferred way to read a file   
if need to work on line by line

In [26]:
count = 0
with open("sample.txt") as file:         
    for line in file:                   # this will read the file line by file
        count = count+1

print(f"Total lines: {count}")           # No file.close() statement required. 

Total lines: 146


What if File does not exist or some issue occured when trying to access a file  
### Read file with `try/except` block 

In [28]:

file_name = input("Enter the file name: ")

try:
    count = 0
    with open(file_name) as fh:         # this will read the file line by file
        for line in fh:
            count = count+1

except FileNotFoundError:
    print("File does not exists.")

except Exception as e:
    print("Unspecified Error occured. - ", type(e))
    
else:
    print(f"Total lines: {count}")

Enter the file name:  sample


File does not exists.


### File Methods

In [39]:
fh = open("data.csv")  # Open the whole file in one go
data = fh.read()       # Read all the characters from file

print(f'Total Char:  {len(data)}')
print(f'Current offset: {fh.tell()}')              # Display the current offset (byte) location

Total Char:  589657
Current offset: 592718


In [38]:
fh.seek(0,0)                     #  Resetting offset at beginning of file

data_lines = fh.readlines()      # Read all the lines from file

print(f'Total Lines: {len(data_lines)}')

Total Lines: 3061


In [40]:
# Count word 

data.count('Dollars')

3060

In [42]:
# Total lines in file 
data.count("\n")

3061

In [43]:
# Total lines in file 
len(data_lines)

3061

In [44]:
data_lines[:3]

['Series_reference,Period,Data_value,Suppressed,STATUS,UNITS,Magnitude,Subject,Group,Series_title_1,Series_title_2,Series_title_3,Series_title_4,Series_title_5\n',
 'BDCQ.SF1AA2CA,2016.06,1116.386,,F,Dollars,6,Business Data Collection - BDC,Industry by financial variable,Sales (operating income),Forestry and Logging,Current prices,Unadjusted,\n',
 'BDCQ.SF1AA2CA,2016.09,1070.874,,F,Dollars,6,Business Data Collection - BDC,Industry by financial variable,Sales (operating income),Forestry and Logging,Current prices,Unadjusted,\n']

As we are dealing with Files,  Files are not going to be there where your python program exists.   
So traverse through the file path, we use `pathlib` module

In [1]:
import pathlib as p 

In [2]:
p.Path.home()  # User's home directory

WindowsPath('C:/Users/Atul')

In [3]:
p.Path.cwd()   # User's current working directory

WindowsPath('D:/Linux-Shared/DATA/mydata/github/Python/python_workshop/class9')

In [4]:
# Create path variable

test_dir = p.Path("D:/Linux-Shared/")
test_dir

WindowsPath('D:/Linux-Shared')

In [12]:
test_dir = p.Path.cwd() / "python_out"
test_dir

WindowsPath('D:/Linux-Shared/DATA/mydata/github/Python/python_workshop/class9/python_out')

In [13]:
# check if path exist
p.Path.exists(test_dir)

False

In [15]:
# Create Directory, Ignore if exists

p.Path.mkdir(test_dir, exist_ok=True)

In [16]:
p.Path.exists(test_dir)

True

### File Operations   

With the help of library `os` we can perform lots of os specific operation over files such as menntioned below - 

#### Check Existence with `exists()`

In [1]:
import os 

os.path.exists('sample.txt')

True

In [2]:
os.path.exists('sample')

False

#### Check type with `isfile()`

In [3]:
os.path.isfile('sample.txt')

True

In [4]:
os.path.isfile('sample_2.txt')

False

#### Change name with `rename()`

In [7]:
os.rename('sample.txt', 'sam.txt')           # source file name, target file name

#### Delete a file with `remove()`

In [8]:
os.remove('test.txt')

In [9]:
os.path.isfile('test.txt')

False

### Directory Operations

#### Create file with `mkdir()`

In [10]:
os.mkdir('stories')

In [11]:
os.path.exists('stories')

True

In [12]:
os.path.isfile('stories')

False

In [13]:
os.path.isdir('stories')

True