**Closing and Opening Files**

+ Python, along with associated packages, supports a number of formats for file reading and writing. 
+ The file types supported are txt, csv, tsv, xls, xlsx, doc, docx, dat, output, sql (and some specialized formats such as R, .dta, sas7bdat)
+ Here, we will cover some of the options and will look at the rest in session on pandas.
+ In order to open a file, use open() method specifying file name and mode of opening (read, write, append, etc)
+ Open returns a file handle 
+ handle = open(filename, mode)
+ Once work is done, it is always better to close the file otherwise other programs might not be able to access the file.
+ File closing is done using close() method

```python
Syntax:
file_handle = open(<filename>, <mode>)
```

In [2]:
pwd

'C:\\Users\\Subba Reddy Yeruva\\Desktop\\Python Classes'

In [3]:
f = open('abc.txt', 'r')

<b>Modes:</b><br>
<u>Text Modes</u>

    r or rt - read mode, if file not exists throws IOError
    w or wt - write mode, if file not exists creats new one
    a or at - append mode is write mode but starts writing, from the end of the file

    r+ or rt+ - read write
    w+ or wt+ - write read
    a+ or at+ - append read
    
<u>Binary Modes</u>

    rb - Binray read
    wb - Binary write
    ab - append
    rb+ - read and write in binary
    wb+ - read and write in binary
    ab+ - read and append in binary

In [None]:
# Files - regular files

# Text Files

In [4]:
import os
os.getcwd()

'C:\\Users\\Subba Reddy Yeruva\\Desktop\\Python Classes'

In [5]:
pwd

'C:\\Users\\Subba Reddy Yeruva\\Desktop\\Python Classes'

In [6]:
#os.chdir('C:\\Users\\syeruva4\\Desktop\\Python Classes\\myfile.txt')

In [7]:
# f is a file handle.
f = open('abc.txt')

In [8]:
print (type(f))
print (f)

<class '_io.TextIOWrapper'>
<_io.TextIOWrapper name='abc.txt' mode='r' encoding='cp1252'>


In [None]:
# modes
# r -> read the file.
# w -> write into the file.
# NOTE: if file exists w overwrites a file and if it doesnt exit it will create it.
# a -> append. Appending the data to the file.
# b -> binary.
# rb,wb,ab
# r+ -> Read and write into same file

In [50]:
print (dir(f))

['_CHUNK_SIZE', '__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', '_finalizing', 'buffer', 'close', 'closed', 'detach', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'line_buffering', 'mode', 'name', 'newlines', 'read', 'readable', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']


In [None]:
# reading a file.
print (help(f.read))

In [55]:
#f.seek(0)

0

In [9]:
print (f.read(6))

first 


In [10]:
print(f.read())

line 
second line 
third line
4th line 


In [11]:
print(f.read())




In [12]:
# f.tell -> Current Position
print (help(f.tell))

Help on built-in function tell:

tell() method of _io.TextIOWrapper instance
    Return current stream position.

None


In [13]:
print (f.tell())

48


In [None]:
# f.seek --> New absolute position
print (help(f.seek))


In [14]:
print (f.seek(0))


0


In [15]:
print (f.tell())

0


In [16]:
print(f.read())

first line 
second line 
third line
4th line 


In [None]:
# f.readline
print (help(f.readline))

In [17]:
f.seek(0)
print (f.readline(2))

fi


In [None]:
# f.readlines
print (help(f.readlines))

In [18]:
f.seek(0)
my_strings = f.readlines()

In [19]:
my_strings

['first line \n', 'second line \n', 'third line\n', '4th line ']

In [None]:
# f.readinto
#print (help(f.readinto))
# f.xreadlines
#print (help(f.xreadlines))

In [20]:
g = open('myfile1.txt','w')

In [21]:
# g.write
g.write("This is my 1st line.\nThis is my 2nd line.\nThis is my 3rd line.\nThis is my fourth line.\n")

87

In [None]:
# flush
print (help(g.flush))
# close
print (help(g.close))

In [22]:
g.flush()

In [23]:
help(g.flush)

Help on built-in function flush:

flush() method of _io.TextIOWrapper instance
    Flush write buffers, if applicable.
    
    This is not implemented for read-only and non-blocking streams.



In [24]:
g

<_io.TextIOWrapper name='myfile1.txt' mode='w' encoding='cp1252'>

In [25]:
g.close()


In [68]:
g

<_io.TextIOWrapper name='myfile1.txt' mode='w' encoding='cp1252'>

In [26]:
# closed
print (g.closed)

True


In [27]:
g.write("hey i am writing into the file.")

ValueError: I/O operation on closed file.

In [28]:
# conditional flow
if g.closed:
    print ("the file is closed")
else:
    print (g.write("hey i am writing into the file."))

the file is closed


In [72]:
# Exceptions
try:
    g.write("hey i am writing into the file.")
except ValueError:
    print ("Sorry!! the files is closed.")
else:
    g.write("hey i am writing into the file.")
finally:
    g.close()

Sorry!! the files is closed.


In [75]:
pwd

'C:\\Users\\Subba Reddy Yeruva\\Desktop\\Python Classes'

In [29]:
# with
with open('myfile1.txt','a') as h:
    h.write("hey i am writing into the file.")

In [30]:
print (h)

<_io.TextIOWrapper name='myfile1.txt' mode='a' encoding='cp1252'>


In [31]:
# g.writelines
print (my_strings)
h = open('myfile1.txt','a')
h.writelines(my_strings)

['first line \n', 'second line \n', 'third line\n', '4th line ']


In [32]:
my_strings

['first line \n', 'second line \n', 'third line\n', '4th line ']

In [82]:
h.close

<function TextIOWrapper.close>

In [None]:
# f.next()   --> to read the file line by line 
# EOF ?   --> exception = StopIteration: 

In [86]:
path = 'C:/Users/Subba Reddy Yeruva/Desktop/Python Classes/myfile.txt'



days_file = open(path,'r')
days = days_file.read()


new_path = 'C:/Users/Subba Reddy Yeruva/Desktop/Python Classes/new_days.txt'
new_days = open(new_path,'w')

title = 'Days of the Week\n'
new_days.write(title)
print(title)

new_days.write(days)
print(days)

days_file.close()
new_days.close()

Days of the Week

line1
line2 
line3 


In [None]:
# If would like to update the file based on some criteria then how you do it ?
# There are many options 

# CSV files

In [35]:
import pandas as pd

df = pd.read_csv('attendees.csv',header=None)
df

Unnamed: 0,0,1,2
0,Bharat,1,abc@gmail.com
1,venu,2,abc1@gmail.com
2,sandeep,3,abc2@gmail.com
3,karthik,4,abc3@gmail.com
4,narasimha,5,abc4@gmail.com


In [None]:
pd.read

In [36]:
import csv

f = open('attendees.csv')
csv_f = csv.reader(f)

for row in csv_f:
      print (row)

['Bharat', '1', 'abc@gmail.com']
['venu', '2', 'abc1@gmail.com']
['sandeep', '3', 'abc2@gmail.com']
['karthik', '4', 'abc3@gmail.com']
['narasimha', '5', 'abc4@gmail.com']


In [37]:
import csv

f = open('attendees.csv')
csv_f = csv.reader(f)

for row in csv_f:
      print (row[2])

abc@gmail.com
abc1@gmail.com
abc2@gmail.com
abc3@gmail.com
abc4@gmail.com


In [38]:

f = open('attendees.csv')
csv_f = csv.reader(f)

attendee_emails = []

for row in csv_f:
      attendee_emails.append(row[2])

print (attendee_emails)


['abc@gmail.com', 'abc1@gmail.com', 'abc2@gmail.com', 'abc3@gmail.com', 'abc4@gmail.com']


In [39]:
f = open('attendees.csv')
csv_f = csv.reader(f)

attendee_emails1 = []

for row in csv_f:
      attendee_emails1.append(row[2])


f = open('attendees2.csv')
csv_f = csv.reader(f)

attendee_emails2 = []

for row in csv_f:
      attendee_emails2.append(row[2])

attendee_emails11 = set(attendee_emails1)
attendee_emails22 = set(attendee_emails2)

second_year_attendees = attendee_emails22.difference(attendee_emails11)   # B - A

print (second_year_attendees)


{'abc6@gmail.com', 'abc5@gmail.com'}


In [40]:
import csv
import sys

f = open("dummy.csv", 'wt')
try:
    writer = csv.writer(f)
    writer.writerow( ('Title 1', 'Title 2', 'Title 3') )
    for i in range(10):
        writer.writerow( (i+1, chr(ord('a') + i), '08/%02d/07' % (i+1)) )
finally:
    f.close()

print (open("dummy.csv", 'rt').read())

Title 1,Title 2,Title 3

1,a,08/01/07

2,b,08/02/07

3,c,08/03/07

4,d,08/04/07

5,e,08/05/07

6,f,08/06/07

7,g,08/07/07

8,h,08/08/07

9,i,08/09/07

10,j,08/10/07




In [41]:
        
import csv
import sys

#csv.register_dialect('pipes', delimiter='|')

f = open("dummy1.csv", 'wt')

try:
    writer = csv.writer(f,delimiter="|")
    writer.writerow( ('Title 1', 'Title 2', 'Title 3') )
    
    for i in range(10):
              writer.writerow( (i+1, chr(ord('a') + i), '08/%02d/07' % (i+1)) )     
finally:
    f.close()

print (open("dummy1.csv", 'rt').read())

Title 1|Title 2|Title 3

1|a|08/01/07

2|b|08/02/07

3|c|08/03/07

4|d|08/04/07

5|e|08/05/07

6|f|08/06/07

7|g|08/07/07

8|h|08/08/07

9|i|08/09/07

10|j|08/10/07




# SAS files Handling

In [None]:
#!pip install sas7bdat    # to install sas7bdat
import sas7bdat
from sas7bdat import *

# To convert a SAS file in to a text file

In [None]:
data = SAS7BDAT('a120926s1d.sas7bdat')

In [None]:
data.convert_file('sas_data.txt','\t')

In [None]:
sf = open('sas_data.txt')
print (sf.read())
sf.seek(0)

In [None]:
print(sf.readline())

In [None]:
with SAS7BDAT('a120926s1d.sas7bdat') as f:
    for row in f:
        print (row)
    f.seek(0)

In [None]:
# Introducing Pandas for undersanding purpose of data only
import pandas as pd

In [None]:
with SAS7BDAT('a120926s1d.sas7bdat') as g:
    df = g.to_data_frame()   #to data frame conversion 

In [None]:
df.head(5)

In [None]:
df.dtypes

In [None]:
#import pandas as pd

#data = pd.read_sas('a120926s1d.sas7bdat')
#pd.DataFrame.to_csv(data, 'sastocsv_file.csv')

In [None]:
# XML Data 
# How about the package BeautifulSoup    
# At this stage i don't want to take this up as part of processing the data

# Database Connection

In [None]:
# Please note that below commands are for illustration purpose only 

In [27]:
import MySQLdb as mdb

ImportError: No module named 'MySQLdb'

In [28]:
!pip install MySQLdb

Collecting MySQLdb


  Could not find a version that satisfies the requirement MySQLdb (from versions: )
No matching distribution found for MySQLdb


In [None]:
# Select Query 
# pyodbc package for ODBC connector 
#!/usr/bin/python
import MySQLdb as mdb
con = mdb.connect('localhost','user52','user52','dbname')     # 10.0.1    # pseudo name # ASFGPROD
# connection = mdb.connect(Mysqlserver,user,password,database)
cur = con.cursor()
cur.execute('select user()')    # select * from table where a = 1
my_user = cur.fetchone()[0].split('@')[0]
print ("the user i am connect to is {}".format(my_user))

In [None]:
df2 = pd.read_sql("select * from table where a = 1",con)

In [None]:
# Insert Query 

#!/usr/bin/python
import MySQLdb as mdb
con = mdb.connect('localhost','user52','user52','dbname')
# connection = mdb.connect(Mysqlserver,user,password,database)
cur = con.cursor()
cur.execute("create table students (name varchar(10),gender varchar(6))")
con.close()

In [None]:
# Create table name 

#!/usr/bin/python
import MySQLdb as mdb
con = mdb.connect('localhost','user52','user52','dbname')
# connection = mdb.connect(Mysqlserver,user,password,database)
cur = con.cursor()
cur.execute("create table students (name varchar(10),gender varchar(6))")
con.close()

In [None]:
#conn = pyodbc.connect("DRIVER={NetezzaSQL};SERVER=<myserver>;PORT=<myport>;DATABASE=<mydbschema>;
#                        UID=<user>;PWD=<password>;")


In [None]:
pyodbc, jaydbeapi