# **Persistence**

more efficient to
* input large amount of data from disk to save manual tedious data entry
* output result to disk for ease of future reference
  

3 modes
* **r**ead from file
* **w**rite to file
* **a**ppend to an existing file

**Reading**

* open the file in read mode
* default is read mode
* if file does not exists, file not found error




In [None]:
infile = open("DATA.TXT", 'r')

Each line is read in as a text string
use assignment (and string slicing operations) to read into variables 


In [None]:
# read one line
line = infile.readline()

# read all lines
lines = infile.readlines()

**File input example**

Read from a text file NUMBERS.DAT which contains one integer per line. Compute and display the average of the numbers.


In [None]:
infile = open("NUMBERS.DAT", 'r')

lines = infile.readlines()

n = 0
total = 0
for num in lines:
    total = total + int(num)
    n = n + 1 

print(total / n)

infile.close()

**Writing**



* open the file in write mode
* if file does not exist, a new file will be created
* **BE CAREFUL:** if file already exists, it will be overwritten (all previous contents gone!)


In [None]:
outfile = open("DATA.TXT", 'w')


* write() is similar to print(), except it does not include newline character (hence the \n)
* data is written as text string


In [None]:
# write name and age to output file 
outfile.write("Lim Ah Seng")
outfile.write("17")
outfile.write("\n") # newline character

**Appending**

* open the file in append mode
* if file does not exist, a new file will be created
* if file already exists, new content will be added to the end of the file


In [None]:
outfile = open("DATA.TXT", 'a')

**Closing**

* remember to close all files at the end of your program
* commit all memory operations to disk (especially for write/append) 
* release storage
* unlock files

In [None]:
infile.close()
outfile.close()

## **Flat Files**

**Fixed length records**  
```
<id><name>      <dob>     <email>              <mobile>
1   Lim Ah Seng 1995-01-01limahseng@hotmail.com12345678
2   Tan Ah Lian 1995-12-31tanahlian@yahoo.com  87654321
```

**Variable length records**  
```
id,name,dob,email,mobile
1,Lim Ah Seng,1995-01-01,limahseng@hotmail.com,12345678
2,Tan Ah Lian,1995-12-31,tanahlian@yahoo.com,87654321
```

**Fixed length records (Read)**
* read line as text string and use slicing

In [None]:
# read all lines
records = infile.readlines()
 
# process each line
for record in records:
  rid = record[0:1]
  name = record[1:21]
  dob = record[21:31]
  email = record[31:61]
  mobile = record[61:69]


**Fixed length records (Write)**
* write line as text string with formatting
* to include new line character

In [None]:
outfile.write("{0}{1:12s}{2}{3}{4}\n".format(1, "Lim Ah Seng", "1995-01-01", "limahseng@hotmail.com", "12345678"))

**Variable length records**
* delimited file eg csv (comma separated variables)
* use Python's csv module


In [None]:
import csv

with open("person1.csv", 'r', newline="") as infile:
	records = csv.reader(infile, delimiter=",")
	for record in records:
    	print("ID:", record[0])
    	print("Name:", record[1])
    	print("DOB:", record[2])
    	print("Email:", record[3])
     print("Mobile:", record[4])

In [None]:
import csv

with open("uperson.csv", "w", newline="") as outfile:
	writer = csv.writer(outfile, delimiter=",")
	writer.writerows([
	(1, 'Lim Ah Seng', '1995-01-01', 'limahseng@hotmail.com', '12345678'),
	(2, 'Tan Ah Lian', '1995-12-31', 'tanahlian@yahoo.com', '87654321')])

## **JSON**
* JavaScript Object Notation


In [None]:
import json

# Python data (can be int, float, string, unicode, list, dictionary, tuple)
pdata = [{'a': 1, 'c': 3, 'b': 2}]
print(pdata)

[{'a': 1, 'c': 3, 'b': 2}]


In [None]:
# encode (python to json)
encoded = json.dumps(pdata)
encoded

'[{"a": 1, "c": 3, "b": 2}]'

In [None]:
# encode (python to json with sorting)
encoded = json.dumps(pdata, sort_keys=True)
encoded

'[{"a": 1, "b": 2, "c": 3}]'

In [None]:
# encode (python to json with sorting and formatting)
encoded = json.dumps(pdata, sort_keys=True, indent=2)
print(encoded)

[
  {
    "a": 1,
    "b": 2,
    "c": 3
  }
]


In [None]:
# JSON data (string)
jdata = '[{"a": 1, "b": 2, "c": 3}]'

# decode (json to python)
decoded = json.loads(jdata)
print(decoded)
type(decoded)

[{'a': 1, 'b': 2, 'c': 3}]


list

In [None]:
type(decoded[0])

dict

In [None]:
decoded[0]

{'a': 1, 'b': 2, 'c': 3}

In [None]:
for key, value in decoded[0].items():
  print(key, value)

a 1
b 2
c 3


In [None]:
decoded[0]['d'] = 4
decoded[0]

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [None]:
fout = open('DATA.txt', 'a') # a - append, w - write
data = 'Good morning!'
for i in range(1,10):
  fout.write(str(i) + '\n')
fout.close()

In [None]:
fin = open('DATA.txt', 'r') # r - read
lines = fin.readlines()
lines

['Good morning!\n',
 '1\n',
 '2\n',
 '3\n',
 '4\n',
 '5\n',
 '6\n',
 '7\n',
 '8\n',
 '9\n']