## Files

## Common file operations
```python
output = open(r'C:\spam','w') # Create output file ('w' means write)
input = open('data', 'r') # create input file ('r' means read)\
aString = input.read() # Read entire file into a single string
aString = input.read(N) # Read upto next N characters or bytes
aString = input.readline() # Read next line (including \n newline) into a string
aList = input.readlines() # Read entire file into list of line strings (with \n)
output.write(aString) # Write a string of characters in file
output.writeLines(aList) # Write all line strings in a list
output.close() # Manual close (done when file is collected)
output.flush() # Flush output buffer to disk without closing
anyFile.seek(N) # change file position to offset N for next operation
for line in open('data'): use line # File iterators read line by line
open('f.txt', encoding='latin-1')
open('f.bin', 'rb')
codecs.open('f.txt', encoding='utf-8')
open('f.bin', 'rb')
```

## Using Files
- File iterators are best for reading lines
- Content is strings not objects
- Files are buffered and seekable
- close is often optional: auto close on collection

In [1]:
myfile = open('./files/file.txt', 'w') # open file for text output "write"
myfile.write("Hello file\n")
myfile.write('goodbye text file\n')
myfile.close()

In [2]:
myfile = open('./files/file.txt', 'r') # open file for read text
myfile.readline()

'Hello file\n'

In [3]:
myfile.readline()

'goodbye text file\n'

In [4]:
myfile.readline() # nothing just empty string

''

In [5]:
open('./files/file.txt').read() # read all at a single time

'Hello file\ngoodbye text file\n'

In [6]:
print(open('./files/file.txt').read())

Hello file
goodbye text file



In [7]:
myfile = open('./files/file.txt')
for line in myfile:
    print(line)

Hello file

goodbye text file



### Text and Binary Files
The file is determined by the second argument to open, the mode string an included 'b' means binary.

> Text files represent the text as normal str strings, perfomr Unicode encoding and decoding automatically and perform end of line translation by default.

> Binary files represent a special bytes string type and allow programs to access file content unaltered.

In [8]:
data = open('./files/file.bin', 'rb').read()
data

b'0b110101010101010010'

## Storing python objects in files: Conversions
writing objects into a text file.

In [13]:
a, b, c = 1, 2, 1
l = "spam"
e = {'a':1, 'b':0}
o = [1,2,3,4,5,6,7,8,9,0]

F = open('./files/datafile.txt', 'w')

## To write into file file first it must be converted into string

F.write(l + '\n')
F.write('%s and %s and %s' % (a, b, c))
F.write("\n")
F.write(str(o) + '\n' + str(e) )
F.close()

In [14]:
chars = open('./files/datafile.txt').read()
chars

"spam\n1 and 2 and 1\n[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]\n{'a': 1, 'b': 0}"

In [16]:
F = open('./files/datafile.txt')
line = F.readline()
line

'spam\n'

In [17]:
line.rstrip()

'spam'

In [24]:
F.close()

In [23]:
for line in open('./files/datafile.txt'):
    print(line, end='+')

spam
+1 and 2 and 1
+[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
+{'a': 1, 'b': 0}+

### Eval
We can use eval for the case when we want to store object type in file by passing it as a string and then it can be used to convert to any object. However Pickle is suitable.

In [26]:
eval('[1,2,3]') #  convert to any data object type

[1, 2, 3]

In [30]:
eval('{"a":1,"b":2}')

{'a': 1, 'b': 2}

## Pickle
The pickle module is a more advanced tool that allows us to store almost any python object in a file directory, with no to or from string conversioon requirement on our part. It is like a super general data formatting and parsing utility.

This performs `**Object Serialization**` converting objects to and from string to bytes but requires very little work on our part. Pickle internally translates our object to a string form.

Also checkout Shelve module.

In [31]:
dictionary = {
    "name": "John",
    "age": 20
}

F = open('./files/datafile.pkl', 'wb') # getting ready pickle file

import pickle
pickle.dump(dictionary, F) # dump the dictionary into file F that we just open
F.close()

In [33]:
F = open("./files/datafile.pkl", 'rb') # opening the pickle file as read binary
F.read()

b'\x80\x04\x95\x1b\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x04name\x94\x8c\x04John\x94\x8c\x03age\x94K\x14u.'

In [36]:
F = open("./files/datafile.pkl", "rb")
E = pickle.load(F) # load any object frm file
E

{'name': 'John', 'age': 20}

### Storing Objects in JSON Format
The prior pickle module translates nearly arbitrary object to a format. 
JSON doesnot support as broad a range of python object types as pickle but its portability is an advantage in some contexts and it represents another way to serialize a specific category of object for storage and transmission. Moreover JSON is close to python dictionaries and lists in syntax the translation to and from object is trivial and is automated by the json standard library module.

In [37]:
name = dict(fist="Jane", last="Smith")
rec = dict(name=name, job=['dev', 'mgr'], age=20)
rec

{'name': {'fist': 'Jane', 'last': 'Smith'}, 'job': ['dev', 'mgr'], 'age': 20}

In [38]:
import json

# json dump string method
json.dumps(rec)

'{"name": {"fist": "Jane", "last": "Smith"}, "job": ["dev", "mgr"], "age": 20}'

In [39]:
s = json.dumps(rec)
s

'{"name": {"fist": "Jane", "last": "Smith"}, "job": ["dev", "mgr"], "age": 20}'

In [40]:
# JSON load string
o = json.loads(s)
o

{'name': {'fist': 'Jane', 'last': 'Smith'}, 'job': ['dev', 'mgr'], 'age': 20}

In [41]:
# jsom dump file

json.dump(rec, fp = open('./files/textjson.txt', 'w'), indent=7)
print(open('./files/textjson.txt').read())

{
       "name": {
              "fist": "Jane",
              "last": "Smith"
       },
       "job": [
              "dev",
              "mgr"
       ],
       "age": 20
}


In [43]:
# JSON load file

dictionary = json.load(open('./files/textjson.txt'))
dictionary

{'name': {'fist': 'Jane', 'last': 'Smith'}, 'job': ['dev', 'mgr'], 'age': 20}

## Storing Packed Binary Data: struct
The struct module knows how to both compose and parse packed binary data. In a sense this is another data conversion tool that interprets strings in files as binary data.

### Context Manager
Ensure that file will be closed and have its output flush to disk if needed automatically on exit instead of relying on the auto close during garbage collection.

In [48]:
with open('./files/file.txt') as myfile:
    for line in myfile:
        print(line)

Hello file

goodbye text file



In [50]:
# The try finally statement can provide similar functionality but at cost in extra code to write.

myfile = open('./files/datafile.txt')
try:
    for l in myfile:
        print(l)
finally:
    myfile.close()

spam

1 and 2 and 1

[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]

{'a': 1, 'b': 0}
