# Working with files and data serialization

When you have data you want to write the data to a file to store it after use. Or, you want to open a file to get the data and do something with it. In this notebook we'll be covering several common file types. So let's get coding!O

## Contents

0. Install packages
1. Writing to a text file
2. Reading from a text file
3. JSON
4. YAML
5. Pickle
6. Creating a simple credentials file 

## 0. Install packages

In [2]:
%pip install pyyaml 

Note: you may need to restart the kernel to use updated packages.


In [1]:
# we will use glob to show a list of certain filetype 
import glob
png_files = glob.glob('*.png')
png_files

['anatomy of an array.png',
 'opencv.png',
 'OpenCV_rgb.png',
 'plantuml.png',
 'temp1.png',
 'road2.png',
 'plot.png']

The other way of doing this is by using **ls**

In [2]:
ls *.txt 

NOAA_data.txt            input.txt                requirements.txt
XML_file.txt             leegmelden_plantuml.txt  servers.txt
a_file.txt               plantuml.txt             untitled.txt
accounts2.txt            plantuml_complex.txt
demo.txt                 plantuml_complex2.txt


## 1. Writing to a text file

In [3]:
pwd

'/Users/michielbontenbal/Documents/GitHub/Notebooks'

In [7]:
#this scripts creates a .txt file with 3 lines Michiel, Bontenbal and empty line
with open ('accounts.txt', mode='w') as accounts:
    accounts.write('100 Jones 24.98\n')
    accounts.write('200 Doe 345.67\n')
    accounts.write('300 White 0.00\n')
    print('400 Stone -42.16', file=accounts)#also create a new line in the file
    print('500 Rich 224.62', file=accounts)
#python closes the file

In [8]:
from glob import glob
my_txt_files = glob('*.txt')
print(my_txt_files)

['demo.txt', 'XML_file.txt', 'requirements.txt', 'plantuml_complex2.txt', 'servers.txt', 'leegmelden_plantuml.txt', 'accounts2.txt', 'untitled.txt', 'accounts.txt', 'plantuml.txt', 'a_file.txt', 'plantuml_complex.txt']


In [10]:
#display the file using %pycat
%pycat accounts.txt

In [12]:
#open the text file againg with python
text_file = open('accounts.txt')
file_content = text_file.read()
print(file_content)
text_file.close()
print(type(file_content))

100 Jones 24.98
200 Doe 345.67
300 White 0.00
400 Stone -42.16
500 Rich 224.62

<class 'str'>


### 1b. Converting a python list to a txt.file

In [5]:
#a script to convert python list to a .txt file
a_list = ["abc", "def", "ghi"]
textfile = open("a_file.txt", "w")
textfile.write('header\n')
for element in a_list:
    textfile.write(element + "\n")
textfile.close()

In [7]:
#check the file with pycat
%pycat /Users/michielbontenbal/Documents/GitHub/Notebooks/a_file.txt

## 2. Reading Data from a text file

In [13]:
with open ('accounts.txt', mode='r') as accounts:
    print(f'{"Account":<10}{"Name": <10}{"Balance":>10}')#create three headers/columns with 10 characters and align < or >
    for record in accounts: #for each row do the followinh
        account, name, balance = record.split()
        print(f'{account:<10}{name:<10}{balance:>10}')

Account   Name         Balance
100       Jones          24.98
200       Doe           345.67
300       White           0.00
400       Stone         -42.16
500       Rich          224.62


In [14]:
with open('accounts.txt') as input_file:
          line = input_file.readline()
          while line:
                line = line.strip()
                print(line)
                line = input_file.readline()

100 Jones 24.98
200 Doe 345.67
300 White 0.00
400 Stone -42.16
500 Rich 224.62


## 3. JSON 
source: https://www.programiz.com/python-programming/json

In Python 3:
 - json.loads take a string as input and returns a dictionary as output.
 - json.dumps take a dictionary as input and returns a string as output.
 - json.load you can load a json file

In [26]:
from glob import glob
my_jsons = glob('*.json')
print(my_jsons)

['edgeimpulse.json', 'plane.json', 'person.json', 'petstore_openapi3.json']


In [8]:
import json
try:
    data = json.dumps('person.json')
except:
    pass
print(data)

"person.json"


In [9]:
# to open the json file use json.load
import json

with open('person.json') as f:
    data = json.load(f)

# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print(data)

{'name': 'Bob', 'languages': ['English', 'French']}


In [10]:
import json

person = '{"name": "Michiel", "languages": ["English", "French", "Italian"]}'
person_dict = json.loads(person)

# Output: {'name': 'Bob', 'languages': ['English', 'French']}
print(person_dict)

# Output: ['English', 'French']
print(person_dict['languages'][0]+" and "+ person_dict['languages'][1])
print(type(person_dict))

{'name': 'Michiel', 'languages': ['English', 'French', 'Italian']}
English and French
<class 'dict'>


In [11]:
import json
with open('edgeimpulse.json') as f:
    data=json.load(f)
    
list_item_0 =data['result']['bounding_boxes'][0]
print(list_item_0)
print(list_item_0['value'], list_item_0['label'])
print(list_item_0['label'])
#type(list_item_0)

{'height': 252, 'label': 'zebra', 'value': 0.9409910440444946, 'width': 201, 'x': 13, 'y': 38}
0.9409910440444946 zebra
zebra


In [15]:
import json
python_obj = '{"a":  1, "a":  2, "a":  3, "a": 4, "b": 1, "b": 2, "c": 0}'
json_obj = json.loads(python_obj)
print("Unique key in the JSON:")
print(json_obj) 

Unique Key in the JSON:
{'a': 4, 'b': 2, 'c': 0}


## 4. YAML

Pypi: https://pypi.org/project/PyYAML/

source: https://zetcode.com/python/yaml/

Use the following files: items.yaml en 

raincoat: 1
coins: 5
books: 23
spectacles: 2
chairs: 12
pens: 6

In [1]:
#a script to load data from a .yaml file
import yaml

with open('items.yaml') as f:
    data = yaml.load(f, Loader=yaml.FullLoader)
print(data)
print(type(data))

{'raincoat': 1, 'coins': 5, 'books': 23, 'spectacles': 2, 'chairs': 12, 'pens': 6}
<class 'dict'>


In [47]:
#!/usr/bin/env python3

import yaml

with open('data.yaml') as f:
    
    docs = yaml.load_all(f, Loader=yaml.FullLoader)

    for doc in docs:
        
        for k, v in doc.items():
           print(k, "->", v)
           

cities -> ['Bratislava', 'Kosice', 'Trnava', 'Moldava', 'Trencin']
companies -> ['Eset', 'Slovnaft', 'Duslo Sala', 'Matador Puchov']


## 5. Pickle

Pickle is Python's native object serialization module.  

Pickle official information: https://docs.python.org/3/library/pickle.html
Pickle tutorial: https://www.datacamp.com/community/tutorials/pickle-python-tutorial

#### Pickle vs JSON
- Pickle pro's = python native
- JSON pro's = faster, interoperability, more secure

In [7]:
# Save a dictionary into a pickle file.
import pickle

favorite_color = { "lion": "yellow", "kitty": "red" }
pickle.dump( favorite_color, open( "save.pkl", "wb" ) )

In [8]:
# Open the pickle file and assign to a new variable
favorite_color_new = pickle.load( open( "save.pkl", "rb" ) )
favorite_color_new

{'lion': 'yellow', 'kitty': 'red'}

## 6. Credentials file

First make a seperate textfile called 'credentials.py' as follows:

username = "xy" <br>
password = "abcd"

In [4]:
#using a dummy_credentials.py file
import dummy_credentials
username =dummy_credentials.username
password = dummy_credentials.password
print(username, password)

xy abcd
