# Working with files


## Contents

0. Install packages
1. Writing to a text file
2. Reading from a text file
3. JSON
4. YAML
5. Pickle
6. XML to JSON
7. Credentials file (simple)

## 0. Install packages

In [2]:
%pip install pyyaml 

Note: you may need to restart the kernel to use updated packages.


In [4]:
# we will use glob to show a list of certain filetype 
import glob
png_files = glob.glob('*.png')
png_files

['anatomy of an array.png', 'opencv.png', 'plot.png', 'road2.png']

#### The other way of doing this is by using ls

In [7]:
ls *.txt 

 Volume in drive C is OS
 Volume Serial Number is 2818-58FB

 Directory of C:\Users\31653\Documents\GitHub\Notebooks

06/02/2021  17:07                84 accounts.txt
23/06/2021  20:11                84 demo.txt
26/07/2021  15:45                59 servers.txt
17/04/2021  21:30                 0 untitled.txt
17/04/2021  21:31                16 untitled1.txt
               5 File(s)            243 bytes
               0 Dir(s)  286.964.838.400 bytes free


## 1. Writing to a text file

In [1]:
#this scripts creates a .txt file with 3 lines Michiel, Bontenbal and empty line
with open ('accounts.txt', mode='w') as accounts:
    accounts.write('100 Jones 24.98\n')
    accounts.write('200 Doe 345.67\n')
    accounts.write('300 White 0.00\n')
    print('400 Stone -42.16', file=accounts)#also create a new line in the file
    print('500 Rich 224.62', file=accounts)
#python closes the file

## 2. Reading Data from a text file

In [2]:
with open ('accounts.txt', mode='r') as accounts:
    print(f'{"Account":<10}{"Name": <10}{"Balance":>10}')#create three headers/columns with 10 characters and align < or >
    for record in accounts: #for each row do the followinh
        account, name, balance = record.split()
        print(f'{account:<10}{name:<10}{balance:>10}')

Account   Name         Balance
100       Jones          24.98
200       Doe           345.67
300       White           0.00
400       Stone         -42.16
500       Rich          224.62


In [20]:
with open('accounts.txt') as input_file:
          line = input_file.readline()
          while line:
                line = line.strip()
                print(line)
                line = input_file.readline()

100 Jones 24.98
200 Doe 345.67
300 White 0.00
400 Stone -42.16
500 Rich 224.62


In [26]:
with open('accounts.txt') as input_file:
    with open('demo.txt', 'w') as output_file:
        line = input_file.readline()
        while line:
            print(line.count('Doe'))
            line = line.strip()
            print(line)
            output_file.write(line+ '\n')
            line = input_file.readline()

0
100 Jones 24.98
1
200 Doe 345.67
0
300 White 0.00
0
400 Stone -42.16
0
500 Rich 224.62


In [24]:
line.count('Doe')

0

In [22]:
ls *.txt

 Volume in drive C is OS
 Volume Serial Number is 2818-58FB

 Directory of C:\Users\31653\Documents\GitHub\Notebooks

06/02/2021  17:07                84 accounts.txt
23/06/2021  19:52                 0 demo.txt
17/04/2021  21:30                 0 untitled.txt
17/04/2021  21:31                16 untitled1.txt
               4 File(s)            100 bytes
               0 Dir(s)  281.730.273.280 bytes free


## 3. JSON 
source: https://www.programiz.com/python-programming/json

In [14]:
import json

person = '{"name": "Michiel", "languages": ["English", "French", "Italian"]}'
person_dict = json.loads(person)

# Output: {'name': 'Bob', 'languages': ['English', 'French']}
print(person_dict)

# Output: ['English', 'French']
print(person_dict['languages'][0]+" and "+ person_dict['languages'][1])

{'name': 'Michiel', 'languages': ['English', 'French', 'Italian']}
English and French


In [26]:
import json

with open('person.json') as f:
    data = json.load(f)

# Output: {'name': 'Bob', 'languages': ['English', 'Fench']}
print(data)

{'name': 'Bob', 'languages': ['English', 'French']}


In [47]:
import json
with open('edgeimpulse.json') as f:
    data=json.load(f)
    
list_item_0 =data['result']['bounding_boxes'][0]
print(list_item_0)
print(list_item_0['value'], list_item_0['label'])
print(list_item_0['label'])
#type(list_item_0)

{'height': 252, 'label': 'zebra', 'value': 0.9409910440444946, 'width': 201, 'x': 13, 'y': 38}
0.9409910440444946 zebra
zebra


In [24]:
import json
with open('edgeimpulse.json') as f:
    data=json.dumps(f)
    
print(data)


TypeError: Object of type TextIOWrapper is not JSON serializable

## 4. YAML

Pypi: https://pypi.org/project/PyYAML/

source: https://zetcode.com/python/yaml/

Use the following files: items.yaml en 

raincoat: 1
coins: 5
books: 23
spectacles: 2
chairs: 12
pens: 6

In [4]:
import yaml

with open('items.yaml') as f:
    
    data = yaml.load(f, Loader=yaml.FullLoader)
    print(data)

{'raincoat': 1, 'coins': 5, 'books': 23, 'spectacles': 2, 'chairs': 12, 'pens': 6}


In [5]:
type(data)

dict

data.yaml:
cities:
  - Bratislava
  - Kosice
  - Trnava
  - Moldava
  - Trencin
---
companies:
  - Eset
  - Slovnaft
  - Duslo Sala
  - Matador Puchov

In [14]:
#!/usr/bin/env python3

import yaml

with open('data.yaml') as f:
    
    docs = yaml.load_all(f, Loader=yaml.FullLoader)

    for doc in docs:
        
        for k, v in doc.items():
           print(k, "->", v)
           

cities -> ['Bratislava', 'Kosice', 'Trnava', 'Moldava', 'Trencin']
companies -> ['Eset', 'Slovnaft', 'Duslo Sala', 'Matador Puchov']


## 5. Pickle

Pickle is Python's native object serialization module.  

Pickle official information: https://docs.python.org/3/library/pickle.html
Pickle tutorial: https://www.datacamp.com/community/tutorials/pickle-python-tutorial

#### Pickle vs JSON
- Pickle pro's = python native
- JSON pro's = faster, interoperability, more secure

In [3]:
import pickle

In [7]:
# Save a dictionary into a pickle file.
import pickle

favorite_color = { "lion": "yellow", "kitty": "red" }
pickle.dump( favorite_color, open( "save.pkl", "wb" ) )

In [8]:
# Open the pickle file and assign to a new variable
favorite_color_new = pickle.load( open( "save.pkl", "rb" ) )
favorite_color_new

{'lion': 'yellow', 'kitty': 'red'}

## 6. XML to JSON

In [1]:
%pip install xmltodict

Note: you may need to restart the kernel to use updated packages.


In [2]:
import xmltodict
import json

xml='''<website>
        <name>Codespeedy</name>
        <article>Related to programming</article>
        <message>You can learn easily from codespeedy</message>
    </website>'''

my_dict=xmltodict.parse(xml)
json_data=json.dumps(my_dict)
print(json_data)

{"website": {"name": "Codespeedy", "article": "Related to programming", "message": "You can learn easily from codespeedy"}}


In [3]:
my_xml='''<emp_bank_tx>
         <ebnkt_bank_def_id>BVKZ</ebnkt_bank_def_id>
         <ebnkt_employee_id>99999901</ebnkt_employee_id>
         <ebnkt_transaction_date>2021-03-01</ebnkt_transaction_date>
         <ebnkt_transaction_type>3</ebnkt_transaction_type>
         <ebnkt_amount>56h00</ebnkt_amount>
         <ebnkt_reason_code>ECALC</ebnkt_reason_code>
        </emp_bank_tx>'''

In [4]:
import xmltodict
import json

xml=my_xml

my_dict=xmltodict.parse(xml)
json_data=json.dumps(my_dict)
print(json_data)

{"emp_bank_tx": {"ebnkt_bank_def_id": "BVKZ", "ebnkt_employee_id": "99999901", "ebnkt_transaction_date": "2021-03-01", "ebnkt_transaction_type": "3", "ebnkt_amount": "56h00", "ebnkt_reason_code": "ECALC"}}


## 6a. JSON to UBL-XML
source: https://json-to-ubl-xml-transformer.readthedocs.io/en/latest/installation.html
source: https://github.com/dimitern/json_to_ubl_xml_transformer

In [1]:
!pip install json_to_ubl_xml_transformer

Collecting json_to_ubl_xml_transformer
  Downloading json_to_ubl_xml_transformer-0.2.1-py2.py3-none-any.whl (9.1 kB)
Installing collected packages: json-to-ubl-xml-transformer
Successfully installed json-to-ubl-xml-transformer-0.2.1


In [2]:
import json_to_ubl_xml_transformer

## 7. Credentials file

First make a seperate textfile called 'credentials.py' as follows:

username = "xy"
password = "abcd"

In [4]:
import credentials
username =credentials.username
password = credentials.password
print(username, password)

xy abcd
