* Common File Operations
* Getting Started with File I/O
* Open File in read-only mode
* Convert Data into Collection
* Create list of tuples
* Create list of dicts
* Filter for invalid sales data
* Compute Net Sale Revenue for valid sale
* Compute Total Net Revenue
* Overview of writing to files
* Write list of tuples to File
* Exercise and Solution

### Common File Operations
Here are some of the common file operations we perform. We will use text files which contain comma separated data.
* Read data from file.
* Create File and write data into it.
* Append data to existing file.

### Getting Started with File I/O
Here are the details about performing File I/O to read the data from the file.
* Create File Object using `open`. We need to pass the relative or absolute path as string to it. By default file will be opened in read-only mode.
* We can use functions such as `read` to read the content from the file into Python object (`str` for text files).
* We can also use relevant functions such as `write` on File Object to write to a specific file. For this we need to open the file in write mode.

In [None]:
# Open File in read-only mode (default)
# Relative Path: data/sales/part-00000

f = open('data/sales/part-00000') # same as open('path', 'r')

In [None]:
type(f)

In [None]:
dir(f)

In [None]:
f.readable()

In [None]:
f.read() # reads data from file into memory as str object

In [None]:
# Convert Data into Collection

f = open('data/sales/part-00000')
data = f.read()

In [None]:
data

In [None]:
recs = data.splitlines() # data.split('\n')

In [None]:
recs

In [None]:
recs[0]

In [None]:
recs[1:]

In [None]:
# Create list of tuples (using loops)

f = open('data/sales/part-00000')
recs = f.read().splitlines()[1:]

In [None]:
s = recs[0].split(',')

In [None]:
(int(s[0]), int(s[1]), float(s[2]), int(s[3]) if s[3] != '' else None)

In [None]:

sales_tuples = []
for rec in recs:
    s = rec.split(',')
    sales_tuples.append((int(s[0]), int(s[1]), float(s[2]), int(s[3]) if s[3] != '' else None))
    
sales_tuples

In [None]:
# Create list of tuples (using csv)

import csv

In [None]:

f = open('data/sales/part-00000')
recs = f.read().splitlines()[1:]

In [None]:
for rec in csv.reader(recs):
    print(tuple(rec))

In [None]:
# Create list of dicts (using loops)

f = open('data/sales/part-00000')
data = f.read().splitlines()

In [None]:
header = data[0]

In [None]:
header

In [None]:
recs = data[1:]

In [None]:
recs

In [None]:
columns = header.split(',')

In [None]:
s = recs[0].split(',')

In [None]:
row = (int(s[0]), int(s[1]), float(s[2]), int(s[3]) if s[3] != '' else None)

In [None]:
dict(zip(columns, row))

In [None]:
sales_dicts = []
for rec in recs:
    s = rec.split(',')
    row = (int(s[0]), int(s[1]), float(s[2]), int(s[3]) if s[3] != '' else None)
    sales_dicts.append(dict(zip(columns, row)))
    
sales_dicts

In [None]:
# Create list of dicts (using csv)
f = open('data/sales/part-00000')
data = f.read().splitlines()

In [None]:
for rec in csv.DictReader(data):
    print(rec)

In [None]:
# Filter for invalid sales data (using list of dicts)
f = open('data/sales/part-00000')
data = f.read().splitlines()
sales = csv.DictReader(data)

In [None]:
sales_filtered = []

for sale in sales:
    if (sale['commission_pct'] == '' or int(sale['commission_pct']) < 0):
        sales_filtered.append(sale)

sales_filtered

In [None]:
# Compute Net Sale Revenue for valid sale
# sale_amount - commission_amount
# commission_amount = (sale_amount * commission_pct) / 100

f = open('data/sales/part-00000')
data = f.read().splitlines()
sales = csv.DictReader(data)

In [None]:

sales_net_revenue = []

for sale in sales:
    if not (sale['commission_pct'] == '' or int(sale['commission_pct']) < 0):
        sales_net_revenue.append((int(sale['sale_id']), float(sale['sale_amount']) - (float(sale['sale_amount']) * int(sale['commission_pct'])) / 100))

sales_net_revenue

In [None]:
# Compute Total Net Revenue
sum(sale[1] for sale in sales_net_revenue)

In [None]:
# Overview of writing to files
# Open file in write mode
# Write Python object to file
# Close the file

f = open('data/sales/dummy1.csv', 'w')

In [None]:
l = [1, 2, 3, 4]

In [None]:
for i in l:
    f.write(str(i))
    f.write('\n')

In [None]:
f.close()
# Open file using explorer and validate

In [None]:
# Using Context Manager (read)

with open('data/sales/part-00000') as f:
    data = f.read().splitlines()

data

In [None]:
# Using Context Manager (write)

l = [1, 2, 3, 4]
with open('data/sales/dummy2.csv', 'w') as f:
    for i in l:
        f.write(str(i))
        f.write('\n')

In [None]:
# Write list of tuples to File (using csv)
f = open('data/sales/part-00000')
data = f.read().splitlines()
sales = csv.DictReader(data)

In [None]:
sales_net_revenue = []

for sale in sales:
    if not (sale['commission_pct'] == '' or int(sale['commission_pct']) < 0):
        sales_net_revenue.append((int(sale['sale_id']), float(sale['sale_amount']) - (float(sale['sale_amount']) * int(sale['commission_pct'])) / 100))

sales_net_revenue

In [None]:
help(csv.writer)

In [None]:
with open('data/sales/sale_revenue.csv', 'w') as sales_f:
    writer = csv.writer(sales_f)
    writer.writerows(sales_net_revenue)
    
# review by opening the file using explorer

* Exercise: Read, Process and Write (using File I/O and csv)
  * Read sales data from `data/sales/part-00000` using `csv.reader`.
  * Filter for valid sales with sale amount <= 500.
  * Write filtered data back to `data/sales/filtered.csv` using `csv.writer`.