### Part One: The I/O Functions

#### Note: You may not need to use all the contents here in the Web Analytics course, but it is always a good idea to learn them.
We will cover some basic I/O functions in this lab.

#### Read Keyboard Input

In [1]:
# Ask for user input and print it

# input_str will store the user input
input_str = input("Enter something: ")  # For Python 3.x, use input() . For Python 2.x, use raw_input()
print("The input is {}".format(input_str))

Enter something: haha
The input is haha


#### Open and Close Files

Cheat Sheet for Access Modes:
- "r": Opens the file in read-only mode. Starts reading from the beginning of the file and is the default mode for the open() function.
- "w": Opens in write-only mode. The pointer is placed at the beginning of the file and this will overwrite any existing file with the same name. It will create a new file if one with the same name doesn't exist.
- "a": Opens a file for appending new information to it. The pointer is placed at the end of the file. A new file is created if one with the same name doesn't exist.
- "r+": Opens a file for reading and writing, placing the pointer at the beginning of the file.
- "a+": Opens a file for both appending and reading.

In [2]:
# Open files in write-only mode, create a new file if the specified file does not exit
file = open("text.txt", "w")
print("File Name: " + str(file.name))
print("File Close or Not: " + str(file.closed))
print("File Access Mode: " + str(file.mode))

File Name: text.txt
File Close or Not: False
File Access Mode: w


In [3]:
# Close files
file.close()  # Why CLOSE?? Free up space in RAM, unable to see the edits

#### Read and Write to Files

In [4]:
# Open and write to files

# Step 1: open/create the file you want to write to
f = open("text.txt", "a")  # Append mode: pointer placed at the end of the file

# Step 2: use write() function to add centents in the file under "append" mode
f.write("Let's add some numbers!\n")

# Use a for loop to append 10 numbers: 0 ~ 9 
for i in range(10):
    f.write(str(i)+"\n")

# Step 3: close the file
f.close()

In [5]:
# Read files
f = open("text.txt", "r+")
print(f.read())

# Close your file!!
f.close()

Let's add some numbers!
0
1
2
3
4
5
6
7
8
9



#### Best Practice to Work with Files: Use Context Manager

In [6]:
# Open a .txt file use context manager
with open('text.txt', 'r') as f:
    # f_string = f.read()  # read in the entire content in the file as string
    f_list = f.readlines()  # read in the entire content in the file as list

In [7]:
# Convert a text file to a list where each element in the list corresponding to a line in the text file
f_content = []
with open('text.txt', 'r') as f:
    for line in f:
        f_content.append(line.strip())
f_content

["Let's add some numbers!", '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

#### Another example working with json files using context manager

In [8]:
# import json package to get started
import json

# read file
with open('kids.json', 'r') as f:
    obj = json.load(f)  # load json file to a Python object
    print(type(obj))

<class 'dict'>


In [9]:
# TODO: Go to the kids.json file to see the structure and then run codes to see what they do
obj['Children']  # shows the value associated with the key "Children"

[{'id': 1, 'name': 'Pete', 'age': 3, 'parents': 'human'},
 {'id': 2, 'name': 'Andy', 'age': 2, 'parents': 'martian'},
 {'id': 3, 'name': 'Helen', 'age': 4, 'parents': 'subterrnean'}]

In [10]:
type(obj['Children']) 

list

In [11]:
obj['Children'][0]  # shows the first element in the list 

{'id': 1, 'name': 'Pete', 'age': 3, 'parents': 'human'}

In [12]:
type(obj['Children'][0])

dict

In [13]:
obj['Children'][0]['id']

1

In [30]:
# show values
for kid in obj['Children']:
    # kid is <dict> type
    print("id: " + str(kid['id']))
    print("name: " + str(kid['name']))
    print("age: " + str(kid['age']))

id: 1
name: Pete
age: 3
id: 2
name: Andy
age: 2
id: 3
name: Helen
age: 4


In [15]:
with open('new_text.json', 'w') as f:
    json.dump(obj, f)

### Part Two: The OS Module (Optional)
The OS module allows us to interact with the underlying operating system in a several different ways.
- Navigate the file system
- Get information from files
- Rename and delete files
- ...

To start off, import the os module. Since it is a built-in module, you do not need to install anything.

In [16]:
# import os package
import os

#### Rename files

In [17]:
# Rename a file from text.txt to text2.txt
os.rename("text.txt", "text2.txt")

#### Delete Files

In [18]:
# Delete a file 
os.remove('text2.txt')

#### Create New Directory in the Current Directory

In [19]:
# Create a new directory aka folder in the current working directory
os.mkdir("test")

#### Change Directory

In [20]:
# Change the working directory to the one specified in ()
os.chdir("/Users/mandili/Desktop") 

#### Get the Current Working Directory

In [21]:
# Shows the current working directory
os.getcwd()

'/Users/mandili/Desktop'

#### Delete Directory

In [22]:
os.rmdir('/Users/mandili/Desktop/Web Analytics TA Fall 2020/test')

### Part Three: Import Data from Various File Types
You will learn how to open, read and write data into flat files using Python. Some common file types you may encounter later on are .csv, .json, .txt, and .xlsx. 

We will work with the pandas package to convert files into a pandas dataframe.

Quick note about Pandas: it is a software library written for the Python programming language for data manipulation and analysis, please refer to the official documentation to learn more. 
https://pandas.pydata.org/

#### Work with .csv/.txt

In [23]:
import pandas as pd

# csv - comma separated values
# df_csv = pd.read_csv (r'Path where the CSV file is stored\File name.csv')

df_csv = pd.read_csv(r'Web Analytics TA Fall 2020/airports.csv')
df_csv

Unnamed: 0,Key West Nas /Boca Chica Field (private U. S. Navy ),US,67,NQX
0,A L Mangham Jr. Regional,US,67,OCH
1,AAF Heliport,US,67,AYE
2,Aberdeen Regional,US,67,ABR
3,Abilene Regional,US,67,ABI
4,Abraham Lincoln Capital,US,67,SPI
...,...,...,...,...
2355,Downtown Heliport,VI,4,JCD
2356,Henry E Rohlsen,VI,4,STX
2357,SPB,VI,4,SSB
2358,SPB,VI,4,SPB


In [24]:
# txt - plain text files usually contain data which is properly comma/space separated
# df_txt = pd.read_csv('Path where the TXT file is stored\File name.txt', sep=" ", header=None, names=["a", "b", "c"])
# Specify column names in names 

df_txt = pd.read_csv('Web Analytics TA Fall 2020/new_purchases.txt', 
                     sep='\t', header=None,  
                     names=["data", "time", "city", "topic", "amount", "payment type"])
df_txt

Unnamed: 0,data,time,city,topic,amount,payment type
0,2012-01-01,09:00,San Jose,Men's Clothing,214.05,Amex
1,2012-01-01,09:00,Fort Worth,Women's Clothing,153.57,Visa
2,2012-01-01,09:00,San Diego,Music,66.08,Cash
3,2012-01-01,09:00,Pittsburgh,Pet Supplies,493.51,Discover
4,2012-01-01,09:00,Omaha,Children's Clothing,235.63,MasterCard
...,...,...,...,...,...,...
802,2012-01-01,09:38,St. Louis,DVDs,159.51,Visa
803,2012-01-01,09:38,Milwaukee,Sporting Goods,74.16,Visa
804,2012-01-01,09:38,San Bernardino,Toys,394.52,Cash
805,2012-01-01,09:38,Sacramento,Video Games,97.16,Visa


#### Work with .json

In [25]:
df = pd.read_json(r'Web Analytics TA Fall 2020/kids.json', orient='records')
df  # You can see the result is not exactly what we wanted...

Unnamed: 0,Children
0,"{'id': 1, 'name': 'Pete', 'age': 3, 'parents':..."
1,"{'id': 2, 'name': 'Andy', 'age': 2, 'parents':..."
2,"{'id': 3, 'name': 'Helen', 'age': 4, 'parents'..."


In [26]:
from pandas.io.json import json_normalize
with open('Web Analytics TA Fall 2020/kids.json', 'r') as f:
    obj = json.load(f) 
    df = json_normalize(obj['Children'])
df

Unnamed: 0,id,name,age,parents
0,1,Pete,3,human
1,2,Andy,2,martian
2,3,Helen,4,subterrnean


#### Work with .xlsx/.xls (Excel)

In [27]:
# df = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx')
df = pd.read_excel(r'Web Analytics TA Fall 2020/kid.xlsx')
df
# if you have a specific Excel sheet that you’d like to import, specify it in 'sheet_name'
# df = pd.read_excel (r'Path where the Excel file is stored\File name.xlsx', sheet_name='your Excel sheet name')

Unnamed: 0,id,name,age,parents
0,1,Pete,3,human
1,2,Andy,2,martian
2,3,Helen,4,subterrnean


#### Something extra: Print string to text files

In [28]:
Amount = 30
Total = 100.34
with open("output.txt", "w") as f:
    # f.write("Amount: {}".format(Amount))
    f.write("Total: {} Amount {}".format(Total, Amount))

#### Other useful links to learn more about Python
The official Site: https://www.python.org/doc/

W3Schools: https://www.w3schools.com/python/

Python Video Tutorial (Programming with Mosh):https://www.youtube.com/watch?v=_uQrJ0TkZlc

Happy Learning!