## Lesson 7b - Taking Input, Reading and Writing Files



### Table of Contents

* Taking Input
* Reading Files
* Writing Files

<a id="input"></a>

### Taking Input

`input()` and `argv` are two different ways used take input from the userin python scripts. but  `argv` don't work well with Jupyter notebooks, we will cover them because they can be useful in Python scripts.

#### `input()`

Newer versions of Jupyter Notebook support this kind of input. Mostly we just 'hard code' the value for a variable

In [1]:
num = int(input("enter the number"))

enter the number666


In [2]:
print(num)

666


In [3]:
type(num)

int

In [None]:
# 'hard code' the value for a variable
y = 6

#### `argv`

When you import the `argv` special variable, it allows you to pass strings, numbers, and filenames to your python code. It doesn't work in Jupyter notebooks, however, so you'll have to use a workaround. We can comment out the `argv` calls and hard code the values we would have passed. Later, when we select "Download as > Python (.py)", we can open up that .py file and uncomment the `argv` calls. Still, it's a good idea to define all your variables and file paths at the start of your notebook.

![argv.png](attachment:argv.png)

### Reading Files

* Python provides inbuilt functions for creating, writing, and reading files.
* We can read in a text file using `open()` and then print or use it all at once or one line at a time. 
* Note that when we read the lines of a file, the lines are removed from the file handle object (called a `TextIOWrapper`).

###  File Access Modes
Access modes govern the type of operations possible in the opened file. It refers to how the file will be used once its opened. There are 6 access modes in python.

* **Read Only (‘r’)** : Open text file for reading. The handle is positioned at the beginning of the file. If the file does not exists, raises the I/O error. This is also the default mode in which a file is opened.
* **Read and Write (‘r+’)**: Open the file for reading and writing. The handle is positioned at the beginning of the file. Raises I/O error if the file does not exist.
* **Write Only (‘w’)** : Open the file for writing. For the existing files, the data is truncated and over-written. The handle is positioned at the beginning of the file. Creates the file if the file does not exist.
* **Write and Read (‘w+’)** : Open the file for reading and writing. For an existing file, data is truncated and over-written. The handle is positioned at the beginning of the file.
* **Append Only (‘a’)**: Open the file for writing. The file is created if it does not exist. The handle is positioned at the end of the file. The data being written will be inserted at the end, after the existing data.
* **Append and Read (‘a+’)** : Open the file for reading and writing. The file is created if it does not exist. The handle is positioned at the end of the file. The data being written will be inserted at the end, after the existing data.

### Syntax to open a file in a certain mode


```
File_object = open("File_Name","Access_Mode")
```

In [1]:
# save file name to be used
filename = 'sales_data.csv'

#### Read a file all at once

`read()` : Returns the read bytes in form of a string. Reads n bytes, if no n specified, reads the entire file.

In [5]:


txt = open(filename)

print("Content of file %r:" % filename)

#print(txt.read())
print(txt.read(10))

Content of file 'sales_data.csv':
Date,Custo


In [6]:
# close the file
txt.close()

In [11]:
type(txt)

_io.TextIOWrapper

#### Read one line at a time

`readline()` : Reads a line of the file and returns in form of a string. For specified `n`, reads at most `n bytes`. However, does not reads more than one line, even if n exceeds the length of the line.

![image.png](attachment:image.png)

In [9]:
txt = open(filename)
# read two bytes of a line
txt.readline(5)

'Date,'

In [10]:
txt.readline()

'Customer_Age,Age_Group,Customer_Gender,Country,State,Product_Category,Sub_Category,Product,Order_Quantity,Unit_Cost,Unit_Price,Profit,Cost,Revenue\n'

In [11]:
txt.readline()

'11/26/2013,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n'

In [12]:
txt.readline()

'11/26/2015,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n'

In [13]:
txt.close()

#### Read lines as a list

`readlines()` : Reads all the lines and return them as each line a string element in a list.

![image.png](attachment:image.png)

In [14]:
txt = open(filename)

# read all the lines
txt.readlines() 

['Date,Customer_Age,Age_Group,Customer_Gender,Country,State,Product_Category,Sub_Category,Product,Order_Quantity,Unit_Cost,Unit_Price,Profit,Cost,Revenue\n',
 '11/26/2013,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n',
 '11/26/2015,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n',
 '3/23/2014,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,23,45,120,1366,1035,2401\n',
 '3/23/2016,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,20,45,120,1188,900,2088\n',
 '5/15/2014,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,238,180,418\n',
 '5/15/2016,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,5,45,120,297,225,522\n',
 '5/22/2014,47,Adults (35-64),F,Australia,Victoria,Accessories,Bike Racks,Hitch Rack - 

In [15]:
txt.close()

#### Open in a `with` block. Then use `for` loop, `read()`, `readline()`, or `readlines()`.

In [16]:
with open(filename, 'r') as f:
    for line in f:
        #Remove any white spaces at the end of a string
        line = line.rstrip()
        print(line)

Date,Customer_Age,Age_Group,Customer_Gender,Country,State,Product_Category,Sub_Category,Product,Order_Quantity,Unit_Cost,Unit_Price,Profit,Cost,Revenue
11/26/2013,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950
11/26/2015,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950
3/23/2014,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,23,45,120,1366,1035,2401
3/23/2016,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,20,45,120,1188,900,2088
5/15/2014,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,238,180,418
5/15/2016,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,5,45,120,297,225,522
5/22/2014,47,Adults (35-64),F,Australia,Victoria,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,199,180,379
5/22/2016,47,Adu

In [None]:
with open(filename, 'r') as f:
    lines = f.read()

In [33]:
 lines

'Date,Customer_Age,Age_Group,Customer_Gender,Country,State,Product_Category,Sub_Category,Product,Order_Quantity,Unit_Cost,Unit_Price,Profit,Cost,Revenue\n11/26/2013,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n11/26/2015,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950\n3/23/2014,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,23,45,120,1366,1035,2401\n3/23/2016,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,20,45,120,1188,900,2088\n5/15/2014,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,238,180,418\n5/15/2016,47,Adults (35-64),F,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,5,45,120,297,225,522\n5/22/2014,47,Adults (35-64),F,Australia,Victoria,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,199,180,379\n5/22/20

<a id="writing"></a>

### Writing Files

We can write files using `write()`.

In [34]:
outfile = 'copy_of_sale_data.txt'

In [35]:
# some text to write (a limerick by Edward Lear)
line1 = "I am student of SDS, CUBE!"
line2 = "I have been in Beijing since the last two years,"
line3 = "SDS is situated in Fengtai, Beijing, China.'"

#### Write the most basic way

In [29]:
target = open(outfile, 'w')   #a+ append at the end

target.write(line1)
target.write('\n')
target.write(line2)
target.write('\n')
target.write(line3)
target.write('\n')
target.close()

In [30]:
type(target)

_io.TextIOWrapper

#### Write in a `with` block

Again, we can use `with` to simplify things (avoid having to `close()` the file).

In [36]:
with open(outfile, 'w') as target:
    target.write(line1)
    target.write('\n')
    target.write(line2)
    target.write('\n')
    target.write(line3)
    target.write('\n')

#### Reading and writing to files with Pandas.

In [17]:
# import required packages
import pandas as pd

df = pd.read_csv(filename, header=None)

In [43]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
0,Date,Customer_Age,Age_Group,Customer_Gender,Country,State,Product_Category,Sub_Category,Product,Order_Quantity,Unit_Cost,Unit_Price,Profit,Cost,Revenue
1,11/26/2013,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950
2,11/26/2015,19,Youth (<25),M,Canada,British Columbia,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,590,360,950
3,3/23/2014,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,23,45,120,1366,1035,2401
4,3/23/2016,49,Adults (35-64),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,20,45,120,1188,900,2088
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
195,1/1/2014,24,Youth (<25),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,8,45,120,475,360,835
196,1/1/2016,24,Youth (<25),M,Australia,New South Wales,Accessories,Bike Racks,Hitch Rack - 4-Bike,6,45,120,356,270,626
197,3/1/2014,24,Youth (<25),F,Australia,Victoria,Accessories,Bike Racks,Hitch Rack - 4-Bike,2,45,120,100,90,190
198,3/1/2016,24,Youth (<25),F,Australia,Victoria,Accessories,Bike Racks,Hitch Rack - 4-Bike,4,45,120,199,180,379


#### Write with Pandas to comma-separated values or tab-separated values

Pandas DataFrame provides `to_csv()` method to write/export DataFrame to CSV comma-separated delimiter file along with header and index.

In [34]:
df.to_csv('woodchuck_pandas.csv')

As you see by default CSV file was created with a comma-separated delimiter file, with column header and row index. You can change this behavior by supplying param to the method. to_csv() takes multiple optional params as shown in the below syntax.

# to_csv() Syntax
df.to_csv(path_or_buf=None, sep=',', na_rep='', float_format=None, 
columns=None, header=True, index=True, index_label=None, mode='w', encoding=None, 
compression='infer', quoting=None, quotechar='"', line_terminator=None, 
chunksize=None, date_format=None, doublequote=True, escapechar=None, 
decimal='.', errors='strict', storage_options=None)

In [35]:
df.to_csv('student.csv', sep='\t')

In [39]:
df.to_csv('student.csv', sep=',')

In [19]:
# creating a DataFrame
students = {'Student': ['Amit', 'Cody',
                        'Darren', 'Drew'],
            'RollNumber': [1, 5, 10, 15],
            'Grade': ['A', 'C', 'F', 'B']}
df = pd.DataFrame(students,
                  columns =['Student', 'RollNumber',
                            'Grade'])
 
# saving as a CSV file
df.to_csv('Students.csv', sep ='\t')
 
# loading the CSV file
new_df = pd.read_csv('Students.csv')
 
# displaying the new DataFrame
print('Data from Students.csv:')
print(new_df.head())

Data from Students.csv:
  \tStudent\tRollNumber\tGrade
0                0\tAmit\t1\tA
1                1\tCody\t5\tC
2             2\tDarren\t10\tF
3               3\tDrew\t15\tB
