## File Handling
So far we have seen different Python data types. We usually store our data in different file formats. In addition to handling files, we will also see different file formats(.txt, .json, .xml, .csv, .tsv, .excel) in this section. First, let us get familiar with handling files with common file format(.txt).

File handling is an import part of programming which allows us to create, read, update and delete files. In Python to handle data we use open() built-in function.ages)

### Syntax
##### open('filename', mode) # mode(r, a, w, x, t,b)  could be to read, write, update

- "r" - Read - Default value. Opens a file for reading, it returns an error if the file does not exist
- "a" - Append - Opens a file for appending, creates the file if it does not exist
- "w" - Write - Opens a file for writing, creates the file if it does not exist
- "x" - Create - Creates the specified file, returns an error if the file exists
- "t" - Text - Default value. Text mode
- "b" - Binary - Binary mode (e.g. images)

### Opening Files for Reading
The default mode of open is reading, so we do not have to specify 'r' or 'rt'. I have created and saved a file named reading_file_example.txt in the files directory. Let us see how it is done:

In [115]:
f = open('sample.txt')
print(f)

<_io.TextIOWrapper name='sample.txt' mode='r' encoding='cp1252'>


As you can see in the example above, I printed the opened file and it gave some information about it. Opened file has different reading methods: read(), readline, readlines. An opened file has to be closed with "close()" method.

read(): read the whole text as string. If we want to limit the number of characters we want to read, we can limit it by passing int value to the read(number) method.

In [117]:
txt = f.read()
print(type(txt))
print(txt)
f.close()

<class 'str'>
What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. 

It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.


##### Instead of printing all the text, let us print the first 10 characters of the text file.

In [123]:
f = open('sample.txt')
txt = f.read(10)
print(type(txt))
print(txt)
f.close()

<class 'str'>
What is Lo


##### readline(): read only the first line

In [125]:
f = open('sample.txt')
line = f.readline()
print(type(line))
print(line)
f.close()

<class 'str'>
What is Lorem Ipsum?



##### splitlines(): Get all the lines as a list is using

In [20]:
f = open('sample.txt')
lines = f.read().splitlines()
print(type(lines))
print(lines)
f.close()

<class 'list'>
['What is Lorem Ipsum?', "Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. ", '', 'It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.']


#### Opening Files for Writing and Updating
To write to an existing file, we must add a mode as parameter to the open() function:

- "a" - append - will append to the end of the file, if the file does not it creates a new file.
- "w" - write - will overwrite any existing content, if the file does not exist it creates.

In [127]:
#append some text to the file we have been reading

with open('sample.txt','a') as f:
    f.write('This text has to be appended at the end of the line')

In [131]:
#The method below creates a new file, if the file does not exist
with open('sample1.txt','w') as f:
    f.write('This text will be written in a newly created file')

#### Deleting Files
If we want to remove a file we use os module.

In [133]:
import os
os.remove('sample1.txt')

If the file does not exist, the remove method will raise an error, so it is good to use a condition like this:

In [53]:
import os
if os.path.exists('sample1.txt'):
    os.remove('sample1.txt')
else:
    print('The file does not exist')

The file does not exist


## File Types

#### File with txt Extension
File with txt extension is a very common form of data and we have covered it in the previous section. Let us move to the JSON file

#### File with json Extension
JSON stands for JavaScript Object Notation. Actually, it is a stringified JavaScript object or Python dictionary.

In [64]:
# dictionary
person_dct= {
    "name":"Asabeneh",
    "country":"Finland",
    "city":"Helsinki",
    "skills":["JavaScrip", "React","Python"]
}
# JSON: A string form a dictionary
person_json = "{'name': 'Asabeneh', 'country': 'Finland', 'city': 'Helsinki', 'skills': ['JavaScrip', 'React', 'Python']}"

# we use three quotes and make it multiple line to make it more readable
person_json1 = '''{
    "name":"Asabeneh",
    "country":"Finland",
    "city":"Helsinki",
    "skills":["JavaScrip", "React","Python"]
}'''

print(person_dct)
print(person_json)
print(person_json1)

{'name': 'Asabeneh', 'country': 'Finland', 'city': 'Helsinki', 'skills': ['JavaScrip', 'React', 'Python']}
{'name': 'Asabeneh', 'country': 'Finland', 'city': 'Helsinki', 'skills': ['JavaScrip', 'React', 'Python']}
{
    "name":"Asabeneh",
    "country":"Finland",
    "city":"Helsinki",
    "skills":["JavaScrip", "React","Python"]
}


#### Changing JSON to Dictionary
To change a JSON to a dictionary, first we import the json module and then we use loads method.

In [67]:
import json
# JSON
person_json = '''{
    "name": "Asabeneh",
    "country": "Finland",
    "city": "Helsinki",
    "skills": ["JavaScrip", "React", "Python"]
}'''
# let's change JSON to dictionary
person_dct = json.loads(person_json)
print(type(person_dct))
print(person_dct)
print(person_dct['name'])

<class 'dict'>
{'name': 'Asabeneh', 'country': 'Finland', 'city': 'Helsinki', 'skills': ['JavaScrip', 'React', 'Python']}
Asabeneh


#### Changing Dictionary to JSON
To change a dictionary to a JSON we use dumps method from the json module.

In [135]:
import json
# python dictionary
person = {
    "name": "Asabeneh",
    "country": "Finland",
    "city": "Helsinki",
    "skills": ["JavaScrip", "React", "Python"]
}
# let's convert it to  json
person_json = json.dumps(person, indent=4) # indent could be 2, 4, 8. It beautifies the json
print(type(person_json))
print(person_json)

<class 'str'>
{
    "name": "Asabeneh",
    "country": "Finland",
    "city": "Helsinki",
    "skills": [
        "JavaScrip",
        "React",
        "Python"
    ]
}


#### Saving as JSON File
We can also save our data as a json file. Let us save it as a json file using the following steps. For writing a json file, we use the json.dump() method, it can take dictionary, output file, ensure_ascii and indent. In the code, we use encoding and indentation. Indentation makes the json file easy to read.

In [137]:
with open('json_example.json', 'w', encoding='utf-8') as f:
    json.dump(person, f, ensure_ascii=False, indent=4)

### File with csv Extension
CSV stands for comma separated values. CSV is a simple file format used to store tabular data, such as a spreadsheet or database. CSV is a very common data format in data science.

In [139]:
import csv

data = [
    {'name': 'Nikhil', 'branch': 'COE', 'year': 2, 'cgpa': 9.0},
    {'name': 'Sanchit', 'branch': 'COE', 'year': 2, 'cgpa': 9.1},
    {'name': 'Aditya', 'branch': 'IT', 'year': 2, 'cgpa': 9.3},
    {'name': 'Sagar', 'branch': 'SE', 'year': 1, 'cgpa': 9.5},
    {'name': 'Prateek', 'branch': 'MCE', 'year': 3, 'cgpa': 7.8},
    {'name': 'Sahil', 'branch': 'EP', 'year': 2, 'cgpa': 9.1}
]

with open('csv_example.csv', 'w', newline='') as csvfile:
    fieldnames = ['name', 'branch', 'year', 'cgpa']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)


In [141]:
with open('csv_example.csv') as f:
    csv_reader = csv.reader(f, delimiter=',') # w use, reader method to read csv
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are :{", ".join(row)}')
            line_count += 1
        else:
            print(
                f'\t Student Name is {row[0]}. His branch is {row[1]}, He is in {row[2]}nd year. His CGPA is {row[3]}')
            line_count += 1
    print(f'Number of lines:  {line_count}')

Column names are :name, branch, year, cgpa
	 Student Name is Nikhil. His branch is COE, He is in 2nd year. His CGPA is 9.0
	 Student Name is Sanchit. His branch is COE, He is in 2nd year. His CGPA is 9.1
	 Student Name is Aditya. His branch is IT, He is in 2nd year. His CGPA is 9.3
	 Student Name is Sagar. His branch is SE, He is in 1nd year. His CGPA is 9.5
	 Student Name is Prateek. His branch is MCE, He is in 3nd year. His CGPA is 7.8
	 Student Name is Sahil. His branch is EP, He is in 2nd year. His CGPA is 9.1
Number of lines:  7


#### File with xlsx Extension
To read excel files we need to install xlsxwriter package. We will cover this after we cover package installing using pip.

In [96]:
pip install xlsxwriter

Collecting xlsxwriter
  Downloading XlsxWriter-3.2.0-py3-none-any.whl.metadata (2.6 kB)
Downloading XlsxWriter-3.2.0-py3-none-any.whl (159 kB)
   ---------------------------------------- 0.0/159.9 kB ? eta -:--:--
   ------- ------------------------------- 30.7/159.9 kB 660.6 kB/s eta 0:00:01
   ----------------------- ---------------- 92.2/159.9 kB 1.1 MB/s eta 0:00:01
   ---------------------------------------- 159.9/159.9 kB 1.4 MB/s eta 0:00:00
Installing collected packages: xlsxwriter
Successfully installed xlsxwriter-3.2.0
Note: you may need to restart the kernel to use updated packages.


In [143]:
# import xlsxwriter module
import xlsxwriter

workbook = xlsxwriter.Workbook('Example2.xlsx')
worksheet = workbook.add_worksheet()

# Start from the first cell.
# Rows and columns are zero indexed.
row = 0
column = 0

content = ["ankit", "rahul", "priya", "harshita",
					"sumit", "neeraj", "shivam"]

# iterating through content list
for item in content :

	# write operation perform
	worksheet.write(row, column, item)

	# incrementing the value of row by one
	# with each iterations.
	row += 1
	
workbook.close()


In [101]:
# import xlsxwriter module
import xlsxwriter

workbook = xlsxwriter.Workbook('Example3.xlsx')

# By default worksheet names in the spreadsheet will be 
# Sheet1, Sheet2 etc., but we can also specify a name.
worksheet = workbook.add_worksheet("My sheet")

# Some data we want to write to the worksheet.
scores = (
	['ankit', 1000],
	['rahul', 100],
	['priya', 300],
	['harshita', 50],
)

# Start from the first cell. Rows and
# columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for name, score in (scores):
	worksheet.write(row, col, name)
	worksheet.write(row, col + 1, score)
	row += 1

workbook.close()