# Nested JSON to CSV Conversion

Our job is to convert the JSON file to a CSV format. There can be many reasons as to why we need to perform this conversion. CSV are easy to read when opened in a spreadsheet GUI application like Google Sheets or MS Excel. They are easy to work with for Data Analysis task. It is also a widely excepted format when working with tabular data since it is easy to view for humans, unlike the JSON format.

### Approach

- The first step is to read the JSON file as a python dict object. This will help us to make use of python dict methods to perform some operations. The read_json() function is used for the task, which taken the file path along with the extension as a parameter and returns the contents of the JSON file as a python dict object.
- We normalize the dict object using the normalize_json() function. It checks for the key-value pairs in the dict object. If the value is again a dict then it concatenates the key string with the key string of the nested dict.
- The desired CSV data is created using the generate_csv_data() function. This function concatenates each record using a comma (,) and then all these individual records are appended with a new line (‘\n’ in python).
- In the final step, we write the CSV data generated in the earlier step to a preferred location provided through the filepath parameter.


In [1]:
# converting:- 
import json

def read_json(filename: str) -> dict:
    try:
        with open(filename, "r") as f:
            data = json.loads(f.read())
    except:
        raise Exception(f"Reading {filename} file encountered an error")

    return data

def normalize_json(data: dict) -> dict:
    new_data = dict()
    for key, value in data.items():
        if not isinstance(value, dict):
            new_data[key] = value
        else:
            for k, v in value.items():
                new_data[key + "_" + k] = v

    return new_data

def generate_csv_data(data: dict) -> str:
    
    #defining CSV columns in a list to maintain the order
    csv_columns = data.keys()

    # generate the first row of csv
    csv_data = ",".join(csv_columns) + "\n"

    #generate the single record present
    new_row = list()
    for col in csv_columns:
        new_row.append(str(data[col]))

    # Concatenate the record with the column information  
    # in CSV format 
    csv_data += ",".join(new_row) + "\n"
  
    return csv_data 
  
  
def write_to_file(data: str, filepath: str) -> bool: 
  
    try: 
        with open(filepath, "w+") as f: 
            f.write(data) 
    except: 
        raise Exception(f"Saving data to {filepath} encountered an error") 
  
  
def main(): 
    # Read the JSON file as python dictionary 
    data = read_json(filename="article.json") 
  
    # Normalize the nested python dict 
    new_data = normalize_json(data=data) 
  
    # Pretty print the new dict object 
    print("New dict:", new_data) 
  
    # Generate the desired CSV data  
    csv_data = generate_csv_data(data=new_data) 
  
    # Save the generated CSV data to a CSV file 
    write_to_file(data=csv_data, filepath="data.csv") 
  
  
if __name__ == '__main__': 
    main() 

New dict: {'article_id': 3214507, 'article_link': 'http://sample.link', 'published_on': '17-Sep-2020', 'source': 'moneycontrol', 'article_title': 'IT stocks to see a jump this month', 'article_category': 'finance', 'article_image': 'http://sample.img', 'article_sentiment': 'neutral'}


The same can be achieved through the use of Pandas Python library.

### Approach

- The first step is to read the JSON file as a python dict object. This will help us to make use of python dict methods to perform some operations. The read_json() function is used for the task, which taken the file path along with the extension as a parameter and returns the contents of the JSON file as a python dict object.
- We normalize the dict object using the normalize_json() function. It check for the key-value pairs in the dict object. If the value is again a dict then it concatenates the key string with the key string of the nested dict.
- In this step, rather than putting manual effort for appending individual objects as each record of the CSV, we are using pandas.DataFrame() method. It takes in the dict object and generates the desired CSV data in the form of pandas DataFrame object. One thing in the above code is worth noting that, the values of the “new_data” dict variable are present in a list. The reason is that while passing a dictionary to create a pandas dataframe, the values of the dict must be a list of values where each value represents the value present in each row for that key or column name. Here, we have a single row.
- We use pandas.DataFrame.to_csv() method which takes in the path along with the filename where you want to save the CSV as input parameter and saves the generated CSV data in Step 3 as CSV.


In [2]:
# Example: JSON to CSV conversion using Pandas

import json 
import pandas as pd


def read_json(filename: str) -> dict: 
    try:
        with open(filename, "r") as f:
            data = json.loads(f.read())
    except Exception as e:
        raise Exception(f"Error reading {filename}: {str(e)}")
    return data


def normalize_json(data: dict) -> dict: 

	new_data = {} 
	for key, value in data.items(): 
		if not isinstance(value, dict): 
			new_data[key] = value 
		else: 
			for k, v in value.items(): 
				new_data[f"{key}_{k}"] = v 
	return new_data 


def main(): 
	# Read the JSON file as python dictionary 
	data = read_json(filename="article.json") 

	# Normalize the nested python dict 
	new_data = normalize_json(data=data) 

	print("New dict:", new_data, "\n") 

	# Create a pandas dataframe 
	dataframe = pd.DataFrame(new_data, index=[0]) 

	# Write to a CSV file 
	dataframe.to_csv("article.csv", index= False) 


if __name__ == '__main__': 
	main() 


New dict: {'article_id': 3214507, 'article_link': 'http://sample.link', 'published_on': '17-Sep-2020', 'source': 'moneycontrol', 'article_title': 'IT stocks to see a jump this month', 'article_category': 'finance', 'article_image': 'http://sample.img', 'article_sentiment': 'neutral'} 



In [3]:
import pandas as pd
import json

json_file_path = 'article.json' 
with open(json_file_path, 'r') as file:
    data = json.load(file)

df = pd.json_normalize(data)

csv_file_path = 'article.csv' 
df.to_csv(csv_file_path, index=False)

print(f"Nested JSON data has been successfully flattened and saved to {csv_file_path}.")


Nested JSON data has been successfully flattened and saved to article.csv.


In [4]:
import json
import pandas as pd
from pandas import json_normalize


def read_json(filename: str) -> dict:
    """Read JSON file and return its data."""
    try:
        with open(filename, "r") as f:
            data = json.loads(f.read())
    except Exception as e:
        raise Exception(f"Error reading {filename}: {str(e)}")
    return data


def normalize_json(data: dict) -> pd.DataFrame:
    """Flatten nested JSON data to match CSV format."""
    base_data = {
        "article_id": data.get("article_id"),
        "article_link": data.get("article_link"),
        "published_on": data.get("published_on"),
        "source": data.get("source"),
        "article_title": data["article"].get("title"),
        "article_category": data["article"].get("category"),
        "article_image": data["article"].get("image"),
        "article_sentiment": data["article"].get("sentiment"),
    }

    flattened_data = []

    for test_entry in data.get("test", []):
        row = base_data.copy()
        row["test_test"] = test_entry.get("test")
        row["test_test2"] = test_entry.get("test2")
        row["test_address"] = test_entry.get("address", [None])[0]

        if "names" in test_entry:
            for name_entry in test_entry["names"]:
                row_copy = row.copy()
                row_copy["test_names_name"] = name_entry.get("name")
                row_copy["test_names_age"] = name_entry.get("age")
                row_copy["Rank"] = 1
                flattened_data.append(row_copy)
        else:
            row["test_names_name"] = None
            row["test_names_age"] = None
            row["Rank"] = 1
            flattened_data.append(row)

    dataframe = pd.DataFrame(flattened_data)

    column_order = [
        "article_id", "article_link", "published_on", "source", "article_title",
        "article_category", "article_image", "article_sentiment", "test_test",
        "test_test2", "test_address", "Rank", "test_names_name", "test_names_age",
    ]
    dataframe = dataframe.reindex(columns=column_order, fill_value=None)

    return dataframe


def main():
    data = read_json(filename="article.json")

    dataframe = normalize_json(data)

    dataframe.to_csv("article.csv", index=False)
    print("CSV file 'article.csv' created successfully!")


if __name__ == "__main__":
    main()


CSV file 'article.csv' created successfully!


In [5]:
import json
import pandas as pd


def read_json(filename: str) -> dict:
    """Read JSON file and return its data."""
    try:
        with open(filename, "r") as f:
            data = json.loads(f.read())
    except Exception as e:
        raise Exception(f"Error reading {filename}: {str(e)}")
    return data


def normalize_json(data: dict) -> pd.DataFrame:
    """Flatten nested JSON data to match CSV format."""
    base_data = {
        "article_id": data.get("article_id"),
        "article_link": data.get("article_link"),
        "published_on": data.get("published_on"),
        "source": data.get("source"),
        "article_title": data["article"].get("title"),
        "article_category": data["article"].get("category"),
        "article_image": data["article"].get("image"),
        "article_sentiment": data["article"].get("sentiment"),
    }

    flattened_data = []

    for test_entry in data.get("test", []):
        row = base_data.copy()
        row["test_test"] = test_entry.get("test")
        row["test_test2"] = test_entry.get("test2")
        row["test_address"] = test_entry.get("address", [None])[0]

        if "names" in test_entry:
            for name_entry in test_entry["names"]:
                row_copy = row.copy()
                row_copy["test_names_name"] = name_entry.get("name")
                row_copy["test_names_age"] = name_entry.get("age")
                row_copy["Rank"] = 1
                flattened_data.append(row_copy)
        else:
            row["test_names_name"] = None
            row["test_names_age"] = None
            row["Rank"] = 1
            flattened_data.append(row)

    dataframe = pd.DataFrame(flattened_data)

    column_order = [
        "article_id", "article_link", "published_on", "source", "article_title",
        "article_category", "article_image", "article_sentiment", "test_test",
        "test_test2", "test_address", "Rank", "test_names_name", "test_names_age",
    ]
    dataframe = dataframe.reindex(columns=column_order, fill_value=None)

    return dataframe


def main():
    data = read_json(filename="article.json")

    dataframe = normalize_json(data)

    dataframe.to_csv("article.csv", index=False)
    print("CSV file 'article.csv' created successfully!")


if __name__ == "__main__":
    main()


CSV file 'article.csv' created successfully!


## Convert N-nested JSON to CSV

Any number of nesting and records in a JSON can be handled with minimal code using “json_normalize()” method in pandas. 

<strong>Syntax:</strong>


    json_normalize(data)



### Approach

- The first step is to read the JSON file as a python dict object. This will help us to make use of python dict methods to perform some operations. The read_json() function is used for the task, which taken the file path along with the extension as a parameter and returns the contents of the JSON file as a python dict object.  
- We have iterated for each JSON object present in the details array. In each iteration we first normalized the JSON and created a temporary dataframe. This dataframe was then appended to the output dataframe.  
- Once done, the column name was renamed for better visibility. If we see the console output, the “major” column was named as “education.graduation.major” before renaming. This is because the “json_normalize()” method uses the keys in the complete nest for generating the column name to avoid duplicate column issue. So, “education” is the first level, “graduation” is second and “major” is third level in the JSON nesting. Therefore, the column “education.graduation.major” was simply renamed to “graduation”.   
- After renaming the columns, the to_csv() method saves the pandas dataframe object as CSV to the provided file location.


In [6]:
# Example: Converting n-nested JSON to CSV
import json 
import pandas as pd


def read_json(filename: str) -> dict: 

	try: 
		with open(filename, "r") as f: 
			data = json.loads(f.read()) 
	except: 
		raise Exception(f"Reading {filename} file encountered an error") 

	return data 


def create_dataframe(data: list) -> pd.DataFrame: 

	# Declare an empty dataframe to append records 
	dataframe = pd.DataFrame() 

	# Looping through each record 
	for d in data: 
		
		# Normalize the column levels 
		record = pd.json_normalize(d) 
		
		# Append it to the dataframe 
		dataframe = pd.concat([dataframe, record], ignore_index=True)

	return dataframe 


def main(): 
	# Read the JSON file as python dictionary 
	data = read_json(filename="details.json") 

	# Generate the dataframe for the array items in 
	# details key 
	dataframe = create_dataframe(data=data['details']) 

	# Renaming columns of the dataframe 
	print("Normalized Columns:", dataframe.columns.to_list()) 

	dataframe.rename(columns={ 
		"results.school": "school", 
		"results.high_school": "high_school", 
		"results.graduation": "graduation", 
		"education.graduation.major": "grad_major", 
		"education.graduation.minor": "grad_minor"
	}, inplace=True) 

	print("Renamed Columns:", dataframe.columns.to_list()) 

	# Convert dataframe to CSV 
	dataframe.to_csv("details.csv", index=False) 


if __name__ == '__main__': 
	main() 


Normalized Columns: ['id', 'name', 'age', 'results.school', 'results.high_school', 'results.graduation', 'education.graduation.major', 'education.graduation.minor']
Renamed Columns: ['id', 'name', 'age', 'school', 'high_school', 'graduation', 'grad_major', 'grad_minor']


# XML to JSON to CSV Conversion

### XML to JSON

A <b>JSON</b> file is a file that stores simple data structures and objects in JavaScript Object Notation (JSON) format, which is a standard data interchange format. It is primarily used for transmitting data between a web application and a server. A JSON object contains data in the form of a key/value pair. The keys are strings and the values are the JSON types. Keys and values are separated by a colon. Each entry (key/value pair) is separated by a comma. JSON files are lightweight, text-based, human-readable, and can be edited using a text editor.

<b>XML</b> is a markup language which is designed to store data. It is case sensitive. XML offers you to define markup elements and generate customized markup language. The basic unit in the XML is known as an element. The XML language has no predefined tags. It simplifies data sharing, data transport, platform changes, data availability Extension of an XML file is .xml

Both JSON and XML file format are used for transferring data between client and server. 

However, they both serve the same purpose though differ in their own way.

### Using xmltodict and json module

To handle the JSON file format, Python provides a module named json.

<b>STEP 1:</b> Install xmltodict module using pip or any other python package manager  

pip install xmltodict

STEP 2: import json module using the keyword import 

import json

<b>STEP 3:</b> Read the xml file here, “data_dict” is the variable in which we have loaded our XML data after converting it to dictionary datatype. 

with open("xml_file.xml") as xml_file:
    data_dict = xmltodict.parse(xml_file.read())

<b>STEP 4:</b> Convert the xml_data into a dictionary and store it in a variable JSON object are surrounded by curly braces { }. They are written in key and value pairs.

    json.loads() takes in a string and returns a json object. 
    json.dumps() takes in a json object and returns a string. 
    We use xml_data as input string and generate python object, so we use json.dumps() 

json_data = json.dumps(data_dict)

Here, json_data is the variable used to store the generated object.

<b>STEP 5:</b> Write the json_data to output file 

with open("data.json", "w") as json_file:
        json_file.write(json_data)

In [8]:
# Program to convert an xml
# file to json file

# import json module and xmltodict
# module provided by python
import json
import xmltodict


# open the input xml file and read
# data in form of python dictionary 
# using xmltodict module
with open("test.xml") as xml_file:
	
	data_dict = xmltodict.parse(xml_file.read())
	# xml_file.close()
	
	# generate the object using json.dumps() 
	# corresponding to json data
	
	json_data = json.dumps(data_dict)
	
	# Write the json data to output 
	# json file
	with open("data.json", "w") as json_file:
		json_file.write(json_data)
		# json_file.close()


### XML to CSV

### Approach

- Import module
- Declare rows and columns for the data to arranged in csv file
- Load xml file
- Parse xml file
- Write each row to csv file one by one
- Save csv file


In [9]:
# Importing the required libraries 
import xml.etree.ElementTree as Xet 
import pandas as pd 

cols = ["name", "phone", "email", "date", "country"] 
rows = [] 

# Parsing the XML file 
xmlparse = Xet.parse('sample.xml') 
root = xmlparse.getroot() 
for i in root: 
	name = i.find("name").text 
	phone = i.find("phone").text 
	email = i.find("email").text 
	date = i.find("date").text 
	country = i.find("country").text 

	rows.append({"name": name, 
				"phone": phone, 
				"email": email, 
				"date": date, 
				"country": country}) 

df = pd.DataFrame(rows, columns=cols) 

# Writing dataframe to csv 
df.to_csv('output.csv') 

df_from_csv = pd.read_csv('output.csv')
df_from_csv

Unnamed: 0.1,Unnamed: 0,name,phone,email,date,country
0,0,John Doe,123-456-7890,john.doe@example.com,2025-01-22,USA
1,1,Jane Smith,987-654-3210,jane.smith@example.com,2025-01-20,Canada
2,2,Tom Harris,555-123-4567,tom.harris@example.com,2025-01-18,UK


# Chunking Datasets with Pandas

When working with massive datasets, attempting to load an entire file at once can overwhelm system memory and cause crashes. Pandas provides an efficient way to handle large files by processing them in smaller, memory-friendly chunks using the chunksize parameter.

### Using chunksize parameter in read_csv()

For instance, suppose you have a large CSV file that is too large to fit into memory. The file contains 1,000,000 ( 10 Lakh ) rows so instead we can load it in chunks of 10,000 ( 10 Thousand) rows- 100 times rows i.e You will process the file in 100 chunks, where each chunk contains 10,000 rows using Pandas like this:

In [10]:
import pandas as pd

# Load a large CSV file in chunks of 10,000 rows
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
    print(chunk.shape) # process the shape of each chunk 


(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(10000, 10)
(100

This example demonstrates how to use chunksize parameter in the read_csv function to read a large CSV file in chunks, rather than loading the entire file into memory at once.<br>
<b>How to Use the chunksize Parameter?
</b>
- Specify the Chunk Size: You define the number of rows to be read at a time using the chunksize parameter.
- Iterate Over Chunks: The read_csv function returns a TextFileReader object, which is an iterator that yields DataFrames representing each chunk.
- Process Each Chunk: Perform operations like filtering, aggregation, or transformation on each chunk before moving to the next. After processing all chunks, combine the results if necessary using methods like concat().

### Loading a massive file in smaller chunks: Examples
#### Example 1: Handling large files and creating a consolidated output file incrementally.

In [11]:
import pandas as pd

# Load a large CSV file in chunks of 10,000 rows
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
    # Process the chunk (e.g., save it to a separate file)
    chunk.to_csv('chunk_file.csv', index=False, mode='a', header=False)

 Input file large_file.csv has 1,000,000 rows, so this loop will:

    Process the file in 100 chunks of 10,000 rows each.
    Append each chunk to chunk_file.csv until the entire file is saved.

Parameters:

    index=False: Excludes the index column from being written to the file.
    mode='a': Appends each chunk to the file instead of overwriting it.
    header=False: Skips writing the header (column names) for every chunk, assuming the header is written once in the destination file.

#### Example 2: Load the dataset and Get insights on it


In [12]:
import pandas as pd

# Load only the header of the CSV to get column names
columns = pd.read_csv('large_file.csv', nrows=0).columns
print(columns)


Index(['col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6', 'col_7', 'col_8',
       'col_9', 'col_10'],
      dtype='object')


Parameters:

    nrows=0: Tells pd.read_csv to load no data rows but still read the header (column names).
    .columns: Retrieves the column names as an Index object.

Why This is Efficient:

    Avoids loading unnecessary data chunks into memory.
    Quickly provides the column names regardless of file size.

#### How to Use Generators for Efficiency?

In [13]:
def read_large_file(file_path, chunk_size):
    for chunk in pd.read_csv(file_path, chunksize=chunk_size):
        yield chunk

for data_chunk in read_large_file('large_file.csv', 1000):
    # Process each data_chunk
    print(data_chunk.head())


      col_1     col_2     col_3     col_4     col_5     col_6     col_7  \
0  0.343203  0.825934  0.996324  0.409696  0.861135  0.767284  0.989489   
1  0.541168  0.308143  0.736459  0.511735  0.857131  0.739075  0.678633   
2  0.293255  0.689175  0.720272  0.042944  0.871663  0.218757  0.952930   
3  0.361888  0.767964  0.503981  0.315089  0.339170  0.930220  0.343695   
4  0.904959  0.806983  0.405558  0.978283  0.235744  0.909572  0.769065   

      col_8     col_9    col_10  
0  0.809386  0.105403  0.352833  
1  0.632395  0.525117  0.224253  
2  0.047677  0.548381  0.419413  
3  0.881994  0.040874  0.860752  
4  0.418893  0.399925  0.681327  
         col_1     col_2     col_3     col_4     col_5     col_6     col_7  \
1000  0.418520  0.462736  0.130971  0.369982  0.190686  0.255958  0.120098   
1001  0.388253  0.138324  0.596368  0.551290  0.200304  0.611152  0.151039   
1002  0.148895  0.882713  0.136761  0.400144  0.568857  0.214193  0.777318   
1003  0.694204  0.308583  0.85913

# CSV modules in python

CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format.

## Reading a CSV file

In [14]:
# importing csv module
import csv

# csv file name
filename = "details.csv"

# initializing the titles and rows list
fields = []
rows = []

# reading csv file
with open(filename, 'r') as csvfile:
    # creating a csv reader object
    csvreader = csv.reader(csvfile)

    # extracting field names through first row
    fields = next(csvreader)

    # extracting each data row one by one
    for row in csvreader:
        rows.append(row)

    # get total number of rows
    print("Total no. of rows: %d" % (csvreader.line_num))

# printing the field names
print('Field names are:' + ', '.join(field for field in fields))

# printing first 5 rows
print('\nFirst 5 rows are:\n')
for row in rows[:5]:
    # parsing each column of a row
    for col in row:
        print("%10s" % col, end=" "),
    print('\n')


Total no. of rows: 5
Field names are:id, name, age, school, high_school, graduation, grad_major, grad_minor

First 5 rows are:

    STU001 Amit Pathak         24         85         75         70  Computers  Sociology 

    STU002 Yash Kotian         32         80         58         49    Biology  Chemistry 

    STU003 Aanchal Singh         28         90         70         65        Art         IT 

    STU004 Juhi Vadia         23         95         89         83         IT     Social 



## Reading CSV Files Into a Dictionary With csv 

In [15]:
import csv

# Open the CSV file for reading
with open('details.csv', mode='r') as file:
    # Create a CSV reader with DictReader
    csv_reader = csv.DictReader(file)

    # Initialize an empty list to store the dictionaries
    data_list = []

    # Iterate through each row in the CSV file
    for row in csv_reader:
        # Append each row (as a dictionary) to the list
        data_list.append(row)

# Print the list of dictionaries
for data in data_list:
    print(data)


{'id': 'STU001', 'name': 'Amit Pathak', 'age': '24', 'school': '85', 'high_school': '75', 'graduation': '70', 'grad_major': 'Computers', 'grad_minor': 'Sociology'}
{'id': 'STU002', 'name': 'Yash Kotian', 'age': '32', 'school': '80', 'high_school': '58', 'graduation': '49', 'grad_major': 'Biology', 'grad_minor': 'Chemistry'}
{'id': 'STU003', 'name': 'Aanchal Singh', 'age': '28', 'school': '90', 'high_school': '70', 'graduation': '65', 'grad_major': 'Art', 'grad_minor': 'IT'}
{'id': 'STU004', 'name': 'Juhi Vadia', 'age': '23', 'school': '95', 'high_school': '89', 'graduation': '83', 'grad_major': 'IT', 'grad_minor': 'Social'}


## Writing to a CSV file 

In [16]:
# importing the csv module
import csv
# field names
fields = ['Name', 'Branch', 'Year', 'CGPA']
# data rows of csv file
rows = [['Nikhil', 'COE', '2', '9.0'],
        ['Sanchit', 'COE', '2', '9.1'],
        ['Aditya', 'IT', '2', '9.3'],
        ['Sagar', 'SE', '1', '9.5'],
        ['Prateek', 'MCE', '3', '7.8'],
        ['Sahil', 'EP', '2', '9.1']]
# name of csv file
filename = "university_records.csv"
# writing to csv file
with open(filename, 'w') as csvfile:
    # creating a csv writer object
    csvwriter = csv.writer(csvfile)
    # writing the fields
    csvwriter.writerow(fields)
    # writing the data rows
    csvwriter.writerows(rows)


## Writing a dictionary to a CSV file 

In [17]:
# importing the csv module
import csv

# my data rows as dictionary objects
mydict = [{'branch': 'COE', 'cgpa': '9.0',
           'name': 'Nikhil', 'year': '2'},
          {'branch': 'COE', 'cgpa': '9.1',
           'name': 'Sanchit', 'year': '2'},
          {'branch': 'IT', 'cgpa': '9.3',
           'name': 'Aditya', 'year': '2'},
          {'branch': 'SE', 'cgpa': '9.5',
           'name': 'Sagar', 'year': '1'},
          {'branch': 'MCE', 'cgpa': '7.8',
           'name': 'Prateek', 'year': '3'},
          {'branch': 'EP', 'cgpa': '9.1',
           'name': 'Sahil', 'year': '2'}]

# field names
fields = ['name', 'branch', 'year', 'cgpa']

# name of csv file
filename = "university_records.csv"

# writing to csv file
with open(filename, 'w') as csvfile:
    # creating a csv dict writer object
    writer = csv.DictWriter(csvfile, fieldnames=fields)

    # writing headers (field names)
    writer.writeheader()

    # writing data rows
    writer.writerows(mydict)


<b> JSON organizes data by the following ways :-</b>

<strong>JSON Syntax Rules

- Data is in name/value pairs
- Data is separated by commas
- Curly braces hold objects
- Square brackets hold arrays


<b>JSON Data - A Name and a Value
</b><br>
JSON data is written as name/value pairs, just like JavaScript object properties.

A name/value pair consists of a field name (in double quotes), followed by a colon, followed by a value:
"firstName":"John"

JSON names require double quotes. JavaScript names do not.

<b>JSON Objects
</b><br>
JSON objects are written inside curly braces.

Just like in JavaScript, objects can contain multiple name/value pairs:
{"firstName":"John", "lastName":"Doe"}

<b>JSON Arrays
</b><br>
JSON arrays are written inside square brackets.

Just like in JavaScript, an array can contain objects:
"employees":[
    {"firstName":"John", "lastName":"Doe"},
    {"firstName":"Anna", "lastName":"Smith"},
    {"firstName":"Peter", "lastName":"Jones"}
]

In the example above, the object "employees" is an array. It contains three objects.

Each object is a record of a person (with a first name and a last name).

<b> Organizing Data in CSV</b>
- The full form of CSV is Comma-separated values.
- Comma-separated value is a simple yet powerful file format to store and exchange data.
- Values are separated using commas in this plain text file format.
- CSV files are often used to store data in a tabular format, such as spreadsheets or databases.

<b>Features of CSV files
</b>
- Delimiters: Delimiters are used to separate values in a row. The most common delimiter is the comma but there are other delimiters such as tabs or semicolons.
- Quotes: Quotes make it possible to store any type of data in a CSV file.
- Header rows: Header rows help to identify the columns in the file

<b> Organizing Data in XML</b>
XML is a software- and hardware-independent tool for storing and transporting data.

What is XML?

    XML stands for eXtensible Markup Language
    XML is a markup language much like HTML
    XML was designed to store and transport data
    XML was designed to be self-descriptive

- An XML file is a plain text file that contains data marked up using XML syntax.
- The data in an XML file is organized using tags and attributes, and can be read by a variety of applications and platforms.