# __Reading and Writing CSV Files__

Let's look at how to read and write CSV files. There are several methods that we can use. One is by using the CSV library and the other is by using pandas. 


## Method 1 - Using `csv` Library

In [1]:
import csv
csvFile = open('Sample_File.csv')
csvReader = csv.reader(csvFile)

- csvReader is an object of csv.reader()
- Check the type of csvReader object
- This step is not necessary but can be helpful for verifying that the file was loaded correctly

Now let us print the contents of the file using a for loop
* Declare a variable that iterates through the csvReader
* Print the line

In [2]:
# Reading a CSV file
with open('Sample_File.csv', 'r') as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        print(row)

['Name', 'Age', 'Gender']
['Mark', '24', 'Male']
['Manish', '30', 'Male']
['Mary', '28', 'Female']
['Mike', '44', 'Male']
['Becky', '18', 'Female']


### Step 2: Distinguish the Header and the Rows in the CSV File

- Open the CSV file for reading using the **with** and **open()** function
- Read the CSV file using the **reader()** function 
- Initiate the variable to **count** the rows
- Traverse through the lines in csvReader
- Check if the **count** is 0 and print the line as header if it is
- Else, print it as a row
- Increment the **count** by one



In [3]:
with open('Sample_File.csv','r') as csvFile:
    csvReader = csv.reader(csvFile)
    count = 0
    for line in csvReader:
        if count == 0:
            print('Header: '+str(line))
        else:
            print('Row: '+str(line))
        count+=1

Header: ['Name', 'Age', 'Gender']
Row: ['Mark', '24', 'Male']
Row: ['Manish', '30', 'Male']
Row: ['Mary', '28', 'Female']
Row: ['Mike', '44', 'Male']
Row: ['Becky', '18', 'Female']


### Step 3: Write a New Row to the CSV File

- Open the CSV file for writing using **with** and **open()** functions
- Create a writer object using the **writer()** function
- Write a new row to the CSV file using the **writerow()** function, and mention the row to be written as an argument to the **writerow()** function


In [4]:
with open('Sample_File.csv','a') as csvFile:
    csvWriter = csv.writer(csvFile)
    csvWriter.writerow(['Caitlin',27,'Female'])

### Building a CSV

In [5]:
# Writing to a CSV file
data = [['Name', 'Age'], ['John', 30], ['Jane', 25], ['Bob', 35]]

with open('output.csv', 'w', newline='') as file:
    csv_writer = csv.writer(file)
    csv_writer.writerows(data)


## Method 2 - Using `NumPy`

### A. Using `loadtxt`

In [6]:
import numpy as np

# for numerical data only
array = np.loadtxt('num_data.csv', delimiter=',')
array

array([  9., 167.,  30.,  71., 102.,  87., 229., 227.,  91.,  78.,  84.,
       177.,  91.,  26., 226.,   7.,  92., 215., 185.,  86., 119., 191.,
        21., 182.,  53.,  70., 185., 129., 232.,  28., 106., 136.,  62.,
       168., 139., 154., 149., 125., 188.,  29., 180.,   7., 166., 228.,
        72.,  25., 114., 135.,  54., 208.,  33., 228., 125.,  82.,  69.,
        53., 218.,  90.,  52., 153., 101., 157., 160.,   9., 187.,  20.,
        93.,  42.,  65., 225.,  56.,  55.])

In [7]:
# for csv file with text type
arr = np.loadtxt('Sample_File.csv'
                    ,delimiter = ','
                    ,dtype = str
                    #, skiprows=1 to skip the first row(header)
                    )
arr

array([['Name', 'Age', 'Gender'],
       ['Mark', '24', 'Male'],
       ['Manish', '30', 'Male'],
       ['Mary', '28', 'Female'],
       ['Mike', '44', 'Male'],
       ['Becky', '18', 'Female'],
       ['Caitlin', '27', 'Female']], dtype='<U7')

### B. Using `genfromtxt`

In [8]:
import numpy as np

# Reading a CSV file
data = np.genfromtxt('Sample_File.csv', delimiter=','
                     , dtype=str
                     #, encoding=None
                     )
print(data)


[['Name' 'Age' 'Gender']
 ['Mark' '24' 'Male']
 ['Manish' '30' 'Male']
 ['Mary' '28' 'Female']
 ['Mike' '44' 'Male']
 ['Becky' '18' 'Female']
 ['Caitlin' '27' 'Female']]


In [9]:
# to skip the header 
data[1:]

array([['Mark', '24', 'Male'],
       ['Manish', '30', 'Male'],
       ['Mary', '28', 'Female'],
       ['Mike', '44', 'Male'],
       ['Becky', '18', 'Female'],
       ['Caitlin', '27', 'Female']], dtype='<U7')

#### Saving a NumPy Array As A CSV File

In [10]:
arr = np.array([[1,2,3],[4,5,6]])
np.savetxt('sample_numpy.csv',arr,delimiter=',')

### Saving as NumPy File

In [11]:
arr = np.array([1,2,3,4,6,7])

In [12]:
# as a numpy file
np.save('file.npy',arr)

In [13]:
arr = np.load('file.npy')

In [22]:
arr

array([1, 2, 3, 4, 6, 7])

# Method 3 - Using Pandas

In [14]:
import pandas as pd

df = pd.read_csv('Sample_File.csv')

In [15]:
df

Unnamed: 0,Name,Age,Gender
0,Mark,24,Male
1,Manish,30,Male
2,Mary,28,Female
3,Mike,44,Male
4,Becky,18,Female
5,Caitlin,27,Female


Add a new row

In [16]:
df.loc[len(df.index)] = ['Ganesh',20,'Male']

In [17]:
df

Unnamed: 0,Name,Age,Gender
0,Mark,24,Male
1,Manish,30,Male
2,Mary,28,Female
3,Mike,44,Male
4,Becky,18,Female
5,Caitlin,27,Female
6,Ganesh,20,Male


Write the Updated DataFrame Back to the CSV File

In [18]:
#The "index = False" argument is used to skip writing the index that has been created by pandas internally to the CSV file.
df.to_csv('Sample_File.csv',index = False)

## JSON Data

In [19]:
import pandas as pd

# Example JSON data
json_data = '''
{
    "name": "John Doe",
    "age": 30,
    "city": "New York"
}
'''

# Load JSON data into a DataFrame
df = pd.read_json(json_data, orient='index')

# Transpose the DataFrame for a more structured view
df = df.transpose()

# Print the DataFrame
print(df)


       name age      city
0  John Doe  30  New York


In [20]:
# Transpose the DataFrame for a more structured view
df = df.transpose()

# Print the DataFrame
df

Unnamed: 0,0
name,John Doe
age,30
city,New York


In [21]:
import pandas as pd

# Example JSON data with multiple layers
json_data = '''
{
    "person": {
        "name": "John Doe",
        "age": 30,
        "address": {
            "city": "New York",
            "state": "NY"
        },
        "hobbies": ["reading", "traveling"]
    }
}
'''

# Load JSON data into a DataFrame
df = pd.read_json(json_data, orient='index')

# Transpose the DataFrame for a more structured view
df = df.transpose()

# Print the DataFrame
df


Unnamed: 0,person
name,John Doe
age,30
address,"{'city': 'New York', 'state': 'NY'}"
hobbies,"[reading, traveling]"
