#### Part 12: JSON and Excel Operations in Pandas

In this notebook, we'll explore:
- Working with JSON data in pandas
- Different JSON orientation options
- Date handling in JSON
- Working with Excel files

##### Setup
First, let's import the necessary libraries:

In [None]:
import pandas as pd
import numpy as np
from io import StringIO

##### 1. JSON Operations

### 1.1 Basic JSON Conversion

Let's start by creating a DataFrame and converting it to JSON:

In [None]:
dfj = pd.DataFrame(np.random.randn(5, 2), columns=list('AB'))

json = dfj.to_json()

json

### 1.2 Orient Options

There are different options for the format of the resulting JSON file/string. Let's create a DataFrame and Series to demonstrate:

In [None]:
dfjo = pd.DataFrame(dict(A=range(1, 4), B=range(4, 7), C=range(7, 10)),
                   columns=list('ABC'), index=list('xyz'))

dfjo

In [None]:
sjo = pd.Series(dict(x=15, y=16, z=17), name='D')

sjo

#### Column Oriented (default for DataFrame)
Serializes the data as nested JSON objects with column labels acting as the primary index:

In [None]:
dfjo.to_json(orient="columns")

#### Index Oriented (default for Series)
Similar to column oriented but the index labels are now primary:

In [None]:
dfjo.to_json(orient="index")

In [None]:
sjo.to_json(orient="index")

#### Record Oriented
Serializes the data to a JSON array of column -> value records, index labels are not included:

In [None]:
dfjo.to_json(orient="records")

In [None]:
sjo.to_json(orient="records")

#### Value Oriented
A bare-bones option which serializes to nested JSON arrays of values only, column and index labels are not included:

In [None]:
dfjo.to_json(orient="values")

#### Split Oriented
Serializes to a JSON object containing separate entries for values, index and columns. Name is also included for Series:

In [None]:
dfjo.to_json(orient="split")

In [None]:
sjo.to_json(orient="split")

### 1.3 Date Handling

#### Writing in ISO date format:

In [None]:
dfd = pd.DataFrame(np.random.randn(5, 2), columns=list('AB'))

dfd['date'] = pd.Timestamp('20130101')

dfd = dfd.sort_index(1, ascending=False)

json = dfd.to_json(date_format='iso')

json

#### Writing in ISO date format, with microseconds:

In [None]:
json = dfd.to_json(date_format='iso', date_unit='us')

##### 2. Excel Operations

### 2.1 Reading Excel Files

There are multiple ways to read Excel files in pandas:

In [None]:
# This is a code example - you would need an actual Excel file to run this
# Using the ExcelFile class
'''
data = {}
with pd.ExcelFile('path_to_file.xls') as xls:
    data['Sheet1'] = pd.read_excel(xls, 'Sheet1', index_col=None,
                                   na_values=['NA'])
    data['Sheet2'] = pd.read_excel(xls, 'Sheet2', index_col=None,
                                   na_values=['NA'])
'''

In [None]:
# Equivalent using the read_excel function
'''
data = pd.read_excel('path_to_file.xls', ['Sheet1', 'Sheet2'],
                     index_col=None, na_values=['NA'])
'''

### 2.2 Using xlrd.book.Book Object

In [None]:
# ExcelFile can also be called with a xlrd.book.Book object
'''
import xlrd
xlrd_book = xlrd.open_workbook('path_to_file.xls', on_demand=True)
with pd.ExcelFile(xlrd_book) as xls:
    df1 = pd.read_excel(xls, 'Sheet1')
    df2 = pd.read_excel(xls, 'Sheet2')
'''

### 2.3 Specifying Sheets

The `sheet_name` argument allows specifying the sheet or sheets to read:
- Default value is 0, indicating to read the first sheet
- Pass a string to refer to the name of a particular sheet
- Pass an integer to refer to the index of a sheet (0-based)
- Pass a list of strings or integers to return a dictionary of specified sheets
- Pass None to return a dictionary of all available sheets

In [None]:
# Examples (these are code examples - you would need actual Excel files)
'''
# Returns a DataFrame
pd.read_excel('path_to_file.xls', 'Sheet1', index_col=None, na_values=['NA'])

# Using the sheet index
pd.read_excel('path_to_file.xls', 0, index_col=None, na_values=['NA'])

# Using all default values
pd.read_excel('path_to_file.xls')

# Using None to get all sheets
pd.read_excel('path_to_file.xls', sheet_name=None)

# Using a list to get multiple sheets
pd.read_excel('path_to_file.xls', sheet_name=['Sheet1', 3])
'''

### 2.4 Reading a MultiIndex

`read_excel` can read a MultiIndex index by passing a list of columns to `index_col` and a MultiIndex column by passing a list of rows to `header`.

In [None]:
# Example of creating and reading a MultiIndex DataFrame with Excel
df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': [5, 6, 7, 8]},
                 index=pd.MultiIndex.from_product([['a', 'b'], ['c', 'd']]))

df

In [None]:
# This would write to an Excel file and then read it back
'''
df.to_excel('path_to_file.xlsx')
df = pd.read_excel('path_to_file.xlsx', index_col=[0, 1])
'''

If the index has level names, they will be parsed as well:

In [None]:
df.index = df.index.set_names(['lvl1', 'lvl2'])
df

In [None]:
# This would write to an Excel file and then read it back
'''
df.to_excel('path_to_file.xlsx')
df = pd.read_excel('path_to_file.xlsx', index_col=[0, 1])
'''