In [1]:
import pandas as pd
filename='C:\MyLearn\DataSet\FileHandling\sample.xlsx'

# Data Loading From Excel Files

Read Excel files (extensions:.xlsx, .xls) with Python Pandas. To read an excel file as a DataFrame, use the pandas read_excel() method.

You can read the first sheet, specific sheets, multiple sheets or all sheets. Pandas converts this to the DataFrame structure, which is a tabular like structure.

### Library's for Using Excel with Pandas

**Reading and Writing Excel Files**

There are python packages available to work with Excel files that will run on any Python platform and that do not require either Windows or Excel to be used. They are fast, reliable and open source:

* **openpyxl**: The recommended package for reading and writing Excel 2010 files (ie: .xlsx)
* **xlsxwriter**: An alternative package for writing data, formatting information and, in particular, charts in the Excel 2010 format (ie: .xlsx)
* **pyxlsb**: This package allows you to read Excel files in the xlsb format.

* **pylightxl**: This package allows you to read xlsx and xlsm files and write xlsx files.

* **xlrd**: This package is for reading data and formatting information from older Excel files (ie: .xls)
* **xlwt**: This package is for writing data and formatting information to older Excel files (ie: .xls)
* **xlutils**: This package collects utilities that require both xlrd and xlwt, including the ability to copy and modify or filter existing excel files.

**Note**: In general, these use cases are now covered by **openpyxl!**


##### Writing Excel Add-Ins
The following products can be used to write Excel add-ins in Python. Unlike the reader and writer packages, they require an installation of Microsoft Excel.

* **PyXLL**: PyXLL is a commercial product that enables writing Excel add-ins in Python with no VBA. Python functions can be exposed as worksheet functions (UDFs), macros, menus and ribbon tool bars.
* **xlwings**: xlwings is an open-source library to automate Excel with Python instead of VBA and works on Windows and macOS: you can call Python from Excel and vice versa and write UDFs in Python (Windows only). xlwings PRO is a commercial add-on with additional functionality.


***

### Read excel

Specify the path or URL of the Excel file in the first argument. If there are multiple sheets, only the first sheet is used by pandas. It reads as DataFrame.


In [2]:
df = pd.read_excel(filename)

In [3]:
df

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet1,11,12,13
1,Two,Sheet1,21,22,23
2,Three,Sheet1,31,32,33


***

### Get sheet

You can specify the sheet to read with the argument sheet_name. Sheet_name argument value can be by number starting at 0 for Sheet 1 or by sheet name.

In [4]:
df_sheet1 = pd.read_excel(filename, sheet_name=0)
df_sheet2 = pd.read_excel(filename, sheet_name='Sheet2')

In [5]:
df_sheet1

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet1,11,12,13
1,Two,Sheet1,21,22,23
2,Three,Sheet1,31,32,33


In [6]:
df_sheet2

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet2,11,12,13
1,Two,Sheet2,21,22,23
2,Three,Sheet2,31,32,33


***

### Load multiple sheets

It is also possible to specify a list in the argument sheet_name. It is OK even if it is a number of 0 starting or the sheet name.

The specified number or sheet name is the key, and the data pandas. The DataFrame is read as the ordered dictionary OrderedDict with the value.

In [7]:
df_multi_sheet = pd.read_excel(filename, sheet_name=['Sheet1','Sheet2'])

In [8]:
df_multi_sheet

{'Sheet1':   Unnamed: 0   Sheet   A   B   C
 0        One  Sheet1  11  12  13
 1        Two  Sheet1  21  22  23
 2      Three  Sheet1  31  32  33,
 'Sheet2':   Unnamed: 0   Sheet   A   B   C
 0        One  Sheet2  11  12  13
 1        Two  Sheet2  21  22  23
 2      Three  Sheet2  31  32  33}

In [9]:
type(df_multi_sheet)

dict

In [10]:
df_multi_sheet.keys()

dict_keys(['Sheet1', 'Sheet2'])

In [11]:
df_multi_sheet.get('Sheet1')

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet1,11,12,13
1,Two,Sheet1,21,22,23
2,Three,Sheet1,31,32,33


In [12]:
df_multi_sheet.get('Sheet2')

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet2,11,12,13
1,Two,Sheet2,21,22,23
2,Three,Sheet2,31,32,33


***

### Load all sheets

If sheet_name argument is none, all sheets are read.

In [13]:
df_all_sheet = pd.read_excel(filename, sheet_name=None)

In [14]:
df_all_sheet.keys()

dict_keys(['Sheet1', 'Sheet2'])

In [15]:
df_all_sheet.get('Sheet1')

Unnamed: 0.1,Unnamed: 0,Sheet,A,B,C
0,One,Sheet1,11,12,13
1,Two,Sheet1,21,22,23
2,Three,Sheet1,31,32,33


****