# read_xls_with_pandas
read an excel file using pandas

## dependencies
In order to work Pandas might need (if you installation not yet installed them)
1. `xlrd >= 0.9.0` for Excel support as in https://stackoverflow.com/q/48066517/9475509, and / or
2. `openpyxl` as in https://github.com/spyder-ide/spyder/issues/18071.

Install them in command line.

For `xlrd` use `pip install xlrd`.

```
PS L:\home\py-jupyter-notebook> pip install xlrd
Collecting xlrd
  Downloading xlrd-2.0.1-py2.py3-none-any.whl (96 kB)
     ------------------------------------- 96.5/96.5 kB 791.7 kB/s eta 0:00:00
Installing collected packages: xlrd
Successfully installed xlrd-2.0.1
WARNING: There was an error checking the latest version of pip.
```

And for `openpyxl` use `pip install openpyxl`.

```
PS L:\home\py-jupyter-notebook> pip install openpyxl
Collecting openpyxl
  Downloading openpyxl-3.0.10-py2.py3-none-any.whl (242 kB)
     ----------------------------------- 242.1/242.1 kB 380.8 kB/s eta 0:00:00
Collecting et-xmlfile
  Downloading et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-1.1.0 openpyxl-3.0.10
```

Perhaps pip is also must be upgraded first using `pip install --upgrade pip`.

```
PS L:\home\py-jupyter-notebook> pip install --upgrade pip
Requirement already satisfied: pip in c:\users\sparisoma viridi\appdata\local\packages\pythonsoftwarefoundation.python.3.10_qbz5n2kfra8p0\localcache\local-packages\python310\site-packages (22.1)
Collecting pip
  Downloading pip-22.3.1-py3-none-any.whl (2.1 MB)
     --------------------------------------- 2.1/2.1 MB 350.3 kB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.1
    Uninstalling pip-22.1:
      Successfully uninstalled pip-22.1
Successfully installed pip-22.3.1
```

## files
![](files.png)
Excel files are in the same folder of Jupyter Notebook file `read_xls_with_pandas.ipynb`.

## import pandas

In [1]:
# import pandas library
import pandas as pd

### sample_0.xlsx
![](sample_0.png)

In [2]:
df0 = pd.read_excel("sample_0.xlsx")
print(df0)

     x   y
0    0   2
1    1   3
2    2   4
3    3   5
4    4   6
5    5   7
6    6   8
7    7   9
8    8  10
9    9  11
10  10  12


### sample_1.xlsx
![](sample_1.png)

In [3]:
df1 = pd.read_excel("sample_1.xlsx")
print(df1)

   Unnamed: 0  0  1  2   3   4
0           0  0  0  0   0   0
1           1  0  1  2   3   4
2           2  0  2  4   6   8
3           3  0  3  6   9  12
4           4  0  4  8  12  16


### sample_2.xlsx
![](sample_2.png)

In [4]:
df2 = pd.read_excel("sample_2.xlsx")
print(df2)

   x1      y1  Unnamed: 2  x2      y2
0   0   0.125         NaN   0   0.125
1   1   2.125         NaN   1   5.125
2   2   4.125         NaN   2  10.125
3   3   6.125         NaN   3  15.125
4   4   8.125         NaN   4  20.125
5   5  10.125         NaN   5  25.125
6   6  12.125         NaN   6  30.125
7   7  14.125         NaN   7  35.125


### sample_3.xlsx
![](sample_3.png)

In [5]:
df3 = pd.read_excel("sample_3.xlsx")
print(df3)

    Unnamed: 0 Unnamed: 1 Unnamed: 2  Unnamed: 3  Unnamed: 4 Unnamed: 5  \
0          NaN          0          1         2.0         3.0          4   
1          NaN          0          1         2.0         3.0          4   
2          NaN        NaN        NaN         NaN         NaN        NaN   
3          NaN        NaN        NaN         NaN         NaN        NaN   
4          NaN        NaN        NaN         NaN         NaN          x   
5          NaN        NaN        NaN         NaN         NaN          1   
6          NaN        NaN        NaN         NaN         NaN          3   
7          NaN        NaN        NaN         NaN         NaN          5   
8          NaN        NaN        NaN         NaN         NaN        NaN   
9          NaN          a          b         NaN         NaN        NaN   
10         NaN          1          2         NaN         NaN        NaN   
11         NaN          3          4         NaN         NaN        NaN   
12         NaN          5

+ Notice the `NAN` indicates an empty cell