In [1]:
!pip install xlsxwriter



# Problem: Pandas overwrites formulas and formatting

First, check out the contents of 'week_05_in-class_activity.xlsx' -- there's a formula in use in the fourth column

In [2]:
import xlsxwriter
import pandas
print('Pandas',pandas.__version__)

Pandas 0.23.4


We can read in the file using Pandas and we get the data

In [3]:
df=pandas.read_excel('week_05_in-class_activity.xlsx')

In [4]:
df.head()

Unnamed: 0,cool,awesome,rad,gnarly
0,9,15,2,12.0
1,7,50,2,28.5
2,10,35,1,22.5
3,3,48,1,25.5
4,10,11,1,10.5


We can manipulate the data in Pandas

In [5]:
df['fantastic']=df['cool']+2*df['rad']

In [6]:
df.head()

Unnamed: 0,cool,awesome,rad,gnarly,fantastic
0,9,15,2,12.0,13
1,7,50,2,28.5,11
2,10,35,1,22.5,12
3,3,48,1,25.5,5
4,10,11,1,10.5,12


after making our changes, write the updated dataframe back to Excel

In [7]:
df.to_excel('tmp_overwrite.xlsx', engine='xlsxwriter')

Problem: the new dataframe doesn't have the formulas or plots

Happily, there's an alternative (more Excel-centric) approach: Openpyxl


_References_
* Page 268 of Automate the Boring Stuff with Python
* https://medium.com/aubergine-solutions/working-with-excel-sheets-in-python-using-openpyxl-4f9fd32de87f
* https://yagisanatode.com/2017/11/18/copy-and-paste-ranges-in-excel-with-openpyxl-and-python-3/
* http://zetcode.com/python/openpyxl/
* https://openpyxl.readthedocs.io/en/stable/tutorial.html
* https://openpyxl.readthedocs.io/en/stable/usage.html
* https://stackoverflow.com/questions/40385689/add-a-new-sheet-to-a-existing-workbook-in-python
* https://www.pythonexcel.com/openpyxl-load-workbook-function.php

# install library

make the library available on your computer

In [8]:
!pip install openpyxl



# load library

Make the library available to the Python kernel running in this Jupyter notebook

In [9]:
import openpyxl

# load data

In [10]:
wb = openpyxl.load_workbook('week_05_in-class_activity.xlsx')

# an Excel file is composed of one or more sheets
wb.sheetnames

['Sheet1']

In [11]:
# select the one available sheet; assign to a variable
sheet = wb['Sheet1']

type(sheet)

openpyxl.worksheet.worksheet.Worksheet

# explore the data

In [12]:
# what is at the first row and first column in the worksheet?

sheet.cell(row=1, column=1)

<Cell 'Sheet1'.A1>

This approach of accessing data in cells uses integers to access specific locations

In [13]:
sheet.cell(row=1, column=1).value

'cool'

Because we are using integers to access location, we can loop over the values

In [14]:
for row_indx in range(1,5):
    print(row_indx, sheet.cell(row=row_indx, column=1).value)

1 cool
2 9
3 7
4 10


# create a new worksheet

In [15]:
# https://openpyxl.readthedocs.io/en/stable/tutorial.html
new_ws = wb.create_sheet("example") 

In [16]:
print(wb.sheetnames)

['Sheet1', 'example']


In [17]:
# https://stackoverflow.com/questions/31395058/how-to-write-to-a-new-cell-in-python-using-openpyxl
new_ws.cell(row=2, column=2).value = 'a string'

The following is an entirely distinct way of accessing cells

In [29]:
new_ws["A1"] = "=SUM(1, 1)"

# play with ranges of cells

The name-based cell location is a list

In [19]:
# https://openpyxl.readthedocs.io/en/stable/_modules/openpyxl/worksheet/cell_range.html
new_ws['A1':'A5']

((<Cell 'example'.A1>,),
 (<Cell 'example'.A2>,),
 (<Cell 'example'.A3>,),
 (<Cell 'example'.A4>,),
 (<Cell 'example'.A5>,))

In [20]:
type(new_ws['A1':'A5'])

tuple

A tuple is a collection which is ordered and immutable -- https://www.w3schools.com/python/python_tuples.asp

https://www.geeksforgeeks.org/tuples-in-python/

In [21]:
new_ws['A1':'A5'][0]

(<Cell 'example'.A1>,)

In [22]:
type(new_ws['A1':'A5'][0])

tuple

In [23]:
new_ws['A1':'A5'][0][0]

<Cell 'example'.A1>

In [24]:
type(new_ws['A1':'A5'][0][0])

openpyxl.cell.cell.Cell

The following returns a tuple of tuples and then we access the zeroth entry of the zeroth tuple

In [25]:
new_ws['A1':'A5'][0][0].value

'=SUM(1, 1)'

as with the integer-based access method, we can use loops over the lists

In [26]:
for each_cell in new_ws['A1':'A5']:
    each_cell[0].value='bob'

# save our work to a new file

In [27]:
wb.save('new_file.xlsx')