Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed charts from xlsx files #65

Closed
wants to merge 8 commits into from

Conversation

vfig
Copy link

@vfig vfig commented Dec 9, 2013

This code adds the ability to embed a chart from an Excel spreadsheet into a presentation. The first chart in the spreadsheet is used, and the spreadsheet itself is also embedded so that that chart remains fully editable.

Example usage:

# Create a presentation with one slide
from pptx import Presentation
from pptx.util import Inches
prs = Presentation()
slide = prs.slides.add_slide(prs.slidelayouts[5])

# Add the chart to a slide (can use a file-like object or a filename)
slide.shapes.add_chart_from_spreadsheet("chart.xlsx",
    Inches(0.5), Inches(1.75), Inches(9), Inches(5))

# Save the presentation
prs.save("chart.pptx")

Multiple charts may be added to a presentation, but each must be embedded independently. This is due to limitations (AFAICT imposed by Powerpoint and not by ISO/IEC 29500) that prevent multiple chart shapes referencing a single embedded chart, or multiple embedded charts referencing a single embedded spreadsheet. The latter is possible with linked spreadsheets, but is not implemented in this code.

@scanny
Copy link
Owner

scanny commented Dec 9, 2013

Hi Andy, this looks very interesting, thanks for reaching out :).

Couple things to start off:

First, for various reasons, fairly extensive retirement of learning-curve-related technical debt over the past few months perhaps primary among them, the develop branch is ahead of master by 430-odd commits. So there would be some serious merging to do to incorporate this.

Before we think about what that might look like, can you explain for me what the particular user scenario is for this? Like maybe: "I have a bunch of charts I've developed in Excel and I want to add one or more of them into a presentation." or that sort of thing. And an idea of how it would be done from the PowerPoint UI would be very helpful as well, just so I see what procedure is being automated.

I'm definitely interested in the package having a general-purpose capability to add charts. It's actually crept up the backlog to where I was thinking to take it on in the first half of next year. I don't think this is that functionality exactly, but I'd like to understand better how it fits in with that. One of the challenges inherent in a start-from-scratch capability is having to build an Excel spreadsheet to embed to hold the data, and it looks like your work involves a way of doing that from an externally-built worksheet, so that could be an interesting incremental step that could be built upon for additional features.

If you'll let me know a bit more about what it does and how you've used it we can take things from there :)

@vfig
Copy link
Author

vfig commented Dec 9, 2013

The scenario for which I wrote this was quite simple: we needed to be able
to automate the generation of presentations that included charts. It was
desirable, but not essential that the charts would remain customisable and
editable.

The actual approach was determined from the specs and experimentation with
Powerpoint. From what I could see, we had only a few options:

  • Embed an image of the chart
  • Embed a vector version of the chart (SVG or metafile, presumably)
  • Embed a DrawingML chart
  • Embed an Excel chart as an OLE object.

The first two obviously fail at being editable, so I looked into the latter
two as preferable. Using an OLE object provided a generally worse
experience for users editing the spreadsheet (more cumbersome to customise,
and slower to render), and DrawingML looked feasible. Ultimately the
deciding factor was this:

“For WordprocessingML and PresentationML documents, the data for a chart is
not stored in the Chart part directly. Instead, it shall be stored in an
embedded SpreadsheetML package targeted by an Embedded Package part
specified by that Chart part.” — ISO/IEC 29500-1:2011(E), §14.2.1 p131

If we wanted to use a DrawingML chart, it had to embed a spreadsheet.
Getting python-pptx to do this would be a huge addition: not just apis for
defining a chart, but also everything to be able to create a simple
spreadsheet.

Now another part of our project required us to also generate Excel
spreadsheets containing charts, and we’d already used XlsxWriter (
http://xlsxwriter.readthedocs.org) to do that. Given we had this
functionality already available, it made more sense for our purposes to use
that library to generate the charts.

In summary, what add_chart_from_spreadsheet does is:

  • Open the spreadsheet package, enumerate the charts, and copy the xml
    for the first one.
  • Add the spreadsheet as an embedded package part (in ppt/embeddings)
  • Add a chart part for the chart and a rel to the embedded
    spreadsheet—the chart part’s contents are the copied xml with the addition
    of an element with the relId.
  • Add a rel from the slide to the chart part.
  • Add a chart shape with that relId

This is the default format that Powerpoint uses. When you “Insert Chart”,
you get this same embedded spreadsheet setup. Similarly, if you copy and
paste a chart from Excel to Powerpoint, it’s the same except the
spreadsheet is linked and not embedded.

I’ve been vaguely aware of all the restructuring you’d been doing, but not
following it. We wanted a stable version for our project, so were using
0.2.6. I appreciate this means our patch won’t merge at all! But the diffs
are not very extensive—I don’t imagine porting them over would be too hard.

Andy.

On Mon, Dec 9, 2013 at 10:30 AM, scanny notifications@github.com wrote:

Hi Andy, this looks very interesting, thanks for reaching out :).

Couple things to start off:

First, for various reasons, fairly extensive retirement of
learning-curve-related technical debt over the past few months perhaps
primary among them, the develop branch is ahead of master by 430-odd
commits. So there would be some serious merging to do to incorporate this.

Before we think about what that might look like, can you explain for me
what the particular user scenario is for this? Like maybe: "I have a bunch
of charts I've developed in Excel and I want to add one or more of them
into a presentation." or that sort of thing. And an idea of how it would be
done from the PowerPoint UI would be very helpful as well, just so I see
what procedure is being automated.

I'm definitely interested in the package having a general-purpose
capability to add charts. It's actually crept up the backlog to where I was
thinking to take it on in the first half of next year. I don't think this
is that functionality exactly, but I'd like to understand better how it
fits in with that. One of the challenges inherent in a start-from-scratch
capability is having to build an Excel spreadsheet to embed to hold the
data, and it looks like your work involves a way of doing that from an
externally-built worksheet, so that could be an interesting incremental
step that could be built upon for additional features.

If you'll let me know a bit more about what it does and how you've used it
we can take things from there :)


Reply to this email directly or view it on GitHubhttps://github.com//pull/65#issuecomment-30120965
.

@scanny
Copy link
Owner

scanny commented Dec 10, 2013

Thanks Andy, that makes sense of it for me :)

I'll have a closer look over the weekend and see how to fit things together.

@tooh
Copy link

tooh commented Feb 2, 2014

I'm very interested in this feature. Is there any update ?

@scanny
Copy link
Owner

scanny commented Feb 4, 2014

@tooh Hi Peter, no plans to work on this in the near future. It's a bit of a big nut and there are a lot of other items on the backlog. I kind of vaguely expect to get around to charts in about a year given the current backlog and velocity. Sorry I didn't have better news :)

@scanny scanny added this to the later milestone Mar 30, 2014
@scanny scanny added the chart label Jun 15, 2014
@scanny
Copy link
Owner

scanny commented Sep 14, 2014

Charting capability was added in v0.5.0, pushed today. All the basics are there and its quite functional. The relevant docs pages are here.

Special thanks to you Andy @adurdin, I used your code for the initial spike and using it was able to get something working in a single working session. There were about 100 commits between then and now to get all the details ironed out, but it was a big confidence builder at the critical moment as I was getting started to be able to see a chart show up in PowerPoint so soon after I began :)

@scanny scanny closed this Sep 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants