python-pptx output cannot be opened in Keynote (resubmit of #66) #80

Closed
tvanyo opened this Issue Jan 29, 2014 · 21 comments

Comments

Projects
None yet
3 participants

tvanyo commented Jan 29, 2014

I saw issue 66 and thought I would contribute some debugging files.

I created a pptx from python-pptx and confirmed it wouldn't open in Keynote. Saved to another name in PowerPoint and this new file could be opened in Keynote. I installed opc-diag and generated the output.

github n00b, so now I need to figure out how to upload the files.

dskrad commented Feb 1, 2014

I noticed that the mime-type of pptx files generated by the module is "application/octet-stream" while files made by MS powerpoint are mime-type "application/vnd.ms-powerpoint". Perhaps this bug can be fixed.

Owner

scanny commented Feb 2, 2014

Where are you finding the MIME type? As far as I know the file does not contain a MIME-type of its own. In general, this sort of thing is applied externally as far as I've ever known.

dskrad commented Feb 2, 2014

file -i test.pptx

I ran into trouble when I was trying to serve the file over the web via apache. The webserver was not sending the file for some reason or the browser was rejecting it. So I checked my /etc/mime.types and all the docx, pptx, etc formats were declared. So then I ran the above command and discovered that the mime type was octet-stream which is not the same as the mime type that MS powerpoint outputs. I thought this was tangentially related to this issue of Keynote not being able to open it until it was opened and saved by Powerpoint with a new file name.

dskrad commented Feb 2, 2014

Oh... And the file with correct mime type was successfully served over Apache web server.

Owner

scanny commented Feb 2, 2014

I'm pretty sure the MIME type for .pptx files is application/vnd.openxmlformats-officedocument.presentationml.presentation. The one you mention above is appropriate for the older .ppt files.

Does your Apache instance have a MIME mapping explicitly for the .pptx extension?

On the file call, I think you might mean file -I test.pptx, equivalent to file --mime test.pptx. On my machine (Mac) this returns application/zip; charset=binary for all Microsoft Office files.

dskrad commented Feb 2, 2014

I'm not sure about the difference between older and newer office documents in regards to the MIME type.

My Apache instance does have entries for the newest office document versions (exactly as you stated above).

You are correct about
file -I
This is the Mac implementation of file

But on Linux, Ubuntu and Raspbian, it is
file -i

I did a bit of research about MIME types (aka Content type) and "octet-stream" is the default catch-all type assigned when none is explicitly set. I didn't find any way to explicitly set the type so I'm not sure this is a bug. But maybe some python or office document guru may have the answer.

Thanks for the awesome module!! It is really powerful and useful.

Owner

scanny commented Feb 2, 2014

Are you serving the file directly from a stream, like StringIO or bytes[]? When you do that sort of thing you have to set the Content Type: header in the response to the proper MIME type yourself. At least I always have.

If it's a file on the hard drive with a .pptx extension and Apache isn't applying the registered MIME mapping I'm afraid I'm out of ideas :)

Glad you find the module useful :)

dskrad commented Feb 4, 2014

So far, as it stands, the Apache server is applying the appropriate "application/vnd.openxmlformats-officedocument.presentationml.presentation" mime type. I can download the pptx with a desktop browser (Chrome) and I am getting the right response header (mime type) in the developer console. The only issue is that when I try to open the file with mobile Safari (iOS 7) the pptx never actually downloads or opens. 
Thanks for all your help. I will do more extensive testing and let you know what I find. I would ultimately like to use your module in a script which will generate a pptx file and then serve it up over http (either apache or directly -- I use woof.py for convenience when going from my Ubuntu desktop or raspberry pi (raspbian) to my Mac). I am not sure what the issue is. 

Owner

scanny commented Feb 4, 2014

I think the key question will be does it behave differently with a .pptx file that is saved by PowerPoint itself and one that is generated by python-pptx.

If there is no difference, the problem is elsewhere, perhaps on the web server.

If they do behave differently, you can use opc-diag to compare two versions of the same file, the one generated directly by python-pptx and the same file loaded and saved by PowerPoint and that should identify the difference. If there is a difference we can bring them to be consistent.

Owner

scanny commented Feb 6, 2014

Hi David, I actually have a pretty good idea what the diagnosis is for the original problem, not opening in Keynote, I expect Apple uses the same code to open it in iOS, so I expect the same update will fix both problems. It sounds like the MIME-type problem is put to bed, so I'll leave scope on this one to the original problem.

It might be a little while before I get to this one, so if you want to dig in yourself for a temporary fix let me know and I'll share the gory details :)

dskrad commented Feb 6, 2014

I would be willing to try and help. Let me know what you think may be the root of the bug. Thanks.

Owner

scanny commented Feb 7, 2014

@dskrad Okay, I was digging into this to identify where to point you and actually identified the problem in the process. Two new lines of code do the trick to fix it on my side. For whatever reason, Keynote is insisting on the presence of a <p:clrMapOvr> element that is optional according to the spec. Whatever :) I'm happy to oblige in the spirit of platform inclusiveness. Besides, Apple has been pretty good to me over the years :)

Are you able to test it if I push a fix to GitHub? I'd prefer not to push a new PyPI release until we can verify the fix.

dskrad commented Feb 7, 2014

I am willing to test new version. I would have to first pip uninstall python-pptx then pull from the repository and run setup.py? If I am wrong let me know. Sorry, but I am still novice in some ways regarding python.

dskrad commented Feb 7, 2014

Wow. I just added those tags to my "broken" file and ran

opc repackage broken clr_fix.pptx

and all is well.

Keynote will open and import, I can open with Preview on my Mac, and I can download and preview on my iOS device. Pretty wild. I guess the mime type issue was a red herring.

Thanks! Waiting for the commit.

dskrad commented Feb 7, 2014

In the name of simplifying, I found that just by adding the self closing tag <p:clrMapOvr/> fixed it as well. No need to actually define anything between two tags.

Owner

scanny commented Feb 7, 2014

Ah, great, that's enough of a confirmation test for me, I'll push a new release this afternoon :)

dskrad commented Feb 7, 2014

Okay, I have actually spent too much time today going over this. I would not recommend using the single self closing tag as it is not in the specification to be empty.

In order for iOS and Keynote to preview or import the ppt the following must precede the closing </p:sld> tag

<p:clrMapOvr>
  <a:masterClrMapping/>
</p:clrMapOvr>

Unfortunately, I found this makes the file unreadable for Powerpoint for Mac 2008. The fix was not intuitive. In slide1.xml: the <p:sp xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main"> tags must be changed to <p:sp>

and the xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" must be moved into the opening <p:sld> tag so it reads as follows:

<p:sld xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" 
xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main">

I found that the file is then readable by both MS-PPT and Keynote and iOS. Hopefully this doesn't require too much coding for you. Thanks!!

Owner

scanny commented Feb 7, 2014

Actually I cleaned that up too while I was in there, so we should be good :)

Owner

scanny commented Feb 7, 2014

Ok, I pushed a new release, v0.3.2, want to update and give it a try?

$ pip install -U --pre python-pptx

should do the trick.

dskrad commented Feb 7, 2014

Bravo!

  • Opens in Powerpoint (Mac 2008)
  • Opens in Keynote (mac and iOS)
  • Opens in iOS (Safari)
  • Served up by woof.py web server

Think we can call it closed. 👍

Owner

scanny commented Feb 7, 2014

Great, thanks David :)

@scanny scanny closed this Feb 7, 2014

scanny pushed a commit that referenced this issue Feb 11, 2014

@Progi1984 Progi1984 referenced this issue in PHPOffice/PHPPresentation Oct 3, 2014

Closed

Make a PPTX compatible with Mac Keynote #46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment