Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AddEmbeddedPackagePart is saved to wrong path #1318

Closed
Asbjoedt opened this issue Jan 24, 2023 · 9 comments
Closed

AddEmbeddedPackagePart is saved to wrong path #1318

Asbjoedt opened this issue Jan 24, 2023 · 9 comments

Comments

@Asbjoedt
Copy link
Contributor

Asbjoedt commented Jan 24, 2023

Describe the bug

ISSUE 1

You can iterate all EmbeddedPackageParts by doing the below. It iterates WorksheetParts and then each EmbeddedPackagePart.

using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filepath, true))
{
    IEnumerable<WorksheetPart> worksheetParts = spreadsheet.WorkbookPart.WorksheetParts;
    foreach (WorksheetPart worksheetPart in worksheetParts)
    {
        IEnumerable<EmbeddedPackagePart> embedpackages = worksheetPart.EmbeddedPackageParts;
        foreach (EmbeddedPackagePart part in embedpackages)
        {

        }
    }
}

You can delete any EmbeddedPackagePart by deleting it at the WorksheetPart level like this.

worksheetPart.DeletePart(part);

However, if you want to AddEmbeddedPackagePart you can do so also at the WorksheetPart level:

EmbeddedPackagePart embedpack = worksheetPart.AddEmbeddedPackagePart(contentType);

But the new EmbeddedPackagePart will be saved to this URI where a new embeddings folder will be created:

xl/worksheets/embeddings/package.bin

If you use Excel, all EmbeddedPackageParts are saved in this folder instead:

xl/embeddings/package.bin

Is this correct behaviour? Should Open XML SDK also save the part to xl/embeddings?

ISSUE 2:

Excel saves the EmbeddedPackagePart with original extension i.e. .xlsx.

Open XML SDK saves EmbeddedPackagePart with .bin extension.

Observed behavior

  1. EmbeddedPackagePart is saved to xl/worksheets/embeddings

  2. EmbeddedPackagePart is saved with .bin extension.

Expected behavior

  1. EmbeddedPackagePart should be saved to xl/embeddings

  2. EmbeddedPackagePart should be saved with the original extension of the package i.e. .xlsx.

Desktop (please complete the following information):

  • OS: Windows 11
  • Office version: 2212
  • .NET Target: .Net latest version
  • DocumentFormat.OpenXml Version: 2.19
@Asbjoedt
Copy link
Contributor Author

Asbjoedt commented Jan 25, 2023

Hi @twsouthwick

I made a mistake trying to PR to v2.20 so I pulled it again. I think you may want my PR for v3 or another version than 2.20.

You can look at the code here 9f7ed97
Let me know, if you want it and I can PR in the branch you want it.

@twsouthwick
Copy link
Member

Hi @Asbjoedt, I would prefer any feature requests against main and then we can decide if it's worth it to back port.

This issue looks to be a schema issue similar to #1292.

/cc @tomjebo @hunyu @mikeebowen

@tomjebo
Copy link
Collaborator

tomjebo commented Jan 26, 2023

@Asbjoedt do you have test code including the embedded contents to be added? I will need to verify all the behavior here.

  1. I would have to test this to verify the behavior but I suspect that Excel would be ok with the path to the part as long as the relationship pointed correctly to where it lives. The ISO 29500 standard and our implementer notes don't specify a path. They give examples but they specify that the package must be the target of the http://purl.oclc.org/ooxml/officeDocument/relationships/package relationship. If it does need correction because Excel doesn't accept what the SDK uses or we just decide it really has to match Excel's address, we do this on the backend/generation phase. If Excel (or Word or PowerPoint) don't care what the path is, then we could use a feature or just extra parameter on part creation to allow the coder to specify a new path.

  2. As far as the .xlsx, we shouldn't use .bin as this is an Office Open XML package and not an OLE object. This would also be done on the backend/generation phase in our package.xml CT_EmbeddedPackagePart definition.

<sigh, another edit> for (2) above, it won't be just in the backend generation. It will have to be decided based on the content type passed into AddEmbeddedPackagePart().

@Asbjoedt
Copy link
Contributor Author

Asbjoedt commented Jan 26, 2023

@tomjebo

  1. I think imitating Excel to the widest possible extent as long as it does not go against the OOXML standards, is best. It is confusing to have to look for embeddings multiple places based on whether you used Excel or Open XML SDK. However, you are right that technically as long as the relationship paths are correctly written, Excel can read the data.

Also, another reason came to mind. When you AddImagePart in the SDK, they are saved to xl/media/ and not xl/worksheets/media/.

  1. I agree. As a solution, I suggest to imitate the ImagePartType, that is already in the codebase and I reused this code yesterday for a fix for embedded packages, do take a look and compare it with the ImagePartType stuff: 9f7ed97
    I am not fully sure, it completes the fix, but it goes a long way to fix it.

Here's code for you to test the behaviour on .xlsx files.

public void EmbedObject(string filepath)
{
    List<EmbeddedPackagePart> packages = new List<EmbeddedPackagePart>();

    using (SpreadsheetDocument spreadsheet = SpreadsheetDocument.Open(filepath, true))
    {
        IEnumerable<WorksheetPart> worksheetParts = spreadsheet.WorkbookPart.WorksheetParts;
        foreach (WorksheetPart worksheetPart in worksheetParts)
        {
            packages = worksheetPart.EmbeddedPackageParts.Distinct().ToList();
            foreach (EmbeddedPackagePart part in packages)
            {
                // Create new EmbeddedPackage of contentType .xlsx
                EmbeddedPackagePart new_EmbeddedPackagePart = worksheetPart.AddEmbeddedPackagePart("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");

                // Feed data to the new package from an .xlsx file
                var new_Stream = new MemoryStream();
                using (FileStream file = new FileStream(filepath_to_embed, FileMode.Open, FileAccess.Read))
                {
                    file.CopyTo(new_Stream);
                }
                new_EmbeddedPackagePart.FeedData(new_Stream);
                new_Stream.Dispose();
            }
        }
    }
}

I recommend you create your own test sample, where you i.e. create two .xlsx spreadsheets and embed one of the .xlsx into the other as an object. You should try the embedding process first using Excel and then Open XML SDK.

@tomjebo
Copy link
Collaborator

tomjebo commented Jan 26, 2023

@Asbjoedt thanks for the additional details and explanation. I can create a sample to test this and verify the behavior. Regarding (1) above, yes, I agree that we should imitate Excel also. The SDK is not only built around the ISO standard but around the Office behaviors. So good point.

@Asbjoedt
Copy link
Contributor Author

Asbjoedt commented Apr 4, 2023

@tomjebo. It would be great if this fix could get into 2.20, if possible.

@tomjebo
Copy link
Collaborator

tomjebo commented Apr 5, 2023

@Asbjoedt I'm working on getting this into v2.20.0.

@twsouthwick
Copy link
Member

@tomjebo this was included in v2.20.0 right? Can this be closed?

@tomjebo
Copy link
Collaborator

tomjebo commented Apr 16, 2023

@twsouthwick, @Asbjoedt we can probably close this issue now.

See #1406. The path behavior that you observed is indeed a deviation from what Excel actually does. However, as you noted the package is not broken because of this deviation. But in #1406, I'd like to investigate using features to "enhance" the SDK to do this mimicking behavior. So we can track the "enhancement" there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants