Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized property type #32

Closed
OutOfThisPlanet opened this issue Nov 14, 2018 · 22 comments
Closed

Unrecognized property type #32

OutOfThisPlanet opened this issue Nov 14, 2018 · 22 comments
Assignees
Labels
Milestone

Comments

@OutOfThisPlanet
Copy link

OutOfThisPlanet commented Nov 14, 2018

Hi Ironfede,

We have been using your library in a project at work where I am grabbing the "Creation Date" property from old format office files that contain it (doc, xls, ppt, for example - pub and vsd files don't seem to have the property).

We found that it works really well, but some files throw an "Unrecognized property type" exception.
We can't figure out why.

When we check the files with "SSView", we can see the date exists.
Similarly, the date exists within the Windows File Explorer properties.

Our test code is simple:

using System;
using OpenMcdf;
using OpenMcdf.Extensions;
using OpenMcdf.Extensions.OLEProperties;
using OpenMcdf.Extensions.OLEProperties.Interfaces;

namespace ConsoleApp4
{
    class Program
    {
        static void Main(string[] args)
        {
            string file = @"c:\temp\_Test2.doc";

            CompoundFile cf;

            try
            {
                cf = new CompoundFile(file);
            }
            catch (CFFileFormatException cfe)
            {
                Console.WriteLine(file + " isn't a valid OLE Storage file. " + cfe.Message);
                Console.ReadKey();
                return;
            }

            int numDirectories = cf.GetNumDirectories();

            for (int i = 0; i < numDirectories; i++)
            {
                Console.WriteLine(cf.GetNameDirEntry(i));
            }

            CFStream stream = cf.RootStorage.GetStream("\u0005SummaryInformation");

            PropertySetStream ps = CFStreamExtension.AsOLEProperties(stream);

            int count = 0;

            foreach (PropertyIdentifierAndOffset propId in ps.PropertySet0.PropertyIdentifierAndOffsets)
            {
                Console.WriteLine(count + ": " + propId.PropertyIdentifier.GetDescription());
                count++;
            };

            count = 0;

            foreach(ITypedPropertyValue prop in ps.PropertySet0.Properties)
            {
                Console.WriteLine(count + ": " + prop.GetType() + ": " + prop.PropertyValue);
                count++;
            }
            Console.ReadKey();
        }
    }
}

The issue is thrown on the following line:

PropertySetStream ps = CFStreamExtension.AsOLEProperties(stream);

We can't figure out whether the issue is with the file, our code, or the library...

In 1 case, a file that was not working suddenly started working after we changed some document properties (removed author). Weird.
Attempting to do this on another file had no effect.

Attempting to find the original age of a document will help us with our document retention automation in SharePoint. This isn't our SharePoint code, it's just a console app to test pulling back the values from the stream.
With GDPR being a thing nowadays, handling all these old files is suddenly also a thing too.

Can you help please?

Thanks for your excellent work!

@ironfede ironfede self-assigned this Nov 17, 2018
@ironfede ironfede added the bug label Nov 17, 2018
ironfede added a commit that referenced this issue Nov 21, 2018
…nd SummaryInfo in .AsOLEProperties extension method. Structured Storage Explorer has default OLE_PROPERTY flag on. (Beta feature). Some ole property is not supported yet.
@ironfede
Copy link
Owner

I've added some enhancement in OLE Properties handling that hopefully should allow DocumentSummaryInfo and SummaryInfo sets parsing. Please, consider OLE properties still in a beta stage because not all property types are supported and this feature needs a deep unit testing to be considered really a production-ready feature nevertheless it's a useful extension so please let me know if there are other issues possibly attaching an example file to analyze.
Best Regards,
Federico

@ironfede
Copy link
Owner

ironfede commented Nov 21, 2018

Extension NuGet package 2.2.1.3 published.

@nemecben
Copy link

nemecben commented Nov 23, 2018 via email

@OutOfThisPlanet
Copy link
Author

Thank you so much Federico, I'll go and get the new package now and test :)

Much appreciated!

@OutOfThisPlanet
Copy link
Author

OutOfThisPlanet commented Nov 23, 2018

Hi Federico,

Unfortunately, this did not fix our issue.

I have attached a .ppt file (in a zip) that contains a date, but cannot seemingly be grabbed by OpenMCDF.

Some properties seem to have changed in this new version:

"PropertyValue" is now "Value", for example.

For some reason, I am now getting this error message.

System.MissingMethodException: 'Method not found: 'UInt32 OpenMcdf.Extensions.OLEProperties.PropertyIdentifierAndOffset.get_PropertyIdentifier()'.'

I'll "Repair" my Visual Studio.

image

image

_Test.zip

@ironfede ironfede reopened this Nov 23, 2018
@ironfede
Copy link
Owner

Thank you for this test-case. I've found the issue (missing clipboarddata property type) and it will be fixed as soon as possible.
Please, take in account that api is not stable yet so expect some required changes due to refactoring.

@OutOfThisPlanet
Copy link
Author

Hi Federico,

Thanks for looking at it again. I appreciate your efforts very much.

ironfede added a commit that referenced this issue Nov 26, 2018
… a PARTIAL commit: it should work for common case, OLE properties ---->read only<----. User defined properties not supported yet, Array properties not supported.
@ironfede ironfede added this to the 2.3.0.0 milestone Nov 26, 2018
@ironfede
Copy link
Owner

ironfede commented Nov 26, 2018

Please, take a look at current codebase (no nuget yet) to see if this partial commit fix reported issue.
API is being refactored (OLE properties sub-project at https://github.com/ironfede/openmcdf/projects/1#card-15194412 ) so some client code change could probably be required.

@Numpsy
Copy link
Contributor

Numpsy commented Nov 27, 2018

Hi,
Not sure whether to put this here or in a new issue, but i get the 'unrecognized property type' exception when trying to read a SummaryInformation stream from a Word document which contains standard properties (e.g. Author) whose type is VT_LPWSTR rather than VT_LPSTR.

I can attach a sample file if that would be useful?

Thanks.

@ironfede
Copy link
Owner

Hi,
please @Numpsy , attach file because I'll use those samples in unit test if it's ok for you.
I'm progressively adding property types and I'm trying to cover all MS-OLEPS specifications.
Milestone is 2.3.0.0 for OpenMcdf extensions.

@Numpsy
Copy link
Contributor

Numpsy commented Nov 27, 2018

This file has Author and Keywords properties of type LPWSTR.

wstr_presets.zip

@ironfede
Copy link
Owner

LPWSTR support added.
Work in progress...

@ironfede
Copy link
Owner

Please, take in account that OLEProperties Container still does not support write methods (NotImpementedException to avoid issues)

@ironfede
Copy link
Owner

ironfede commented Dec 6, 2018

@Numpsy please, let me know if current code base close this issue.
Thank you!

@Numpsy
Copy link
Contributor

Numpsy commented Dec 7, 2018

Hi,

I gave it a quick try and I can get the property values now (no exception any more), but it looks like there might be spurious null characters on the ends of the strings?

extra_null

looks good otherwise though.

@ironfede
Copy link
Owner

ironfede commented Dec 7, 2018

Thanks @Numpsy. Yes, ole strings have null termination AND a size field so I think that it's better if client application applies a post filter to handle them in its preferred way at the moment. I will introduce some type of configuration parameter to specify how handle null characters.

@ironfede ironfede closed this as completed Dec 7, 2018
@OutOfThisPlanet
Copy link
Author

Seems like this is a thread hijack to me!

Can I ask why this bug report has been closed?

@ironfede
Copy link
Owner

ironfede commented Dec 7, 2018 via email

@ironfede ironfede reopened this Dec 7, 2018
@OutOfThisPlanet
Copy link
Author

We are currently trying to compile from source, however it's not yet compiling. Looking into it.
Previously, we used Nuget to add the extension.

From our perspective, we don't have a working solution currently.

Will update when we successfully compile and test.

Sorry for slow reply, I've been away.

@OutOfThisPlanet
Copy link
Author

OutOfThisPlanet commented Dec 7, 2018

WooooHoooo! It works! :)

Fede, you are a hero! :)

Please let us know if this gets made into a nuget package, as I fear that using our compiled DLLs may not be updateable.

@ironfede
Copy link
Owner

ironfede commented Dec 9, 2018

@nullldata , i'm going to close issue.
Please let me know if it's ok.
Nuget package will be released when OLE properties read/write project will be closed as "Production ready". So I think that it will take some time to reach 2.3.0.0 milestone... stay tuned ;-)
Thank you for your reports and for your patience.

@OutOfThisPlanet
Copy link
Author

@ironfede Sure, no problem.

Thanks again! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants