-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document metadata properties #33
Comments
Sure ... if you make it optional |
I just released a new package https://www.nuget.org/packages/IFilterTextReader/1.6.1 |
New package works great, thanks! |
@Sicos1977 and @mguinness the only problem with this is that it's possible to have meta data properties are duplicated e.g. Names: foo In this scenario, the dictionary generates a key already exists exception. I'll log a separate issue for this also |
Out of interest what is the output of filtdump of an example file? I imagine the tags are coming from different sections in the file. Changing the field type to |
@mguinness - sorry, I didn't rush back to this - in this case it's the same section, but the 'different sections' is also a problem CHUNK: --------------------------------------------------------------- VALUE: --------------------------------------------------------------- CHUNK: --------------------------------------------------------------- VALUE: ---------------------------------------------------------------
Now, whilst we changed to <string, object> - and i'm going to look at this again soon - for some reason, I seem to recall thinking that including the schema into the output would be useful: Pretty sure I found that <string becomes 'Names' - so if a purpose is to allow an application to filter on a specific filter lets say the meta data property output doesn't let you identify the same name from different paths if there is a conflict. So for example, I have Where we have System.Title, title and Title. One of them is dc:tittle - the other is TestSchema:Title - and presumably the System.Title is the default document title outside the metadata. This I think is the issue that you were hitting on? |
Thanks for the reply. The example you cited seems more like an array of names. Can you upload a small example document? |
@mguinness - it was indeed an array of names - sample image uploaded below: (hopefully github doesn't modify it) |
When the
includeProperties
option is set to true the metadata is included in the output. Would it be possible to expose a new property on theFilterReader
class as a dictionary? I can put a PR together if you have no objections.The text was updated successfully, but these errors were encountered: