Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fn:InNameTree usage inconsistent with docs #49

Open
plaisted opened this issue Mar 12, 2023 · 2 comments
Open

fn:InNameTree usage inconsistent with docs #49

plaisted opened this issue Mar 12, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@plaisted
Copy link

plaisted commented Mar 12, 2023

The InNameTree predicate seems to be used in several cases that don't match the documentation for the predicate. Primary discrepancy is it is used with non-string values, and that it appears to also check if something exists as a value in the name tree rather than the key as an index in the tree.

fn:InNameTree documentation:

  • key is a reference to a PDF name-tree which use PDF strings as indices. Names trees are complex PDF data structures that use strings as indices.
  • Asserts that the current row (key or array element) and which must be a PDF string exists in the specified name-tree.
  • Note that this predicate is not for use with dictionaries that support arbitrary key names or number-trees!

Inconsistent usage:

  • ArrayOfIndirectFileSpecifications.tsv - [fn:InNameTree(parent::RichMediaContent::Assets)] (special case) for catch all values - Type for the row is dictionary, seems this means the current row value must be present as a value in the tree instead of the key being present as an index.
  • PageObject.tsv - fn:IsRequired(fn:InNameTree(trailer::Catalog::Names::Pages) || fn:InNameTree(trailer::Catalog::Names::Templates)) (required values) - Type for row is name, seems this tests if the parent object of the current row (page) is present as a value in the tree, then the field is required.
  • Target.tsv - fn:IsRequired((@R==C) && fn:InNameTree(trailer::Catalog::Names::EmbeddedFiles)) - Type for row is string, but this doesn't make sense for a required value test since you can only evaluate the condition if the value exists, not sure what the value it's meaning to test on is, haven't compared to pdf spec yet.

It may be useful to add additional predicates here or at least break out scenarios further in InNameTree docs:

  • InNameTree(treeReference, key)
  • InNameTree(treeReference) -> current row value is implicitly used as key
  • InNameTreeValues(treeReference, value) -> checks for values rather than indices / keys
  • InNameTreeValues(treeReference) -> -> current row value is implicitly used as value
@petervwyatt
Copy link
Member

Sorry for my slow reply... and I completely agree with the issue as you report.

My current thinking is to go with 2 distinct predicates such as fn:InNameTreeValues(...) and fn: InNameTreeIndex(..) where "index" means the string that is looked up in the name-tree (and must be a string!) and "value" is the object that gets indexed (which itself may be a string object or any other kind of object). And, for simplicity, explicitly repeating the current row key name / array index won't hurt as it also means extracted data is more standalone-ish.

@petervwyatt petervwyatt added the bug Something isn't working label Apr 1, 2023
@petervwyatt petervwyatt self-assigned this Apr 1, 2023
@plaisted
Copy link
Author

plaisted commented Apr 3, 2023

Makes sense to me...

I also had a few other questions / comments regarding IsPresent and Special Cases I added here: #38 (comment)

I can create a separate issue if that's better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants