Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-references broken when ref includes special characters (e.g., hyphens or underscores) #444

Closed
smcgu opened this issue May 29, 2024 · 6 comments · Fixed by #452
Closed
Assignees
Labels
bug Something isn't working

Comments

@smcgu
Copy link

smcgu commented May 29, 2024

Describe the bug

References that include hyphens or underscores are broken in the generated DOCX.

To Reproduce

Steps to reproduce the behavior:

  1. Add evidence file with friendly name that includes hyphens/underscores (e.g., "example-ptc")
  2. Add evidence to finding {{.example-ptc}}
  3. Add reference to evidence {{.ref example-ptc}}
  4. Generate DOCX report
  5. Observe the the cross reference is broken {REF _Refexample-ptc \h}
  6. Removing the hyphen {REF _Refexampleptc \h} and updating the reference (F9) fixes the broken reference

Expected Behavior

Special characters that would break DOCX XML should be stripped when generating the report. Or, they should be blocked from being used in the UI.

Server Specs:

  • OS: Ubuntu 22.04
  • Docker: v26.1.3, build b72abbb
  • Docker Compose: v2.27.0
  • Ghostwriter: v4.1, released 3 April 2024

Additional context

I'm working on some other figure caption code so I may have some upcoming PRs that may cover this. But, documenting this behavior here in case other priorities come up and for feedback on if this reproducible.

@smcgu smcgu added the bug Something isn't working label May 29, 2024
@smcgu smcgu changed the title Cross-references broken when ref includes special changes (e.g., hyphens or underscores) Cross-references broken when ref includes special characters (e.g., hyphens or underscores) May 29, 2024
@ColonelThirtyTwo
Copy link
Collaborator

As a workaround, you can use {{.ref exampleptc}}.

The current code strips non-ASCII-alphanumeric characters from the evidence name when generating it's ref, but the {{.ref}} tag does not do that, hence the mismatch. We could change it to strip the name similarly, but if I'm reading the docx spec right, we could also use the full name in double quotes and just escape " and \ characters. @chrismaddalena Would appreciate your thoughts.

@chrismaddalena
Copy link
Collaborator

@ColonelThirtyTwo Yeah, let's try using quotes. If it works, that's the better solution. Escaping quotes is easy enough, and it should be unlikely that someone uses double quotes in a name anyway. This would also avoid a situation where someone could have evidence1 and evidence-1 and have their reference to the second one end up pointing to the first if we stripped the -.

@smcgu
Copy link
Author

smcgu commented May 29, 2024

Speaking of duplicate evidence files, there does not appear to be any checks to block that. I'm able to create two evidence items with the same name. So, quoting the evidence names wouldn't help here.

I think this indicates that there are two issues:

  1. Special characters are treated differently for figure naming and cross-reference naming. One strips the special characters and the other does not, respectively.
  2. There are no checks to prevent evidence naming duplicates upstream in the Ghostwriter UI or when stripping them when generating the report. Currently, a user can create multiple evidence items with the same friendly name.

@smcgu
Copy link
Author

smcgu commented May 29, 2024

Could we concatenate the evidence friendly name and the evidence ID when generating the DOCX? This should prevent naming collisions, even if multiple evidence items have the same friendly name.

For example:

  • "example-evidence", id 10 -> "example-evidence10"
  • "example-evidence", id 11 -> "example-evidence11"

@smcgu
Copy link
Author

smcgu commented May 29, 2024

Name+ID could still theoretically experience collisions but I think a user would really have to go out of their way (or, be incredibly [un]lucky) to encounter it.

For example:

  • "example-evidence1", id: 10 -> "example-evidence110"
  • "example-evidence", id: 110 -> "example-evidence110"

@ColonelThirtyTwo
Copy link
Collaborator

Keep in mind that captions made with {{.caption <name>}} can be referenced with {{.ref}} as well, so it's not as simple as looking up the evidence by the name passed to {{.ref}}. You could even reference a bookmark in the template with {{.ref}} if you knew the internal reference name.

But I can see name collisions being an issue moving forward, especially with extra fields attempting to generate an evidence figure multiple times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants