Skip to content
This repository was archived by the owner on Jun 24, 2025. It is now read-only.

Conversation

maphew
Copy link
Contributor

@maphew maphew commented Nov 16, 2024

Prefer HTML title tag over filename during import

When importing HTML files, extract and use the title from the <title> tag if available, falling back to the filename only when no title tag is found.

Primary motivaton is handling of titles with special characters that can't be represented in filenames.

I'm inexperienced with javascript and typescript. I relied on Claude Sonnet 3.5 via Codeium to build this. The included test hasn't been run as I've been unsuccessful so far at putting together a complete development environment on Windows.

When importing HTML files, extract and use the title from the <title> tag
if available, falling back to the filename only when no title tag is found.

This improves handling of titles with special characters that can't be
represented in filenames.
@perfectra1n
Copy link
Member

Gotta say, code looks decent even from an LLM lol.

"TSError: ⨯ Unable to compile TypeScript:
spec/services/import/single.spec.ts:137:25 - error TS2339: Property 'content' does not exist on type 'BNote'."
"$ npx tsx spec/services/import/single.spec.ts
/home/mhw/dev/tnext/spec/services/import/single.spec.ts:6
describe('HTML Import', () => {
^
ReferenceError: describe is not defined
    at <anonymous> (/home/mhw/dev/tnext/spec/services/import/single.spec.ts:6:1)

Node.js v18.17.1"
claude sez: "I see the issue. The tests are failing because we need to set up the CLS (Continuation Local Storage) context that Trilium uses for transactions. Let me check how this is handled in other tests:

I've updated the test file to:

I've updated the test file to:

Use an in-memory database for testing
Set up necessary database tables in beforeEach
Wrap all database operations in cls.init() and sql.transactional()
Handle async operations properly
claude:
I've updated the test file to:

Add the entity_changes table with all required fields
Add missing fields to the notes table (isDeleted)
Add missing fields when inserting the parent note
@maphew
Copy link
Contributor Author

maphew commented Nov 17, 2024

I set up an Ubuntu WSL container to run npm and tests within on this branch. The last 5 commits are me basically letting Claude do whatever it wanted with prompts of the last error message. Things really started to go off the rails after importSinglefile (14d7e3e) and I suspect rest of the commits are not worth much!

A root cause of confusion is me not knowing how to run tests in node anyway. For what it's worth, these are the test commands attempted:

npx jasmine spec/services/import/single.spec.ts
npx tsx spec/services/import/single.spec.ts
npx ts-node --esm spec/services/import/single.spec.ts
npx ts-node --esm --compilerOptions '{"target":"es2022"}' spec/services/import/single.spec.ts
NODE_OPTIONS="--loader tsx" jasmine spec/services/import/single.spec.ts

@eliandoran eliandoran changed the title Importing: prefer html title over filename Importing single HTML file: prefer html title over filename Nov 28, 2024
@eliandoran
Copy link
Contributor

@maphew , I've removed the test for now although it seemed very promising just because we don't have a well defined test suite mechanism unfortunately.

Either way, when testing the PR using an HTML file generated by Trilium I've noticed it was not working, prefering the file name. The reason is that the <h1> would get converted to <h2> by the sanitization feature. I've moved the code around so now it should work quite good.

I've also fixed two more issues that were unrelated to the PR itself:

  1. We didn't remove the existing h1 because, although the title now matched, the regex did not support attributes in the HTML element.
  2. The title of the HTML would show up in the text since it was allowed as a tag in the sanitizer.

@eliandoran eliandoran merged commit 21a5481 into TriliumNext:develop Nov 28, 2024
@maphew
Copy link
Contributor Author

maphew commented Nov 28, 2024

thank you @eliandoran!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants