Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scripts to generate a single PDF from the documentation #964

Merged
merged 14 commits into from
Aug 16, 2023

Conversation

szarnyasg
Copy link
Collaborator

@szarnyasg szarnyasg commented Aug 10, 2023

This PR contains scripts that produce a single PDF from the documentation using Pandoc and XeLaTeX. The

The single-file-document/README.md contains instructions on how to use the scripts from the shell natively or via Docker.

@szarnyasg szarnyasg marked this pull request as ready for review August 14, 2023 14:30
.gitignore Show resolved Hide resolved
single-file-document/concatenate_to_single_file.py Outdated Show resolved Hide resolved
single-file-document/concatenate_to_single_file.py Outdated Show resolved Hide resolved
single-file-document/requirements.txt Show resolved Hide resolved
single-file-document/concatenate_to_single_file.py Outdated Show resolved Hide resolved
doc_body = re.sub(r"^---$", "", doc_body, flags=re.MULTILINE)

# add path labels to headers at the beginning of the file
path_label = full_path \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does pandoc validate these labels at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, e.g.

# Hello {#my%world}

Gets translated to:

<h1 id="hello-myworld">Hello {#my%world}</h1>

with the % removed. That said, we could do our own sanitization to be sure that we only use characters in[-:a-zA-Z0-9] for the labels.

doc_body = doc_body.replace("### Pages in this Section", "")
doc_body = doc_body.replace("### More", "")

# drop lines containing "---", pandoc interprets these as h2 headers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we configure this instead? --- should be horizontal breaks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation uses them here:
https://github.com/duckdb/duckdb-web/blob/master/docs/api/c/api.md#duckdb_open_ext
Like so:

### `duckdb_open`
---
Creates a new database or opens an existing database file stored at the the given path.
If no path is given a new in-memory database is created instead.
The instantiated database should be closed with 'duckdb_close'

#### Syntax
---
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">duckdb_state</span> <span class="k">duckdb_open</span>(<span class="k">
</span>  <span class="kt">const</span> <span class="kt">char</span> *<span class="k">path</span>,<span class="k">
</span>  <span class="kt">duckdb_database</span> *<span class="k">out_database
</span>);
</code></pre></div></div>
#### Parameters
---
* `path`

If we indeed interpret them as horizontal breaks, we'll get this:
Screenshot 2023-08-15 at 16 10 49

This does not seem right to me. (Let's ignore for the time being that the Syntax section, which is pure HTML is not yet rendered.)

single-file-document/concatenate_to_single_file.py Outdated Show resolved Hide resolved
@carlopi
Copy link
Collaborator

carlopi commented Aug 15, 2023

One question I have, not strictly related to this PR, is about naming of the single file documentation.

I would consider having the name encode versioning somehow, even just having duckdb-latest.pdf vs duckdb-0.8.1.pdf vs duckdb-0.9.0.pdf, so that when you download them you can still tell them apart from the name.

And I would consider (later!) whether they should be available to be downloaded via SQL (this requires both some PRAGMA and deciding where to store them, somewhere inside ~/.duckdb/ like ~/.duckdb/docs/duckdb-0.8.1.pdf or equivalent).

@szarnyasg szarnyasg merged commit 0f37cc3 into duckdb:master Aug 16, 2023
1 check passed
@szarnyasg szarnyasg deleted the single-pdf branch August 16, 2023 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants