Skip to content

docs: move Hugo docs into repo under docs/#5000

Merged
Yicong-Huang merged 13 commits into
apache:mainfrom
Ma77Ball:feat/Docs
May 12, 2026
Merged

docs: move Hugo docs into repo under docs/#5000
Yicong-Huang merged 13 commits into
apache:mainfrom
Ma77Ball:feat/Docs

Conversation

@Ma77Ball
Copy link
Copy Markdown
Contributor

@Ma77Ball Ma77Ball commented May 9, 2026

What changes were proposed in this PR?

Adds the website documentation folder into the main repository under a new docs/ folder. The docs cover overview, getting started, tutorials, concepts, the operator reference, contribution guidelines, examples, and security

Any related issues, documentation, or discussions?

Closes: #4984

How was this PR tested?

copy pasted from and did a diff check against the Apache Texera website GitHub Repo

Was this PR authored or co-authored using generative AI tooling?

No

@github-actions github-actions Bot added feature docs Changes related to documentations labels May 9, 2026
@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 9, 2026

/request-review @Yicong-Huang

@github-actions github-actions Bot requested a review from Yicong-Huang May 9, 2026 09:46
@chenlica chenlica self-requested a review May 9, 2026 15:59
@chenlica
Copy link
Copy Markdown
Contributor

chenlica commented May 9, 2026

@Ma77Ball I will review it.

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@Ma77Ball can you update the PR description? it is a bit misleading.

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 9, 2026

I updated it to exclude the part about maintaining consistency with the website, as that will be a separate pr. Please let me know if any other changes are needed.

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

/unrequest-review @Yicong-Huang

@github-actions github-actions Bot removed the request for review from Yicong-Huang May 10, 2026 01:28
@chenlica
Copy link
Copy Markdown
Contributor

The PR looks good to me. @Yicong-Huang Can you verify?

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Yicong-Huang commented May 10, 2026

There is no way to review such a big Pr. As it is doc related, it won't affect prod code, I'm fine with merging it as it is. We can always fix small things later.

But a question is how is this doc generated? How do we know if that's synced with codebase?

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Yicong-Huang commented May 10, 2026

I updated it to exclude the part about maintaining consistency with the website, as that will be a separate pr. Please let me know if any other changes are needed.

Are your answers matching the questions in the PR template?

How was this PR tested?

Closes: #4984

I don't think this makes sense?

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 10, 2026

@Yicong-Huang I misunderstood the description issue. My bad, I fixed it now.

Also, to address the other topic. It's synced with the codebase. I developed the website with Hugo, which uses Markdown (.md) files. I used the existing wiki from this repo and placed it in the docs folder under content.

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Yicong-Huang commented May 11, 2026

But a question is how is this doc generated? How do we know if that's synced with codebase?

@Ma77Ball how about this question? Is there a test to verify the docs are synced with codebase?

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 11, 2026

Verifying and syncing docs with the Texera website is a separate issue (#5001) I raised; should I raise that first, then this pr?

Or if you mean it matches the wiki docs. In my understanding, the goal is to remove the wiki docs and keep only the website docs (at least after this PR is done #4435). @chenlica can comment on this.

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

Ma77Ball commented May 11, 2026

@Yicong-Huang here is a report to help with the review of what is included in the docs:

Wiki sync status all 23 pages

Comparison between apache/texera.wiki and the docs/ folder in this PR.

Legend: ✅ in PR (present under docs/) · ❌ not in PR (with reason)

✅ In the PR (18)

# Wiki page Location in docs/
1 Apache-License-header.md contribution-guidelines/apache-license-header.md
2 Build,-Run-and-Configure-micro‐services-in-local-development-environment.md contribution-guidelines/micro-services-local-dev.md
3 Create-Dataset,-upload-data-to-it-and-use-it-in-Workflow.md tutorials/create-dataset-upload-data.md
4 Guide-for-Developers.md contribution-guidelines/guide-for-developers.md
5 Guide-for-how-to-use-Texera.md tutorials/guide-for-how-to-use-texera.md
6 Guide-to-enable-the-LLM‐based-Texera-agent.md tutorials/guide-to-enable-llm-agent.md
7 Guide-to-Frontend-Development-(new-gui).md contribution-guidelines/guide-to-frontend-development.md
8 Guide-to-Implement-a-Java-Native-Operator.md contribution-guidelines/guide-to-implement-java-operator.md
9 Guide-to-Implement-a-Python-Native-Operator-(converting-from-a-Python-UDF).md contribution-guidelines/guide-to-implement-python-operator.md
10 Guide-to-launch-Lakekeeper-as-the-RESTCatalog-Service-for-Texera's-workflow-result-storage.md tutorials/guide-to-launch-lakekeeper.md
11 Guide-to-Use-a-Python-UDF.md tutorials/guide-to-use-python-udf.md
12 How-to-run-Texera-on-local-Kubernetes.md getting-started/run-on-kubernetes.md
13 Installing-Apache-Texera-using-Docker.md getting-started/installing-using-docker.md (updated this pass: added ANTHROPIC_API_KEY section + --profile examples in uninstall/troubleshooting)
14 Install-Texera.md getting-started/install-texera.md
15 Making-Contributions.md contribution-guidelines/making-contributions.md
16 Migrate-a-Jupyter-Notebook-to-a-Texera-Workflow.md tutorials/migrate-jupyter-notebook.md
17 Past-GUI-screenshots.md getting-started/past-gui-screenshots.md (added this pass)
18 [VOTE]-Release-Apache-Texera-(incubating)-Email-Template.md contribution-guidelines/release-email-template.md

❌ Not in the PR (5)

# Wiki page Reason
19 Installing-Texera-on-a-Single-Node.md Wiki file is 0 bytes empty placeholder, nothing to port.
20 temporary‐install‐using‐docker‐compose.md Overlaps with installing-using-docker.md. Unique bits (LLM API key, --profile examples) merged into entry #13 above.
21 Home.md GitHub wiki landing/link-hub page. Hugo builds the site landing from _index.md + overview.md, so a Home.md would be redundant.
22 Deploying-Texera-on-Amazon-Web-Services(AWS).md Cloud-deployment guide. Still in development.
23 Deploying-Texera-on-Google-Cloud-Platform-(GCP).md Cloud-deployment guide. Still in development.

Totals

Bucket Count
Wiki pages compared 23
✅ In PR 18
❌ Not in PR (with reason) 5

@Yicong-Huang
Copy link
Copy Markdown
Contributor

Yicong-Huang commented May 11, 2026

No, I mean syncing source code and the doc.

Say in the future I open a new PR adding a new operator, but forget to add a doc section, is there a mechanism to alert me? Similarly, say I open a PR to change csv operator's parameters, is there a mechanism to alert me?

We want to make sure the docs are in sync with the source code. To do that, usually we use source code to generate docs, and use CI to verify that the docs are generated from source.

What we have now does not look like generated from source code. This will easily drift in the future. In fact, I don't know if the docs are aligned with what we have in source code, right now.

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

That is a valid point. I can raise a new issue to discuss the details and open a PR to resolve this issue. Or I can raise it as part of this PR.

@Yicong-Huang
Copy link
Copy Markdown
Contributor

That is a valid point. I can raise a new issue to discuss the details and open a PR to resolve this issue. Or I can raise it as part of this PR.

Glad we agree on this. I can still approve and merge this PR, just to unblock the process, but we need to solve the sync issue between docs and source code eventually. For now, I will treat docs detached from source code. And new PRs will not need to modify docs until we sync them at one point.

Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved for a temporary status. Docs and source code are not synced/aligned atm. We will treat docs as a blob that is unrelated to source code for now.

@chenlica chenlica removed their request for review May 12, 2026 02:46
@chenlica
Copy link
Copy Markdown
Contributor

The PR number 5000 is a big milestone!

@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 12, 2026 03:58
@Yicong-Huang Yicong-Huang merged commit 7879c2a into apache:main May 12, 2026
13 checks passed
@Ma77Ball Ma77Ball deleted the feat/Docs branch May 12, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Changes related to documentations feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move Hugo docs into apache/texera

3 participants