Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does the html obtained in buildEnd contain ​ #3364

Closed
4 tasks done
rxliuli opened this issue Dec 20, 2023 · 12 comments
Closed
4 tasks done

Why does the html obtained in buildEnd contain ​ #3364

rxliuli opened this issue Dec 20, 2023 · 12 comments
Labels
bug: pending triage Maybe a bug, waiting for confirmation

Comments

@rxliuli
Copy link

rxliuli commented Dec 20, 2023

Describe the bug

I am trying to generate rss for the website added by vitepress, but I found that the html obtained by getting ContentData['html'] in buildEnd contains ​. I want to confirm that this is a mistake or it can be designed. .

image

Reproduction

https://stackblitz.com/edit/vitepress-rss-generate?file=docs%2F.vitepress%2Fconfig.ts&file=docs%2F.vitepress%2Fdist%2Findex.html

Expected behavior

The html obtained in buildEnd does not contain ​ the same as the final output html.

System Info

System:
    OS: Linux 5.0 undefined
    CPU: (8) x64 Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
    Memory: 0 Bytes / 0 Bytes
    Shell: 1.0 - /bin/jsh
  Binaries:
    Node: 18.18.0 - /usr/local/bin/node
    Yarn: 1.22.19 - /usr/local/bin/yarn
    npm: 9.4.2 - /usr/local/bin/npm
    pnpm: 8.10.5 - /usr/local/bin/pnpm
  npmPackages:
    vitepress: latest => 1.0.0-rc.32

Additional context

I also confirmed that the RSS of vue’s official blog has this problem when displayed on inoreader and feedly.

image

Validations

@rxliuli rxliuli added the bug: pending triage Maybe a bug, waiting for confirmation label Dec 20, 2023
@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

That's an HTML entity. Use some parser library to convert that to unicode sequences (like decode method of https://www.npmjs.com/package/html-entities and maybe chain the result with a .replace(/[\u0000-\u001F\u007F-\u009F\u061C\u200E\u200F\u202A-\u202E\u2066-\u2069]/g, "");).

@rxliuli
Copy link
Author

rxliuli commented Dec 20, 2023

@brc-dd Of course I could have deleted them anyway, just wanted to make sure it was something viteprees expected or if it was a bug.

@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

It's expected behavior. We need something there to pass a11y tests.

@rxliuli
Copy link
Author

rxliuli commented Dec 20, 2023

@brc-dd By the way, when generating rss, if it contains pictures, the image link in html obtained in buildEnd is not the final link, such as cover.A4Q5uAxl.jpg

image

Is there a solution to this problem? Maybe I need to scan the dist to get the final html after actually writing the file?

@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

Can you elaborate?

@rxliuli
Copy link
Author

rxliuli commented Dec 20, 2023

Can you elaborate?

updated ⬆️

@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

Ah weird. This should be the final link in buildEnd. I'll take a look.

@brc-dd brc-dd reopened this Dec 20, 2023
@brc-dd brc-dd closed this as completed Dec 20, 2023
@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

Ah no, you're using createContentLoader. It doesn't return SSR'd HTML. You need to create a list and store data from transformHtml and generate the feed from that in buildEnd. It should be something like this - #520 (comment) (first argument of transformHtml is the rendered HTML)

@rxliuli
Copy link
Author

rxliuli commented Dec 20, 2023

Ah no, you're using createContentLoader. It doesn't return SSR'd HTML. You need to create a list and store data from transformHtml and generate the feed from that in buildEnd. It should be something like this - #520 (comment) (first argument of transformHtml 是渲染的 HTML)

Thank you, I solved it. In the end, I divided the html into those with pictures and without pictures. If there were pictures, I used node-html-parser to re-parse. Otherwise, I used the html in ContentData directly. (most do not come with pictures)

@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

Yeah that could work too. Or if you can, try to store images in the public directory. That way their path won't change.

@rxliuli
Copy link
Author

rxliuli commented Dec 20, 2023

Yeah that could work too. Or if you can, try to store images in the public directory. That way their path won't change.

Yes, I noticed that the vue official blog does this. But for my scenario, I need to execute multiple processes from local markdown source files, vitepress is just one of them (building the website), and I need the markdown file to be just a normal file reference.


By the way, I also submitted a PR for vue blog to fix the original problem of this issue. ref: vuejs/blog#21

@brc-dd
Copy link
Member

brc-dd commented Dec 20, 2023

Ah I don’t have access to the blog repo. Someone else will get back to you on that PR.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug: pending triage Maybe a bug, waiting for confirmation
Projects
None yet
Development

No branches or pull requests

2 participants