Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(ja): WebAssembly/JavaScript_interface/Global 以下を更新 #14909

Merged
merged 1 commit into from
Aug 5, 2023

Conversation

bsmth
Copy link
Member

@bsmth bsmth commented Aug 4, 2023

Test PR to reproduce build errs seen in #14765

@bsmth bsmth requested a review from a team as a code owner August 4, 2023 19:39
@bsmth bsmth requested review from mfuji09 and removed request for a team August 4, 2023 19:39
@github-actions github-actions bot added the l10n-ja Issues related to Japanese content. label Aug 4, 2023
@bsmth
Copy link
Member Author

bsmth commented Aug 4, 2023

CI fails here:

  # Use the GitHub API to get the list of changed files
  # documenation: https://docs.github.com/rest/commits/commits#compare-two-commits
  DIFF_DOCUMENTS=$(gh api repos/{owner}/{repo}/compare/ba5878ee9ac3f9d3374d82186771a07c1abbf364...5843766c13a3eda14723df4b4ac226aa8548828f \
    --jq '.files | .[] | select(.status|IN("added", "modified", "renamed", "copied", "changed")) | .filename')
  
  # filter out files that are not markdown files
  GIT_DIFF_CONTENT=$(echo "${DIFF_DOCUMENTS}" | egrep -i "^files/.*\.md$" | xargs)
  echo "GIT_DIFF_CONTENT=${GIT_DIFF_CONTENT}" >> $GITHUB_ENV
  
  # filter out files that are not attachments
  # note that we should get the absolute path of the changed attachments
  GIT_DIFF_FILES=$(echo "${DIFF_DOCUMENTS}" | egrep -i "^files/.*\.(png|jpeg|jpg|gif|svg|webp)$" | xargs readlink -e | xargs)
  echo "GIT_DIFF_FILES=${GIT_DIFF_FILES}" >> $GITHUB_ENV
  shell: /usr/bin/bash -e {0}
  env:
    BASE_SHA: ba5878ee9ac3f9d3374d82186771a07c1abbf364
    HEAD_SHA: 5843766c13a3eda14723df4b4ac226aa8548828f
    GITHUB_TOKEN: ***
transform: short source buffer

Failing:

gh api repos/mdn/translated-content/compare/ba5878ee9ac3f9d3374d82186771a07c1abbf364...5843766c13a3eda14723df4b4ac226aa8548828f

And transform: short source buffer is a Go err: https://go.googlesource.com/text/+/23ae387dee1f90d29a23c0e87ee0b46038fbed0e/transform/transform.go#23

Looks like a gh api problem.


If you run:

gh api repos/mdn/translated-content/compare/ba5878ee9ac3f9d3374d82186771a07c1abbf364...5843766c13a3eda14723df4b4ac226aa8548828f > output.json
tail output.json

You can see it chokes at line 22 at the (�) char:

Unicode の「置換文字」 U+FFFD, (%

And can be reproduced by getting the commit that introduced the character:

gh api repos/mdn/translated-content/commits/7144ca0458a33571ade432951827e61443d85ff5 > output.json
tail output.json
# ...
g に存在する対になっていないサロゲートコードポイントは、ブラウザーが Unicode の「置換文字」 U+FFFD, (%

@mfuji09
Copy link
Collaborator

mfuji09 commented Aug 5, 2023

@bsmth
Thank you very much, for identifying the reason of problem!

However, the unicode character reference has been introduced from the English version of that time (before the commit mdn/content@f284782#diff-ab009fdab2d2d37768c947522209ddd397af896272bc13e84e7d61e0151f7cc8).

This PR removes the character (corresponding to the commit above for the content repo). But, do you know how should I do to solve the error? (should I force to merge?)

@bsmth
Copy link
Member Author

bsmth commented Aug 5, 2023

My pleasure. Any pull request that has a diff including this character will fail in the same way in future. We might want to replace occurrences of this with the HTML entity: � so that it displays properly in the rendered docs but is parsed okay by the gh api tool. Open to opinions on that.

This PR removes it so other branches updating from main won't experience this problem, so I believe it's safe to merge. What do you think?

@mfuji09
Copy link
Collaborator

mfuji09 commented Aug 5, 2023

@bsmth
I agree. Again, thank you!

@mfuji09 mfuji09 merged commit 63873b4 into mdn:main Aug 5, 2023
4 of 7 checks passed
@yin1999
Copy link
Member

yin1999 commented Aug 5, 2023

I would also like to have a deeper look into GitHub cli. I think we have a chance to fix this so that future CIs will handle this special character correctly :)

@bsmth bsmth deleted the workflow-test branch August 11, 2023 07:13
@bsmth
Copy link
Member Author

bsmth commented Aug 11, 2023

I would also like to have a deeper look into GitHub cli. I think we have a chance to fix this so that future CIs will handle this special character correctly :)

True, we might want to report this as an issue on the repo that fetching patches containing this character with gh api is not handled well with current encoding used.

@yin1999
Copy link
Member

yin1999 commented Aug 11, 2023

repos/mdn/translated-content/commits/7144ca0458a33571ade432951827e61443d85ff5

I can now confirm that the issue is caused by a dependency of the GitHub cli:

Repro code

It is caused by the sanitizer used by the transformer.

image

I'll dig further into this issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
l10n-ja Issues related to Japanese content.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants