Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upImprove URL scheme #1205
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@woju could you help here? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
woju
Sep 23, 2015
Member
|
If we are manipulating URIs, can we also put /en/ somewhere in the path,
preferably at the beginning?
Currently all pages are in English, but in the future that may change
and someone may like to translate Qubes manual.
Second question: if we are changing URIs, should we rename source files
in the repo to reflect respective URI?
I think it would look cleaner and more professional if all of our URLs
were lowercase and used only the characters `a-z`, `0-9`, `-`, and
possibly `_`:
If answer to the first question is yes, will we include lower-case
characters from respective script? (Polish example: ąćęłńóśźż, but not ĄĆĘŁŃÓŚŹŻ)
(...) I'm not sure if there's a relatively easy, programmatic way to change all the files. (...)
Yes, there can be any time, but results have to be checked manually.
Saves keystrokes, but probably not eyegazing. I can provide you with
tool, but I don't have time to go through all the pages, so you'd have
to promise to point out all the errors which will be left.
Just in case you wonder what happens when automatic scripts go
unchecked, see our current source. :)
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Sep 24, 2015
Member
If we are manipulating URIs, can we also put /en/ somewhere in the path, preferably at the beginning? Currently all pages are in English, but in the future that may change and someone may like to translate Qubes manual.
Yes, good idea.
For example:
https://www.qubes-os.org/doc/en/split-gpg/
Right?
Second question: if we are changing URIs, should we rename source files in the repo to reflect respective URI?
Yes, I was planning on changing, e.g., SplitGpg.md to split-gpg.md.
If answer to the first question is yes, will we include lower-case characters from respective script? (Polish example: ąćęłńóśźż, but not ĄĆĘŁŃÓŚŹŻ)
This I'm not so sure about. Naming source files using non-ASCII characters could cause compatibility issues with certain file systems, couldn't it?
Yes, there can be any time, but results have to be checked manually. Saves keystrokes, but probably not eyegazing. I can provide you with tool, but I don't have time to go through all the pages, so you'd have to promise to point out all the errors which will be left.
No problem, I can do the manual checking.
Yes, good idea. For example:
Yes, I was planning on changing, e.g.,
This I'm not so sure about. Naming source files using non-ASCII characters could cause compatibility issues with certain file systems, couldn't it?
No problem, I can do the manual checking. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
woju
Sep 24, 2015
Member
No problem, I can do the manual checking.
OK. I pushed processed repo to woju/qubes-doc and the tools are in
woju/qubesos.github.io. Check them out and merge as you like.
For example:
https://www.qubes-os.org/doc/en/split-gpg/
Right?
I don't know, probably /en/doc/split-gpg/, because we may like to
translate press releases or whatever. @bnvk, what's your opinion?
Second question: if we are changing URIs, should we rename source files in the repo to reflect respective URI?
Yes, I was planning on changing, e.g.,
SplitGpg.mdtosplit-gpg.md.
OK. There is a tool in qubesos.github.io/_utils/camel2hyphen.pl which
processes the file path. Didn't do that yet.
If answer to the first question is yes, will we include lower-case
characters from respective script? (Polish example: ąćęłńóśźż, but
not ĄĆĘŁŃÓŚŹŻ)This I'm not so sure about. Naming source files using non-ASCII
characters could cause compatibility issues with certain file systems,
couldn't it?
I don't know. @marmarek, will the offline docs reside in dom0 with
support for UTF-8, or usb stick with FAT16/32?
OK. I pushed processed repo to woju/qubes-doc and the tools are in
I don't know, probably
OK. There is a tool in
I don't know. @marmarek, will the offline docs reside in dom0 with |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
marmarek
Sep 24, 2015
Member
On Thu, Sep 24, 2015 at 12:12:25PM -0700, Wojtek Porczyk wrote:
I don't know. @marmarek, will the offline docs reside in dom0 with
support for UTF-8, or usb stick with FAT16/32?
Most likely some VM. But I'd still avoid non-ASCII characters in file
names and URLs.
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
|
On Thu, Sep 24, 2015 at 12:12:25PM -0700, Wojtek Porczyk wrote:
Most likely some VM. But I'd still avoid non-ASCII characters in file Best Regards, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bnvk
Sep 25, 2015
A big YES to making the URLs nicer and non-camelCased, great call :-)
I'm not sure how to best (and if possible) to do multi language support with Jekyll, unless it's just simple copies of all the markdown files and scoping them inside of sub folders en, de, pl
In which case, if we are going to do the whole site (not just the docs) then /en/doc/split-gpg makes most sense, but if just the docs then /doc/en/split-gpg is preferred, I guess!
bnvk
commented
Sep 25, 2015
|
A big YES to making the URLs nicer and non-camelCased, great call :-) I'm not sure how to best (and if possible) to do multi language support with Jekyll, unless it's just simple copies of all the markdown files and scoping them inside of sub folders In which case, if we are going to do the whole site (not just the docs) then |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Sep 26, 2015
Member
OK. I pushed processed repo to woju/qubes-doc and the tools are in woju/qubesos.github.io. Check them out and merge as you like.
Thank you! Could you give me some example commands using these scripts? I'm trying to figure out how to use them on my own, but I haven't been very successful so far.
(I would just use your already-processed repo, but I had to sort a bunch of unsorted doc pages and clean up the developer documentation after you created it.)
Thank you! Could you give me some example commands using these scripts? I'm trying to figure out how to use them on my own, but I haven't been very successful so far. (I would just use your already-processed repo, but I had to sort a bunch of unsorted doc pages and clean up the developer documentation after you created it.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
woju
Sep 28, 2015
Member
OK. I pushed processed repo to woju/qubes-doc and the tools are in
woju/qubesos.github.io. Check them out and merge as you like.Thank you! Could you give me some example commands using these
scripts? I'm trying to figure out how to use them on my own, but
I haven't been very successful so far.
Sure. First of all, cd qubesos.github.io. Then:
find _doc/ -name \*.md | while read file; do echo $file; _utils/rewrite-camel-permalinks.pl < $file > /tmp/relink; cat < /tmp/relink > $file; done
Now all files have permalink: in lower-hyphen convention, however with
some caveats, for example VPN is rewritten v-p-n. Algorithm does not
catch capitalised words. Old permalink is added as first redirect. Now
it has to be checked manually and all permalink can be rewritten (no
need to add another redirect). Then run:
find _doc/ -name \*.md | while read file; do echo $file; _utils/get-redirects.pl < $file; done > /tmp/redirects
Now in /tmp/redirects there is list of all redirects, $redirect_from $permalink, one redirect per line. The filename /tmp/redirects is
important, because it is hardcoded in the next tool. Finally, the command:
find _doc/ -name \*.md | while read file; do echo $file; _utils/redirect-links.pl < $file > /tmp/relink; cat < /tmp/relink > $file; done
It rewrites all links in [link](uri) format.
Order of the commands is important, since rewriting links in page
content depends on redirect_from:, not another regexp. This is to
allow for manual correction (which would have to be done twice and be an
opportunity for error) and to get rid of https -> http redirect at the
same time.
If you need to rewrite just one file, just pipe it to standard input of
respective tool.
Sure. First of all,
Now all files have
Now in
It rewrites all links in Order of the commands is important, since rewriting links in page If you need to rewrite just one file, just pipe it to standard input of |
marmarek
added
the
C: doc
label
Oct 5, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Is this ticket completed? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
No, I haven't had time to do this yet. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Oct 11, 2015
Member
Thank you, @woju! Your tools were extremely helpful. There are just two more places where we should make changes:
- Markdown file names (same as URI change, i.e., from
CamelCasetolowercase-hyphen-separated). - Page titles (add spaces in-between capitalized words, e.g.,
AntiEvilMaidtoAnti Evil Maid).
Can your tools be tweaked to make these changes these, as well?
|
Thank you, @woju! Your tools were extremely helpful. There are just two more places where we should make changes:
Can your tools be tweaked to make these changes these, as well? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Oct 11, 2015
Member
I've made all the changes (except, of course, the ones mentioned in my last message).
However, after looking at the results, I wonder if we really want to prepend /en/ to every subpage. Realistically, how likely is it that any content will get translated? Will enough of it get translated for it to make sense to have both https://www.qubes-os.org/en/ and https://www.qubes-os.org/de/, for example? And even if so, how quickly before the non-English versions go out of date and out of sync with the more current English version? (Also, consider the security implications. If we have only one <language> speaker in the community who volunteers to translate the pages into <language>, we may not be able to tell whether misinformation is being inserted (deliberately or mistakenly) into the translated pages.)
I can certainly see the benefit of going with /en/doc/ rather than /doc/en/, as @woju suggested, because it allows us to translate things like press releases. But one of the main disadvantages is that now every page with any kind of English on it (in other words, every page) gets redirected from the bare URL to the /en/ version. So, for example, any external site which links to https://www.qubes-os.org/downloads/ is getting redirected to https://www.qubes-os.org/en/downloads/.
So, even if we stick with using /en/ for some pages, it probably makes sense to exempt certain pages, such as:
/
/downloads/
/hcl/
/screenshots/
/people/
|
I've made all the changes (except, of course, the ones mentioned in my last message). However, after looking at the results, I wonder if we really want to prepend I can certainly see the benefit of going with So, even if we stick with using
|
marmarek
added
C: website
and removed
C: doc
labels
Oct 12, 2015
added a commit
to woju/qubesos.github.io
that referenced
this issue
Oct 13, 2015
added a commit
to woju/qubesos.github.io
that referenced
this issue
Oct 13, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
woju
Oct 13, 2015
Member
On Sun, Oct 11, 2015 at 12:51:12AM -0700, Axon wrote:
I've made all the changes (except, of course, the ones mentioned in my last message).
Here are the tools for you: woju/qubesos.github.io@master.
Usage: go to qubesos.github.io/_doc (it is important to be in this
directory) and launch ../_utils/rename_to_permalink.py (no shell loop
this time...). This script will "git mv" on every *.md file it will find
beneath the directory, based on last segment of permalink.
The tool will not rename directories, it will just point them out as
warnings. Currently there are only four, so that can be done manually.
I didn't wan't to automagically change titles, because some of them are
wrong anyway, so the second tool, ../_utils/find_camel_title.py, will
only list all the files which has wrong title. Please change them
manually, maybe to something other, like equivalent of
inside.
However, after looking at the results, I wonder if we really want to
prepend/en/to every subpage. Realistically, how likely is it that
any content will get translated? Will enough of it get translated for
it to make sense to have bothhttps://www.qubes-os.org/en/and
https://www.qubes-os.org/de/, for example? And even if so, how
quickly before the non-English versions go out of date and out of sync
with the more current English version?
Because of this problem I don't think putting /en as main directory in
_doc was good idea. I think keeping everything in one tree and just
naming things *.en.md, *.pl.md etc would be abetter idea, because then
would be easier to see which files went out of sync just in github tree
listing.
(Also, consider the security implications. If we have only one
<language>speaker in the community who volunteers to translate the
pages into<language>, we may not be able to tell whether
misinformation is being inserted (deliberately or mistakenly) into the
translated pages.)
I don't know, but it seems a valid concern. @rootkovska, what do you
think about this? Maybe you could appoint maintainer of each language
version, who will be personally responsible?
I can certainly see the benefit of going with
/en/doc/rather than
/doc/en/, as @woju suggested, because it allows us to translate
things like press releases. But one of the main disadvantages is that
now every page with any kind of English on it (in other words, every
page) gets redirected from the bare URL to the/en/version. So, for
example, any external site which links to
https://www.qubes-os.org/downloads/is getting redirected to
https://www.qubes-os.org/en/downloads/.So, even if we stick with using
/en/for some pages, it probably makes sense to exempt certain pages, such as:/downloads/ /hcl/ /screenshots/ /people/
Downloads should be localised more than anything else: there should be
big green „Download” button in as many languages as possible. As to HCL
and people/team, I don't know. As for screenshots, @bnvk, could you
weight in?
|
On Sun, Oct 11, 2015 at 12:51:12AM -0700, Axon wrote:
Here are the tools for you: woju/qubesos.github.io@master. Usage: go to qubesos.github.io/_doc (it is important to be in this The tool will not rename directories, it will just point them out as I didn't wan't to automagically change titles, because some of them are inside.
Because of this problem I don't think putting /en as main directory in
I don't know, but it seems a valid concern. @rootkovska, what do you
Downloads should be localised more than anything else: there should be |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrewdavidwong
Oct 14, 2015
Member
Here are the tools for you: woju/qubesos.github.io@master.
[...]
Thank you, @woju!
Because of this problem I don't think putting /en as main directory in _doc was good idea. I think keeping everything in one tree and just naming things *.en.md, *.pl.md etc would be abetter idea, because then would be easier to see which files went out of sync just in github tree listing.
I agree with not having everying in en/, but I wonder if we should just leave the current files as just .md, then add language codes to any translated files.
Then there's the question of how the language should be represented in the URL. I'd like to get people's opinions about the pros and cons of these two options:
de.qubes-os.org/page/
vs.
qubes-os.org/de/page/
(The first way is how Wikipedia handles it.)
Also, one possibility is to have the English version without any language code, then insert a language code for any translated pages (similar to what I said above about the .md files).
Thank you, @woju!
I agree with not having everying in Then there's the question of how the language should be represented in the URL. I'd like to get people's opinions about the pros and cons of these two options:
vs.
(The first way is how Wikipedia handles it.) Also, one possibility is to have the English version without any language code, then insert a language code for any translated pages (similar to what I said above about the |
andrewdavidwong commentedSep 22, 2015
Currently, many of our page URLs use CamelCase, which is an artifact of the old TracWiki system:
I think it would look cleaner and more professional if all of our URLs were lowercase and used only the characters
a-z,0-9,-, and possibly_:The website is already set up to handle redirects, so that's not a problem. However, I'm not sure if there's a relatively easy, programmatic way to change all the files. We would want to change the
yamlfrontmatter from this:to this:
for each file/page.