Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table renaming #423

Open
bertfrees opened this Issue Oct 11, 2017 · 9 comments

Comments

Projects
None yet
4 participants
@bertfrees
Copy link
Member

bertfrees commented Oct 11, 2017

Sooner or later we have to clean up the table names because it is a mess. We should have a uniform naming scheme and file extensions that make sense.

On the other hand we should avoid renaming files if not really needed because some applications depend on file names. Therefore I propose to plan in this issue which renames we want to do, and apply them all at once later when we have figured out a solution for this problem.

Related:


This is the tentative list of renames

old name new name comment
afr-za-g1.ctb ?
ar-ar-g1.utb ?
ar-fa.utb ?
ar.tbl ? merge with ar-ar-g1.utb
as-in-g1.utb ?
as.tbl ? merge with as-in-g1.utb
aw-in-g1.utb ?
awa.tbl ? merge with aw-in-g1.utb
be-in-g1.utb ?
bengali.cti ?
bg.ctb ?
bg.tbl ? merge with bg.ctb
bh.ctb ?
bh.tbl ? merge with bh.ctb
bn.tbl ? merge with be-in-g1.utb
bo.ctb ?
bo.tbl ? merge with bo.ctb
boxes.ctb ?
br-in-g1.utb ?
bra.tbl ? merge with br-in-g1.utb
braille-patterns.cti ?
ca-chardefs.cti ?
ca-g1.ctb ?
ca.tbl ? merge with ca-g1.ctb
chardefs.cti ?
chr-us-g1.ctb ?
ckb-chardefs.cti ?
ckb-g1.ctb ?
ckb-translation.cti ? embed in ckb-g1.ctb
ckb.tbl ? merge with ckb-g1.ctb
compress.cti ?
controlchars.cti ?
corrections.cti ?
countries.cti ?
cs-chardefs.cti ?
cs-g1.ctb ?
cs-translation.cti ? embed in cs-g1.ctb
cs.tbl ? merge with cs-g1.ctb
cy-cy-g1.utb ?
cy-cy-g2.ctb ?
cy.tbl ? merge with cy-cy-g2.ctb
Cz-Cz-g1.utb ?
da-dk-6miscChars.cti ?
da-dk-g08.ctb ?
da-dk-g16-lit.ctb ?
da-dk-g16.ctb ?
da-dk-g18.ctb ?
da-dk-g26-lit.ctb ?
da-dk-g26.ctb ?
da-dk-g26l-lit.ctb ?
da-dk-g26l.ctb ?
da-dk-g28.ctb ?
da-dk-g28l.ctb ?
da-dk-octobraille.dis ?
da-dk.dis ?
da-lt.ctb ?
da.tbl ? merge with da-dk-g26.ctb
de-ch-accents.cti ?
de-ch-g0.utb de-CH-x-g0.tbl
de-ch-g1.ctb de-CH-x-g1.tbl
de-ch-g2.ctb de-CH-x-g2.tbl
de-chardefs6.cti ?
de-chardefs8.cti ?
de-chess.ctb ?
de-de-accents.cti ?
de-de-comp8.ctb de-DE-x-comp8.tbl
de-de-g0.utb de-DE-x-g0.tbl
de-de-g1.ctb de-DE-x-g1.tbl
de-de-g2.ctb de-DE-x-g2.tbl
de-de.dis ?
de-eurobrl6.dis ?
de-eurobrl6u.dis ?
de-g0-core.uti ?
de-g1-core.cti ?
de-g2-core.cti ?
de.tbl de-x-g2.tbl alias for de-DE-x-g2.tbl
de_CH.tbl ? merge with de-CH-x-g2.ctb
de_DE.tbl ? merge with de-DE-x-g2.ctb; generalize locale to de
devanagari.cti ?
digits6Dots.uti ?
digits6DotsPlusDot6.uti ?
digits8Dots.uti ?
dra.ctb ?
dra.tbl ? merge with dra.ctb
el.ctb ?
el.tbl ? merge with el.ctb
en-GB-g2.ctb ?
en-chess.ctb ?
en-gb-comp8.ctb ?
en-gb-g1.utb ?
en-in-g1.ctb ?
en-ueb-chardefs.uti ?
en-ueb-g1.ctb ? make this the default for locale en
en-ueb-g2.ctb ? make this the default for locale en
en-ueb-math.ctb ?
en-us-brf.dis ?
en-us-comp6.ctb ?
en-us-comp8-ext.utb ?
en-us-comp8.ctb ?
en-us-compbrl.ctb ?
en-us-g1.ctb ?
en-us-g2.ctb ?
en-us-interline.ctb ?
en-us-mathtext.ctb ?
en.tbl ? merge with en-us-g2.ctb; change locale to en-US
en_AS.tbl ? merge with UEBC-g2.ctb
en_CA.ctb ?
en_CA.tbl ? merge with en_CA.ctb
en_GB.tbl ? merge with en-GB-g2.ctb
en_US-comp8-ext.tbl ? merge with en-us-comp8-ext.utb
en_US.tbl ? merge with en-us-g2.ctb
eo-g1-x-system.ctb ?
eo-g1.ctb ?
eo.tbl ? merge with eo-g1.ctb
es-chardefs.cti ?
Es-Es-G0.utb ?
Es-Es-g1.utb ?
es-g1.ctb ?
es-new.dis ?
es-old.dis ?
es-translation.cti ? embed in es-g1.ctb
es.tbl ? merge with es-g1.ctb
et-g0.utb ?
et.ctb ?
et.tbl ? merge with et.ctb
ethio-g1.ctb ?
eurodefs.cti ?
fa-ir-comp8.ctb ?
fa-ir-g1.utb ?
fi-fi-8dot.ctb ?
fi-fi.ctb ?
fi.tbl ? merge with fi-fi.ctb
fi.utb ?
fi1.ctb ?
fi2.ctb ?
fr.tbl ? merge with Fr-Fr-g2.ctb
fr-2007.ctb ?
fr-bfu-comp6.utb ?
fr-bfu-comp68.cti ?
fr-bfu-comp8.utb ?
fr-bfu-g2.ctb ?
fr_CA.tbl ? merge with Fr-Ca-g2.ctb
fr-ca-g1.utb ?
Fr-Ca-g2.ctb ?
fr_FR.tbl ? delete
fr-fr-g1.utb ?
Fr-Fr-g2.ctb ?
ga-g1.utb ?
ga-g2.ctb ?
ga.tbl ? merge with ga-g2.ctb
gd.ctb ?
gd.tbl ? merge with gd.ctb
gez.tbl ? merge with ethio-g1.ctb
gon.ctb ?
gon.tbl ? merge with gon.ctb
gr-bb.ctb ?
gr-pl-comp8.uti ?
gu-in-g1.utb ?
gu.tbl ? merge with gu-in-g1.utb
gujarati.cti ?
gurumuki.cti ?
haw-us-g1.ctb ?
he.ctb ?
he.tbl ? merge with he.ctb
hi-in-g1.utb ?
hi.tbl ? merge with hi-in-g1.utb
hr-chardefs.cti ?
hr-comp8.tbl ? merge with hr-comp8.utb
hr-comp8.utb ?
hr-digits.uti ?
hr-g1.ctb ?
hr-g1.tbl ? merge with hr-g1.ctb
hr-translation.cti ? embed in h1-g1.ctb
hu-backtranslate-correction.dis ?
hu-chardefs.cti ?
hu-exceptionwords.cti ?
hu-hu-comp8.ctb ?
hu-hu-g1.ctb ?
hu-hu-g2.ctb ?
hu-hu-g2_exceptions.cti ?
hu.tbl ? merge with hu-hu-g1.ctb
hy.ctb ?
hy.tbl ? merge with hy.ctb
hyph_brl_da_dk.dic ? move to Liblouisutdml (#334)
hyph_cs_CZ.dic ? move to Liblouisutdml (#334)
hyph_da_DK.dic ?
hyph_de_DE.dic ? move to Liblouisutdml (#334)
hyph_en_US.dic ? move to Liblouisutdml (#334)
hyph_eo.dic ? move to Liblouisutdml (#334)
hyph_es_ES.dic ? move to Liblouisutdml (#334)
hyph_fr_FR.dic ? move to Liblouisutdml (#334)
hyph_hu_HU.dic ? move to Liblouisutdml (#334)
hyph_it_IT.dic ? move to Liblouisutdml (#334)
hyph_nb_NO.dic ? move to Liblouisutdml (#334)
hyph_nl_NL.dic ? move to Liblouisutdml (#334)
hyph_nn_NO.dic ? move to Liblouisutdml (#334)
hyph_pl_PL.dic ? move to Liblouisutdml (#334)
hyph_pt_PT.dic ? move to Liblouisutdml (#334)
hyph_ru.dic ? move to Liblouisutdml (#334)
hyph_sv_SE.dic ? move to Liblouisutdml (#334)
IPA.utb ?
is-chardefs6.cti ?
is-chardefs8.cti ?
is.ctb ?
is.tbl ? merge with is-chardefs6.cti
it-it-comp6.utb ?
it-it-comp8.utb ?
it.tbl ? merge with it-it-comp6.utb
iu-ca-g1.ctb ?
ka-in-g1.utb ?
kannada.cti ?
kh-in-g1.utb ?
kha.tbl ? merge with kh-in-g1.utb
kn.tbl ? merge with ka-in-g1.utb
ko-2006-g1.ctb ?
ko-2006-g2.ctb ?
ko-2006.cti ?
ko-chars.cti ?
ko-g1-rules.cti ?
ko-g1.ctb ?
ko-g2-rules.cti ?
ko-g2.ctb ?
ko.cti ?
kok.ctb ?
kok.tbl ? merge with kok.ctb
kru.ctb ?
kru.tbl ? merge with kru.ctb
ks-in-g1.utb ?
latinLetterDef6Dots.uti ?
latinLetterDef8Dots.uti ?
litdigits6Dots.uti ?
litdigits6DotsPlusDot6.uti ?
loweredDigits6Dots.uti ?
loweredDigits8Dots.uti ?
lt.ctb ?
lt.tbl ? merge with lt.ctb
lv.tbl ? merge with Lv-Lv-g1.utb
Lv-Lv-g1.utb ?
malayalam.cti ?
mao-nz-g1.ctb ?
marburg.ctb ?
marburg_edit.ctb ?
marburg_single_cell_defs.cti ?
marburg_unicode_defs.cti ?
ml-in-g1.utb ?
ml.tbl ? merge with ml-in-g1.utb
mn-MN.utb ?
mn-in-g1.utb ?
mni.tbl ? merge with mn-in-g1.utb
mr-in-g1.utb ?
mr.tbl ? merge with mr-in-g1.utb
mt.ctb ?
mt.tbl ? merge with mt.ctb
mun.ctb ?
mun.tbl ? merge with mun.ctb
mwr.ctb ?
mwr.tbl ? merge with mwr.ctb
ne.ctb ?
ne.tbl ? merge with np-in-g1.utb
nemeth.ctb ? move to Liblouisutdml (#321)
nemeth_edit.ctb ? move to Liblouisutdml (#321)
nemethdefs.cti ?
nl-BE-g0.utb ?
nl-BE.dis ?
nl-NL-g0.utb ?
nl-chardefs.uti ?
nl-g0.uti ?
nl.tbl ? merge with nl-NL-g0.utb
nl_BE.tbl ? merge with nl-BE-g0.utb
nl_NL.tbl ? merge with nl-NL-g0.utb
no-no-8dot-fallback-6dot-g0.utb ?
no-no-8dot.utb ?
no-no-chardefs6.uti ?
no-no-comp8.ctb ?
no-no-g0.utb ?
no-no-g1.ctb ?
no-no-g2.ctb ?
no-no-g3.ctb ?
no-no-generic.ctb ?
no-no-generic.dis ?
no-no-latinLetterDef6Dots_diacritics.uti ?
no-no.dis ?
no.tbl ? merge with no-no-g3.ctb
np-in-g1.utb ?
or-in-g1.utb ?
or.tbl ? merge with or-in-g1.utb
oriya.cti ?
pa.tbl ? merge with pu-in-g1.utb
pi.ctb ?
pi.tbl ? merge with pi.ctb
pl-pl-comp8.ctb ?
Pl-Pl-g1.utb ?
pl.tbl ? merge with Pl-Pl-g1.utb
printables.cti ?
pt-pt-comp8.ctb ?
pt-pt-g1.utb ?
pt-pt-g2.ctb ?
pt.tbl ? merge with pt-pt-g2.ctb
pu-in-g1.utb ?
ro.ctb ?
ro.tbl ? merge with ro.ctb
ru-chardefs.cti ?
ru-compbrl.ctb ?
ru-letters.dis ?
ru-litbrl.ctb ?
ru-ru-g1.utb ?
ru-ru.dis ?
ru.ctb ?
ru.tbl ? merge with ru-ru-g1.utb
sa-in-g1.utb ?
sa.tbl ? merge with sa-in-g1.utb
sd.tbl ? merge with si-in-g1.utb
se-se.ctb ?
se-se.dis ?
Se-Se-g1.utb ?
si-in-g1.utb ?
sin.cti ?
sin.utb ?
sk-chardefs.cti ?
sk-g1.ctb ?
sk-sk-g1.utb ?
sk-sk.utb ?
sk-translation.cti ? embed in sk-g1.ctb
sk.tbl ? merge with sk-sk-g1.utb
sl-si-comp8.ctb ?
sl-si-g1.utb ?
sl.tbl ? merge with sl-si-g1.utb
sot-za-g1.ctb ?
spaces.ctb spaces.tbi
sr-chardefs.cti ?
sr-g1.ctb ?
sr.tbl ? merge with sr-g1.ctb
sv-1989.ctb ?
sv-1996.ctb ?
sv.tbl ? merge with Se-Se-g1.utb
ta-ta-g1.ctb ?
ta.ctb ?
ta.tbl ? merge with ta.ctb
tamil.cti ?
te-in-g1.utb ?
te.tbl ? merge with te-in-g1.utb
telugu.cti ?
text_nabcc.dis ?
tr-g1.ctb ?
tr.ctb ?
tr.tbl ? merge with tr.ctb
tsn-za-g1.ctb ?
UEBC-g1.utb ? delete? (replace with en-ueb-g1.utb) (#468)
UEBC-g2.ctb ? delete? (replace with en-ueb-g2.ctb) (#468)
ukchardefs.cti ?
ukmaths.ctb ? move to Liblouisutdml (#321)
ukmaths_edit.ctb ? move to Liblouisutdml (#321)
ukmaths_single_cell_defs.cti ? move to Liblouisutdml?
ukmaths_unicode_defs.cti ?
uni-text.dis ?
unicode-braille.utb ?
unicode.dis ?
unicodedefs.cti ?
ur-pk-g1.utb ?
ur-pk-g2.ctb ?
us-table.dis ?
vi-g1.ctb ?
vi.ctb ?
vi.tbl ? merge with vi.ctb
wiskunde-chardefs.cti ? move to Liblouisutdml (#321)
wiskunde-translation.cti ? move to Liblouisutdml (#321)
wiskunde.ctb ?
zh-chn.ctb ?
zh-hk.ctb ?
zh-tw.ctb ?
zh_CHN.tbl ? merge with zh-chn.ctb
zh_HK.tbl ? merge with zh-hk.ctb
zh_TW.tbl ? merge with zh-tw.ctb
@rimas-kudelis

This comment has been minimized.

Copy link
Contributor

rimas-kudelis commented Oct 12, 2017

I would suggest to consistently use lowercase for language tag and uppercase for country tag (when one is used).

@egli

This comment has been minimized.

Copy link
Member

egli commented Oct 12, 2017

Good effort, thanks @bertfrees. The meta data tables should be merged with the "real tables". And the above overview table should probably go into a wiki page

@bertfrees

This comment has been minimized.

Copy link
Member Author

bertfrees commented Oct 12, 2017

I think I'd like to try a naming scheme close to the RFC 5646 standard. I'm thinking about putting the braille specific tags like "g0" etc. under the so-called "extension subtags" or "private-use subtags". For example:

de-CH-x-g0.tbl

The -x- denotes the beginning of the braille subtags. For custom extensions you are supposed to use "x", but since the only officially registered extensions are "t" and "u", we could in principle also use e.g. "b".

de-CH-b-g0.tbl

@bertfrees

This comment has been minimized.

Copy link
Member Author

bertfrees commented Oct 18, 2017

A thing to consider is to support specifying multiple #+locale tags in a table so that it matches queries for either locale. This way we could possibly eliminate some "alias" tables.

bertfrees referenced this issue in nlbdev/pipeline Oct 19, 2017

[liblouis] Make more tables for uncontracted braille discoverable
Notably for languages that have a table for contracted braille as
well.

Also make UEB the default for "en". In commit fb78caa I claimed that I
had made en-ueb-g1 discoverable, but that's not true. I only made
en-us-g1 discoverable. That is now fixed.
@BueVest

This comment has been minimized.

Copy link
Collaborator

BueVest commented Nov 2, 2017

What about extensions: the utb/uti/ctb/cti scheme is currently used to various extends. Do you propose to use tbl for all tables?

@bertfrees

This comment has been minimized.

Copy link
Member Author

bertfrees commented Nov 2, 2017

Yes. tbl for main tables and tbi for tables intended for inclusion only.

@bertfrees

This comment has been minimized.

Copy link
Member Author

bertfrees commented Nov 14, 2017

Here is some more explanation about the .tbl tables because it hasn't really been explained anywhere else yet.

The current state is that there are two sets of tables. The existing tables, and the tables with extension .tbl. The goal is to end up with only the new set at the end. Both sets should eventually become equivalent, and as soon as the .tbl tables are all named according to the new scheme, the old tables can be removed.

The .tbl tables were initially added to support table lookups based on metadata (from DAISY Pipeline) without touching the existing tables. It was kind of a quick hack, and it hasn't been officially announced. More .tbl tables have been added by others since, but on the other hand metadata has been added to tables in the old set as well, so there is not such a clear distinction between the two sets anymore. A pattern that does remain is that .tbl tables consist of only metadata and include rules and that every .tbl table has a corresponding version in the old table set. We should maintain this pattern. Existing tables should be renamed as little as possible. .tbl tables may be renamed at will and these renames should not be mentioned in the NEWS. We won't announce the new table set officially at all until the old set is removed, and when it happens we'll put the file name mapping in the NEWS.

I'm still unsure whether metadata changes should be mentioned in the NEWS.

@BueVest

This comment has been minimized.

Copy link
Collaborator

BueVest commented Nov 14, 2017

@bertfrees

This comment has been minimized.

Copy link
Member Author

bertfrees commented Nov 14, 2017

No I'm not saying that. The Danish tables for example have metadata in the original tables, and that's fine. We don't need to add a .tbl file for each original, just for the sake of it.

The file names don't have to correspond. Like I said above the old tables should be renamed as little as possible (to avoid frustrations of users) and the new tables should be named according to the new naming scheme. The naming scheme is the topic of this Github issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.