Proposal: "language" to be testable #110

Open
f-mb opened this Issue Sep 1, 2013 · 14 comments

Comments

Projects
None yet
5 participants

f-mb commented Sep 1, 2013

Following #55 that make "language"a CSL variable.

A good idea would be to have "language" testable, like locutor or position, within "choose". Then, testing language, we could change title rendering. Also, for legal_case, bill and legislation, testing language could help to determine citation format (and jurisdiction); specially between common-law and civil-law citation.

As you know capitalizing rules vary from one language to an other, even within the same citation style (ex: McGill French or Lluelles).

I looked into csl schema and that doesn't seem that hard (also that seems backward compatible).

Any opinion/objection/idea ? (@fbennett @adam3smith)

PS: I don't know if I should post that on xbiblio-devel list, or here is the right place.

Owner

adam3smith commented Sep 1, 2013

I don't think anyone disagrees that having language parseable would be nice
(and MLZ does that), but it's pretty demanding because it doesn't just
require a CSL change, but also standardized input from the client, so I
don't see that happening very quickly.
Jurisdiction is better handled separately from language. Ideally both
fields should probably be done with a selection menu, the way MLZ handles
jurisdiction. Since language does exist in most clients, that's tricky
because they need to decide what to do with existing entries.

On Sun, Sep 1, 2013 at 4:58 AM, Florian Martin-Bariteau <
notifications@github.com> wrote:

Following #55https://github.com/citation-style-language/schema/issues/55that make "language"a CSL variable.

A good idea would be to have "language" testable, like locutor or
position, within "choose". Then, testing language, we could change title
rendering. Also, for legal_case, bill and legislation, testing language
could help to determine citation format (and jurisdiction); specially
between common-law and civil-law citation.

As you know capitalizing rules vary from one language to an other, even
within the same citation style (ex: McGill French or Lluelles).

I looked into csl schema and that doesn't seem that hard (also that seems
backward compatible).

Any opinion/objection/idea ? (@fbennett https://github.com/fbennett
@adam3smith https://github.com/adam3smith)

PS: I don't know if I should post this proposal here or on xbiblio-devel
list.


Reply to this email directly or view it on GitHubhttps://github.com/citation-style-language/schema/issues/110
.

Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

f-mb commented Sep 1, 2013

A CSL change would be a first step. But why change clients to standardize the field is required ? Complete standardization would be difficult, because one document could be in several languages.
For now, Zotero can collect and parse its language field via the "locale" condition into a choose condition, but this is not a CSL valid structure :-(

I'm OK with you. Jurisdiction would need to be handle separately, like in MLZ. But this time, it will require a new field. It could be standardize... That's why I thought of language (each jurisdiction can be mapped to a language, except some international organizations, but their citation format is the french format (most of the time).

As for MLZ, I know it has specific fields for lawyers; but it'll require a lot of improvement and development about jurisdictions and styles before I could use it, and tell other peoples to. For now, I'm about to convert a lot of Québec Law Librarians and Ph.D. students to Zotero and CSL... That's a first step. ;-) I'll check MLZ this winter.

Owner

adam3smith commented Sep 1, 2013

Zotero can't parse the language field - you can just test whether its
present or not and you can print its literal content. You can't test for
its content. So softening the specs would hardly help for the types of
issues you describe.
If we're going to change the CSL specs so that the language is actually
parseable - i.e. that CSL can distinguish between a work in English and one
in French and one in German - we need to require the format for that
input, so yes, the field would need to be standardized. And before we do
that we would want to be certain that what we require can actually be done
on the client side. And we'd want to think about thinks how to handle
documents in multiple languages, etc.

My main reason for referring to MLZ is that this is where legal support in
CSL is headed. We may end up doing some things differently than Frank in
csl-m, but by and large we'll want to take advantage of the massive amounts
of work/thought he has put into this rather than re-invent the wheel
piecemeal.

On Sun, Sep 1, 2013 at 9:53 AM, Florian Martin-Bariteau <
notifications@github.com> wrote:

A CSL change would be a first step. But why change clients to standardize
the field is required ? Complete standardization would be difficult,
because one document could be in several languages.
For now, Zotero can collect and parse its language field via the "locale"
condition into a choose condition, but this is not a CSL valid structure :-(

I'm OK with you. Jurisdiction would need to be handle separately, like in
MLZ. But this time, it will require a new field. It could be standardize...
That's why I thought of language (each jurisdiction can be mapped to a
language, except some international organizations, but their citation
format is the french format (most of the time).

As for MLZ, I know it has specific fields for lawyers; but it'll require a
lot of improvement and development about jurisdictions and styles before I
could use it, and tell other peoples to. For now, I'm about to convert a
lot of Québec Law Librarians and Ph.D. students to Zotero and CSL... That's
a first step. ;-) I'll check MLZ this winter.


Reply to this email directly or view it on GitHubhttps://github.com/citation-style-language/schema/issues/110#issuecomment-23627382
.

Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

Owner

rmzelle commented Sep 1, 2013

(generally the xbiblio mailing list is the best place to discuss things like these, since it's followed by more people than the CSL GitHub issue trackers)

f-mb commented Sep 1, 2013

@adam3smith : OK I see.
Zotero can parse language field via the variable "locale", but this is refused by the CSL validator. I tested it in the test pane. I understand the CSL team doesn't want a language field/variable freely entered by user.

Nevertheless, if "language" is now a variable in CSL 1.0.1, the CSL validator refuse the use of language as variable (variable="language") in a condition or for text. I just tried it with @simonster's validator, and Zotero refuses to import it.

Yeah, Frank has done a very good job with MLZ (!), and with the "MLZ book". We could look to implement so of the CLS-m features (even if I understand that CSL is a now a standard, and need to consider clients using CSL!).

@rmzelle Do you want me to forward this thread to the mailing-list ?

Owner

adam3smith commented Sep 1, 2013

Zotero can parse language field via the variable "locale"

right, it can indeed. That's code from MLZ, the citation processor is the same.

Nevertheless, if "language" is now a variable in CSL 1.0.1, the CSL validator refuse the use of language as >variable (variable="language") in a condition or for text. I just tried it with @simonster's validator, and Zotero >refuses to import it.

yes, that's on purpose. As a conditional it's not very useful and as printed variable we want to discourage it, as the goal is to use language codes (like fr-CA) rather than names in the language field in the medium to long run.

Owner

bdarcus commented Sep 1, 2013

Maybe you can just summarize the topic on the ML in sentence or two, and
include the link to the issue?

Ultimately, you need to make a judgement call about whether content goes
here, or on the ML. I'd say if there's some detail of decision-making that
someone later really wants to be able to find in one please, put it here.
But if it's more conversational, put on ML.

On Sun, Sep 1, 2013 at 1:03 PM, Florian Martin-Bariteau <
notifications@github.com> wrote:

@adam3smith https://github.com/adam3smith : OK I see.
Zotero can parse language field via the variable "locale", but this is
refused by the CSL validator. I tested it in the test pane. I understand
the CSL team doesn't want a language field/variable freely entered by user.

Nevertheless, if "language" is now a variable in CSL 1.0.1, the CSL
validator refuse the use of language as variable (variable="language") in a
condition or for text. I just tried it with @simonsterhttps://github.com/simonster's
validator, and Zotero refuses to import it.

Yeah, Frank has done a very good job with MLZ (!), and with the "MLZ
book". We could look to implement so of the CLS-m features (even if I
understand that CSL is a now a standard, and need to consider clients using
CSL!).

@rmzelle https://github.com/rmzelle Do you want me to forward this
thread to the mailing-list ?


Reply to this email directly or view it on GitHubhttps://github.com/citation-style-language/schema/issues/110#issuecomment-23628720
.

f-mb commented Sep 1, 2013

@adam3smith Thanks. Now I know why/how "locale" is parsed ;-)

I understand the goal of using language codes. Nevertheless, validate the use of "locale" (as the processor can do it) would help style development. The only way would be to filter: if standard code = parsed, if not = ignored; because we would never be sure that all software using CSL have a standardized language field.

I guess any style using the variable "locale" will be refused in the CSL official repository...

@bdarcus I will. Thanks.

Owner

adam3smith commented Sep 1, 2013

citeproc as used in Zotero and Mendeley isn't the only processor - there are at least two other major processors (the closed-source one in Papers and citeproc-hs) that we have to look out for. Plus, if we now allow a syntax that we later want to change or disallow, that creates its own set of problems.
As for how this should work - my preferred version would be to require the clients to handle that entirely and by their discretion and to have the processors only accept standard codes, which we can then use in citation styles.

I'm sorry, I know your quick success in getting the field mappings into Zotero was encouraging, but it was also highly a-typical. Getting something like this thought through and implemented takes time. If you want to have these features quickly you won't get around MLZ.

f-mb commented Sep 1, 2013

OK thanks. I understand.

Anyway, thank you for your quick answers!

Owner

rmzelle commented Sep 5, 2013

Sorry to chime in a bit late.

Just to summarize: the CSL 1.0.1 specification mentions that title casing should not be used for non-English items (see http://citationstyles.org/downloads/specification.html#non-english-items ). Whether an item is deemed English or not depends on the value of the optional "language" field of the item metadata (which needs to be a IETF language tag like defined in the RFC 1766 standard). But the "language" field does not map to a CSL variable (partially because we figured that nobody wants to print a language tag in a citation).

The current support for modifying citations based on the individual item's language is clearly rather limited, and I agree we should improve on it in future versions of CSL. The discussion actually goes back to 2010, when Frank first proposed to add a conditional test to cs:choose for "language": http://xbiblio-devel.2463403.n2.nabble.com/Proposal-test-condition-for-quot-language-quot-td5775633.html . In a private email exchange around that time, Frank and I concluded that relying on cs:choose had an important limitation: Frank mentioned that the delimiter used to separate cites (e.g. "Doe 2000" and "Smith 1999") within a citation ("(Doe 2000, Smith 1999)") can be dependent on the language of the cited items. In CSL this delimiter is set on cs:layout, which is outside the scope of any cs:choose element. Frank solved this in Multilingual Zotero by introducing multiple cs:layout elements in his fork of CSL. See http://citationstylist.org/docs/citeproc-js-csl.html#multilingual-layouts

@ajlyon later tried to revive the discussion at http://xbiblio-devel.2463403.n2.nabble.com/Bringing-back-up-language-td6290371.html (without much success, since the thread went quickly off-topic). In that thread, I mention the reason why I haven't been pushing for incorporation of Frank's solution in official CSL: by solely relying on cs:locale elements, there is no straight-forward way to specify item-language-specific settings for options that only exist on cs:style, cs:citation, and cs:bibliography. I'm not sure this is a showstopper, but it is a clear limitation that should be given sufficient thought. There may be other solutions to this problem that don't suffer from this limitation (I haven't yet come up with any good alternatives, though).

Member

fbennett commented Sep 5, 2013

Thanks for this excellent summary of the background Rintze (@rmzelle) -- and my apologies for being so late to respond in this thread.

As it happens, I have just recently started working on the first style for MLZ that will support Japanese and English references. The base style I'm working from was prepared by a local librarian, and there is a circle of interested users, so the end product should receive a thorough review. The work has turned up a few small processor bugs related to multilingual processing already. When the dust settles on the work, we should have a better idea of how well the solutions I'm trying out in MLZ will work in the wild.

Multilingual is a big step, and we do need to tread carefully when contemplating extensions to official CSL. But it's moving forward by degrees.

f-mb closed this Jul 10, 2016

rmzelle reopened this Jul 11, 2016

Owner

rmzelle commented Jul 11, 2016

(I think we'd want to keep this issue open)

f-mb commented Jul 11, 2016

OK ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment