-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEI using outmoded ISO 5218 for sex value attribute #426
Comments
I have long argued that ISO 5218 was inadequate for recording sex, not only because of the precedence of "male" in the numbering (although I've heard people interpret it the other way around--female is double the value of male--but that's a derailment), but because the values "unknown" and "not applicable" (which presumably can only refer to things like an anonymous blogger and a robot, respectively) are completely inadequate for representing the many other sexes and sexual-identities possible, including intersex, genderqueer, gender-neutral, fluid, trans* and many others. I did ask around various online queer communities if there were other proposed open standards for representing sex more inclusively, but no one could think of any. (The problem being, of course, that any such standard would be inadequate.) I agree very strongly with Melissa that we need a discussion about how to improve this situation, both at the TEI level, where we have a chance to improve the situation in the short term, and at ISO (which will no doubt takea lot longer). Failing any other external standard to adopt, I suggest the datatype of Original comment by: @gabrielbodard |
I have been arguing for this for the past 6 years, since the presentation of P5, and I don't buy the retrospective argument that 2 is assigned to a woman because it is twice as good as a man. The TEI should not accomplice of the sexism of ISO. Agreed with Melissa and Gabby. Original comment by: sf_user_epierazzo |
I agree with Gabby's proposed solution to go to data.enumerated for sex/ Original comment by: @martindholmes |
My feeling is that we should stick to the principle that we re-use external standards where at all possible. There isn't an obvious other contender to a categorisation of sex than ISO, so if thats inadequate, lobby ISO, not TEI. Why do people care? if you want to ignore the nornalization to ISO, use the body of the element to say whatever is needed. If you want to use another normalization schema, redefine data.sex in your customization as usual. by the way, I don't regard TEI as "them" or "you". It's "we" and "our" standard. Similarly, ISO. Original comment by: @sebastianrahtz |
I don't seriously make the argument that '2' is better than '1' because it is more. When I've said that it is to point out how silly I find it to make the assumption that a numbering system of 1 and 2 somehow implies precedence or order (especially when '9' is also used). IMHO, I don't think it truly 'assigns women to be secondary to men', just like if ISO 5218 was expanded to have , let's say, trans ppl as '3' that they would be considered tertiary and below women. I'm sorry that you find it offensive. I've always felt that It is just a different number. It is simply an agreed machine-processable label -- yes people can get offended by that, but that isn't inherent to the number itself but the interpretations people place on it. I'm not saying those interpretations aren't real or don't have weight, validity or consequences. But that has to be balanced against using some adhoc linguistically-specific system which is why we moved away from 'm', 'f', 'u', and 'x'. There are many, many, other possible ways that people could record the information about sex and/or gender using TEI should they wish to do so on point of principle. (Using I'm not saying the TEI shouldn't change this, but I would instead be trying to get ISO to redefine the standard in some appropriate way and then TEI would, happily and without argument, implement that. No one has suggested what possible values we should use, so discussion would need to develop a clear proposal. Part of the problem is that any numerical system has the perception of ordering, and alphabetic ones have linguistic culture-specific assumptions that we'd prefer to shy away from if possible. One possibility suggested to me was to code for chromosome types XY for males and XX for females (which nicely gets a way to deal with sex chromosome abnormalities like Klinefelter syndrome), however, while this may work for biological sex identification it does not deal with gender and/or sexual identification such as the list proposed by Gabby. I honestly do not know what the right values should be. If I was encoding texts where it was felt important to have more than the four categories, as I suggested I would probably use a <taxonomy> with a range of values suitable to the task. [Since Gabby has already done some research in this area, I'm assigning the ticket to him to make sure we don't lose sight of it. Marking it as group 'RED' at the moment because we'd need a clear proposal to discuss and it isn't all clear what that proposal should be.] Original comment by: @jamescummings |
Original comment by: @jamescummings |
I agree that TEI is not--and shouldn't be--in the business of creating new standards, but we are currently in the business of recommending the use of existing standards, and we should be careful that the standards we recommend are fit for the purpose our users are going to employ them for. It's clear that for several reasons ISO 5218 is not fit. I do have a concrete proposal, in fact: change the datatype of person/ Those of us who care can also petition ISO to improve or remove the inadequate standard (but good luck with that), or work with other communities to come up with a rival standard. It's not TEI's business to do that, though. Original comment by: @gabrielbodard |
Having thought about this a bit more, Gabby's proposal (changing data.sex to data.enumerated) won't work transparently, because data.enumerated is data.name, and data.name is an XML name which cannot begin with a digit, and so ISO 5218 numerical values would become invalid. They could be prefixed with a letter, of course, but such a change would break backwards compatibility. Original comment by: @martindholmes |
That's a problem. This isn't the first time that the datatype of data.enumerated has turned out to be a problem (cf discussion of datatype of Why does data.enumerated need to start with an alphabetic character anyway? Original comment by: @gabrielbodard |
The discussion Melissa has pointed to advocates an open-ended approach in which there are some suggested values ("male", "female") but the category is open so that users can express their sexuality or gender in a way that suits them. One option would be to create a new attribute with open data.enumerated values, and suggest some values. This could coexist alongside the Naming this attribute would be problematic. Original comment by: @martindholmes |
It seems decidely retrograde to go back to an open list of arbitrary tokens made up by each project as it deems fit. Either let's use tokens from a recognized authority or convention (as we do for dates and times, for example), or let's use the "pointer to a classification" system which we espouse elsewhere. So just as we say hand="#hand1", lets say sex="#sex1", where "sex1" is the ID Using arbitrary magic codes is the worst possible solution. (my survey of people, asking them if they find the 1 and 2 thing offensive, so far yields a more or less equal numbers of "i have no idea what you're talking about", "oh yes that old chestnut, but there are far more important issues to solve", and "it's an unordered set of arbitrary tokens, whats the issue") Original comment by: @sebastianrahtz |
I was amused to read that Sweden used to/uses a citizen identifier where "The number uses ten digits, YYMMDD-NNGC. The first six give the birth date in YYMMDD format. Digits seven to nine (NNG) are used to make the number unique, where digit nine (G) is odd for men and even for women. " Scotland does something similar. It raises the possibility that one could use any old numbering system, but follow the convention of "odd is sort of like men and even is sort of like women" (leaving 0 or negative numbers for other uses). that would allow one to use 100 for women and 101 for men. A trivial function will return the ISO equivalent for those that want to map to it. http://en.wikipedia.org/wiki/National\_identification\_number is fascinating reading :-} Original comment by: @sebastianrahtz |
Thanks for your comments on this - and glad to see that some are taking this seriously (when someone says that they are offended at something, it is generally useful to believe that they are offended, rather than telling them that they cant possibly be offended, or that there are better things to be doing.) I'm following this discussion with interest (although markup isnt my forte) - I agree that using an outmoded standard, just because it is a standard, isnt a useful approach. Fwiw, I'd be interested in working with someone on petitioning the ISO about this, if anyone else is willing to join forces. Original comment by: @melissaterras |
ISO standards have a very detailed and carefully designed process, to make sure they don't just hang on for ever. This one was last examined and renewed by due process in 2004. I dont think the right process is to "petition ISO", however. They don't make standards, they merely publish the work done by their working groups, which are composed of representatives of the national standard bodies, ie the BSI in our case. So I'd suggest contacting BSI, and finding out when the next examination is due, and which the relevant committee is. On a quick browse of 5218, it is very clear that this isn't a group of people sitting down and making up codes; it is (as is often the case) formalizing existing processes in member countries. My other investigation suggests that the convention odd=male, even=female is probably the origin of it. It may well be, then, a very uphill task indeed to argue for a revision. I cannot see which of the lengthy and detailed replies on the ticket is not taking it seriously, by the way. I would not agree that "using an outmoded standard, just because it is a standard, isn't a useful approach.". I'd argue that it is a great deal better than having no interchange of information at all. It is pretty obvious, isn't it, that the standard of doing our calendar based on the supposed birth of a Jewish prophet in a religion which is a minority worldwide is outmoded - but its jolly useful! Original comment by: @sebastianrahtz |
I think there are 6 answers here, some backward compatible and some not:
Original comment by: @sebastianrahtz |
I will encourage the rest of tei-council to comment on this (and if it seems reasonable later to draw attention to this ticket on TEI-L). To spell out my proposed solution for those who don't want to use a numerical <person ana="#idOfBiologicalSexualCategory #idOfGenderIdentification">...</person> which would then point to a taxonomy with categories with the appropriate IDs. I include the (debatable) bio vs genderIdentification here simply to highlight that such an approach allows multiple vectors however the encoders feel would be useful to categorise their taxonomies. If we adopted Of his suggestions:
Of all of them 1. is easiest, but may not really solve the problem of offence generated, just apologise for it shifting the blame to ISO. I realise that isn't very satisfactory but at least recognises the problem while causing the minimal side-effects in backwards compatibility for the community. Original comment by: @jamescummings |
I would vote for 1. as well, maybe referring to James' good proposal of using Original comment by: @laurentromary |
I agree with Laurent. Promoting/explaining the use of Original comment by: @sebastianrahtz |
Using None of this addresses the central problem of ISO 5218 and that our use of it may be offensive. I think there is a choice (before Of the two I would prefer a) but with an explanation of possible problems and limitations added in chapters and reference pages, and use of something like Original comment by: @jamescummings |
I think we need a slightly more coherent approach than is being discussed here. I suggest:
For those who wish to continue using ISO 5218 in the meantime, the only difference is that the attribute they are using is mildly disrecommended in favour of Brief additional prose to point out the problems with ISO 5218 would be welcome. I don't think we want to discuss "how to represent non-binary sex" especially, as again that's not our place. How to use taxonomies other than ISO (whatever the reason for your dissatisfaction for it) would be essential, however. Original comment by: @gabrielbodard |
I'd simplify this to
and forget I don't think our deprecation mechanisms are enough to have enough effect. If are prepared to consider this issue against Birnbaum, rename Original comment by: @sebastianrahtz |
<personGrp> would also need any new attribute (it currently has A new attribute class for Original comment by: @martindholmes |
Martin: This is why I suggested we replace both person/ Why does att.sex sound ridiculous? Sebastian: I still think that renaming to Original comment by: @gabrielbodard |
Gabby: I do like your solution, but it is a bit disruptive compared with Sebastian's. Re att.sex: I was thinking it should be an adjective, and I couldn't think of an acceptable one. Original comment by: @martindholmes |
i am ok with renaming Original comment by: @sebastianrahtz |
Sorry to come to this particular party a bit late. Here are my views:
Original comment by: @lb42 |
Lou:
(5. Side issue: presumably you could have a listPerson broken up into 1 sub-listPerson containing all the men, another containing only women, and a third containing all the intersex athletes at the 2028 Olympics; wouldn't it be useful to be able to attach Original comment by: @gabrielbodard |
Carina Zona has sent me some links to more inclusive standards for recording sex, some of which have a certain amount of real-world use. Most interesting is: > http://transhealth.ucsf.edu/trans?page=lib-data-collection <- the Other links she recommended included: I don't think any of these schemas/standards are ready for use as a replacement for ISO 5218 at this moment, but some may be the basis for a competing standard or a modified ISO proposal at some point. More immediately relevant right now, it might be worth pointing to one or two of these in the Guidelines discussion of what alternatives there are to the binary distinction forced by Original comment by: @gabrielbodard |
The first link does look interesting. It breaks down sex/gender identification into two distinct questions: 1) How does someone self-identify, and 2) what sex was assigned to the person at birth. The options for the latter fit with ISO (although that doesn't address the offence of ordinal precedence). The second question is concerned with how people self-identify when asked, and this is an enumeration with the additional option of supplying a new value. I think this distinction fits with our proposed distinction between Original comment by: @martindholmes |
Original comment by: @gabrielbodard |
At the TEI Council meeting in Brown, 2013-04, we agreed to change the datatype of person/ In the meantime (and in another ticket) Syd is going to suggest changing data.enumerated to data.word so that we can use that here and values such as "0", "1" will remain valid (currently a data.enumerated is data.name which has to begin with an alphabetic character, and would therefore break backward-compatibility). The datatype of data.sex may therefore eventually be changed to data.enumerated or similar, but all values valid against data.word will remain valid. Original comment by: @gabrielbodard |
Original comment by: @gabrielbodard |
Original comment by: @gabrielbodard |
Done at revision [r11913]. Original comment by: @gabrielbodard |
Great to see some movement on this! thanks for taking it forward - appreciated. Original comment by: @melissaterras |
This issue was originally assigned to SF user: gabrielbodard |
TEI uses ISO 5218:2004 to assign sexuality of persons in a document ( with attributes being given as 1 for male, 2 for female, 9 for non-applicable, and 0 for unknown). This is an outmoded and problematic representation of sexuality, and in particular formally assigns women to be secondary to men.
There are other discussions online regarding how best to tackle sexuality in markup, and the problems in using ISO 5218 - see the w3c lists here: http://lists.w3.org/Archives/Public/public-contacts-coord/2010JulSep/0010.html .
I would like to see TEI move away from enshrining women as the second sex in their markup - as Steven Ramsay tweeted:
<author>Simone de Beauvoir</author> <sex value="2">female</sex> sigh
Can a discussion be had about how best to achieve this? Your current approach is both outmoded and offensive.
best,
Melissa
Original comment by: @melissaterras
The text was updated successfully, but these errors were encountered: