Skip to content

Confusing/offensive use of "gender" in starwars dataset #4456

@dhicks

Description

@dhicks

The starwars sample dataset includes a gender variable, which currently takes 4+1 values: "female," "male," "hermaphrodite," and "none" (plus NA). There are two problems with this variable and the way that it is coded.

The first problem is that the codes refer to characters' biological sex, not their gender. (For background on the sex/gender distinction, see here: https://en.wikipedia.org/wiki/Sex_and_gender_distinction.) For example, Jabba the Hutt is consistently referred to using masculine pronouns (he/him/his; see https://en.wikipedia.org/wiki/Jabba_the_Hutt). But Hutts are biologically hermaphroditic, with both male and female sex organs (https://starwars.fandom.com/wiki/Sexes/Legends; https://starwars.fandom.com/wiki/Hutt/Legends). So Jabba has a masculine gender, but he's not biologically male.

Fixing this problem is simply a matter of renaming the variable sex; though of course this breaks many examples using the dataset.

The second problem is that, outside of specifically biological contexts, "hermaphrodite" is an offensive term for intersex people. Specifically, the term has been used historically to pathologize and medicalize intersex people (see Anne Fausto-Sterling's Sexing the Body or this Daily Beast piece: https://www.thedailybeast.com/dont-call-them-hermaphrodites). It also continues to be used to objectify intersex people as sexual curiosities. For example, many of the top results for a web search of the term are for porn videos; examples can be seen at https://duckduckgo.com/?q=hermaphrodite&atb=v17&ia=web, though obviously the results at that link are NSFW.

While the term is appropriate in the context of describing the sex of Starwars characters, it's easy for this context to be lost when the dataset is used for examples. The result can be problematic or even outright offensive examples. (Here's the Twitter thread that prompted me to write up this issue: https://twitter.com/danieljhicks/status/1144656643906404352. I also want to note that Mara handled this really well!)

I suggest this second problem can be addressed by switching to adjectival forms. "Hermaphroditic" carries more technical connotations than the noun form of the term; a web search turns up mostly dictionaries, specifically biological discussions, and discussions of intersex people, without any porn: https://duckduckgo.com/?q=hermaphroditic&atb=v17&ia=web. The "none" value, which could be confused with NA, could also be replaced with an "asexual" value, which is more descriptive. (Though potentially confusing in its own way, because that term is also used for a sexual orientation.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    tidy-dev-day 🤓Tidyverse Developer Day rstd.io/tidy-dev-daywipwork in progress

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions