Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
64 lines (36 sloc) 6.7 KB

<<< Previous | Next >>>

Level of impact II, continued

Ramifications of (re)producing categories

Decisions on the categories and boundaries scholars use shape our:

  • Datasets
  • Catalogues
  • Maps
  • Algorithms

Categories are key to digital tools in many ways: the classification systems used by libraries and archives, the tags used on websites, the methods of categorization informing algorithms, and the spatial divisions on a map. The production and reproduction of these categories shapes how things/people/places etc. are identified and grouped together, and also what is searchable, findable, and foregrounded.

A comic from Postcolonial #DH No. 28 by Adeline Koh: "Wikipedia and the politics of gender categorization." In the image, a bunch of white men stand to the left behind a roped off area, and a bunch of people of color and women stand to the right. A white male facing the people to the right says to them, "I'm sorry, there just isn't any more space in the main wikipedia 'American Novelist' category. Maybe you oculd join the 'American Woman Novelist' category?"

Image source: A comic by Adeline Koh from #DHPoco: Postcolonial Digital Humanities, shared here with her permission.

Bias in, bias out: systems of oppression and inputting human bias

Human beings are making decisions that inform how these groupings are being made, and human "beliefs are embedded in the design and concept of technological systems" (Broussard 2018, page 67). Technology is not unbiased, but rather will inevitably represent the decisions of its human creators who each create from their own situated standpoints—personally, socially, and historically.

The United States, for example, is a society shaped by dominant systems of oppression such as white supremacy, settler colonialism, and cis-hetero-patriarchy (see the Glossary for definitions). These oppressive systems effect—sometimes intentionally, sometimes not—the decisions people make when they create digital platforms or tools (or anything for that matter!) and the parameters of what is permitted/supported/funded to be created and shared, and thus also shape the output that results from these digital platforms or tools or their computations.

"Writing a presentation on library cataloging and classification & realizing that there is nothing serendipitous about serendipitous browsing. As with everything else, someone decided where the book you found on the shelf would land. Someone with biases because we all have them" (Jenna Freedman @zinelib on Twitter, shared here with her permission. Also see: Jenna Freedman, "Library Cataloging and Classification: Reifying the Default," 2018)

"Human beings are developing the digital platforms we use, and as I present evidence of the recklessness and lack of regard that is often shown to women and people of color in some of the output of these systems, it will become increasingly difficult for technology companies to separate their systematic and inequitable employment practices, and the far-right ideological bents of some of their employees, from the products they make for the public." (Safiya Umoja Noble, Algorithms of Oppression: How Search Engines Reinforce Racism, 2018, page 2; see also Meredith Broussard, Artificial Unintelligence: How Computers Misunderstand the World, chapter six, 2018)

In the fields of Artificial Intelligence (AI) and Data Science, the phrase "junk in, junk out" (Eric David Halsey, 2017), is used to describe the the fact that predictive models take into account the data that is provided to them by their human creators, and then extrapolate to possible futures. Often this data is incomplete, faulty, or messy in a way in which the results are considered "junk," because the data input to the model was "junk."

While decisions made by algorithms are often presented as free of the personal value judgments that a loan officer or judge might apply to loan candidates or in determining the length of a prisoner's sentence, for example, they can still reproduce the bias evident in the data the algorithm is trained on. Many scholars and activists have also critiqued the use of existing data on policing, arrests, and recidivism in algorithms that try to predict future criminal behavior. Because the data being input to the model is based on past policing practices that include the over-policing of communities of color and low-income people, that data is biased against those groups and thus will reproduce the existing bias in its predictions of future activities. For a deeper dive, you can read more in this article (Maurice Chammah, 2016) that shows how predictive policing is not value-free and unbiased.

A graphic with the words 'bias in' next to an arrow pointing into the left side of a box, and then another arrow pointing out of the box on the right side towards the words 'bias out'

Image source: Created by author in MS PowerPoint.

Attempts to "resist the hierarchy"

A question to consider:

Can categorical hierarchies and existing bias be resisted through digital projects? If such resistance is possible, how can it be achieved?

As scholars, we have a responsibility to think critically about how we do or do not reproduce existing biases in the canons we reference, the data we use, and the conclusions we reach. Some projects that have tried to produce new, less-biased representations include:


Let's analyze and discuss a case study.

Check out the Interference Archive (IA) website, read this brief article and discuss:

  • What kinds of materials does IA host and do they have rights to it?
  • In reference to the article, how does IA see itself as “resisting the hierarchy”?
  • What levels of impact does IA aim to take into account?

Discuss with your table, then share as a group.

<<< Previous | Next >>>