Skip to content

This repository serves as a space to collaborate towards the implementation of better practices to address the class imbalance problem in healthcare datasets.

License

Notifications You must be signed in to change notification settings

natalianorori/HealthDataSharingIsCaring

Repository files navigation

Health Data Sharing Is Caring

Welcome to the repository for the "Health Data Sharing is Caring" session, which was part of Mozilla Festival's 2019 Openness space.

This repository serves as a space to collaborate towards the implementation of better practices to address the lack of representation of vulnerable populations in healthcare datasets. The repository contains the code,session plan, materials and other resources used in the session as well as notes, learnings and future ideas for the project.

All session materials are safely preserved in Zenodo, you can download them here.

A brief introduction to data imbalances and data bias in healthcare

Open Data has the potential to revolutionize global health. A crucial step to unleash this potential, but If the training data is misrepresentative of the variability present in the population, it is prone to reinforcing inequality and bias. In healthcare, this could lead to dangerous outcomes, including misdiagnoses, missing diagnosis or poor treatment plans.

Algorithms are needed to advance healthcare, and open data offers the possibility of producing more powerful models, but unfortunately, the medical datasets openly available for use by data scientists and researchers are notoriously biased.

Repo Goal

The goal of this repository is to serve as a channel to:

  1. Raise awareness on the underrepresentation of minorities in health datasets, as the lack of available data about certain communities affects disease control plans and accountability, thus its impact on global health.

  2. Discuss how balanced healthcare datasets might help us develop more accurate diagnostic tools as well as avoid discrimination and inequality in healthcare.

Why is this important?

Global issues need a global approach. For us to truly experience and take advantage of Open Data and emergent technologies, we need to make sure to take into account the needs of all. Healthcare datasets can help us improve global health, but they also come with great risks if misused. It is important to raise awareness of these risks and find viable ways to prevent them.

Mozfest learnings

During the Mozfest 90 minute session, participants were invited to interact with 8 posters inspired in the Matrix red pill blue pill analogy. In each poster, the blue pill contained information on the advantages that open datasets offer to healthcare, and the red pills contained less talked about but very crucial facts that prevent us from seeing the real benefits of healthcare applied technologies.

The 25 session participants shared their thoughts, ideas and questions in the form of post-its, that were later discussed and implemented as session learnings that are now distributed as issues on this repository.

We divided participant collaborations in two main groups:

Design thinking questions heavily inspired in OpenCon's do-a-thon challenges. Possible solutions, recommendations and routes of actions.

You can read about our learnings and submit your questions, challenges and solutions in the issues section of the repo. We invite session participants, and anyone interested to keep the conversation going. The end goal of this project is to share our learnings and findings with fellow researchers, institutions, or anyone interested in working towards making science more open.

You can also access the session participant's thoughts on each poster and add your own here. We highly encourage you to add your ideas, our project would be nothing without participants like yourself!

Want to run your own session?

We'd be happy to help you remix the session! We have uploaded a remixable session plan here. Please feel free to adapt it to your needs and do get in touch if you need any assistance during the process.

If you decide to run your own session let us know by signing up here.

Want to contribute?

This repository is the product of collaboration between session attendees, participants, and people interested in the project who could not attend the live session.

We are seeking contributors to help us expand ideas in github issues. If anything you've read seems interesting to you, we invite you to contribute by commenting on existing issues, or creating new ones.

If you contributed, let us know here. If you were in the Mozfest session, write your name here for us to credit you.

If you want to keep up to date with the repo, please consider subscribing to issues you consider interesting.

Email Natalia at natalianorori@gmail.com or tweet at @natalianorori. if you have any questions, would like to chat, or replicate the session in your community or institution.

"Health Data Sharing is Caring" was co-facillitated by Stefano Vrizzi and Natalia Norori. All the materials in this repository are licensed CC-BY 4.0

session participants

Sakthi Anand Yash Chaubey Vishal Pandey Bala Subramaniyan Gopala Krishna Anne Clinio Athina Tzovara Micah Vandegrift Damian O.Eke Stefano Vrizzi Joscha Jäger George Ogoh Natalia Norori Stefano Vrizzi

About

This repository serves as a space to collaborate towards the implementation of better practices to address the class imbalance problem in healthcare datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published