Published on January 13, 2026. By Marília Prata, mpwolke.

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import feature_extraction, linear_model, model_selection, preprocessing
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

#Ignore warnings
import warnings
warnings.filterwarnings('ignore')

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

/kaggle/input/med-gemma-impact-challenge/Hackathon dataset.txt


### This is a Hackathon with no provided dataset.

![](https://media.licdn.com/dms/image/v2/D4D12AQGhSiCszCM1pQ/article-cover_image-shrink_720_1280/B4DZjquOqnGsAI-/0/1756284653366?e=2147483647&v=beta&t=w8YC9jZ0Zj9-U91zo3TC91ZLgCK6Xjl_3J3xD2V1QwU)

## Competition Citation: The MedGemma Impact Challenge

@misc{med-gemma-impact-challenge,

    author = {Fereshteh Mahvar and Yun Liu and Daniel Golden and Fayaz Jamil and Sunny Jansen and Can Kirmizi and Rory Pilgrim and David F. Steiner and Andrew Sellergren and Richa Tiwari and Sunny Virmani and Liron Yatziv and Rebecca Hemenway and Yossi Matias and Ronit Levavi Morad and Avinatan Hassidim and Shravya Shetty and María Cruz},
    
    title = {The MedGemma Impact Challenge},
    year = {2026},
    howpublished = {\url{https://kaggle.com/competitions/med-gemma-impact-challenge}},
    note = {Kaggle}
}

### Overview

"Google has released open-weight models specifically designed to help developers more efficiently create novel healthcare and life sciences applications. MedGemma and the rest of HAI-DEF collection.
Whether you’re building apps to streamline workflows, support patient communication, or facilitate diagnostics, your solution should demonstrate how these tools can enhance healthcare."

https://www.kaggle.com/competitions/med-gemma-impact-challenge

## Health AI Developer Foundations

Authors: Atilla P. Kiraly, Sebastien Baur, Kenneth Philbrick, Fereshteh Mahvar, Liron Yatziv, Tiffany Chen, Bram Sterling, Nick George, Fayaz Jamil, Jing Tang, Kai Bailey, Faruk Ahmed, Akshay Goel, Abbi Ward, Lin Yang, Andrew Sellergren, Yossi Matias, Avinatan Hassidim, Shravya Shetty, Daniel Golden, Shekoofeh Azizi, David F. Steiner, Yun Liu, Tim Thelin, Rory Pilgrim, Can Kirmizibayrak

"Robust medical Machine Learning (ML) models have the potential to revolutionize healthcare by accelerating clinical research, improving workflows and outcomes, and producing novel insights or capabilities. Developing such ML models from scratch is cost prohibitive and requires substantial compute, data, and time (e.g., expert labeling). To address these challenges, the authors introduced **Health AI Developer Foundations (HAI-DEF)**, a suite of pre-trained, domain-specific foundation models, tools, and recipes to accelerate building ML for health applications."

"The models cover various modalities and domains, including radiology (X-rays and computed tomography), histopathology, dermatological imaging, and audio. These models provide domain specific embeddings that facilitate AI development with less labeled data, shorter training times, and reduced computational costs compared to traditional approaches."

### MODELS

**CXR Foundation**

"CXR Foundation is a set of 3 models, all using an EfficientNet-L2 image encoder backbone. The three models learned representations of CXRs by leveraging both the image data and the clinically relevant information available in corresponding radiology reports."

**Path Foundation**

"Path Foundation is a Vision Transformer (ViT) encoder for histopathology image patches trained with self-supervised learning."

**Derm Foundation**

"Derm Foundation is a BiT ResNet-101x3 image encoder trained using a two-stage approach on over 16K natural and dermatology images."

**HeAR**

"HeAR is a ViT audio encoder trained using a Masked Autoencoder (MAE) approach. The model learns to reconstruct masked spectrogram patches, capturing rich acoustic representations of health-related sounds like coughs and breathing patterns."

**CT Foundation**

"CT Foundation provides embeddings suitable for downstream classification tasks. The underlying
model is VideoCoCa, a video-text model designed for efficient transfer learning from 2D Contrastive Captioners (CoCa)."

### Limitations

"The models were developed with a focus on **classification tasks**, and **prognosis tasks will need to be further evaluated**. **Image segmentation** and generation tasks are also currently **not supported**. Further,specific requirements such as smaller models (e.g. for on-device applications on a mobile device) or lower latency will need other techniques such as distillation to the target model size of interest."

https://arxiv.org/pdf/2411.15128

https://developers.google.com/health-ai-developer-foundations

https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/README.md

https://github.com/Google-Health/imaging-research/tree/master/ct-foundation

### I tried to run some demo, although I failed.

In [6]:
!pip install lifelines

Collecting lifelines
  Downloading lifelines-0.30.0-py3-none-any.whl.metadata (3.2 kB)
Collecting autograd-gamma>=0.3 (from lifelines)
  Downloading autograd-gamma-0.5.0.tar.gz (4.0 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting formulaic>=0.2.2 (from lifelines)
  Downloading formulaic-1.2.1-py3-none-any.whl.metadata (7.0 kB)
Collecting interface-meta>=1.2.0 (from formulaic>=0.2.2->lifelines)
  Downloading interface_meta-1.3.0-py3-none-any.whl.metadata (6.7 kB)
Downloading lifelines-0.30.0-py3-none-any.whl (349 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m349.3/349.3 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[?25hDownloading formulaic-1.2.1-py3-none-any.whl (117 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m117.3/117.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading interface_meta-1.3.0-py3-none-any.whl (14 kB)
Building wheels for collected packages: autograd-gamma
  Building wheel for autograd

In [9]:
!pip install loss

Collecting loss
  Downloading loss-0.1.2.tar.gz (9.6 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: loss
  Building wheel for loss (pyproject.toml) ... [?25l[?25hdone
  Created wheel for loss: filename=loss-0.1.2-py3-none-any.whl size=8823 sha256=8cefbb982b3cf9ad85b853ed0bdd78cd73685d162358a9009587fd303a264012
  Stored in directory: /root/.cache/pip/wheels/63/34/0e/18eacfee607cd72ad43dfd807ea8bea1c41d537bb8c8713342
Successfully built loss
Installing collected packages: loss
Successfully installed loss-0.1.2


In [11]:
!pip install network

Collecting network
  Downloading network-0.1.tar.gz (2.8 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: network
  Building wheel for network (setup.py) ... [?25l[?25hdone
  Created wheel for network: filename=network-0.1-py3-none-any.whl size=3138 sha256=5f8622a7ef40fad338d853e1000115cf117bd5b5d94d57edaedcfce6bb4f39fd
  Stored in directory: /root/.cache/pip/wheels/3a/9a/a4/341d3b109494a43a5cdd444ca83be3a4bfe8c1267ad9f85332
Successfully built network
Installing collected packages: network
Successfully installed network-0.1


In [12]:
import lifelines
import numpy as np
import tensorflow as tf

import loss
import network

NUM_EXAMPLES = 64
SEQUENCE_LENGTH = 2
PATCH_SIZE = 128
NUM_EPOCHS = 32

In [18]:
!pip install cluster_utils

Collecting cluster_utils
  Downloading cluster_utils-3.0.0-py3-none-any.whl.metadata (10 kB)
Collecting al-smart-settings (from cluster_utils)
  Downloading al_smart_settings-1.2.0-py2.py3-none-any.whl.metadata (2.3 kB)
Downloading cluster_utils-3.0.0-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.3/84.3 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading al_smart_settings-1.2.0-py2.py3-none-any.whl (10 kB)
Installing collected packages: al-smart-settings, cluster_utils
Successfully installed al-smart-settings-1.2.0 cluster_utils-3.0.0


In [20]:
!pip install data utils

Collecting data
  Downloading data-0.4.tar.gz (7.0 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting utils
  Downloading utils-1.0.2.tar.gz (13 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting funcsigs (from data)
  Downloading funcsigs-1.0.2-py2.py3-none-any.whl.metadata (14 kB)
Downloading funcsigs-1.0.2-py2.py3-none-any.whl (17 kB)
Building wheels for collected packages: data, utils
  Building wheel for data (setup.py) ... [?25l[?25hdone
  Created wheel for data: filename=data-0.4-py3-none-any.whl size=7226 sha256=4d80539829f1c592889f6565917369847c6892fe4269673b859f8fa4c1b3a162
  Stored in directory: /root/.cache/pip/wheels/d2/d3/10/d5fe9bc9dcb197ea289baccca92a25f2f95135235a92ca1b11
  Building wheel for utils (setup.py) ... [?25l[?25hdone
  Created wheel for utils: filename=utils-1.0.2-py2.py3-none-any.whl size=13906 sha256=df90d1d74d7ba26f3fd3aa457b4d6c708a90720cda5c1ab5da21c0161d83aa8a
  Stored in directory: /root/.cache/pip/wheels/15/0c/b3

In [35]:
import math
import sklearn

import cluster_utils
#import data_utils  #No module named 'data_utils'

## hear_demo.ipynb (attempt)

"Health Acoustics Representations (HeAR) is a machine learning (ML) model that produces embeddings based on health acoustic data. The embeddings can be used to efficiently build AI models for health acoustic-related tasks (for example, identifying disease status from cough sounds, or measuring lung function using exhalation sounds made during spirometry), requiring less data and less compute than having to fully train a model without the embeddings or the pretrained model."

https://developers.google.com/health-ai-developer-foundations/hear

In [33]:
import concurrent.futures
import random

import google.auth
import google.auth.transport.requests

In [36]:
#https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/hear_demo.ipynb

#We don't have this json file

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/path/to/your/credentials/json/file'

In [38]:
!pip install api_utils

Collecting api_utils
  Downloading api-utils-2019.9.18-3.tar.gz (1.9 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Discarding [4;34mhttps://files.pythonhosted.org/packages/c6/6e/e45e92ad8d7328bf3fbbb1347f8e7fa652e39722075269a99231752eb1c2/api-utils-2019.9.18-3.tar.gz (from https://pypi.org/simple/api-utils/) (requires-python:>=3.6)[0m: [33mRequested api_utils from https://files.pythonhosted.org/packages/c6/6e/e45e92ad8d7328bf3fbbb1347f8e7fa652e39722075269a99231752eb1c2/api-utils-2019.9.18-3.tar.gz has inconsistent version: expected '2019.9.18.post3', but metadata has '2019.9.18'[0m
  Downloading api_utils-2019.9.18-3-py3-none-any.whl.metadata (1.1 kB)
Downloading api_utils-2019.9.18-3-py3-none-any.whl (17 kB)
Installing collected packages: api_utils
Successfully installed api_utils-2019.9.18


In [39]:
#https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/hear_demo.ipynb

# Environment variable `GOOGLE_APPLICATION_CREDENTIALS` must be set for these
# imports to work.
import api_utils

## Online predictions - With raw audio

In [40]:
#https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/hear_demo.ipynb

raw_audio = np.array([[random.random() for _ in range(32000)] for _ in range(4)])
embeddings = api_utils.make_prediction(
  endpoint_path=api_utils.RAW_AUDIO_ENDPOINT_PATH,
  instances=raw_audio,
)

AttributeError: module 'api_utils' has no attribute 'make_prediction'

## If you have a lot of queries to run

Example with the raw-audio endpoint (202) using ThreadPoolExecutor.

In [43]:
#https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/hear_demo.ipynb

# 1000 batches of 4 clips. This is the format expected for the raw audio endpoint
instances = np.random.uniform(size=(1000, 4, 32000))  # update with your data

responses = {}

with concurrent.futures.ThreadPoolExecutor(max_workers=50) as executor:
  futures_to_batch_idx = {
    executor.submit(
        api_utils.make_prediction_with_exponential_backoff,
        api_utils.RAW_AUDIO_ENDPOINT_PATH,
        instance
    ): batch_idx
    for batch_idx, instance in enumerate(instances)
  }

  for future in concurrent.futures.as_completed(futures_to_batch_idx):
    batch_idx = futures_to_batch_idx[future]
    try:
      responses[batch_idx] = future.result()
    except Exception as e:
      print("An error occurred:", e)

AttributeError: module 'api_utils' has no attribute 'make_prediction_with_exponential_backoff'

## Patient communication

For the record, patient communication is one of the most difficult process in healthcare. Mostly, nowadays, when professionals spend many time looking to their computers instead of dedicating time to their patients. That's one of the majors patients complaints.

Therefore, it would be helpful if professionals improve their communication skills. This would provide better diagnosis, prognosis and correctness to choose treatments.

Communication improves with practice and experience. These communication involves all the healthcare team. Additionally, it's an effort that professionals should be engaged, no matter what Machine Learning model they adopt in their decisions.

![](https://www.worksure.org/wp-content/uploads/2024/07/Blog-153.png)

## After 1h:56m failing installing packages, my only option is to Back Off

module 'api_utils' has no attribute 'make_prediction_with_exponential_**backoff'**

#Acknowledgements:

https://github.com/Google-Health/google-health/blob/master/health_acoustic_representations/hear_demo.ipynb