<a href="https://colab.research.google.com/github/IyadSultan/educational/blob/main/Python_for_Pediatric_Oncology.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python for Pediatric Oncology: Introductory Coding Workshop

# Variables
Variables are like labeled containers for information. In medicine, you might put a patient’s age in one field, weight in another – in Python, you’d store these values in variables. A variable has a name and holds some value (number, text, etc.). You can then use that name to retrieve or change the value. For example, let's create a variable for a patient's age and another for their name:

In [None]:
# Storing patient info in variables
patient_name = "Alice"
patient_age = 7
print("Patient name:", patient_name)
print("Patient age:", patient_age)


Patient name: Alice
Patient age: 7


In [None]:
## Data Types



## Data types
Python has several basic data types:
Integers (int) for whole numbers (e.g., 7 for age).
Floats for decimals (e.g., 7.5 for weight in kg perhaps).
Strings (str) for text (e.g., "Alice" for a name or "Stage I" for a disease stage).
Booleans (bool) for True/False values (e.g., True for has fever, False for no fever).

## Lists

In [None]:
patient_names = ["Alice", "Bob", "Charlie"]
temperatures = [36.6, 38.0, 39.4]  # in °C
print(patient_names)
print("First patient:", patient_names[0])


This creates a list of three names and a list of three temperature readings. patient_names[0] accesses the first element ("Alice") since indexing starts at 0 in Python. Lists are super useful to hold multiple related items (like records in a cohort).

Analogy: Think of a variable as a labeled jar where you can put a piece of data. You might have a jar labeled "Age" containing the number 7. You can open it and replace it with 8 later. A list is like a pill organizer with multiple slots – each slot can hold a value, and you can access each by its position.

## Exercise 1
Generate a list of 10 random numbers (between 1 and 6) and print the first and last number

Make a variable hospital_name for your hospital and print it.

In [None]:
# prompt: Make a variable hospital_name and give it a value of "KHCC"

import random

# Generate a list of 10 random numbers between 1 and 6
random_numbers = [random.randint(1, 6) for _ in range(10)]

# Print the first and last numbers
print("First number:", random_numbers[0])
print("Last number:", random_numbers[-1])

# Make a variable hospital_name and give it a value of "KHCC"
hospital_name = "KHCC"
print("Hospital Name:", hospital_name)


Create a list ward_numbers with some ward/floor numbers and try printing the second item (index 1).

In [None]:
# prompt: Create a list ward_numbers with some ward/floor numbers and try printing the second item (index 1).

ward_numbers = [101, 202, 303, 404]
print("Second ward/floor number:", ward_numbers[1])


Second ward/floor number: 202


Change one element in your list (e.g., update a ward number) and print the list again to see the change.

In [None]:
# prompt: Change one element in your list by changing the first element to 102 and print the list again to see the change.

ward_numbers[0] = 102
ward_numbers


[102, 202, 303, 404]

# Loops and Conditional Logic

## Loops
In programming, loops let us repeat actions easily, and conditional statements (if/else) let us make choices based on data. These are like the bread and butter of automating tasks. For Loops: A for-loop allows you to iterate over a sequence (like each item in a list) and do something with each item. Imagine doing morning rounds for each patient in a list – that's a loop in action (repeating a procedure for each patient). Let's say we have a list of patient temperatures and we want to check each for fever:

```
# This is formatted as code
```



In [None]:
temperatures = [36.6, 38.0, 39.4]  # three patients' temperatures
for temp in temperatures:
    print("Current temp:", temp)


Current temp: 36.6
Current temp: 38.0
Current temp: 39.4


## If conditions
 Now, just printing isn't that useful – let's add a decision. We can use an if statement inside the loop to flag which temperatures indicate fever. In medicine, we might say >37.5°C is a fever. We can code that logic:


In [None]:
for temp in temperatures:
    if temp > 37.5:
        print(temp, "-> High fever! 🔥")
    else:
        print(temp, "-> Normal")


36.6 -> Normal
38.0 -> High fever! 🔥
39.4 -> High fever! 🔥


We introduced an if/else:

- If the condition temp > 37.5 is true, we execute the first block (print "High fever! 🔥").

- Otherwise (else), we execute the second block (print "Normal").

The 🔥 emoji is just for fun to highlight fevers.
(Yes, you can include emojis in Python strings!) This small snippet combined a loop and a conditional check it went through each temperature and made a decision for each. You could imagine extending this: e.g., if fever, maybe alert the user or increment a counter of how many fevers were found, etc.

*Analogy*: Loops are like doing the same test on every sample in a batch, or like a nurse checking each patient in a ward one by one. If/else statements are like triaging: if a patient is critical, send to ICU, else send them to a normal ward.

## Exercise 2

Add a new temperature to the list (e.g., 37.0) and re-run the loop. Does it correctly label it as normal?

Change the fever threshold to 38.0 in the code (replace 37.5 with 38.0) and run again. Now 38.0°C would be considered normal (because it’s not greater than 38.0). This shows how a simple code change can adjust your logic.

Extra: Create a list of blood pressure readings and write a loop to print which ones are above a certain threshold (you decide the threshold).

In [None]:
# prompt: Create a list of blood pressure readings (100/50, 60/40, 110/60) and write a loop to print which ones are above a certain threshold where mean arterial pressure < 50.

blood_pressure_readings = ["100/50", "60/40", "110/60"]
threshold = 50  # Mean Arterial Pressure threshold

for reading in blood_pressure_readings:
    systolic, diastolic = map(int, reading.split("/"))
    map_value = (1/3) * systolic + (2/3) * diastolic
    if map_value < threshold:
        print(f"Blood pressure reading {reading} is below the threshold (MAP: {map_value})")


Blood pressure reading 60/40 is below the threshold (MAP: 46.666666666666664)


# Functions (Reusable Code Blocks)

Functions are a way to package a set of instructions so you can reuse them with different inputs. Think of a function like a recipe or a medical protocol: you define it once, and then you can use it whenever needed, without rewriting all the steps. In Python, we define a function using the def keyword, give it a name, parameters (inputs) in parentheses, and a block of code. For example, let's define a simple function to categorize a Hodgkin lymphoma stage into "Early-stage" vs "Advanced-stage" disease:

In [None]:
def classify_stage(stage):
    if stage <= 2:
        return "Early-stage"
    else:
        return "Advanced-stage"


This function classify_stage takes one input (stage). Inside, it uses an if/else to decide the return value. If stage is 1 or 2, it returns "Early-stage"; if 3 or 4, it returns "Advanced-stage". Notice the indentation – Python uses indentation (spaces) to define blocks of code under the function and under the if/else. Colab will help by auto-indenting after you type a : and press Enter.

In [None]:
print(classify_stage(1))  # Stage I -> expected "Early-stage"
print(classify_stage(4))  # Stage IV -> expected "Advanced-stage"


Early-stage
Advanced-stage


We can call classify_stage for any stage number now, and it will consistently apply the rule we wrote. Functions can have multiple parameters as well. For instance, you might have a function to calculate Body Mass Index:

In [None]:
def calculate_bmi(weight_kg, height_m):
    bmi = weight_kg / (height_m ** 2)
    return bmi

print(calculate_bmi(50, 1.6))  # 50 kg, 1.6 m height


19.531249999999996


This would output a BMI value (around 19.5 for those numbers). The details of BMI aren't central here, but it shows how a function can take inputs and produce an output (using return). The return statement hands back the result to wherever the function was called.

*Analogy*: A function is like a diagnostic test – you provide a sample (input), and it gives you a result (output) after processing. Once a test is developed, any patient sample can go through it to get a result, without redesigning the test each time. By organizing code into functions, you make it more modular and readable. If you have a complex calculation you'll do often, define it as a function and then your main code becomes much cleaner (“just call the function”).

### Exercise 3

Define a function greet_doctor(name) that takes a name and returns a greeting like "Hello Dr. <name>, welcome!"

Test your function by calling it with a few different names.

Modify the classify_stage function to handle an edge case: if stage is 0 or not 1-4, have it return "Unknown stage" (hint: use an if before the others to check if stage not in [1,2,3,4]).

# Using Libraries (Importing Modules)

One of Python’s superpowers is its rich ecosystem of libraries (also called modules or packages). Libraries are like add-on toolkits that provide extra functions and capabilities. For example, there are libraries for scientific computing, for making plots, for machine learning, and much more. We'll use some libraries later for data analysis (like pandas and matplotlib).

To use a library, you first import it. Python has many built-in libraries (like math, datetime, etc.), and thousands of external ones you can install. In Colab, many common libraries (numpy, pandas, matplotlib, etc.) are pre-installed.

Let's see a quick example with the built-in math library:

In [None]:
import math
value = math.sqrt(16)
print("Square root of 16 is:", value)

In [None]:
from math import sqrt
value = sqrt(16)
print("Square root of 16 is:", value)


Square root of 16 is: 4.0
Square root of 16 is: 4.0


This should output Square root of 16 is: 4.0. We did import math (bringing in the math module), and then used math.sqrt() function to compute a square root. The math library has many math functions (try math.log, math.sin, etc.).

We’ll soon use pandas (for data handling) and matplotlib (for plotting). When we do, we’ll write import pandas as pd and import matplotlib.pyplot as plt – the as pd and as plt are just aliases to save typing.

*Note* : You only need to import a library once at the top of your notebook or code. After that, you can use its functions.

Libraries allow you to accomplish complex tasks with just a few lines of code because others have developed and tested these tools for you. This is especially useful in medical data analysis – why write a statistical function from scratch if a trusted library can do it?

### Exercise 4

import random and then use random.choice([list]) to pick a random element from a list of your choice (e.g., random.choice of a list of patient names).

In [None]:
# prompt: import random and then use random.choice([list]) to pick a random element from a list of my patients (John, Samir, Helen)

random_patient = random.choice(["John", "Samir", "Helen"])
random_patient


import datetime and use datetime.datetime.now() to get the current date and time (just to see an example of using a slightly larger library).

In [None]:
# prompt: import datetime and use datetime.datetime.now() to get the current date and time

import datetime

# ... (rest of your existing code)

# Get the current date and time
current_datetime = datetime.datetime.now()
print("Current date and time:", current_datetime)


# Data Analysis & Visualization

One of the powerful things about Python is how it can help you make sense of data. In research or clinical practice, you might encounter datasets – for example, a list of patients with various attributes. We’ll simulate that now with a synthetic dataset for Hodgkin lymphoma patients.

*Scenario*: Imagine we have collected data on 20 pediatric Hodgkin lymphoma cases. For privacy and simplicity, we'll use synthetic (fake) data that resembles what you might see in a real registry, but randomly generated. This way we can practice without any real patient information.

Our dataset (let's call it df for DataFrame, a table-like data structure from the pandas library) has the following columns for each patient:
- PatientID – an identifier (1 through 20).
- Age – age of the patient in years.
- Gender – M or F.
- Stage – disease stage (1, 2, 3, or 4).
- B_symptoms – whether "B symptoms" (fever, night sweats, weight loss) are present (Yes/No).
- Outcome – outcome after treatment, e.g., Remission or Relapse.

Let's load this data into a pandas DataFrame and take a peek at the first few entries. (In a real scenario, you might load data from a CSV file with pd.read_csv, but here we'll create it directly for demonstration.)

In [None]:
import pandas as pd

# Create a synthetic dataset
data = {
    "PatientID": list(range(1, 21)),
    "Age": [19, 13, 17, 20, 16, 22, 13, 12, 20, 24, 15, 18, 20, 25, 14, 21, 15, 17, 23, 11],
    "Gender": ["F","M","M","F","M","F","F","M","F","F","M","F","F","M","M","M","F","M","F","F"],
    "Stage": [1,1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4],
    "B_symptoms": ["Yes","No","Yes","No","No","No","Yes","No","No","No","Yes","Yes","No","Yes","Yes","Yes","Yes","No","Yes","Yes"],
    "Outcome": ["Remission","Remission","Relapse","Relapse","Remission","Remission","Remission","Remission","Remission","Remission",
                "Remission","Remission","Remission","Remission","Remission","Remission","Relapse","Remission","Remission","Remission"]
}
df = pd.DataFrame(data)
df.head()


The above code uses a dictionary to define columns and then creates a DataFrame. The df.head() displays the first five rows.

Now, with this dataset in df, we can start analyzing it. Pandas (pd) is a powerful library for data manipulation. You can think of df like an Excel table that we can query with code. Let's do some basic analysis:
- How many patients are there? (len(df) or df.shape gives the number of rows)
- What’s the average age?
- How many patients in each stage?
- What fraction had B symptoms? etc.

- First, verify the number of patients and columns of df

---



In [None]:
# prompt: First, verify the number of patients and columns of df

# Print the number of patients (rows)
print("Number of patients:", len(df))

# Print the number of columns
print("Number of columns:", len(df.columns))


Number of patients: 20
Number of columns: 6


- Using pandas, we can get mean, max, min in one line each. For our synthetic data

- Let's see how many patients in each Stage:

- Create a bar chart to visualize the number of patients in each stage

In [None]:
# prompt: Create a bar chart to visualize the number of patients in each stage, make x axis labels: I, II, III, IV not 1,2,3,4, and make sure the order is correct

import matplotlib.pyplot as plt

# Count the number of patients in each stage
stage_counts = df['Stage'].value_counts().sort_index()

# Create the bar chart
plt.figure(figsize=(8, 6))
plt.bar(stage_counts.index.astype(str), stage_counts.values)
plt.xlabel("Stage")
plt.ylabel("Number of Patients")
plt.title("Number of Patients in Each Stage")
plt.xticks(stage_counts.index.astype(str), ['I', 'II', 'III', 'IV']) # Set x-axis labels
plt.show()


- Create a histogram showing the age distribution of our patients

- Make a table showing different outcomes of patients

- Make a pie chart showing patients' outcomes

### Exercise 5

- Calculate the median age of patients (df["Age"].median()).

- What’s the average age for each Stage? Hint: Use df.groupby("Stage")["Age"].mean() – this groups data by stage and then calculates mean age in each group.

- Create a filter to see data for Stage 3 patients only: df[df["Stage"] == 3]. (You should see only rows where Stage is 3.)

- Modify the plotting code to visualize something else, e.g., replace "Stage" with "B_symptoms" to count how many had B symptoms vs not, or make a bar chart of Outcome counts.

Data analysis in Python becomes even more powerful with real datasets: you could import CSV files of clinical trials, perform statistical analysis (pandas can do basic stats; for advanced analysis you might use libraries like statsmodels or scikit-learn), and make publication-quality plots. Even in this small example, hopefully you see how quickly we got from raw data to meaningful info.

Real-world note: Python is extensively used in biomedical research for tasks like biostatistics, machine learning on patient data, and more. In fact, the study we referenced earlier used Python to train models predicting treatment response in pediatric Hodgkin lymphoma. As you grow more comfortable, you could leverage those same libraries (like scikit-learn) to explore predictive modeling on datasets (with guidance from a data scientist). But that’s beyond today’s intro – our goal here is to get you started and show what's possible.

# Building Simple Interactive Applications

So far, we've run code and seen output, which is already exciting. But Python (especially in Jupyter/Colab notebooks) can also create interactive widgets – little UI elements like sliders, dropdowns, buttons – that let you play with parameters without changing the code. This can turn a static analysis into a mini interactive app right inside the notebook.

Why is this useful? Because you can experiment in real-time. For example, adjust a slider to filter patients above a certain age and see the list update instantly, or input some values to get a calculated result (like a risk score). It makes data exploration and tool prototyping much more engaging.

We'll use the ipywidgets library’s interact function, which automatically creates controls for interacting with functions​
ipywidgets.readthedocs.io
. In Colab, ipywidgets is usually pre-installed, but you might need to enable it by running !pip install ipywidgets if it ever doesn’t work. Let's try a simple interactive example:

## Filtering patients by age with a slider.

In [None]:
from ipywidgets import interact

def filter_by_min_age(min_age):
    # Filter the DataFrame to only include patients with Age >= min_age
    filtered = df[df["Age"] >= min_age]
    print(f"Patients with age >= {min_age}: {len(filtered)}")
    display(filtered)  # display the filtered table

# Create an interactive slider for min_age between 10 and 25
interact(filter_by_min_age, min_age=(10, 25, 1))


When you run this cell in Colab, it will show a slider (labeled “min_age”) ranging from 10 to 25. Initially, it might default to the lower bound (10). As you move the slider, the filter_by_min_age function is called with the current slider value, and it prints how many patients meet the criteria and displays those patient rows.
- Try moving the slider to 15, 18, 21, etc., and watch the output update immediately.
- If you set min_age to 18, for example, it will show only patients aged 18 or older. You can quickly see how the cohort changes.


This little interactive tool could help you, say, focus on adolescent vs younger patients. You could similarly create filters for Stage or any other criterion. The interact function made it super easy – we just defined what we want to do (filter by age) and it handled the GUI part.

## A simple risk calculator widget.

Let's make a hypothetical risk assessment tool. Suppose (for the sake of example) we define “high risk” Hodgkin lymphoma patients as those with Stage III-IV or with B symptoms, and others are “standard risk”. We can create an interactive form that takes a patient's Stage and B symptoms status as inputs and outputs the risk category.

In [None]:
def assess_risk(stage, b_symptoms):
    if stage >= 3 or b_symptoms == "Yes":
        print(f"Stage {stage} with B symptoms({b_symptoms}) -> **High Risk**")
    else:
        print(f"Stage {stage} with B symptoms({b_symptoms}) -> Standard Risk")

interact(assess_risk,
         stage=(1,4,1),
         b_symptoms=["No", "Yes"])


## Exercise 6

- Change the assess_risk logic: for example, maybe only Stage IV or Stage III with B symptoms is "High Risk" and others are "Standard". Implement that rule and test it.

- Add a third input to assess_risk for something like bulk_disease (Yes/No) and include that in your risk criteria (you'll need to update both the function and the interact call with a new parameter, e.g., bulk=["No","Yes"]).

- Make an interactive plot: use interact to pick a Stage number and plot the age distribution of patients witht hat stage

In [None]:
# prompt: Make an interactive plot: use interact to pick a Stage number and plot the age distribution of patients witht hat stage

import matplotlib.pyplot as plt
def plot_age_distribution_by_stage(stage):
    # Filter the DataFrame for the selected stage
    stage_df = df[df["Stage"] == stage]

    # Create the histogram
    plt.figure(figsize=(8, 6))
    plt.hist(stage_df["Age"], bins=5, edgecolor='black')  # Adjust bins as needed
    plt.xlabel("Age")
    plt.ylabel("Number of Patients")
    plt.title(f"Age Distribution for Stage {stage}")
    plt.show()

interact(plot_age_distribution_by_stage, stage=(1, 4, 1));


Interactive widgets are not only fun, but they also demonstrate the concept of building applications. In fact, there are tools like Streamlit or Dash that let you turn Python scripts into web apps for data analysis or clinical calculators with minimal code. What we did here in the notebook is a stepping stone towards that. You could prototype logic in a notebook with interact, and later a developer could help turn it into a full app for wider use.

#  Using the OpenAI API

No modern tech workshop would be complete without mentioning AI! In particular, Large Language Models (LLMs) like OpenAI's GPT-4 have tremendous potential in healthcare.

They can interpret and generate human-like text, which means they could help draft reports, explain medical terms to patients, triage information, and more (always with human oversight, of course). In this section, we'll introduce how to use the OpenAI API in Python. This will allow you to send a prompt to an AI model (like GPT-3.5 or GPT-4) and get a response, all from your code.

Imagine integrating a "clinical assistant" that can answer questions or summarize text as part of your programs – that’s what this enables.

**Important: ** To actually run OpenAI API calls, you need an API key (from an OpenAI account) and the openai Python library. In a live workshop, we might just demonstrate this (since not everyone will have API keys ready). Here we'll show the code, and you can try it later if you obtain a key.

Steps to use OpenAI API in Colab:
1. Install the OpenAI library: In a Colab cell, run !pip install openai. (The ! tells Colab to run a shell command to install the package.)
2. Get your API key: Sign up on OpenAI’s website and get a personal API key from the dashboard. (Keep it secret, like a password.)
3. Use the API in code: Import the library, set your API key, and call the model with a prompt.

Let's see an example where we ask the AI to explain a medical concept in simple terms. Suppose we want to explain "Stage II Hodgkin lymphoma" to a patient in a reassuring way.

In [None]:
import os
from openai import OpenAI

client = OpenAI(
    # This is the default and can be omitted
    api_key="YOUR API KEY",
)

response = client.responses.create(
    model="gpt-4o-mini",
    instructions="You are a medical assistant.",
    input="how do you treat high risk Hodgkin lymphoma",
)

print(response.output_text)

Medical use-cases: You can see how such capabilities could be useful – from generating easy-to-understand explanations, drafting clinical notes or patient letters, summarizing long research articles, to even brainstorming research ideas. ChatGPT can process vast amounts of medical data (like guidelines, articles) and assist medical professionals in making informed decisions or creating patient-friendly communications​.

Of course, any AI output should be verified by a professional (you!). It's a tool to augment your work, not replace human expertise. Ethical note: When using patient data with AI, one must ensure privacy (never share identifiable information with an external API unless it's compliant and allowed). Also, be aware of the model's limitations – it might sometimes produce incorrect or nonsensical answers (so-called "hallucinations"), so use it with caution in critical settings.

That said, it’s a rapidly evolving field. From a coding perspective, integrating AI is just making API calls. We used OpenAI’s service here, but there are also local models and other platforms. The key takeaway is: with a few lines of Python, you can tap into incredibly powerful AI models. This bridges yet another gap – it means as clinicians you could automate some of the drudge work (like summarizing notes) and spend more time on decision-making.

### Exercise 7

Change the prompt to ask something else, e.g., "List 3 possible late side effects of chemotherapy for Hodgkin lymphoma in children." and see what it returns.

Try a creative prompt: "Invent a simple analogy to explain how immunotherapy works to a child."

If you get a chance, experiment with different models (by changing "model" to "text-davinci-003" for the older GPT-3 model, or to "gpt-4" if you have access – GPT-4 tends to give even more detailed and accurate responses, albeit slower and more expensive).

Remember, you can also feed it chunks of text to summarize. If you had a long pathology report in a string report_text, you could do prompt = "Summarize the following report for me:\n" + report_text.

By now, we’ve covered a lot: from Python basics to data analysis to interactive widgets to AI integration. Take a moment to appreciate how far you've come in just an hour! 🎉

# Conclusion: Empowering Collaboration and Next Steps

In this 1-hour whirlwind tour, we've seen that even basic Python skills can open up a world of possibilities for healthcare professionals:
You learned how to use variables, loops, and functions – the fundamental constructs that let you automate tasks and organize logic.
You applied Python to a pediatric oncology dataset, performing data analysis and creating visualizations that turn raw numbers into insights. (Imagine doing this with your real patient data or research data – patterns and insights might emerge that could inform practice or hypotheses for study.)
You built a mini interactive app in a notebook, demonstrating that you can quickly prototype tools (like a risk calculator or data filter) that react in real-time. This is the same concept behind many clinical decision support tools, just on a smaller scale.
You got a glimpse of integrating AI through the OpenAI API, showing how advanced models can assist in generating human-like explanations or analyses, potentially speeding up documentation and research.
Throughout, we tried to keep things fun and relevant – using analogies (variables as jars, loops as rounds, functions as reusable protocols) and even emojis to keep the mood light. Learning to code can feel like learning a new language (it literally is!), but by relating it to what you already know in medicine, it becomes much more approachable. Crucially, this knowledge empowers collaboration. Now that you understand the basics, you can better communicate with data scientists or software developers in your team. Instead of vaguely saying "I have data, I need insights", you can sketch out what you want – maybe even write a small script to illustrate it. You’ve seen how to manipulate data tables and produce graphs; you could do a quick analysis and then ask a statistician to double-check or suggest more sophisticated methods. By speaking a bit of the same language, projects become more efficient and less frustrating. As noted in a recent perspective, clinicians with data science skills can bridge the gap between clinical questions and technical solutions, working closely with statisticians and programmers to drive healthcare innovation​
pmc.ncbi.nlm.nih.gov
. Even if you don’t write code daily, just knowing what's possible is powerful. You might come up with ideas like "what if we automate this tedious part of our workflow?" or "could we build a simple app for patients to report symptoms?" – and you'll have a sense of how Python could be part of the solution. You can prototype the idea or collaborate with someone who can. In essence, you're becoming a part of the new generation of clinicians who are fluent in both medicine and technology, sometimes called clinician-data scientists or clinicians who code. This dual literacy will be increasingly valuable as healthcare continues to digitize. Next steps: If this workshop piqued your interest, here are some paths you can take:
Practice, practice, practice: Try to reproduce what we did without looking at the answers. Modify things, break things (it's okay!), and see what errors come up – that's how you learn.
Explore more data: Maybe take a public dataset (Kaggle has many health datasets) and apply the same analysis techniques. Or if you have de-identified data from a past research project, try analyzing it in Python.
Learn with peers: Form a study group with interested colleagues. Share a Colab notebook and tackle a small project (e.g., analyze last year's patient admissions, or visualize some public health data).
Advanced topics: When ready, venture into deeper waters – for example, visualization libraries like Seaborn or Plotly for more complex charts, machine learning with scikit-learn or TensorFlow (you could build a simple model to predict outcomes based on patient features), or bioinformatics libraries if relevant to your work.
Leverage AI thoughtfully: Experiment with tools like GPT (as we did) to aid in writing or summarizing. There are even libraries like langchain that help build more complex AI-driven apps. But always be mindful of privacy and accuracy.
Connect with the community: There are forums (like Stack Overflow, Reddit’s r/learnpython, etc.) and local meetups for "coding clinicians" or data science in medicine. You'll find you're not alone, and people are often eager to help beginners in tech.
Finally, remember that the goal isn't to turn you into a full-time programmer (unless you discover a hidden passion for it!), but to give you enough literacy to innovate and collaborate. Whether it's improving patient outcomes through data insights, automating a boring admin task, or simply communicating better with your IT team, your Python skills are a tool for empowerment. As you continue, you'll be contributing to a culture where healthcare and technology work hand-in-hand. Thank you for participating in this workshop! We hope you found it engaging and enlightening. Keep experimenting with code, and don't be afraid – as the saying goes in programming, "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are (by definition) not smart enough to debug it." – which is a humorous way to remind us to keep things simple. You've started simply today, and you can build up from here. Happy coding, and all the best in bridging pediatric oncology with the power of Python! References:
Python’s popularity in science and medicine​
uit.stanford.edu
Example of Python (scikit-learn) used in a pediatric Hodgkin lymphoma study​
pubmed.ncbi.nlm.nih.gov
Jupyter interact function for easy interactive widgets​
ipywidgets.readthedocs.io
Potential of AI (ChatGPT) to enhance healthcare decision support​
pmc.ncbi.nlm.nih.gov
Clinician-data scientists collaborating with technical teams​
pmc.ncbi.nlm.nih.gov