# Meet Tekin: A Public Health Analyst on a Python Journey

Tekin works in a national public health lab. His work often involves tracking pathogens and managing data, but he’s tired of copying and pasting values into Excel sheets. One day, he hears about how Python can automate tasks, make data cleaning easier, and even help visualize complex trends from environmental samples. Curious, he decides to start learning Python.

In this notebook, you’ll follow Tekin as he begins his journey with Python. He’ll learn about variables, basic data types, and how to run simple commands. Each step he takes helps him connect the code to his real-life work — from storing the number of positive cases in a list, to understanding how `print()` can show him outputs immediately.

Let’s begin Tekin’s journey into the world of Python!

### Welcome to Python – Tekin’s First Step

Tekin sat at his desk, staring at a folder full of messy CSV files from recent wastewater surveillance. He thought, *“There must be a better way to handle all this data.”* That’s when he decided to learn Python — a programming language known for its simplicity and power.

Python is not just for software developers. It's used by public health experts, bioinformaticians, and data scientists around the world. Its clean and readable syntax makes it a perfect choice for someone like Tekin, who wants to spend less time on repetitive tasks and more time on interpreting meaningful trends.

---

### Why Tekin Chose Python

Here’s what convinced Tekin to give Python a try:

- **Easy to Read & Write** → Python’s syntax feels almost like English, which helped Tekin feel more confident as a beginner.
- **Versatile** → From automating tasks to analyzing pathogen data, Python is used in many parts of public health work.
- **Large Community & Libraries** → Tekin quickly found tutorials, examples, and tools specific to bioinformatics.
- **Cross-Platform** → Whether at the office or on his Linux-based laptop at home, Python runs everywhere.
- **Fast Prototyping** → He could quickly test out new ideas with just a few lines of code.

By the end of this course, Tekin aims to build a solid foundation in Python and apply it directly to real-world public health challenges, especially in microbiological surveillance and outbreak analysis.

---

### Let’s Get Started!

Tekin opened his first notebook and typed:

```python
print("Hello, World!")


In [2]:
# How to make it work? "Use Shift+Enter" or Run
print("Hello, World")

Hello, World


In [None]:
# TRY IT OUT BY YOURSELF AREA

# Python Data Types – Tekin Starts Storing Information

After printing his first "Hello, World!", Tekin felt encouraged.  
Now he wondered: *“How can I make Python remember information like a name or a lab result?”*  
That’s when he learned about **variables** – containers that store different types of data.

Let’s explore the basic data types Tekin started using:

---

## Strings (`str`)

Tekin’s first task was to store sample IDs, names, and messages.  
For this, he used **strings**, which are sequences of characters wrapped in quotes.  
Python allows single (`'`), double (`"`), or triple quotes (`''' """`).

```python
# String examples
name = "Tekin"
greeting = 'Merhaba, Hola, Salut'
message = """This is a multi-line string."""

print(name)
print(greeting)
print(message)


In [7]:
# Let's try it out

Name= "Gültekin"

print(name)

Gültekin


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Floats (`float`)

Next, Tekin needed to store measurements — like temperature readings from wastewater or concentrations of specific markers.

For this, he used **floats**, which are numbers with decimal points.  
They allowed more precision than whole numbers.

```python
# Float examples
price = 19.99
temperature = 36.5

print(price)
print(temperature)


### Let’s Try It Out – Tekin Buys Fruit 

To practice working with floats, Tekin decided to model something simple: fruit prices from the market near his lab.

```python
# Price of an apple
price_of_an_apple = 20.99

# Price of an orange
price_of_an_orange = 10.99

print(price_of_an_apple)
print(price_of_an_orange)


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Booleans (`bool`)

One day, Tekin wanted to track whether a wastewater sample exceeded a critical threshold for viral load.  
He learned that **booleans** are perfect for representing simple *yes/no* or *true/false* conditions.

A boolean can only have two values: `True` or `False`.

```python
# Boolean examples
is_threshold_exceeded = True
is_sample_contaminated = False

print(is_threshold_exceeded)
print(is_sample_contaminated)


In [16]:
is_cape_town_fun = True
is_cape_town_cold = False

# Let's print it out 

print(is_cape_town_fun)
print(is_cape_town_cold)

True
False


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Summary of the Data Types – Tekin Organizes His Sample Info

After experimenting with different variable types, Tekin summarized what he had learned — this time, thinking in terms of wastewater surveillance.

- **Strings**: For storing sample IDs, location names, and lab notes  
  Example: `"Sample_AK2024"`, `'WWTP_Zone3'`
- **Integers**: For counting things like the number of positive detections  
  Example: `10`, `-5`, `2024`
- **Floats**: For recording measurements like viral concentrations  
  Example: `3.14`, `99.99`
- **Booleans**: For flagging conditions like threshold exceedance  
  Example: `True`, `False`

---

In Python, **variables** are used to store data. Tekin learned that he didn’t need to declare the type — Python figures it out automatically!

He also discovered **f-strings**, which made it easy to include variables in text outputs. For example:

```python
sample_id = "WWTP_23"
viral_load = 245.7
print(f"Sample {sample_id} has a viral load of {viral_load} copies/mL.")


### Let’s Try Everything We’ve Learned – Tekin Summarizes a Sample

Before moving on, Tekin decided to review everything he had learned so far by creating a mock summary of a wastewater sample.

He used different variable types to represent various attributes of the sample:

```python
# Let's have a string first – the sample location
location = "Zone A - WWTP"

# Then let's add an integer – sample number
sample_number = 102

# Moving on to a float – viral concentration in copies/mL
viral_concentration = 367.5

# And finish it with a boolean – does it exceed threshold?
exceeds_threshold = True

# We can print all the values as below
print(f"Sample from {location}, ID: {sample_number}")
print(f"Viral concentration: {viral_concentration} copies/mL")
print(f"Exceeds safety threshold? {exceeds_threshold}")



In [None]:
# TRY IT OUT BY YOURSELF AREA

## Basic Operations in Python – Tekin Calculates Lab Results

As Tekin progressed, he began using Python not just to store data — but to calculate with it.  
He discovered that Python supports various **operations** on numbers, which helped him process lab values efficiently.

Here are the basic **arithmetic operations** Tekin practiced, using sample-related scenarios:

```python
# Addition – combining two sample volumes
a = 10 + 5
print(a)  # Output: 15

# Subtraction – difference in sample counts between two days
b = 10 - 3
print(b)  # Output: 7

# Multiplication – estimating total viral load (copies/mL × volume)
c = 4 * 3
print(c)  # Output: 12

# Division – splitting a total volume into equal parts
d = 10 / 2
print(d)  # Output: 5.0

# Floor Division – calculating how many full containers fit
e = 10 // 3
print(e)  # Output: 3

# Modulus – finding leftover volume
f = 10 % 3
print(f)  # Output: 1

# Exponentiation – simulating exponential growth in viral replication
g = 2 ** 3
print(g)  # Output: 8


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [3]:
# Let's have examples!

# Using variables in arithmetic operations
x = 20
y = 7

# Addition
sum_result = x + y
print(sum_result)  # Output: 27

# Subtraction
difference = x - y
print(difference)  # Output: 13

# Multiplication
product = x * y
print(product)  # Output: 140

# Division
quotient = x / y
print(quotient)  # Output: 2.857142857142857

# Floor Division
floor_div = x // y
print(floor_div)  # Output: 2

# Modulus (remainder)
modulus = x % y
print(modulus)  # Output: 6

# Exponentiation
power = x ** 2
print(power)  # Output: 400

# Complex arithmetic expression
result = (x + y) * 2 - (x // y)
print(result)  # Output: 53


# Concatenation
greeting = "Hello" + " " + "World"
print(greeting)  # Output: Hello World

# Repetition
repeat_text = "Cape " * 3
print(repeat_text)  # Output: Cape Cape Cape

# Checking substrings
sentence = "Cape Town is amazing"
print("Cape Town" in sentence)  # True
print("cold" in sentence)    # False


27
13
140
2.857142857142857
2
6
400
52
Hello World
Cape Cape Cape 
True
False


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Comparison Operations – Tekin Compares Sample Results

Tekin often needed to compare results from different wastewater treatment plants or between different time points.  
Python’s **comparison operators** helped him quickly evaluate conditions and make decisions.

Comparison operators return a **boolean** value: `True` or `False`.

Here’s how Tekin used them:

```python
# Viral loads from two different days
viral_load_day1 = 10
viral_load_day2 = 5

print(viral_load_day1 > viral_load_day2)    # True – today's load is higher
print(viral_load_day1 < viral_load_day2)    # False – not lower than yesterday
print(viral_load_day1 == viral_load_day2)   # False – values are not equal
print(viral_load_day1 != viral_load_day2)   # True – values differ
print(viral_load_day1 >= viral_load_day2)   # True – day 1 is greater or equal
print(viral_load_day1 <= viral_load_day2)   # False – not less or equal


In [None]:
# Let's have examples

word1 = "apple"
word2 = "banana"

print(word1 == word2)   # False (Different words)
print(word1 != word2)   # True  (Not the same)
print(word1 < word2)    # True  ("apple" comes before "banana" alphabetically)
print(word2 > word1)    # True  ("banana" comes after "apple" alphabetically)


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Logical Operations – Tekin Combines Conditions

As Tekin’s scripts got more advanced, he started checking **multiple conditions** at once.  
For example, he wanted to flag samples that exceeded the viral threshold **and** came from high-risk zones.

Python’s **logical operators** allowed him to combine such conditions:

- `and`: True if both conditions are True  
- `or`: True if at least one condition is True  
- `not`: Reverses the boolean value

```python
# Example conditions for a wastewater sample
exceeds_threshold = True
is_from_high_risk_zone = False

print(exceeds_threshold and is_from_high_risk_zone)  # Output: False
print(exceeds_threshold or is_from_high_risk_zone)   # Output: True
print(not exceeds_threshold)                         # Output: False


In [None]:
# Checking if a DNA sequence contains both 'A' and 'T' bases
dna_sequence = "ATCGGCTA"

contains_A = "A" in dna_sequence
contains_T = "T" in dna_sequence

print(contains_A and contains_T)  # True (both bases are present)
print(contains_A or contains_T)   # True (at least one base is present)
print(not contains_A)             # False (sequence contains 'A')


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# Evaluating sequencing quality using logical operations
quality_score = 30
coverage = 80

is_high_quality = quality_score > 25
is_high_coverage = coverage > 50

print(is_high_quality and is_high_coverage)  # True (both conditions met)
print(is_high_quality or is_high_coverage)   # True (at least one condition met)
print(not is_high_quality)                   # False (quality score is high)


In [None]:
# TRY IT OUT BY YOURSELF AREA

## String Operations – Tekin Labels and Logs Samples

Tekin often had to generate **sample labels** and log messages while working with wastewater data.  
Python’s string operations made this easy — especially **concatenation** and **repetition**.

Here’s how Tekin used them:

```python
# String Concatenation – combining sample info
zone = "Zone A"
sample_id = "Sample_105"
label = zone + " - " + sample_id
print(label)  # Output: Zone A - Sample_105

# String Repetition – repeating alert messages
alert = "Threshold Exceeded! "
print(alert * 3)  # Output: Threshold Exceeded! Threshold Exceeded! Threshold Exceeded!


In [None]:
# Let's have examples!
# String Concatenation: Combining DNA sequences
dna_part1 = "ATGCGT"
dna_part2 = "AAGTCC"
full_sequence = dna_part1 + dna_part2
print(full_sequence)  # Output: ATGCGTAAGTCC

In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# String Repetition: Simulating Microsatellite Repeats
repeat_unit = "ATG"
microsatellite = repeat_unit * 4
print(microsatellite)  # Output: ATGATGATGATG


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# Formatting protein sequence information
protein_name = "Hemoglobin"
sequence_length = 146

info = "Protein: " + protein_name + ", Length: " + str(sequence_length) + " amino acids"
print(info)  # Output: Protein: Hemoglobin, Length: 146 amino acids


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Assignment Operators – Tekin Adjusts Sample Values

As Tekin worked with dynamic wastewater data, he often needed to update variables based on new measurements.  
Python’s **assignment operators** allowed him to modify variables directly and efficiently.

These operators combine assignment with arithmetic operations.

```python
# Initial viral load measurement
viral_load = 10

viral_load += 5  # New incoming data adds more load
print(viral_load)  # Output: 15

viral_load -= 3  # Adjustment after quality control
print(viral_load)  # Output: 12

viral_load *= 2  # Doubling the volume for concentration
print(viral_load)  # Output: 24

viral_load /= 4  # Averaging over 4 samples
print(viral_load)  # Output: 6.0


In [None]:
# Let's have examples!

# Tracking the number of DNA sequences processed
sequences_processed = 100

sequences_processed += 20  # 20 more sequences analyzed
print(sequences_processed)  # Output: 120

sequences_processed -= 10  # 10 sequences discarded due to low quality
print(sequences_processed)  # Output: 110


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# Adjusting read depth in sequencing analysis
read_depth = 30

read_depth *= 2  # Doubling the sequencing depth
print(read_depth)  # Output: 60

read_depth /= 3  # Reducing sequencing depth to a third
print(read_depth)  # Output: 20.0


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# Expanding a DNA motif with repetitive units
motif = "ATG"

motif *= 3  # Repeating the motif three times
print(motif)  # Output: ATGATGATG

In [None]:
# TRY IT OUT BY YOURSELF AREA

In [None]:
# TRY IT OUT BY YOURSELF AREA

## Summary – Tekin’s Toolbox So Far

Tekin now had a solid foundation in Python and could already perform many tasks relevant to wastewater data analysis.

Here’s a quick recap of what he learned:

- **Arithmetic operations**: `+`, `-`, `*`, `/`, `//`, `%`, `**` → For calculations on sample values  
- **Comparison operations**: `>`, `<`, `==`, `!=`, `>=`, `<=` → To compare measurements  
- **Logical operations**: `and`, `or`, `not` → To combine multiple conditions  
- **String operations**: Concatenation (`+`), repetition (`*`) → For building sample labels and alerts  
- **Assignment operators**: `+=`, `-=`, `*=`, `/=` → To update values efficiently

With these tools, Tekin could already write meaningful scripts — but he was about to unlock even more power.



In [3]:
# Let's have a last complex example!

# Simulating DNA sequencing statistics

# Step 1: Initialize sequencing data
total_reads = 50000  # Total sequencing reads
high_quality_reads = 32000  # Reads with high quality
low_quality_reads = total_reads - high_quality_reads  # Remaining are low quality

# Step 2: Calculate the percentage of high-quality reads
high_quality_percentage = (high_quality_reads / total_reads) * 100

# Step 3: Check if sequencing quality is acceptable
is_high_quality = high_quality_percentage >= 70
is_low_quality = high_quality_percentage < 50

# Step 4: Adjust sequencing reads based on additional processing
total_reads += 5000  # New batch of reads added
high_quality_reads *= 1.1  # Assume a 10% improvement after filtering

# Step 5: Generate a sequencing report
report_title = "Sequencing Report"
separator = "=" * len(report_title)

report = report_title + "\n" + separator
report += (f"\nTotal Reads: {total_reads}")
report += (f"\nHigh-Quality Reads: {int(high_quality_reads)}")
report += (f"\nHigh-Quality Percentage: {high_quality_percentage:.2f}%")
report += (f"\nSequencing Quality Acceptable: {is_high_quality}")
report += (f"\nWarning: Low Quality? {is_low_quality}")

# Step 6: Print the final report
print(report)



Sequencing Report
Total Reads: 55000
High-Quality Reads: 35200
High-Quality Percentage: 64.00%
Sequencing Quality Acceptable: False


In [None]:
# TRY IT OUT BY YOURSELF AREA IF YOU TRUST YOURSELF! I BELIEVE IN YOU! YOU CAN DO IT! NO PAIN NO GAIN!

### Problem 1: Calculating Total Analysis Cost with Discount

**Scenario:**  
Tekin is working in a public health laboratory and receives multiple wastewater samples for analysis.  
Each type of analysis has a specific cost:

- **Bacterial analysis**: 5.99 TL per sample  
- **Viral analysis**: 4.49 TL per sample  

This week, Tekin receives:  
- **3.5 composite samples** for bacterial testing  
- **2 discrete samples** for viral testing  

If the **total cost exceeds 25 TL**, the lab applies a **10% discount** as part of a bulk analysis policy.

---

**Your Task:**  
Write a Python program that:

1. Calculates the total analysis cost.  
2. Applies a **10% discount** if the total exceeds 25 TL.  
3. Prints the **final cost**.


In [None]:
# TRY TO SOLVE IT!

In [None]:
# TRY TO SOLVE IT!

In [None]:
# TRY TO SOLVE IT!

In [None]:
# Prices per sample type
bacterial_analysis_price = 5.99   # e.g. E. coli test
viral_analysis_price = 4.49       # e.g. SARS-CoV-2 test

# Number of samples received
bacterial_samples = 3.5  # composite samples
viral_samples = 2        # discrete samples

# Step 1: Calculate total cost
total_cost = (bacterial_analysis_price * bacterial_samples) + (viral_analysis_price * viral_samples)

# Step 2: Apply 10% discount if total cost > 25 TL
if total_cost > 25:
    discount = total_cost * 0.10
    final_price = total_cost - discount
else:
    final_price = total_cost

# Step 3: Print final result
print(f"Total cost before discount: {total_cost:.2f} TL")
print(f"Final price after discount (if applicable): {final_price:.2f} TL")


In [None]:
# Solution


---

#  Problem 2: DNA Base Count
### Scenario
You are analyzing a **DNA sequence** and need to count the occurrences of **each nucleotide** (A, T, C, G).

###  Task
Write a Python program that:
1. Takes a **DNA sequence** as input.
2. Counts how many times each **nucleotide** appears (A, T, C, G).
3. Prints the counts.



In [None]:
# Solution

In [None]:
# Solution

In [None]:
# PROBLEM Number Two

# Input DNA sequence
dna_sequence = input("Enter a DNA sequence: ").upper()

# Count occurrences of each nucleotide
count_A = dna_sequence.count("A")
count_T = dna_sequence.count("T")
count_C = dna_sequence.count("C")
count_G = dna_sequence.count("G")

# Print results
print(f"A: {count_A}")
print(f"T: {count_T}")
print(f"C: {count_C}")
print(f"G: {count_G}")


In [None]:
# Congratulations! You've finished the first module, let's move on to the next one!