# Conditional Statements (`if` Clauses) in Python – Tekin Makes Decisions with Code

As Tekin began building real-world tools for wastewater surveillance, he needed his scripts to make **decisions based on data** — for example, checking if a sample exceeds a viral threshold or determining the type of nucleic acid detected in a sequence.

In Python, conditional statements like `if`, `elif`, and `else` allow code to run **only when certain conditions are met**.

---

## Why Use Conditional Statements in Wastewater Surveillance?

Conditional logic is key for:

- Checking for **specific patterns** in sequences (e.g., start codon `"ATG"`)
- Flagging samples with **high viral loads**
- Determining whether a sample contains **RNA or DNA**
- Handling **ambiguous nucleotide codes**
- Applying **rules dynamically** based on metadata or measurements

Without `if` clauses, Tekin's code would always follow the same steps — unable to **react** to the input.  
Conditionals allowed him to **filter**, **classify**, and **respond** based on sample-specific data.

---

### Basic `if` Statement – Detecting a Start Codon

```python
sequence = "ATGCGT"

if "ATG" in sequence:
    print("Start codon found in viral sequence!")


In [3]:
# Let's have some examples, again!

dna_sequence = input("Enter a DNA sequence: ").upper()

if "ATG" in dna_sequence:
    print("Start codon found!")
else:
    print("No start codon present.")
    
# There are two conditions in the code above. One condition is to have "ATG" 
# in the input sequence, the other one is every other condition.

Enter a DNA sequence: ATACAGATCAT
No start codon present.


In [None]:
# TRY IT OUT BY YOURSELF AREA

In [4]:
# Our lives based on conditions as the codes! So let's dive a bit deeper.

# What is I want to check the nucleotides one by one? Let's see

nucleotide = input("Enter a nucleotide (A, T, C, G): ").upper()

if nucleotide == "A":
    print("Adenine detected")
elif nucleotide == "T":
    print("Thymine detected")
elif nucleotide == "C":
    print("Cytosine detected")
elif nucleotide == "G":
    print("Guanine detected")
else:
    print("Invalid nucleotide")
    
# Okay it's time to run the code.

Enter a nucleotide (A, T, C, G): ATACATCAT
Invalid nucleotide


In [None]:
# TRY IT OUT BY YOURSELF AREA

## What If We Are Missing Important Conditions?

In the examples above, we only checked for simple conditions — like whether a nucleotide is `"A"` or `"T"`.  
But what if the **user enters something unexpected**?

For example:

- A number or symbol instead of a DNA base  
- An empty string  
- A lowercase input like `"atg"` instead of `"ATG"`  
- A sequence that contains **ambiguous bases** like `"N"` or `"X"`  
- A sequence with **non-biological characters** like `"123@AGC!"`

In real-world bioinformatics and wastewater applications, data is often **noisy or inconsistent**, especially when copied from lab files or entered manually.

---

### Why Should Tekin Handle More Conditions?

Tekin realized that handling such cases made his code:

- **More robust** – Avoids crashes or wrong results due to bad input  
- **More user-friendly** – Gives clear feedback instead of errors  
- **Reusable** – Could be used by other teams without fear of breaking  
- **Scientifically accurate** – Ensures only valid data is processed

---

### Ideas for Extra Checks

Here are some conditions Tekin might want to add:

- Check if the sequence is **not empty**  
- Convert input to **uppercase** to ensure consistent comparison  
- Validate that the sequence only contains `"A"`, `"T"`, `"C"`, `"G"`  
- Warn the user if **invalid characters** are found  
- Count how many invalid bases are present  
- Reject input if it includes numbers or special characters

Adding such **conditional logic** made Tekin's code ready for the **messy realities** of molecular data and helped him build smarter analysis tools.


### Let'e explore more examples!

In [5]:
# Prompt the user for input
nucleotide = input("Enter a nucleotide (A, T, C, G): ").upper()

# Check if the user entered a sequence instead of a single nucleotide
if len(nucleotide) > 1:
    print("You entered a sequence rather than a nucleotide!")
elif nucleotide == "A":
    print("Adenine detected")
elif nucleotide == "T":
    print("Thymine detected")
elif nucleotide == "C":
    print("Cytosine detected")
elif nucleotide == "G":
    print("Guanine detected")
else:
    print("Invalid nucleotide")


Enter a nucleotide (A, T, C, G): 123agaga
You entered a sequence rather than a nucleotide!


In [None]:
# TRY IT OUT BY YOURSELF AREA

### What do you see in the code block above? What is the difference?

### Let's have another example as follows.

In [6]:
sequence = input("Enter a DNA sequence: ").upper()

if "GAATTC" in sequence:
    print("EcoRI restriction site detected!")
else:
    print("No EcoRI restriction site found.")

Enter a DNA sequence: GAGATAGATAGATGATGA
No EcoRI restriction site found.


In [None]:
# TRY IT OUT BY YOURSELF AREA

## Summary – Tekin Learns to Control Flow

- **`if` statements** execute a block of code **only when a condition is true**  
- **`if-else` statements** allow the program to choose **between two paths**
- **`if-elif-else` statements** handle **multiple branching conditions**

---

### Why Does This Matter for Bioinformatics and Wastewater Surveillance?

Tekin used conditional logic to:

- **Validate DNA sequences** (e.g., ensure only A, T, C, G are present)  
- **Detect start codons** or **restriction sites** in genomic data  
- **Classify GC content** of a sequence (e.g., high vs. low GC)  
- **Differentiate between DNA and RNA** based on base composition  
- **Flag samples with high viral load** based on thresholds  
- **Make decisions dynamically** in automated pipelines

These tools allowed Tekin to create smarter scripts that could adapt to real lab data — messy, unpredictable, and full of variation.

---

Next, we will explore **`for` loops** and how Tekin uses them to **iterate through sequences** in Python.


In [None]:
# CODE PLAYGROUND

In [None]:
# CODE PLAYGROUND

# Introduction to Classes in Python – Tekin Organizes His Tools

As Tekin’s codebase expanded, he realized he needed a more structured way to group **related functions** and **data**.  
That's when he discovered **classes** — a core concept of **Object-Oriented Programming (OOP)** in Python.

A class is like a **template** for creating reusable objects.  
Each object (or "instance") contains **attributes (data)** and **methods (functions)** that operate on that data.

---

## Why Use Classes in Bioinformatics?

Tekin used classes to:

- Bundle related functions into reusable modules  
- Represent biological concepts like **samples**, **sequences**, or **genes**  
- Build scalable tools for **automated analysis pipelines**  
- Reduce repetition and improve code readability

---

## Example: Defining a Wastewater Sample Class

## Using a Python Class to Represent Wastewater Samples – Step by Step

Below is a simple but powerful example of using a Python `class` to model real-life data from a **wastewater surveillance program**.

We define a class called `WastewaterSample` to store and analyze information about each sample.

---

### Step 1: Defining the Class and Constructor

```python
class WastewaterSample:
    def __init__(self, sample_id, viral_load):
        self.sample_id = sample_id
        self.viral_load = viral_load
        
```
### Step 2: A Method to Determine Risk Level

This method checks whether the viral load in the sample exceeds 25,000 copies/mL.

Returns True if it’s high, otherwise False.

```python
   def is_high_risk(self):
        return self.viral_load > 25000
    
```

### Step 3: A Method to Return a Summary


```python

    def summary(self):
        status = "High" if self.is_high_risk() else "Normal"
        return f"Sample {self.sample_id}: {self.viral_load} copies/mL → {status}"

```


### Step 4: Creating and Using Sample Objects

```python

sample1 = WastewaterSample("WWTP_01", 18000)
sample2 = WastewaterSample("WWTP_02", 32000)

print(sample1.summary())  # Output: Sample WWTP_01: 18000 copies/mL → Normal
print(sample2.summary())  # Output: Sample WWTP_02: 32000 copies/mL → High

```


### Why Is This Useful?

Using a class like this allows Tekin to:

* Easily handle multiple samples

* Keep related data and methods together

* Build scalable code for future extensions (e.g., GC content, date tracking, sample metadata)

* This is just the beginning of using Object-Oriented Programming (OOP) in bioinformatics and public health tools!

In [None]:
# CODE PLAYGROUND

In [4]:
class WastewaterSample:
    def __init__(self, sample_id, viral_load):
        self.sample_id = sample_id
        self.viral_load = viral_load

    def is_high_risk(self):
        return self.viral_load > 25000

    def summary(self):
        status = "High" if self.is_high_risk() else "Normal"
        return f"Sample {self.sample_id}: {self.viral_load} copies/mL → {status}"
sample1 = WastewaterSample("WWTP_01", 18000)
sample2 = WastewaterSample("WWTP_02", 32000)

print(sample1.summary())  # Output: Sample WWTP_01: 18000 copies/mL → Normal
print(sample2.summary())  # Output: Sample WWTP_02: 32000 copies/mL → High


Sample WWTP_01: 18000 copies/mL → Normal
Sample WWTP_02: 32000 copies/mL → High


In [None]:
# CODE PLAYGROUND

In [6]:
# Class definition for BiologicalSample

class BiologicalSample:
    # The __init__ method is the constructor, called when a new object is created.
    def __init__(self, sample_id, sample_type):
        self.sample_id = sample_id  # Assign the sample ID to the instance
        self.sample_type = sample_type  # Assign the sample type to the instance

    # This method provides information about the sample.
    def sample_info(self):
        return f"Sample ID: {self.sample_id}, Sample Type: {self.sample_type}"

# Creating instances (objects) of the BiologicalSample class
sample1 = BiologicalSample("SAMP123", "Wastewater")  # Create a sample of type 'Wastewater'
sample2 = BiologicalSample("SAMP124", "Blood")  # Create a sample of type 'Blood'

# Printing the sample information by calling the sample_info method
print(sample1.sample_info())  # Output: Sample ID: SAMP123, Sample Type: Wastewater
print(sample2.sample_info())  # Output: Sample ID: SAMP124, Sample Type: Blood


Sample ID: SAMP123, Sample Type: Wastewater
Sample ID: SAMP124, Sample Type: Blood


In [7]:
# CODE PLAYGROUND

# Problem 1: Detecting Start and Stop Codons in a DNA Sequence
### Scenario
You are analyzing a DNA sequence and need to determine:
1. **If it contains a start codon (`ATG`)**.
2. **If it contains a stop codon (`TAA`, `TAG`, or `TGA`)**.

### Task
Write a Python program that:
- Takes a **DNA sequence** as input from the user.
- **Checks if a start codon (`ATG`) is present**.
- **Checks if any stop codon (`TAA`, `TAG`, `TGA`) is present**.
- Prints appropriate messages based on the findings.


In [None]:
# CODE PLAYGROUND


---

# Problem 2: Classifying DNA vs RNA
### Scenario
You receive a sequence from a researcher, but they didn't specify whether it's **DNA or RNA**. You need to classify it based on its nucleotide composition.

### Task
Write a Python program that:
- Takes a **sequence** as input.
- **Checks whether it contains "T" (Thymine)** → If so, it's **DNA**.
- **Checks whether it contains "U" (Uracil)** → If so, it's **RNA**.
- If it contains **both "T" and "U"**, print an **error message** (invalid sequence).
- If neither is found, print **"Unknown sequence type"**.

In [None]:
# CODE PLAYGROUND

In [None]:
# CODE PLAYGROUND, I still believe in you, don't afraid of trying. This is the best way to learn!

In [None]:
## **Solution for Problem 1: Detecting Start and Stop Codons**


# Get user input
sequence = input("Enter a DNA sequence: ").upper()

# Check for start and stop codons
if "ATG" in sequence:
    print("Start codon found!")
else:
    print("No start codon found.")

if "TAA" in sequence or "TAG" in sequence or "TGA" in sequence:
    print("Stop codon found!")
else:
    print("No stop codon found.")


In [None]:
# CODE PLAYGROUND

In [None]:
## Solution for Problem 2

# Get user input
sequence = input("Enter a sequence: ").upper()

# Check for DNA or RNA classification
contains_T = "T" in sequence
contains_U = "U" in sequence

if contains_T and contains_U:
    print("Invalid sequence: contains both T and U.")
elif contains_T:
    print("This is a DNA sequence.")
elif contains_U:
    print("This is an RNA sequence.")
else:
    print("Unknown sequence type.")


In [None]:
# CODE PLAYGROUND

In [None]:
# CODE PLAYGROUND

## Any Questions?