# Getting Started with Google Colab

## Managing Runtimes in Google Colab

### What Does “Temporary Runtime” Mean?

When you open a notebook in Google Colab, you're given access to a temporary virtual machine in the cloud.

If:

*   You close the tab
*   You're inactive for too long
*   The session crashes

All uploaded files, variables, and states will be lost.

Real-World Example:
Imagine uploading my_data.csv using:

In [None]:
from google.colab import files
files.upload()


Saving my_data.csv to my_data (6).csv


{'my_data (6).csv': b'name,age,score,passed\nAlice,24,85,True\nBob,27,90,True\nCharlie,22,78,False\nDiana,32,92,True\nEthan,29,88,True\n'}

Then after a break, you come back to run:

In [None]:
df = pd.read_csv("my_data.csv") ## assuming you've already installed pandas (Python library for data manipulation and analysis, providing tools for efficiently handling structured data like CSV/Excel files and DataFrames, ellaborate in later section)


You’ll get:

FileNotFoundError: [Errno 2] No such file or directory: 'my_data.csv'


### Solution: Mount Google Drive (Persistent Storage)
Mounting Google Drive gives you stable access to files across sessions.

Demo Code: Mount Google Drive and Read a File

In [None]:
# Step 1: Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


This will prompt the user to sign in and authorize access.

In [None]:
# Step 2: Access a file inside your Google Drive
import pandas as pd
file_path = '/content/drive/MyDrive/my_data.csv'

df = pd.read_csv(file_path)
df.head()


Unnamed: 0,name,age,score,passed
0,Alice,24,85,True
1,Bob,27,90,True
2,Charlie,22,78,False
3,Diana,32,92,True
4,Ethan,29,88,True


Now, even if your session disconnects, the file stays safe in your Drive.

## File Access in Colab

### ❌ Common Mistake : File not uploaded or linked

In [None]:
pip install pandas



In [None]:
import pandas as pd

In [None]:
# This will fail in Colab if the file isn’t uploaded or linked
df = pd.read_csv("D:\\my_data.csv")


FileNotFoundError: [Errno 2] No such file or directory: 'D:\\my_data.csv'

### Correct Way: Mount Google Drive

GPT prompt: I saved the my_data.csv at the following path of google colab:https://drive.google.com/drive/my-drive, what is the python code to read the file at google colab

In [None]:
from google.colab import drive
drive.mount('/content/drive')

import pandas as pd

# Example path (replace with your file's path)
file_path = '/content/drive/MyDrive/my_data.csv'

# Read the CSV file
df = pd.read_csv(file_path)

# Verify the data (optional)
print(df.head())


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
      name  age  score  passed
0    Alice   24     85    True
1      Bob   27     90    True
2  Charlie   22     78   False
3    Diana   32     92    True
4    Ethan   29     88    True


Tip for Reference:

Store your .csv, .ipynb, and result files inside MyDrive/Colab Notebooks or similar

Use Google Drive like a persistent hard drive when coding in Colab

# Python is Simple and Readable

## 1. Does not need curly braces or semicolons

 Python Code Example:

In [None]:
# Python does not need curly braces or semicolons
greeting = "Hello"
user_name = "Alex"
print(greeting + ", " + user_name)


Hello, Alex


Equivalent in C or Java:

```java
public class HelloWorld {
    public static void main(String[] args) {
        String greeting = "Hello";
        String user_name = "Alex";
        System.out.println(greeting + ", " + user_name);
    }
}
```


The above code is highlighting a major syntactic difference between Python and many other programming languages, especially C-style languages like:

### Languages that **DO** require curly braces `{}` and semicolons `;`

| Language     | Requires `{}` for blocks         | Requires `;` to end lines                 |
|--------------|----------------------------------|-------------------------------------------|
| **C**        | Yes                              | Yes                                       |
| **C++**      | Yes                              | Yes                                       |
| **Java**     | Yes                              | Yes                                       |
| **JavaScript** | Yes (optional in modern JS)     | Often (but optional in some styles)       |
| **C#**       | Yes                              | Yes                                       |
| **Swift**    | Yes                              | Yes                                       |


### Indentation is Structure

Instead of {} like in C/Java, Python uses 4 spaces for code blocks.



Concept:
Python does not use {} to define code blocks (like C, Java, or JavaScript).
Instead, it uses indentation — typically 4 spaces — to define the scope of a block such as under if, for, def, etc.

Python Example: Proper Indentation

In [None]:
x = 10

if x > 5:
    print("x is greater than 5")
    print("This line is part of the if-block")

print("This line is outside the if-block")


x is greater than 5
This line is part of the if-block
This line is outside the if-block


Explanation:
The two indented print() lines are part of the if block (executed if x > 5).

The last print() line is not indented, so it is outside the block and always runs.



Python Example: Indentation Error

In [None]:
x = 10

if x > 5:
print("x is greater than 5")  # ❌ This will cause an IndentationError


IndentationError: expected an indented block after 'if' statement on line 3 (<ipython-input-21-fc431253f6f7>, line 4)

❌ Error: IndentationError: expected an indented block

Python requires consistent indentation after control structures.

### C / Java Equivalent (using {})

```c
int x = 10;

if (x > 5) {
    printf("x is greater than 5\n");
    printf("This line is part of the if-block\n");
}

printf("This line is outside the if-block\n");
```


| Language | How Blocks Are Defined    | Indentation Required  |
| -------- | ------------------------- | --------------------- |
| Python   | By indentation (4 spaces) | ✅ Yes (enforced)      |
| C/Java   | By curly braces `{}`      | ❌ No (style optional) |


## 2. Case-Sensitivity

🔹Python is Case-sensitive for **variables**

🔹Case-conventioned for **keywords **(like print, if) — they must be **lowercase**

| Language Feature            | Python Behavior                                  | Why it Feels Simple           |
| --------------------------- | ------------------------------------------------ | ----------------------------- |
| Case sensitivity            |  Case-sensitive                                 |  Precise control over names |
| Keyword case consistency    |  All keywords are lowercase                     |  Easier to remember/read    |
| Variable naming flexibility |  No type prefixing needed (`strName`, `iCount`) |  Cleaner variable names     |


Keywords are reserved words with predefined meanings in the language's syntax, Keywords are part of the Python language's syntax.They cannot be used as identifiers (variable names, function names, etc.).
Examples include:
 def, if, else, for, while, return, class, and import.


In [None]:
# Correct
print("Hello, world!")  # lowercase 'print' is required

# Incorrect
# Print("Hi!")  ❌ This will raise: NameError: name 'Print' is not defined


Hello, world!


Python Code: Case Sensitivity in Variables

In [None]:
name = "Alice"
Name = "Bob"
NAME = "Charlie"

print(name)   # Output: Alice
print(Name)   # Output: Bob
print(NAME)   # Output: Charlie


Alice
Bob
Charlie


Explanation:
name, Name, and NAME are three separate variables in Python.

Even though they look similar, Python treats them as distinct because it is case-sensitive.

| Variable | Value     |
| -------- | --------- |
| `name`   | "Alice"   |
| `Name`   | "Bob"     |
| `NAME`   | "Charlie" |


❌ Common Beginner Mistake:
Trying to access name when only Name is defined:

In [None]:
Name = "Bob"
print(name)  # ❌ NameError: name 'name' is not defined


Alice


Rule of Thumb: Be consistent in variable casing — it's a best practice to stick to lowercase (e.g. user_name, total_price) unless using a specific naming convention (like CamelCase for classes).

 ## 3. Dynamically Typed in Python

What It Means:
In Python, you do not need to declare a variable’s type.

The type is inferred automatically at runtime based on the value you assign.

This is different from statically typed languages (like C, Java, or C++), where you must explicitly declare the type.

Python Code Example:

# print() Function

What it does:
The print() function displays output in the console.

It's one of the most commonly used built-in functions in Python.

It can print strings, numbers, variables, or even expressions.



## Example 1: Print a String

In [None]:
print("Hello, world!")


Hello, world!


Elaboration:
You must enclose text in quotation marks (" or ').

## Example 2: Print Numbers and Expressions

In [None]:
print(42)             # prints a number
print(3 + 7)          # prints the result of the expression


42
10


Elaboration:
You don’t need to convert numbers to strings to print them.

Python evaluates the expression 3 + 7 and prints 10.

## Example 3: Print Variables

In [None]:
name = "Alice"
age = 25
print(name)                   # Output: Alice
print("Age:", age)           # Output: Age: 25
print("Name:", name, "| Age:", age)  # Multiple values


Alice
Age: 25
Name: Alice | Age: 25


Elaboration:
Commas separate multiple values inside print() and add spaces automatically.

You can mix strings and variables without needing manual type conversion.

## Common Error Example:

In [None]:
age = 25
# print("Age: " + age)  # ❌ TypeError: can only concatenate str (not "int") to str
print("Age: " + str(age))  # ✅ Convert int to string


Age: 25


## Summary

| Feature         | Behavior                             |
| --------------- | ------------------------------------ |
| `print("text")` | Prints a string                      |
| `print(2 + 3)`  | Evaluates and prints result          |
| `print(x, y)`   | Prints multiple values with spaces   |
| `print(x + y)`  | Concatenates strings or adds numbers |


# Text (String) vs Numbers


Understanding the difference between strings and numbers is foundational in Python programming. These two data types behave differently and are used in different contexts.

## 1. Strings (str)-- "Hello" or 'World'

Definition:
A string is a sequence of characters enclosed in quotes.

In [None]:
# Assign string values using either single or double quotes
greeting1 = "Hello"
greeting2 = 'World'

print(greeting1)  # Output: Hello
print(greeting2)  # Output: World


Hello
World


Elaboration:

Strings are sequences of characters enclosed in quotes.

Python accepts both 'single' and "double" quotes.

Use strings to represent textual data like names, labels, or messages.

## 2. Numbers — 123, 4.56

Definition:
Numbers represent quantitative values. They are not enclosed in quotes.

### Types

int: whole numbers (e.g., 1, 100)

float: decimal numbers (e.g., 3.14, 0.99)



In [None]:
# Assign numeric values
age = 25           # Integer
height = 1.75      # Float (decimal number)

print(age)         # Output: 25
print(height)      # Output: 1.75


25
1.75


**Elaboration:**

Numbers in Python are either:

int (integers like 5, 100)

float (decimals like 4.56, 0.99)

Used for calculations, measurements, etc.

### Use Cases

In Python, numbers (int and float) are fundamental data types used across many tasks. They are not just for math — they help control logic, track quantities, and represent real-world metrics.




Common Use Cases:

| Use Case                          | Description                                        |
| --------------------------------- | -------------------------------------------------- |
| **Math Operations**               | Perform calculations like addition, subtraction    |
| **Counters & Quantities**         | Track item counts, iteration numbers, loop indexes |
| **Measurements & Financial Data** | Represent height, weight, money, percentages       |


#### Math Operations

In [None]:
price = 100
discount = 15
final_price = price - discount
print("Final price:", final_price)  # Output: Final price: 85


Final price: 85


**Elaborations**

Python supports all basic operators: +, -, *, /, ** (power), // (floor division), % (modulus).

Useful in shopping cart logic, scoring systems, and budget estimations.

#### Counters & Quantities

Integers (int) are perfect for counting items (e.g., how many clicks, how many files).



In [None]:
students = 5
students += 1  # One more student joins
print("Total students:", students)  # Output: Total students: 6


Total students: 6


**Ellaboration:**

students = 5
You’re defining a counter that starts at 5 — it represents a quantity (number of students currently enrolled, present, or counted).

students += 1
This is shorthand for students = students + 1.
It increments the counter — you’re increasing the quantity by 1, e.g., one more student joined the group.

print(...)
Displays the updated count.

**Counters & Quantities**

| Concept          | How It’s Shown in Code                                                                              |
| ---------------- | --------------------------------------------------------------------------------------------------- |
| **Counter**      | `students` is a variable tracking how many students there are                                       |
| **Incrementing** | `students += 1` adds to the count — a classic counter operation                                     |
| **Quantity**     | The variable stores a real-world numeric **value** that changes with events (like students joining) |
| **Dynamic**      | The number can go up (or down), simulating real-time changes in quantity                            |


**Real-World Analogies**


Like a click counter on a website:

Every time someone clicks "Join Class", students += 1.


Like counting items in a shopping cart:
cart_items += 1 each time a product is added.

Like tracking event sign-ups or attendance.

#### Measurements

In [None]:
height_m = 1.75
weight_kg = 68
bmi = weight_kg / (height_m ** 2)
print("BMI:", round(bmi, 2))  # Output: BMI: 22.2


BMI: 22.2


**What This Code Demonstrates:**

| Concept                   | How It’s Shown in the Code                              |
| ------------------------- | ------------------------------------------------------- |
| Measurements (real-world) | `height_m`, `weight_kg` represent physical units        |
| Decimal numbers (floats)  | Both values include decimal precision (`1.75`, `68`)    |
| Arithmetic operation      | `weight / height²` involves division and exponentiation |
| Mathematical expression   | `** 2` is exponentiation (height squared)               |
| Formatting output         | `round(..., 2)` formats the result to 2 decimal places  |


### Use + to concatenate strings: "Hi " + "there!"




**What is Concatenation?**

Concatenation means joining two strings end-to-end.

In Python, you can use the + operator to concatenate (combine) two or more strings into a single string.

The + operator combines strings without any space unless you include it yourself.

In [None]:
greeting = "Hi "
name = "there!"
full_message = greeting + name
print(full_message)  # Output: Hi there!


Hi there!


Breakdown of the Code:

| Line                  | Explanation                            |
| --------------------- | -------------------------------------- |
| `greeting = "Hi "`    | Creates a string with a trailing space |
| `name = "there!"`     | Another string variable                |
| `greeting + name`     | Joins both strings into `"Hi there!"`  |
| `print(full_message)` | Displays the result in the console     |


**Important Note:**

You cannot directly add a number and a string using +. For example:

In [None]:
age = 25
print("Age: " + age)  # ❌ This will raise a TypeError!


TypeError: can only concatenate str (not "int") to str

Instead, you need to convert the number to a string:

In [None]:
print("Age: " + str(age))  # ✅ Output: Age: 25


Age: 25


**Related Operators for Numbers:**

| Operator | Description | Example | Result |
| -------- | ----------- | ------- | ------ |
| `+`      | Add         | `5 + 2` | `7`    |
| `-`      | Subtract    | `5 - 2` | `3`    |
| `*`      | Multiply    | `5 * 2` | `10`   |
| `/`      | Divide      | `5 / 2` | `2.5`  |


These work on numbers, not strings. But + has dual use: addition for numbers, concatenation for strings.

# Comments in Python


Python uses comments to help programmers explain what their code does. Comments are ignored during execution, but they are essential for code readability, documentation, and collaboration.

## Use # for Single-Line Comments

Single-line comments begin with a # symbol and extend to the end of the line. They're ideal for brief explanations or disabling a line of code during debugging.

In [None]:
# This is a single-line comment
name = "Alice"  # Store user's name
print(name)


Alice


**Elaboration**

Everything after the # on the same line is ignored by Python.

Use them to:

Explain the purpose of a line or variable.

Temporarily deactivate a line of code.

Leave reminders or TODOs for yourself or others.

## Use triple quotes ''' or """ for Multiline Documentation

Multiline comments or docstrings use triple quotes (''' or """) and are typically used to describe functions, classes, or large blocks of code.

In [None]:
"""
This program prints a welcome message
and displays the user's name.
"""

def greet():
    """This function prints a greeting message."""
    print("Welcome to the program!")

greet()


Welcome to the program!


**Elaboration**

Python treats triple-quoted strings not assigned to a variable as a docstring or multiline comment.

You can use them:

At the top of scripts for documentation.

Inside functions or classes for built-in help (help(function)).

To block-comment multiple lines temporarily.

Works with either ''' or """, but """ is recommended for official docstrings per PEP 257.

**Summary Table**

| Type            | Syntax Example           | Use Case                               |
| --------------- | ------------------------ | -------------------------------------- |
| Single-line     | `# Print user's name`    | Quick notes or inline comments         |
| Multiline (doc) | `"""This function..."""` | Documenting functions, files, sections |


# Common Mistakes


## Missing Quotation Marks

Python requires strings to be enclosed in matching quotation marks — either single ' or double ". Missing one leads to a syntax error.

In [None]:
# ❌ Incorrect - missing closing quote
# print("Hello)

# ✅ Correct
print("Hello")


Hello


**Elaboration**

If one quote is missing, Python can't tell where the string ends.

Error example:

In [None]:
SyntaxError: unterminated string literal


SyntaxError: invalid syntax (<ipython-input-20-e3e964fe3868>, line 1)

Always check that every opening ' or " has a closing match.

## Forgetting to Close Parentheses

Python uses parentheses () for functions and expressions. Forgetting to close them causes a syntax error.

In [None]:
# ❌ Incorrect - missing closing parenthesis
# print("Hello"

# ✅ Correct
print("Hello")


Hello


**Elaboration**

Missing a closing ) breaks Python's ability to parse the code.


## Mixing Strings and Integers Without Type Conversion

You cannot combine strings and numbers with + directly. Python requires explicit type conversion.

In [None]:
age = 25

# ❌ Incorrect
# print("Age: " + age)  # TypeError

# ✅ Correct (convert int to string)
print("Age: " + str(age))

# ✅ Alternative (comma-separated)
print("Age:", age)


Age: 25
Age: 25


**Elaboration**

Strings and numbers are different types: '25' vs 25.

Trying "Age: " + 25 causes:

TypeError: can only concatenate str (not "int") to str


Fix it using str(), or use commas in print() (which automatically handles types).

**Summary Table**

| Mistake Type               | Error Triggered                            | Fix                                 |
| -------------------------- | ------------------------------------------ | ----------------------------------- |
| Missing quotation marks    | `SyntaxError: unterminated string literal` | Use matching opening/closing quotes |
| Missing parentheses        | `SyntaxError: unexpected EOF`              | Ensure all `(` have matching `)`    |
| Mixing strings and numbers | `TypeError: can’t concat str + int`        | Use `str()` or comma in `print()`   |


# Building Logic with Conditionals and Loops


## if Statements

### Code Example 1: Budget Check

In [None]:
price = 800
budget = 1000

if price <= budget:
    print("You can buy this item.")
else:
    print("Sorry, it's too expensive.")


You can buy this item.


### Code Example 2: Age Gate

In [None]:
age = int(input("Enter your age: "))

if age < 18:
    print("You're a children.")
else:
    print("You're an adult.")


Enter your age: 8
You're a children.


Note: We use int() to convert input into a number.

### Common Mistake

In [None]:
# This will break!
age = input("Enter your age: ")

if age < 18:
    print("You're a student.")  # ❌ TypeError: '<' not supported between instances of 'str' and 'int'


Enter your age: 89


TypeError: '<' not supported between instances of 'str' and 'int'

Fix:

In [None]:
age = int(input("Enter your age: "))


Enter your age: 89


## Using if / elif / else to Handle Multiple Conditions

### Code Example 1: Age Classification



In [None]:
age = int(input("Enter your age: "))

if age < 18:
    print("Student")
elif age < 65:
    print("Working adult")
else:
    print("Senior")


Enter your age: 65
Senior


**Explanation:**

if age < 18: → only executes if the condition is True

elif age < 65: → only checks if the first condition was False

else: → catches all other remaining cases

**Example Outputs:**

Input: 15 → Output: Student

Input: 30 → Output: Working adult

Input: 70 → Output: Senior

### Code Example 2: Grade System

In [None]:
score = int(input("Enter your exam score: "))

if score >= 90:
    print("Grade: A")
elif score >= 80:
    print("Grade: B")
elif score >= 70:
    print("Grade: C")
elif score >= 60:
    print("Grade: D")
else:
    print("Grade: F")


Enter your exam score: 5
Grade: F


**Tips**

| Statement | Meaning                                                    |
| --------- | ---------------------------------------------------------- |
| `if`      | First condition (checked no matter what)                   |
| `elif`    | Additional condition (only checked if all above are False) |
| `else`    | Final fallback (runs if all previous are False)            |


### Common Mistake

In [None]:
# This won't work logically
if age < 18:
    print("Student")
if age < 65:
    print("Working adult")  # This will also run if age is under 18


Always use elif if your cases are mutually exclusive (non-overlapping).

**Tips: ChatGPT Prompt**

 "Write a Python program that prints the weather category: Cold (<10), Warm (10–25), Hot (>25). Use if/elif/else."

## For Loops - Best for Counting

Scenario:
You’re a teacher handing out 5 exam papers, one to each student in a row.

Each time you move to the next student, you:

Count the student (1, 2, 3, …)

Hand them a paper

Repeat until all 5 students have received one

How This Relates to a for Loop in Python:

In [None]:
for i in range(5):
    print("Handed paper to student", i + 1)


Handed paper to student 1
Handed paper to student 2
Handed paper to student 3
Handed paper to student 4
Handed paper to student 5


range(5) is the number of students

i is the student’s index (starts at 0, so we use i + 1 for natural counting)

The loop automates the repeated action of handing out a paper

 Message:
"A for loop is like doing the same task again and again — like handing out papers, putting stickers on folders, or checking names off a list. Instead of repeating code 5 times, you let the loop do it for you."

### Code Example 1: Count from 0 to 4

In [None]:
for i in range(5):
    print("Count:", i)


Count: 0
Count: 1
Count: 2
Count: 3
Count: 4


range(5) creates the list [0, 1, 2, 3, 4]

i takes on each value in the list, one at a time

 ### Code Example 2: Print Even Numbers Between 1 and 10

In [None]:
for number in range(2, 11, 2):
    print(number)


2
4
6
8
10


range(start, stop, step):

Start at 2, stop before 11, step by 2



### Code Example 3: Loop Through a List

In [None]:
fruits = ["apple", "banana", "cherry"]

for fruit in fruits:
    print("I like", fruit)


I like apple
I like banana
I like cherry


## While Loops - Best for Conditions

### Code Example: Basic Counter

In [None]:
i = 0
while i < 5:
    print("Count:", i)
    i += 1


Count: 0
Count: 1
Count: 2
Count: 3
Count: 4


Explanation:
i = 0: starting value

while i < 5: loop continues as long as i is less than 5

i += 1: increments i so we eventually break the loop

 ### Common Mistake: Infinite Loop

In [None]:
i = 0
while i < 5:
    print("Count:", i)
    # Forgot i += 1 ❌


This runs forever because i never changes → i < 5 stays True.

Always ensure the condition will eventually become False.

### GPT Exercise Prompt:

Write a while loop that asks for a password until the correct one is entered:

In [None]:
password = ""
while password != "secret123":
    password = input("Enter password: ")

print("Access granted.")


Enter password: 123
Enter password: 123
Enter password: secret123
Access granted.


## range() Function Basics

### Example 1: range(stop)

Prints numbers from 0 to stop - 1

In [None]:
for i in range(5):
    print(i)


0
1
2
3
4


### Example 2: range(start, stop)

Starts from start and stops before stop

In [None]:
for i in range(2, 6):
    print(i)


2
3
4
5


### Example 3: range(start, stop, step)

Includes a step size to skip values

In [None]:
for i in range(1, 10, 2):
    print(i)


1
3
5
7
9


To view the list generated by range(), you can convert it using list():

In [None]:
print(list(range(1, 10, 2)))
# Output: [1, 3, 5, 7, 9]


[1, 3, 5, 7, 9]


# Handling Real-World Data with pandas and CSVs


Why pandas?
pandas is a powerful library for:

*   Working with tables (like Excel)
*   Cleaning, filtering, and analyzing structured data

Think of a DataFrame as a spreadsheet you can program

In [None]:
import pandas as pd


## Loading Data into a DataFrame


###Option 1: Upload CSV file manually in Colab

In [None]:
from google.colab import files
uploaded = files.upload()

df = pd.read_csv("my_data.csv")


Saving my_data.csv to my_data (1).csv


### Option 2: Load from Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

df = pd.read_csv('/content/drive/MyDrive/my_data.csv')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Notes:You must upload or mount — local paths like C:\Users\... don’t work in Colab!

## Exploring DataFrames

### df.head()

In [None]:
df.head()     # Show top 5 rows



Unnamed: 0,name,age,score,passed
0,Alice,24,85,True
1,Bob,27,90,True
2,Charlie,22,78,False
3,Diana,32,92,True
4,Ethan,29,88,True


.head() shows the top rows to preview your data

### df.info()

In [None]:
df.info()     # Structure and data types


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   name    5 non-null      object
 1   age     5 non-null      int64 
 2   score   5 non-null      int64 
 3   passed  5 non-null      bool  
dtypes: bool(1), int64(2), object(1)
memory usage: 257.0+ bytes


.info() shows column names, null values, data types

### df.describe()

In [None]:
df.describe() # Summary statistics

Unnamed: 0,age,score
count,5.0,5.0
mean,26.8,86.6
std,3.962323,5.458938
min,22.0,78.0
25%,24.0,85.0
50%,27.0,88.0
75%,29.0,90.0
max,32.0,92.0


describe() gives stats like mean, min, max

## Editing and Manipulating Data

pandas allows flexible column creation, filtering, and cleanup.


### Step 0: Upload and Load CSV

In [None]:
from google.colab import files
uploaded = files.upload()

import pandas as pd
df = pd.read_csv("my_data.csv")
print("Original DataFrame:")
df.head()


Saving my_data.csv to my_data (3).csv
Original DataFrame:


Unnamed: 0,name,age,score,passed
0,Alice,24,85,True
1,Bob,27,90,True
2,Charlie,22,78,False
3,Diana,32,92,True
4,Ethan,29,88,True


Explanation:

files.upload() opens a file picker in Colab — students can upload their my_data.csv file.

We read it into a pandas DataFrame named df.

df.head() shows the first 5 rows so we can inspect the original data.

### Step 1: Add a New Column score_doubled

In [None]:
df["score_doubled"] = df["score"] * 2
print("After adding score_doubled:")
df.head()


After adding score_doubled:


Unnamed: 0,name,age,score,passed,score_doubled
0,Alice,24,85,True,170
1,Bob,27,90,True,180
2,Charlie,22,78,False,156
3,Diana,32,92,True,184
4,Ethan,29,88,True,176


Explanation:

Creates a column score_doubled by doubling each value in score.

Helps illustrate how pandas operates element-wise on columns and creates new data.

### Step 2: Drop the Column passed

In [None]:
df.drop("passed", axis=1, inplace=True)
print("After dropping 'passed' column:")
df.head()


After dropping 'passed' column:


Unnamed: 0,name,age,score,score_doubled
0,Alice,24,85,170
1,Bob,27,90,180
2,Charlie,22,78,156
3,Diana,32,92,184
4,Ethan,29,88,176


Explanation:
Removes the passed column from the DataFrame.

axis=1 specifies column removal, and inplace=True: means "do it now and update this object" instead of returning a new one.

Useful for cleaning out unnecessary or sensitive data.

### Step 3: Add a Boolean Column high_score

In [None]:
df["high_score"] = df["score"] > 70
print("After creating 'high_score' boolean column:")
df.head()


After creating 'high_score' boolean column:


Unnamed: 0,name,age,score,score_doubled,high_score
0,Alice,24,85,170,True
1,Bob,27,90,180,True
2,Charlie,22,78,156,True
3,Diana,32,92,184,True
4,Ethan,29,88,176,True


Explanation:

Creates a column high_score with True if score > 70, otherwise False.

Demonstrates how to generate categorical flags based on numeric conditions.

### Step 4: Review the Final DataFrame

In [None]:
print("Final DataFrame:")
df


Final DataFrame:


Unnamed: 0,name,age,score,score_doubled,high_score
0,Alice,24,85,170,True
1,Bob,27,90,180,True
2,Charlie,22,78,156,True
3,Diana,32,92,184,True
4,Ethan,29,88,176,True


Explanation:

Presents the updated DataFrame with all transformations applied.

Should review column names and values to ensure everything looks correct.

 Summary

| Action                             | pandas Method Used                        | Outcome Explanation                                       |
| ---------------------------------- | ----------------------------------------- | --------------------------------------------------------- |
| Add a column with transformed data | `df["score_doubled"] = df["score"] * 2`   | Computes new data and adds it to each row                 |
| Remove an existing column          | `df.drop("passed", axis=1, inplace=True)` | Removes unwanted data, modifies DataFrame in place        |
| Add boolean condition flags        | `df["high_score"] = df["score"] > 70`     | Adds a status column with True/False based on a condition |


## Saving Modified Data

In [None]:
# Save the modified DataFrame to a new CSV file
df.to_csv("updated_my_data.csv", index=False)

# Trigger download in Google Colab
from google.colab import files
files.download("updated_my_data.csv")


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Explanation:

df.to_csv("updated_my_data.csv", index=False)

Saves the DataFrame into a new CSV file named updated_my_data.csv.

The index=False argument excludes the row index from the CSV file.

files.download(...)
Launches a browser-based download prompt so you can save the work to a local machine.

#Using Libraries and Creating Your Own Functions


## Using numpy

NumPy (Numerical Python) is a foundational Python library for performing fast, efficient numerical operations on lists, arrays, and matrices.

It is optimized for scientific computing and is widely used in data science, machine learning, and statistics.

np.mean() is one of many functions NumPy provides for statistical analysis. It calculates the average (arithmetic mean) of a list or array of numbers.

### Code Example

In [None]:
import numpy as np

data = [1, 2, 3, 4, 5]
mean_val = np.mean(data)
print("Average is:", mean_val)


Average is: 3.0


Elaboration


import numpy as np:

This imports the NumPy library and gives it a short alias np, which is the standard convention.

data = [1, 2, 3, 4, 5]:

This creates a basic Python list containing five integers.

np.mean(data):

This calculates the average (mean) of the numbers in the list.

Formula used internally:
(
1
+
2
+
3
+
4
+
5
)
/
5
=
15
/
5
=
3.0
(1+2+3+4+5)/5=15/5=3.0

print("Average is:", mean_val):

This prints the result in a readable format.

##Writing Functions


### A. Basic Function Syntax

In [None]:
def greet(name):
    print("Hello, " + name)


Explanation:

def: This keyword tells Python you’re defining a function.

greet: This is the function name.

name: This is a parameter — a placeholder for any value you pass in when calling the function.

print(...): This is the action the function performs.

The indented block under the def line is the function body — it only runs when the function is called.

### B. Calling the Function

In [None]:
greet("Juliana")  # Output: Hello, Juliana


Hello, Juliana


Explanation:

This line calls or invokes the greet() function.

The string "Juliana" is passed to the parameter name.

The function prints a greeting with the name inserted.

Key point: Defining a function does not execute it — it must be called.



### C. Returning a Value Instead of Printing

In [None]:
def scale_score(score):
    return score / 10

print(scale_score(87))  # Output: 8.7


8.7


Explanation:

return: This sends a result back to where the function was called, like a calculator.

score / 10: This calculation is performed inside the function.

print(...): This shows the returned result.

Use case: Return is better than print when you want to use the result later in calculations or other logic.

### D. Using Functions with DataFrames (.apply())

Create Data + Apply Custom Function

In [None]:
import pandas as pd

# Step 1: Create a sample dataset
data = {
    "name": ["Alice", "Bob", "Charlie", "Diana"],
    "score": [78, 85, 92, 66]
}

df = pd.DataFrame(data)

# Step 2: Define a custom function
def double(x):
    return x * 2

# Step 3: Apply the function to the 'score' column
df["score_doubled"] = df["score"].apply(double)

# Step 4: Show the updated DataFrame
print(df)


      name  score  score_doubled
0    Alice     78            156
1      Bob     85            170
2  Charlie     92            184
3    Diana     66            132


This setup ensures:

The DataFrame df is defined

The score column exists

The custom double function is applied safely with .apply()

### Application Example: Converting Exam Scores to Percentages

Scenario:
You are analyzing student exam scores stored in a CSV file. The scores are out of 80 points, but your school’s grading system requires them to be shown as percentages out of 100.

You can write a function to scale the scores and use .apply() to convert every student's score.

In [None]:
import pandas as pd

# Sample data
data = {
    "name": ["Alice", "Bob", "Charlie"],
    "raw_score": [64, 72, 58]
}

df = pd.DataFrame(data)

# Define the conversion function
def to_percentage(raw_score):
    return round((raw_score / 80) * 100, 2)

# Apply the function to create a new column
df["score_percent"] = df["raw_score"].apply(to_percentage)

# Display the result
print(df)


      name  raw_score  score_percent
0    Alice         64           80.0
1      Bob         72           90.0
2  Charlie         58           72.5


Explanation:

to_percentage() is a reusable function that converts raw scores to percentages.

.apply(to_percentage) runs the function on every value in the raw_score column.

df["score_percent"] is a new column that stores the converted percentages.

Why It Matters:
This technique automates repetitive column-wise transformations.

It avoids writing for-loops and makes your code cleaner and faster.

You can easily swap in a new function later without changing the structure of your DataFrame operations.

### Summary Table

| Concept            | What It Does                      | Benefit                        |
| ------------------ | --------------------------------- | ------------------------------ |
| `def`              | Defines a function                | Encapsulates logic             |
| Parameters         | Accept inputs                     | Make the function reusable     |
| `return`           | Sends back a value                | Enables computation reuse      |
| `.apply(function)` | Applies function to column values | Efficient in data manipulation |


# Debugging and Error Handling in Python


## Example: TypeError

In [None]:
age = 25
# print("Age: " + age) ❌
print("Age: " + str(age))  # ✅


Age: 25


Explanation: You can’t use + to combine a string with an integer.

## Example: FileNotFoundError

In [None]:
# ❌ Won’t work in Colab unless uploaded
df = pd.read_csv("udpated_my_data.csv")


FileNotFoundError: [Errno 2] No such file or directory: 'udpated_my_data.csv'

Fix:

In [None]:
from google.colab import files
files.upload()
df = pd.read_csv("my_data.csv")  # ✅ Now it works


Saving updated_my_data.csv to updated_my_data (3).csv


Colab has no access to your C:\ drive unless you upload the file.

## Debugging Tips– With Example

Let’s say you’re trying to write a function that calculates the square of a number and prints the result:

Faulty Code Example:

In [None]:
def square(x):
    return x**2

number = "5"
result = square(number)
print("Result is: " + result)



TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

### Tip 1: Read the Full Error Message

Focus on the last line:

**TypeError: unsupported operand type(s) for  or pow(): 'str' and 'int'**

This tells you:

You are trying to use the ** operator (exponentiation) — or the function pow() — with two values where at least one of them has the wrong data type.

Specifically, you’re trying to raise a string (str) to a power, which is not allowed.

### Tip 2: Use print() to Track Variables

Add print statements to understand what's going wrong:

In [None]:
print(type(number))  # <class 'str'>
print(type(result))  # <class 'int'>


<class 'str'>
<class 'int'>


Now you know:

number is a string

result is an integer

###Tip 3: Run Step by Step

Instead of fixing everything at once, try:

In [None]:
# Step 1
number = "5"
print(type(number))  # str

# Step 2
number = int(number)  # Convert to integer

# Step 3
result = square(number)
print("Result is:", result)  # Use comma to avoid string concat issues


<class 'str'>
Result is: 25


### Tip 4: Ask AI to Help Interpret

Prompt to ChatGPT:

"I got this error: TypeError: unsupported operand type(s) for  or pow(): 'str' and 'int'. Here's my code: XXXXX — why is this happening?"

### Final Correct Code:

In [None]:
def square(x):
    return x ** 2

number = "5"
number = int(number)  # Fix: convert to int

result = square(number)
print("Result is:", result)  # Fix: use comma or str(result)


Result is: 25
