<a href="https://colab.research.google.com/github/cloudpedagogy/data-science-programming/blob/main/python-programming/02_Python_Basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Basics


## Overview

When discussing variables and types, operators, basic input, and output in Python, here are some fundamental concepts:

**Variables and Types:**
1. Variable Declaration: Variables in Python can be declared without specifying their type. For example, `x = 5` assigns the value 5 to the variable `x`.
2. Data Types: Python supports various data types, including integers, floats, strings, booleans, lists, tuples, and dictionaries.
3. Type Conversion: You can convert between different data types using built-in functions such as `int()`, `float()`, `str()`, etc.
4. Variable Naming: Variable names should be descriptive, follow the naming conventions, and start with a letter or underscore.

**Operators:**
1. Arithmetic Operators: Addition (`+`), subtraction (`-`), multiplication (`*`), division (`/`), modulus (`%`), exponentiation (`**`), and floor division (`//`).
2. Comparison Operators: Equal to (`==`), not equal to (`!=`), greater than (`>`), less than (`<`), greater than or equal to (`>=`), and less than or equal to (`<=`).
3. Logical Operators: AND (`and`), OR (`or`), and NOT (`not`) for combining or negating conditions.
4. Assignment Operators: Assigning values to variables, such as `=`, `+=`, `-=`, `*=`, `/=`, `%=`, `**=`, and `//=`.

**Basic Input and Output:**
1. Printing Output: The `print()` function is used to display output on the console. For example, `print("Hello, World!")` will print the text "Hello, World!".
2. Reading Input: The `input()` function allows you to take user input. For example, `name = input("Enter your name: ")` prompts the user to enter their name and assigns it to the variable `name`.

These are just some of the basics when it comes to variables and types, operators, basic input, and output in Python. There are many more concepts and techniques to explore as you delve deeper into the language.

# Variables and Types


## Variable assignment

Variable assignment in Python refers to the process of assigning a value to a variable. It allows you to store and manipulate data by giving it a name. In Python, variables can hold values of different data types such as integers, floats, strings, and more.

Here's an example using the Pima Indian Diabetes dataset to demonstrate variable assignment:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Assign values from the dataset to variables
pregnancies = dataset.loc[0, 'Pregnancies']
glucose = dataset.loc[0, 'Glucose']
outcome = dataset.loc[0, 'Outcome']

# Print the variable values
print("Pregnancies:", pregnancies)
print("Glucose:", glucose)
print("Outcome:", outcome)


In this example, we load the Pima Indian Diabetes dataset using the Pandas library. We then assign specific values from the dataset to variables using the assignment operator (=).

The variable `pregnancies` is assigned the value from the 'Pregnancies' column at the first row (index 0). Similarly, the variable `glucose` is assigned the value from the 'Glucose' column at the first row, and the variable `outcome` is assigned the value from the 'Outcome' column at the first row.

We then print the values of these variables using the `print()` function to display the assigned values from the dataset.


## Basic data types (integer, float, string, boolean)

In Python, there are several basic data types that represent different kinds of values. The most commonly used basic data types are:

1. Integer (int): Represents whole numbers without decimal points. For example, 1, 5, -10.

2. Float (float): Represents floating-point numbers, which include decimal points. For example, 3.14, -2.5, 0.75.

3. String (str): Represents a sequence of characters enclosed in single quotes (' ') or double quotes (" "). For example, "Hello", 'Python', "42".

4. Boolean (bool): Represents a logical value that can be either True or False. This data type is particularly useful in making decisions or controlling the flow of a program.

Here's an example using the Pima Indian Diabetes dataset to demonstrate these basic data types:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Access specific values with different data types from the dataset
pregnancies = dataset.loc[0, 'Pregnancies']  # Integer
glucose = dataset.loc[0, 'Glucose']  # Integer
bmi = dataset.loc[0, 'BMI']  # Float
diabetes_pedigree = dataset.loc[0, 'DiabetesPedigreeFunction']  # Float
name = "John"  # String
has_diabetes = dataset.loc[0, 'Outcome']  # Boolean

# Print the values and their data types
print("Pregnancies:", pregnancies, type(pregnancies))
print("Glucose:", glucose, type(glucose))
print("BMI:", bmi, type(bmi))
print("Diabetes Pedigree Function:", diabetes_pedigree, type(diabetes_pedigree))
print("Name:", name, type(name))
print("Has Diabetes:", has_diabetes, type(has_diabetes))


In this example, we load the Pima Indian Diabetes dataset using the Pandas library. We access specific values from the dataset to demonstrate different data types.

The variables `pregnancies` and `glucose` store integer values from the 'Pregnancies' and 'Glucose' columns, respectively. The variables `bmi` and `diabetes_pedigree` store floating-point values from the 'BMI' and 'DiabetesPedigreeFunction' columns, respectively.

The variable `name` stores a string value "John". We can directly assign a string value to a variable.

The variable `has_diabetes` stores a boolean value from the 'Outcome' column. It represents whether a person has diabetes or not.

We print the values along with their respective data types using the `type()` function to verify the data types of the variables.


## Type conversion


Type conversion, also known as type casting, in Python refers to the process of changing the data type of a variable from one type to another. Python provides built-in functions to convert variables from one type to another.

Here are some commonly used type conversion functions in Python:

1. `int()`: Converts a value to an integer.
2. `float()`: Converts a value to a floating-point number.
3. `str()`: Converts a value to a string.
4. `list()`: Converts a value to a list.
5. `tuple()`: Converts a value to a tuple.
6. `bool()`: Converts a value to a Boolean.

Here's an example using the Pima Indian Diabetes dataset to demonstrate type conversion:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Convert the 'Age' column from float to integer
dataset['Age'] = dataset['Age'].astype(int)

# Convert the 'Outcome' column from integer to boolean
dataset['Outcome'] = dataset['Outcome'].astype(bool)

# Convert the 'Glucose' column from integer to string
dataset['Glucose'] = dataset['Glucose'].astype(str)

# Print the updated dataset with converted types
print(dataset.dtypes)


In this example, we load the Pima Indian Diabetes dataset using the Pandas library. We then perform type conversion on different columns of the dataset.

First, we convert the 'Age' column from float to integer using the `astype()` method and specifying the `int` type. This converts all values in the 'Age' column to integers.

Next, we convert the 'Outcome' column from integer to boolean using the `astype()` method and specifying the `bool` type. This converts all 0 values to False and non-zero values to True.

Finally, we convert the 'Glucose' column from integer to string using the `astype()` method and specifying the `str` type. This converts all values in the 'Glucose' column to strings.

After performing the type conversions, we print the updated dataset using the `dtypes` attribute to see the data types of each column.


# Operators


## Arithmetic operators

Arithmetic operators in Python are used to perform mathematical operations on numeric values. These operators allow you to add, subtract, multiply, divide, and more. Here are the arithmetic operators in Python:

1. Addition (+): Adds two values together.
2. Subtraction (-): Subtracts the right operand from the left operand.
3. Multiplication (*): Multiplies two values.
4. Division (/): Divides the left operand by the right operand, returning a floating-point result.
5. Floor Division (//): Divides the left operand by the right operand and rounds down to the nearest whole number.
6. Modulus (%): Returns the remainder of the division of the left operand by the right operand.
7. Exponentiation (**): Raises the left operand to the power of the right operand.

Here's an example using the Pima Indian Diabetes dataset to demonstrate the use of arithmetic operators:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Calculate the average glucose level
average_glucose = dataset['Glucose'].mean()

# Calculate the total pregnancies multiplied by the average glucose level
total_pregnancies = dataset['Pregnancies'].sum()
result = total_pregnancies * average_glucose

# Print the result
print("Total pregnancies multiplied by average glucose level:", result)


In this example, we load the Pima Indian Diabetes dataset using Pandas library. We then use arithmetic operators to perform calculations. First, we calculate the average glucose level by taking the mean of the 'Glucose' column. Then, we calculate the total number of pregnancies by summing the 'Pregnancies' column. Finally, we multiply the total pregnancies by the average glucose level and store the result in the 'result' variable. We print the result to see the output.


## Comparison operators

Comparison operators in Python are used to compare values and return a Boolean result (True or False) based on the comparison. These operators are commonly used to make decisions or control the flow of a program.

The comparison operators in Python are as follows:

1. Equal to (==): Checks if two values are equal.
2. Not equal to (!=): Checks if two values are not equal.
3. Greater than (>): Checks if the left operand is greater than the right operand.
4. Less than (<): Checks if the left operand is less than the right operand.
5. Greater than or equal to (>=): Checks if the left operand is greater than or equal to the right operand.
6. Less than or equal to (<=): Checks if the left operand is less than or equal to the right operand.

Here's an example using the Pima Indian Diabetes dataset to demonstrate the use of comparison operators:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Filter the dataset to select records with BMI greater than or equal to 30
filtered_data = dataset[dataset['BMI'] >= 30]

# Print the filtered dataset
print(filtered_data)


In this example, we load the Pima Indian Diabetes dataset using Pandas library. We then use a comparison operator (>=) to filter the dataset and select records where the BMI (Body Mass Index) is greater than or equal to 30. The resulting filtered_data contains only the records that satisfy the condition. Finally, we print the filtered dataset to see the output.


##Logical operators

Logical operators in Python are used to combine and evaluate multiple conditions or expressions. These operators are typically used in Boolean logic to make decisions or control the flow of a program. Python provides three logical operators:

1. AND (`and`): Returns True if both the left and right operands are True, otherwise returns False.

2. OR (`or`): Returns True if at least one of the left and right operands is True, otherwise returns False.

3. NOT (`not`): Returns the opposite Boolean value of the operand. If the operand is True, it returns False, and if the operand is False, it returns True.

Here's an example using the Pima Indian Diabetes dataset to demonstrate the use of logical operators:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Check if a person has high glucose level and high BMI
high_glucose = dataset['Glucose'] > 140
high_bmi = dataset['BMI'] > 30
has_high_glucose_bmi = high_glucose & high_bmi

# Check if a person is either older than 50 or has diabetes
older_than_50 = dataset['Age'] > 50
has_diabetes = dataset['Outcome'] == 1
is_older_than_50_or_has_diabetes = older_than_50 | has_diabetes

# Print the results
print("Persons with high glucose and high BMI:")
print(dataset[has_high_glucose_bmi])

print("\nPersons older than 50 or have diabetes:")
print(dataset[is_older_than_50_or_has_diabetes])


In this example, we load the Pima Indian Diabetes dataset using the Pandas library. We then use logical operators to combine and evaluate conditions on the dataset.

First, we check if a person has a high glucose level (greater than 140) and a high BMI (greater than 30). We create Boolean Series `high_glucose` and `high_bmi` based on these conditions. Then, we use the AND (`&`) operator to combine the two conditions and create a new Boolean Series `has_high_glucose_bmi`. This series will have `True` for the rows where both conditions are satisfied.

Next, we check if a person is either older than 50 or has diabetes. We create Boolean Series `older_than_50` and `has_diabetes` based on these conditions. Then, we use the OR (`|`) operator to combine the two conditions and create a new Boolean Series `is_older_than_50_or_has_diabetes`. This series will have `True` for the rows where at least one of the conditions is satisfied.

Finally, we print the subsets of the dataset where the conditions are satisfied using Boolean indexing (`dataset[condition]`). This allows us to filter the dataset based on the logical conditions and display the relevant rows.


# Basic Input and Output



## Input function

The `input()` function in Python is used to read input from the user during runtime. It allows the user to enter data from the keyboard, which can then be assigned to a variable and used in the program. The `input()` function pauses the program execution and waits for the user to enter input.

Here's an example of using the `input()` function with the Pima Indian Diabetes dataset:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Get user input for a specific column
column_name = input("Enter the column name to display: ")

# Check if the column exists in the dataset
if column_name in dataset.columns:
    # Display the values in the specified column
    column_values = dataset[column_name]
    print(column_values)
else:
    print("Invalid column name.")


In this example, after loading the Pima Indian Diabetes dataset using the Pandas library, we use the `input()` function to prompt the user to enter a column name they want to display from the dataset.

The user enters a column name, which is stored in the `column_name` variable. We then check if the entered column name exists in the dataset using the `in` operator to compare the input with the column names.

If the column name is found in the dataset, we display the values in that column by indexing the dataset with the `column_name`. If the column name is not found, we display an error message.

This allows the user to interactively specify a column from the dataset and view its values during runtime by entering the column name through the `input()` function.


## Print function
The `print()` function in Python is used to output data or information to the console or standard output. It allows you to display text, variables, or the result of expressions during program execution.

Here's an example using the Pima Indian Diabetes dataset to demonstrate the `print()` function:


In [None]:
import pandas as pd

# Load the Pima Indian Diabetes dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv"
column_names = ["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI", "DiabetesPedigreeFunction", "Age", "Outcome"]
dataset = pd.read_csv(url, names=column_names)

# Print the dataset information
print("Dataset Information:")
print(dataset.info())

# Print the first 5 rows of the dataset
print("\nFirst 5 Rows:")
print(dataset.head())

# Print the summary statistics of the dataset
print("\nSummary Statistics:")
print(dataset.describe())


In this example, we load the Pima Indian Diabetes dataset using the Pandas library. We use the `print()` function to display different types of information related to the dataset.

First, we print the dataset information using `dataset.info()`. This provides an overview of the dataset, including the number of rows, column names, and data types of each column.

Next, we print the first 5 rows of the dataset using `dataset.head()`. This shows a glimpse of the data, displaying the top rows.

Finally, we print the summary statistics of the dataset using `dataset.describe()`. This provides statistical information such as count, mean, standard deviation, minimum, maximum, and quartile values for each numerical column in the dataset.

By using the `print()` function, we can output these different pieces of information to the console for analysis and understanding of the dataset.


# Reflection points

Here's a list of reflection points with answers on the topics of Variables and Types, Operators, and Basic Input and Output in Python.

**Variables and Types:**
1. What is a variable in Python, and how is it used?
   - A variable is a named container that holds a value in memory. It allows you to store and manipulate data throughout your program.

2. How do you declare and assign a value to a variable in Python?
   - To declare a variable, you simply write its name followed by the assignment operator (=). For example: `name = "John"`. You can assign different types of values, such as strings, numbers, or Boolean, to variables.

3. What are the rules for naming variables in Python?
   - Variable names in Python must start with a letter or underscore (_) and can contain letters, numbers, and underscores. They are case-sensitive, meaning `name` and `Name` are considered different variables.

4. How do you check the type of a variable in Python?
   - You can use the `type()` function to determine the type of a variable. For example: `type(age)` will return the type of the `age` variable.

**Operators:**
1. What are arithmetic operators in Python, and how are they used?
   - Arithmetic operators (+, -, *, /, %, **) are used to perform basic mathematical operations such as addition, subtraction, multiplication, division, modulus, and exponentiation.

2. How do you use comparison operators in Python?
   - Comparison operators (==, !=, >, <, >=, <=) are used to compare values and return Boolean results (True or False) based on the comparison.

3. What are logical operators in Python, and how do they work?
   - Logical operators (and, or, not) are used to combine or modify Boolean values. They allow you to perform logical operations such as conjunction, disjunction, and negation.

4. How do you use assignment operators in Python?
   - Assignment operators (+=, -=, *=, /=, %=) are used to modify the value of a variable and assign the result back to the variable in a single step. For example: `count += 1` is equivalent to `count = count + 1`.

**Basic Input and Output:**
1. How do you take user input in Python?
   - You can use the `input()` function to prompt the user for input. The function waits for the user to enter a value and returns it as a string. For example: `name = input("Enter your name: ")`.

2. How do you display output to the console in Python?
   - You can use the `print()` function to display output to the console. It accepts one or more arguments and prints them to the console. For example: `print("Hello, World!")`.

3. How do you format output in Python?
   - You can use string formatting techniques such as f-strings or the `format()` method to format output in Python. These allow you to insert variables and format them according to specific patterns or placeholders.

4. How do you convert data types in Python?
   - You can use type-specific functions like `int()`, `float()`, `str()`, and `bool()` to convert data from one type to another. For example: `age = int(input("Enter your age: "))` converts the input value to an integer.



# A quiz on Variables and Types


1. Which operator is used for variable assignment in Python?
   <br>a) =
   <br>b) ==
   <br>c) :=
   <br>d) =>

2. What is the data type of the variable "age" if it stores the value 25?
   <br>a) Integer
   <br>b) Float
   <br>c) String
   <br>d) Boolean

3. Which data type would you use to store a person's name?
   <br>a) Integer
   <br>b) Float
   <br>c) String
   <br>d) Boolean

4. Which data type would you use to store a person's height in meters?
   <br>a) Integer
   <br>b) Float
   <br>c) String
   <br>d) Boolean

5. How would you convert the variable "age" from an integer to a string?
   <br>a) str(age)
   <br>b) int(age)
   <br>c) float(age)
   <br>d) bool(age)

6. Suppose the variable "weight" is assigned the value 65.5. How would you convert it to an integer data type?
   <br>a) int(weight)
   <br>b) str(weight)
   vc) float(weight)
   <br>d) bool(weight)

7. What is the correct way to convert the string "3.14" to a float?
   <br>a) float("3.14")
   <br>b) int("3.14")
   <br>c) str(3.14)
   <br>d) bool("3.14")

8. Which data type would you use to store a True/False value?
   <br>a) Integer
   <br>b) Float
   <br>c) String
   <br>d) Boolean

9. Suppose the variable "is_diabetic" is assigned the value True. How would you convert it to an integer data type?
   <br>a) int(is_diabetic)
   <br>b) str(is_diabetic)
   <br>c) float(is_diabetic)
   <br>d) bool(is_diabetic)

10. How would you assign the value 42 to the variable "answer"?
    <br>a) answer = 42
    <br>b) answer == 42
    <br>c) answer := 42
    <br>d) answer => 42
---
Answers:
1. a) =
2. a) Integer
3. c) String
4. b) Float
5. a) str(age)
6. a) int(weight)
7. a) float("3.14")
8. d) Boolean
9. a) int(is_diabetic)
10. a) answer = 42
---

# A quiz on Operators


1. Arithmetic operators are used to perform mathematical calculations in Python. Which of the following is NOT an arithmetic operator?
   <br>a) +
   <br>b) /
   <br>c) %
   <br>d) &
   
2. Comparison operators are used to compare values in Python. Which of the following is the correct operator to check if two values are equal?
   <br>a) ==
   <br>b) !=
   <br>c) >=
   <br>d) >
   
3. Logical operators are used to combine multiple conditions in Python. Which of the following is the correct logical operator for "logical OR"?
   <br>a) &&
   <br>b) ||
   <br>c) !
   <br>d) &
   
4. Consider the following code snippet:

   ```python
   x = 10
   y = 5
   z = 7
   
   result = (x > y) and (z < y)
   ```
   What will be the value of the `result` variable?
   <br>a) True
   <br>b) False
   
5. Consider the following code snippet:

   ```python
   a = 8
   b = 12
   
   result = (a % 3 == 0) or (b % 3 == 0)
   ```
   What will be the value of the `result` variable?
   <br>a) True
   <br>b) False
   
6. Consider the following code snippet:

   ```python
   a = 5
   b = 7
   
   result = (a < b) or (a > b)
   ```
   What will be the value of the `result` variable?
   <br>a) True
   <br>b) False
   
7. The Pima Indian dataset contains information about individuals, including their age, blood pressure, glucose level, etc. Which type of operators would you use to compare the age of two individuals in the dataset?
   <br>a) Arithmetic operators
   <br>b) Comparison operators
   <br>c) Logical operators
   
8. The Pima Indian dataset contains binary classification labels, such as whether an individual has diabetes or not. Which type of operators would you use to check if an individual has diabetes or not based on their label in the dataset?
   <br>a) Arithmetic operators
   <br>b) Comparison operators
   <br>c) Logical operators
---
Answers:

1. d) &

2. a) ==

3. b) ||

4. b) False

5. a) True

6. a) True

7. b) Comparison operators

8. c) Logical operators
---