<a href="https://colab.research.google.com/github/poepping/hello-world/blob/main/Introduction_Python_TLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Workshop Title: Introduction to Python Programming for Data Science**

**Duration: 3 hours**

**Objective**: To introduce beginners to the Python programming language with a focus on its applications in data science.

**Prerequisites**: No prior programming experience required.

---

### I. Introduction (15 minutes)
**Key Concepts**: Python, Jupyter Notebook

- Brief Introduction to the Workshop Topics
- Importance and Applications of Python in Data Science
- Introduction to Python
- Why Python for Data Science
- Setting up the Python Environment: Jupyter Notebook

---

### II. Python Basics (30 minutes)

**Key Concepts**: Variables, Data types, Arithmetic operations, String operations

- Introduction to Python Syntax
- Variables and Data Types in Python
- Basic Arithmetic Operations
- String Operations

---

### III. Control Structures in Python (30 minutes)

**Key Concepts**: Conditional statements, Loops

- Introduction to Control Structures
- Conditional Statements: if, elif, else
- Loops: for and while

---

### IV. Functions in Python (30 minutes)

**Key Concepts**: Function definition, Built-in functions

- Introduction to Functions in Python
- Defining and Calling Functions
- Basic Built-in Functions

---

### V. Python Data Structures (30 minutes)

**Key Concepts**: Lists, Tuples, Dictionaries

- Introduction to Python Data Structures
- Lists: Creation, Access, Modification
- Tuples: Creation, Access
- Dictionaries: Creation, Access, Modification

---

### VI. Introduction to Libraries (15 minutes)

**Key Concepts**: Importing libraries, Math, Random, Datetime

- Introduction to Python Libraries
- Importing Libraries
- Math, Random, and Datetime Libraries

---

### VII. Introduction to NumPy and pandas (30 minutes)

**Key Concepts**: NumPy arrays, pandas DataFrames

- Introduction to NumPy and pandas
- Creating and Manipulating NumPy Arrays
- Creating and Manipulating pandas DataFrames

---

### VIII. Conclusion and Next Steps (15 minutes)

- Recap of the Workshop
- Overview of More Advanced Python Topics
- Resources for Further Learning
- Q&A and Closing Remarks

# Section 2: Python Basics

## Data Types

In this part, we learn how to create variables in Python and assign values to them. Variables can be of various types, like **integer**, **float**, **string**, and **boolean**. We use the `print()` function to display the value of a variable and the `type()` function to determine its data type.

In [17]:
# Integer
x = 10
y= 5
print(x, y, x/y, type(x), type(y), type(x/y))

# Float
p = 3.14
print(p, type(p))

# String
s="Hello python!"
print(s, type(s))

# Boolean
bt = True
bf = False
print(bt, bf, type(bt))

10 5 2.0 <class 'int'> <class 'int'> <class 'float'>
3.14 <class 'float'>
Hello python! <class 'str'>
True False <class 'bool'>


## Basic Arithmetic Operations

Next, we cover the basic arithmetic operations that you can perform in Python. This includes **addition (`+`), subtraction (`-`), multiplication (`*`), division (`/`), modulus (`%`), and exponentiation (`**`)**. These operations work as you'd expect from your mathematics classes.

In [30]:
# Addition
sum = 5+3; print(sum)
sumxy = x + y; print(sumxy)

# Subtraction
difference = 10-7; print(difference)
diffxy = x - y; print(diffxy)

# Multiplication
product = 4 * 7; print(product, type(product))
product = 4 * 7.0; print(product, type(product))
product = x * y; print(product, type(product))

# Division
quotient = 22/7; print(quotient, type(quotient))

# Modulus
remainder = 10 % 3; print(remainder)

# Exponentiation
square = 7**2; print(square)

8
15
3
5
28 <class 'int'>
28.0 <class 'float'>
50 <class 'int'>
3.142857142857143 <class 'float'>
1
49


## Basic string operations

Finally, we explore several operations that can be performed on strings, which are sequences of characters. We see how to concatenate (join) strings using the `+` operator, repeat strings using the `*` operator, and access specific characters in a string via indexing (e.g., `s[0]`). We also learn how to get a substring from a string using slicing (e.g., `s[1:4]`), and determine the length of a string using the `len()` function.

In [63]:
# String Concatenation
greeting = "Hello" + " " + "Python"; print(greeting)
greeting = "Hello" + "Python"; print(greeting)

# String Repetition
laugh = "Ha"*5; print(laugh)

# String Indexing
first_letter = greeting[0]; fifth_letter = greeting[5]; print(greeting, first_letter, fifth_letter)

# String Slicing
python = greeting[5:11]; print(python)
hp = greeting[0]+greeting[5]; print(hp)

# String Length
slength = len(greeting); print(slength)

Hello Python
HelloPython
HaHaHaHaHa
HelloPython H P
Python
HP
11


This section forms the foundation of Python programming. As you proceed with the workshop, you'll find that these concepts are integral to understanding and writing Python code, whether it's for simple tasks or complex data science projects.

# Section 3: Control Structures

## Conditional Statement:

This part introduces conditional statements in Python, which allow us to execute certain pieces of code based on specific conditions. We learn about the **if statement**, which checks if a condition is true and executes a block of code if it is. We also learn about the else clause, which lets us specify a block of code to be executed if the condition in the if statement is false. Lastly, we cover the **elif** clause (short for "else if"), which allows us to check multiple conditions and execute a block of code as soon as one of the conditions evaluates to true.

In [59]:
# if statement
x = 7; y = -7
if x > 0:
  print("x is positive")
  if y < 0:
    print("y is negative")

# if-else statement
if x % 2 == 0:
  print("x is even")
else:
  print("x is odd")

# if-elif-else statement
color = "red"
if color == "blue":
  print("The color is blue")
elif color == "green":
  print("The color is green")
else:
  print("The color is neither blue nor green")


x is positive
y is negative
x is odd
The color is neither blue nor green


## Loops

Next, we learn about loops in Python, which allow us to execute a block of code multiple times. We cover the for loop, which iterates over a sequence (like a list or a string) or other iterable objects. We also learn about the `range()` function, which generates a sequence of numbers that we can iterate over. Lastly, we cover the while loop, which continues to execute a block of code as long as a certain condition remains true.

### For Loops

In this part, we learn how to use the for loop to iterate over different types of sequences, including lists and strings. The for loop executes a block of code for each item in the sequence, making it extremely useful for tasks that involve processing collections of items, such as summing a list of numbers or processing each character in a text string.

We also introduce the concept of nested for loops, which involves placing one loop inside another, allowing for more complex iteration patterns.

In [76]:
# Iterating over a list
fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
  print(fruit)

# Iterating over a string
for char in "Hee": print(char)
#for i in "Hee": print(i)

# Using the range function
for i in range(5): print(i)

# Using nested for loops
for i in range(3):
  for j in range(3):
    print(i,j, i/10, j/10)


apple
banana
cherry
H
e
e
0
1
2
3
4
0 0 0.0 0.0
0 1 0.0 0.1
0 2 0.0 0.2
1 0 0.1 0.0
1 1 0.1 0.1
1 2 0.1 0.2
2 0 0.2 0.0
2 1 0.2 0.1
2 2 0.2 0.2


### While Loops

Next, we cover the while loop, which repeatedly executes a block of code as long as a given condition is true. This is useful for tasks where you don't know in advance how many times the loop should run (for example, when you're waiting for a certain condition to be met).

We also introduce the break and continue statements, which provide more control over the loop execution. The break statement allows us to exit the loop prematurely when a certain condition is met, while the continue statement allows us to skip the rest of the current loop iteration and immediately proceed to the next one.

In [83]:
# Basic while loop
counter = 0
while counter < 5:
  print(counter)
  counter += 1    # increment counter at end; counter = counter + 1;

# While loop with break statement
counter = 0
while counter < 5:
  if counter == 3:    # last value of counter to be printed will be 2
    break
  print(counter)
  counter +=1         # put counter at end; values 0 through 4 used in loop

# While loop with continue statement
print()
counter = 0
while counter < 5:
  counter +=1         # put counter at start; values 1 through 5 used in loop
  if counter == 3:    # last value of counter to be printed will be 2
    continue          # skip back to top of while loop and continue
  print(counter)


0
1
2
3
4
0
1
2

1
2
4
5


# Section 4: Functions

Define a Function:

In this part, we learn how to define our own functions using the def keyword. A function is a reusable block of code that performs a specific task. Once a function is defined, we can call it by its name, followed by parentheses. This allows us to execute the code within the function

In [95]:
# Defining a function
def greet(name):
  print(f"Hello, {name}?")

# Calling a function
greet("Alice")
greet("anyone")
greet("")

Hello, Alice?
Hello, anyone?
Hello, ?


Function with parameters:

Next, we learn how to define functions with parameters. Parameters are variables that are included in the function definition and that accept values when the function is called. The values that we provide to the function at the call are known as arguments.

In [None]:
# Defining a function with a parameter

# Calling a function with an argument


Function with return values:

Here, we learn how to return a value from a function using the return keyword. The returned value can be stored in a variable or used directly in an expression. This allows us to produce output from our functions that can be used elsewhere in our code.

In [93]:
# Defining a function with a return value
def square(num):
  return num ** 2

# Calling a function and storing its return value
result = square(7)
print(result)

49


Default and Keyword Arguments:

Lastly, we cover default arguments, which are parameters that have a default value provided in the function definition. This value is used if no argument is provided for that parameter when the function is called. We also learn about keyword arguments, which are arguments provided at the function call, specified by the parameter name. This allows us to call a function with arguments in any order.

In [105]:
# Defining a function with a default argument
def greet(name="Guest"):
  print(f"Hello, {name}!")

# Calling a function without providing an argument
greet("Alice")
greet()

# Defining a function with keyword arguments
def describe_pet(animal_type="dog", pet_name="Harry"):
  print(f"I have a {animal_type} named {pet_name}.")

# Calling a function with keyword arguments
describe_pet(animal_type="dog", pet_name="Harry")
describe_pet()
describe_pet("cockapoo", "Luna")

Hello, Alice!
Hello, Guest!
I have a dog named Harry.
I have a dog named Harry.
I have a cockapoo named Luna.


Add-on: Build-in Functions

In [106]:
# Basic built-in functions
print(len("Pythin"))
print(int(123))
#print(int("Python"))

6
123


# Section 4: Python Data Structures

## Lists

Lists are ordered collections of items that are mutable, meaning we can add, remove, or change items after the list is created. In this part, we learn how to create a list, access its elements using indices, modify its elements, and add or remove elements using the `append()` and `remove()` methods, respectively.

In [160]:
# Creating a list
fruits = ["apple", "banana", "cherry"]; print(fruits, type(fruits))

# Accessing elements
print(fruits[0])  # print first element
print(fruits[-1]) # print last element

# Modifying elements
fruits[1] = "blueberry" ; print(fruits)

# Adding elements
fruits.append("dragonfruit"); print(fruits)

# Removing elements
fruits.remove("apple"); print(fruits)

['apple', 'banana', 'cherry'] <class 'list'>
apple
cherry
['apple', 'blueberry', 'cherry']
['apple', 'blueberry', 'cherry', 'dragonfruit']
['blueberry', 'cherry', 'dragonfruit']


In [127]:
# examples of list comprehension
my_list = [1, 2, 3, 4, 5]; print(my_list)
my_list.append(6); print(my_list)
my_list.append([7, 8, 9]); print(my_list)
my_list.remove(1); print(my_list)
#squares = [x**2 for x in my_list]; print(squares) #need to debug

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6, [7, 8, 9]]
[2, 3, 4, 5, 6, [7, 8, 9]]


## Tuples

Tuples are similar to lists, but they are immutable, meaning we can't change their size or the values of their items once they're created. This makes tuples useful for grouping related data and ensuring it doesn't get changed. We learn how to create a tuple and access its elements.

In [159]:
# Creating a tuple; Notice round brackets ()
fruits = ("apple", "banana", "cherry"); print(fruits, type(fruits))

# Accessing elements
print(fruits[0])
print(fruits[-1])

# Tuples are immutable
# fruits[1] = "blueberry"  # This will raise an error

('apple', 'banana', 'cherry') <class 'tuple'>
apple
cherry


## Dictionary

Dictionaries are unordered collections of key-value pairs, where each key is unique. This allows us to access, modify, add, or remove items using their keys, which can be any immutable data type. We learn how to create a dictionary, access its elements using keys, modify its elements, and add or remove elements.

In [156]:
# Creating a dictionary ; Notice the curly brackets {}
student = {"name":"John", "age":21, "courses":["Math", "CompSci"]}

# Accessing elements
print(student["name"])

# Modifying elements
student["age"] = 22
print(student)

# Adding elements
student["grade"]=[90,95]; print(student)

# Removing elements
del student["age"]
print(student)
print(type(student))

John
{'name': 'John', 'age': 22, 'courses': ['Math', 'CompSci']}
{'name': 'John', 'age': 22, 'courses': ['Math', 'CompSci'], 'grade': [90, 95]}
{'name': 'John', 'courses': ['Math', 'CompSci'], 'grade': [90, 95]}
<class 'dict'>


In [138]:
# Dictionaries
print(student.keys())
print(student.values())
print(student.items())


dict_keys(['name', 'courses', 'grade'])
dict_values(['John', ['Math', 'CompSci'], [90, 95]])
dict_items([('name', 'John'), ('courses', ['Math', 'CompSci']), ('grade', [90, 95])])


# Section 5: Introduction to Libraries

Importing Libraries:

In Python, libraries are collections of functions and methods that allow you to carry out many actions without writing your code. In this part, we learned how to import the `math` library and use its `sqrt()` function to calculate the square root of a number.

In [139]:
# Importing a library
import math

# Using a function from the library
print(math.sqrt(16))

4.0


Importing with Aliases:

Sometimes, for convenience and ease of use, libraries are imported using aliases. Here, we imported the `random` library using `rnd` as an alias, and then used the `randint()` function to generate a random integer.

In [150]:
# Importing a library with an alias
import random as rnd    #create a nickname

# Using a function from the library
print(rnd.randint(1, 10))


2


Importing Specific Functions:

When we only need a specific function from a library, we can import that alone. In this example, we imported the `date` function from the `datetime` library to print today's date.

In [151]:
# Importing a specific function
from datetime import date

# Using the function
print(date.today())


2024-04-16


Exploring Library Documentation:

Understanding how to use a function is crucial when programming. The `help()` function provides a way to access the documentation of a function, which can provide useful information on how to use it. Here, we used `help()` to access the documentation for the `sqrt` function from the `math` library.

In [153]:
# Getting help on a function
help(math.sqrt)

Help on built-in function sqrt in module math:

sqrt(x, /)
    Return the square root of x.



# Section 6: Introduction to NumPy and pandas

## Numpy

NumPy (Numerical Python) is a powerful library for performing mathematical and logical operations on arrays. In this part, we learn how to create a NumPy array and perform operations on it. We also see how NumPy enables element-wise computations, which are highly efficient and useful for scientific computing.

In [155]:
# Importing numpy
import numpy as np

# Creating a numpy array
arr = np.array([1, 2, 3, 4, 5]) ; print(arr, type(arr))

[1 2 3 4 5] <class 'numpy.ndarray'>


In [158]:
# Performing operations on numpy arrays
print(arr * 2)
print(np.mean(arr), type(np.mean(arr)))

[ 2  4  6  8 10]
3.0 <class 'numpy.float64'>


## Pandas

pandas is a high-level data manipulation tool that's built on the NumPy package. It's key data structure is called the DataFrame, which allows you to manipulate and analyze data in a tabular form. In this section, we learn how to create a DataFrame, access its data, and apply descriptive statistics methods. pandas is essential for data manipulation and analysis in Python, and is commonly used in conjunction with other data science libraries.

In [205]:
# Importing pandas
import pandas as pd

# Creating a pandas DataFrame
data = {
    "name":["John", "Anna", "Peter", "Linda", "Anna"],
    "Age":[23, 21, 29, 33, 40]
}

df = pd.DataFrame(data)
print(df)

    name  Age
0   John   23
1   Anna   21
2  Peter   29
3  Linda   33
4   Anna   40


In [206]:
# Accessing data in DataFrame
print(df["name"])
print(df.loc[0])


0     John
1     Anna
2    Peter
3    Linda
4     Anna
Name: name, dtype: object
name    John
Age       23
Name: 0, dtype: object


In [207]:
# Descriptive statistics
df.describe(include="all")

Unnamed: 0,name,Age
count,5,5.0
unique,4,
top,Anna,
freq,2,
mean,,29.2
std,,7.694154
min,,21.0
25%,,23.0
50%,,29.0
75%,,33.0


In [208]:
# Adding a new column to the DataFrame
df["Score"] = [85, 90, 88, 92, 100]

In [209]:
# Saving DataFrame to a csv file
df.to_csv("DataFrame_data.csv", index=False)

In [210]:
# Reading a csv file to a DataFrame
df_from_csv = pd.read_csv("DataFrame_data.csv")
print(df_from_csv)

#compare:
df.to_csv("DataFrame_data.csv", index=True)
df_from_csv = pd.read_csv("DataFrame_data.csv")
print(); print(df_from_csv)


    name  Age  Score
0   John   23     85
1   Anna   21     90
2  Peter   29     88
3  Linda   33     92
4   Anna   40    100

   Unnamed: 0   name  Age  Score
0           0   John   23     85
1           1   Anna   21     90
2           2  Peter   29     88
3           3  Linda   33     92
4           4   Anna   40    100


In [211]:
# Accessing data using conditions
print(df[df["Age"]>25])


    name  Age  Score
2  Peter   29     88
3  Linda   33     92
4   Anna   40    100


In [213]:
# Grouping data
#df["Score"] = [85, 90, 92, 92, 100]
df["Age"] = [21, 21, 24, 24, 50]  # create new ages with some duplicates to see effect of binning
print(df); print()
df.groupby("Age")["Score"].mean()

    name  Age  Score
0   John   21     85
1   Anna   21     90
2  Peter   24     88
3  Linda   24     92
4   Anna   50    100



Age
21     87.5
24     90.0
50    100.0
Name: Score, dtype: float64

In [214]:
# Applying functions
df["NewAge"] = df["Age"].apply(lambda x:x+1)
print(df)

    name  Age  Score  NewAge
0   John   21     85      22
1   Anna   21     90      22
2  Peter   24     88      25
3  Linda   24     92      25
4   Anna   50    100      51


In this section, we expand our knowledge of pandas DataFrames. First, we learn how to add a new column to the DataFrame. Then, we cover how to save the DataFrame as a `csv` file using the `to_csv` method and how to read the csv file back into a DataFrame using the `read_csv` function.

We also delve into more complex operations on DataFrames, such as accessing data using conditions, grouping data by a certain column, and applying functions to columns. These operations are essential for data manipulation and analysis in pandas.

Finally, we use the `apply()` method to apply a function to each element in a DataFrame column. This is a powerful tool that allows us to perform complex transformations on our data. In this example, we use a `lambda` function to add 1 to each age in the 'Age' column.