# **APPLIED PYTHON 101**


---

## Context

The goal of that program is to teach to **_data story telling_** in context.

---

## Objectives
The main goal of this notebook is to extend and apply concepts covered in the prerequisite [Kaggle Python tutorial](https://www.kaggle.com/learn/python). If you have prior experience coding in Python, then you can skip straight to the exercises in **Part B**. If you are less familiar Python, we strongly suggest exploring each section below and running the provided example code blocks. **Focus on understanding why each line of code works and how the output can be changed**: you can add a new block using the "+ Code" button to test your own modifications.  

**Learning to code in Python is similar to learning to speak a new language. It will take time and hands-on practice.** Googling unknown terms and phrases is a useful strategy to help you quickly find answers and debug errors. Sites like [StackOverflow](https://stackoverflow.com/), [Towards Data Science](https://towardsdatascience.com/), and [Kaggle](https://www.kaggle.com/) usually have high-quality posts and troubleshooting suggestions, especially related to data analysis and machine learning applications.

Additionally, **We strongly encourage you to use the pertinent Moodle pages to discuss problems and points of uncertainty**. If you're stuck on a problem, it's very likely that someone else in the course is having/had a similar issue.

---

**Note**: Additional (optional) background information is provided for you in the blue URLs.

**Note**: Since you'll need to modify the code in this Colab Notebook, please make a copy of the Notebook for your use. Go to **File > Save a copy in Drive** and then work on that copy.

---



# [S] Part A: Opening the black box: Data Structures to Functions
## Motivating Question: What happens when we need to write our own code???
Using pre-built packages is easy, but writing our own code from scratch can quickly become much more challenging. Here, we'll take a bottom up approach, starting with the primitive data structures and operators, then the built-in data structures, both of which form the foundation of Python code. This section is an application of the concepts covered in the Kaggle Python tutorial and requires a basic understanding of Python. Future modules will build upon the concepts highlighted here.


---


#### Organization of Python Code: Data Structures to Functions
- Package or libraries
- Modules
- Classes
- **Functions**: a reusable, modular workflow in Python code
- **Scripts or code snippets**: single-use, simple code or code fragments
- **Data Structures**
  - **Primitive**
  - **Non-primitive**


---

**Note**: Machine Learning requires humans (i.e. specifically us as data analysts) to **translate human-understandable data**, like words, images, and audio, **into machine-understandable formats**, such as data structures, numbers, and network graphs. This notebook focuses on several basic data transformations and manipulations that can occur in Python.

## A.1 Quick Python review:

You are probably familiar with the primitive data structures from R, which are very similar to the Python primitives:
* integers
* floats
* strings
* Boolean (`True` and `False`)
* Special case: `None`



In [None]:
# Integers
4
4376

# Floating-point numbers
0.34
1.1352476
10.0

# Strings
'Blood pressure (BP) is the pressure exercised by circulating blood against the walls of blood vessels.'
'a'
'The minimum pressure  for an adult between two heartbeats (diastolic pressure) is normally 80 millimeters of mercury (mmHg) above the surrounding atmospheric'
'120'

# Boolean
True
False

# Special Case
None

### A.1a Type Casting

We can convert between some data representations using type casting. Check the data type using the function `type()`.

**Note**: Excel data is notorious for mixing numerics and strings, which needs to be resolved prior to analysis.

In [None]:
# Normal BP is considered to be 127/79 mmHg in men and 122/77 mmHg in women

x = 127
y = int('122')

print(x, type(x))
print(y, type(y))

# Currently, Line 9 throws an error. Use type casting in Line 9 to resolve the
# conflict; there are 2 expected answers: (1) 123123 and (2) 246.
z = (x + y)/2           #Desire to compute the average BP across men and women
print(z, type(z))

class percentage_hg():
  def __init__(self,hg_value,name):
    self.hg_value=hg_value
    self.name = name
  def to_decimal():
    return hg_value/100
  #Function that adds an external decimal hg value to this class's percentage hg value
  def add_float_to_me(float_val):
    return float_val*100 + self.hg_value
  #Function that adds an external percentage hg value to this class's percentage hg value
  def add_percentage(percentage_hg):
    return percentage_hg + self.hg_value

#An object instance of a class
my_hg = percentage_hg(z,'Ganesh Ramakrishna')

127 <class 'int'>
122 <class 'int'>
124.5 <class 'float'>


### A.1b Operators

Basic Python operators fall into 7 categories:

* Arithmetic Operators - `+`, `-`, etc.
* Relational Operators - `!=` "is not equal", `==` "is equal", `<`, `<=`, etc.
* Assignment Operators - `=`
* Logical Operators - `and`, `or` are used to combine relational operations
* Membership Operators - `in`, `not in` are used to check if A is in B
* Identity Operators - `is`, `is not` checks if A and B are the same object
* Bitwise Operators - (outside the scope of this course)


---

Python also uses a shortcut to perform arithmetic and assigment operations in one step, since this is a very common action when coding.



![picture](https://drive.google.com/uc?id=16SFddjb55H_ugjkVvtcqASDsqxI1i_Up)

In [None]:
# Expressions are a combination of variables, values, and operators that can be
# evaluated as a single value.
# For example, BMI computation required squared of height in the denominator.
# Most babies born between 37 and 40 weeks weigh somewhere between 2.5 kg and 4 kg and have height between 0.4 and 0.5 meter

0.4**2      #Square of height
True and False
'kilogram' + 'gram' #might not match what you intend to do though!
2.5/(0.4**2) + 4/(0.5**2) #Average BMI across the newborn babies between 37 and 40 weeks

# Statements do something (e.g. assign the value of an expression to a variable)
import math
result = 12<<2   #Padding two zeros (through bit shift operations) to the right of the binary representation of the number 3
print(result)   #This should amount to multiplication of the original number (12) by 4


48


In [None]:
import math

# Assignment statements link a variable to a Python object using `=`.
# Nearly everything in Python is an object.
child_count = 5
topic_name = 'Health Informatics!'
baby_weight_in_pounds = 2.20462 * 2.5 #2.5 kg in pounds
(1/3) * math.pi * (0.5)**3  #Volume of the ventilator to accommodate a new botn infant.
normalBP = True

# Special case: when basic arithmetic is used to modify the value of a variable
# and then assign the new value to the original variable
child_count = child_count + 1

# This operation can also be written using the following shortcut
child_count += 1

# Why does child_count = 7?
child_count



7

It's very important to understand the difference between the identity operator `is` and the relational operator `==`. **They are different operations and using them incorrectly can lead to subtle, erroneous outputs in your analyses.**

* `is` returns `True` if two variables refer to the same object (stored in memory).
* `==` returns `True` if the objects referred to by the variables are equivalent in value.

In [None]:
from copy import copy

# Here, both `heights` and `heights_new` point to same object in memory --> the list [0.42, 0.45, 0.47]
heights = [0.42, 0.45, 0.47]
heights_new = heights

print(heights_new is heights )  # `heights` and `heights_new` refer to the same object?
print(heights_new == heights)   # `heights` and `heights_new` have the same values?

# Because `heights` and `heights_new` both refer to the same list, changes to one affect the other
heights[0] = 'Oops we overwrote the original data'

print(f'\nWe made a change to heights: {heights}')
print(f'But the changes also affect heights_new: {heights_new}')
print('because they refer to the same object in memory.\n')





True
True

We made a change to heights: ['Oops we overwrote the original data', 0.45, 0.47]
But the changes also affect heights_new: ['Oops we overwrote the original data', 0.45, 0.47]
because they refer to the same object in memory.



In [None]:
# Now, we make a copy of `heights` and assign it to `heights_new`.
# Keep in mind that we need 2x memory because there are 2 separate objects
heights = [0.42, 0.45, 0.47]
heights_new = copy(heights)

print(heights_new is heights)  # `heights_new` is NOT the same object as `heights`
print(heights_new == heights)  # `heights_new` has the same values as `heights`

# Since `heights` and `heights_new` refer to different objects...
heights[0] = 'Oops we overwrote the original data'
#We modified only the original and not the copy
print(f'\nWe made a change to a: {heights}')  #https://www.w3schools.com/python/python_string_formatting.asp for more on formatted printing
print(f'This does NOT affect heights_new: {heights_new}')
print('because they refer to 2 different objects in memory.')


False
True

We made a change to a: ['Oops we overwrote the original data', 0.45, 0.47]
This does NOT affect heights_new: [0.42, 0.45, 0.47]
because they refer to 2 different objects in memory.


### A.1c Membership Operators

Recall that the membership operator `in` allows us to quickly check if an element is in a collection. List comprehension allows you to use a shorter syntax when you want to create a new list based on the values of an existing list depending on a specific condition. These two concepts can be combined into an efficient method for filtering or creating a new collection. You can use the same approach for Strings, Sets, and Dictionaries.

In [None]:
# To quickly check if an object is present in a collection, use the membership
# operator 'in'. As a conditional expression, this is formatted as `<part> in <whole>`
# and evalutes to either TRUE or FALSE

list1 = [0, 1, 2, 3, 4, 3, 2, 1, 0]
print(3 in list1)

name = 'bronchiolitis obliterans'
search_term = 'obliterans'
print(search_term in name)



True
True


In [None]:
# To quickly check if an object is present in a collection, use the membership
# operator 'in'. As a conditional expression, this is formatted as `<part> in <whole>`
# and evalutes to either TRUE or FALSE

list1 = ['patient', 'node', 'x_coord', 'y_coord', 'tumor', 'slide', 'center', 'split']
print('patient' in list1) #checking if 'patient' is indeed an atreibute in the list of attributes of Camelyon WILDs challenge

name = 'Applied Machine Learning'
search_term = 'Machine'
print(search_term in name)



True
True


### A.1d Chaining and Relational Operators


In [None]:
n = 100

# The comparison in Line 4...
result1 = 1 < n and n < 200

# Is the same as the comparison in Line 7.
result2 = 1 < n < 200

print(f'Line 4 evaluates to {result1}, and Line 7 evaluates to {result2}')



Line 4 evaluates to True, and Line 7 evaluates to True


## A.2 Control or Logic Flow in Python code

### A.2a Review - IF/ELSE, WHILE, and FOR loops

Python is an interpreted programming language. The code is executed sequentially line by line. However, there are control structures that can cause the Python interpreter to skip or repeat parts (AKA blocks) of code. The most common ones are `if/else` statements, `while` loops, and `for` loops.

Let's use `if/else` as an example.

* The first line always contains a comparison operation that evaulates to either `True` or `False`.
* If `True`, the Python interpreter is directed to a new block of code (note the indentation!) and begins executing the new block. When it reaches the end of the new block, it returns to the next line in the original block.
* If `False`, then the interpreter skips the indented block and continues executing the current block.

Note: **Pay attention to the indentation of each line.** Blocks can be nested inside other blocks.

In [None]:
from random import randint

pid = 8
opd = "OPD1"

# If/Else Statement - careful with setting your condition: only 1 block will be
# executed in an if/else
if (pid % 2)==0:  #even pids go to OPD1
    opd = "OPD1"
else:
    opd = "OPD2"

print(f"Since patiend id = {pid}: the patient can be routed to {opd}")


Since patiend id = 8: the patient can be routed to OPD1


Let's say we need to perform a task multiple times. Here, we can use `while` loops.

* The first line always contains at least one comparison operation that evaulates to either `True` or `False`.
* If `True`, the Python interpreter is directed to a new block of code (note the indentation!) and begins executing the new block. When it reaches the end of the new block, it jumps back to the first line and checks if the condition is still `TRUE`.
* If `FALSE`, the interpreter skips the indented block and continues to the next line.

In [None]:
from random import randint

# While Loop - careful with setting your stop condition: infinite loops can run forever!
id_1_to_swap = randint(20, 30)
id_2_to_swap = 1

#Decrease id1 in steps of 1 and increasing id2 in steps of 2 until they swap roles
#With id1 ==0 OR id2 >= 21
while id_1_to_swap > 0 and id_2_to_swap < 21:
    print(id_1_to_swap, id_2_to_swap)
    id_1_to_swap -= 1  # Here, we decrease id_1_to_swap by 1
    id_2_to_swap += 2  # Here, we increase id_2_to_swap by 2

print('We have exited the While loop!')

24 1
23 3
22 5
21 7
20 9
19 11
18 13
17 15
16 17
15 19
We have exited the While loop!


`For` loops work in a similar fashion; however, they only loop a specified number of times rather than checking for a condition.

* The first line specifies the number of times, usually by setting a range using the format `(start value, stop value, step size)`.
* The stopping point is EXCLUDED.
* If the step size isn't specified, a default value of 1 is used.

For example, `range(5, 10)` would start at 5, increment by 1, and stop after reaching 9.

In [None]:
# For Loop - no conditional, will always enter the indented code block
#If start value is unspecified, the default is 0
for patient_id in range(7):
    print(patient_id)

# What happens if you don't specify the starting value?





0
1
2
3
4
5
6


`For` loops can also be formated as:

for `my_variable_name` in `my_data_structure`

In [None]:
# For loops can also be used to iterate through a collection of Python objects
assorted = ['dermatology', 'epidermis', 'hypodermic', 'xeroderma',   # Here, we use a Python data structure called a `list`
           'Diabetes', '200 mg/dL or above', 'Prediabetes',         # The list can have mix of strings and numbers
           140, '140 – 199 mg/dL', 'Normal', '140 mg/dL or below']       # mg/dL (MG/DL) stands for milligrams per deciliter, the concentration of glucose in blood


for elem in assorted:  # Each object in the list is temporarily assigned to the variable `x`
    print(elem)       # Which allows us to "do something" with x inside the FOR loop


dermatology
epidermis
hypodermic
xeroderma
Diabetes
200 mg/dL or above
Prediabetes
140
140 – 199 mg/dL
Normal
140 mg/dL or below


In [None]:
# This is a code snippet with 3 levels of indentation and 2 loops:
# - WHILE block
# - FOR block
# - IF block
# - ELIF block
# - ELSE block

x = 6
print(f"x = {x}: before WHILE\n")

while x > 0:
    #Range is being determined dynamically based on value of x
    for i in range(x):
        if i > 3:
            print(f"x = {x}, i = {i}: IF block")
        elif i <= 3 and i > 1:
            print(f"x = {x}, i = {i}:   ELIF block")
        else:
            print(f"x = {x}, i = {i}:       ELSE block")
    print()
    x -= 2
print(f"x = {x}: after WHILE\n")

# Can you trace the logic of each block?
# What happens if you change the conditional expressions?

x = 6: before WHILE

x = 6, i = 0:       ELSE block
x = 6, i = 1:       ELSE block
x = 6, i = 2:   ELIF block
x = 6, i = 3:   ELIF block
x = 6, i = 4: IF block
x = 6, i = 5: IF block

x = 4, i = 0:       ELSE block
x = 4, i = 1:       ELSE block
x = 4, i = 2:   ELIF block
x = 4, i = 3:   ELIF block

x = 2, i = 0:       ELSE block
x = 2, i = 1:       ELSE block

x = 0: after WHILE



### A.2b Functions

A function is a block of code which only runs when it is called. You can pass data (known as parameters, arguments, or input) into a function, which can then transform or modify the data and return an output or result.

This further complicates how code is evaluated. Normally, this occurs sequentially line by line. As we saw earlier, that sequential order can be modified by conditional loops and IF/ELSE blocks. Now, if a function call is present, that causes code evaluation to jump to the function before returning to the next sequential line.

For example, see the code block below. Code evaluation always starts on Line 1. But when we reach the function call on Line 24, which calls `appointment_date()`, the Python interperter jumps to Line 5 and begins evaluating the code until Line 7, after which, it jumps back to finish Line 24.


---


**Note**: There are several conventional methods for organizing Python code. The following is a commonly used for shorter pieces of code, like scripts:
1. Import statements
2. Definitions for custom classes and functions
3. Variable assignments
4. The bulk of the code

The main reason for this is to enable easy identification and access to the parts that you might change later on.

In [None]:
# Import statements - usually doing this once per notebook is sufficient
from random import randint

cutf_off_date=10
# Sometimes we need a custom function
#This custom function filters appointments based on request date being <  cut_Off_date
def appointment_date(request_date):
    print('\tIn the function:', request_date)
    return request_date < cutf_off_date


# Define our variables
patient_names = ['Mr. Inder  Pawar', 'Mrs. Jamuna  Ranganathan', 'Ms. Anahita  Patil', 'Mr. Tathagata  Magar']
request_dates = []

# Generate our data
for e in patient_names:
    day = randint(1, 30)
    request_dates.append(day)

print(request_dates)

# Apply our analysis
for i, date in enumerate(request_dates):
    print('For loop:', i)
    if appointment_date(date):
        print(patient_names[i], 'gets an appointment date.')
    else:
        print(patient_names[i], 'does not get an appointment date.')
    print('\tOut of the function:', date)

# Note the use of comments to outline what you are doing. This helps:
#   (1) plan out how to approach coding
#   (2) documents what you've done for future reference


[15, 1, 29, 18]
For loop: 0
	In the function: 15
Mr. Inder  Pawar does not get an appointment date.
	Out of the function: 15
For loop: 1
	In the function: 1
Mrs. Jamuna  Ranganathan gets an appointment date.
	Out of the function: 1
For loop: 2
	In the function: 29
Ms. Anahita  Patil does not get an appointment date.
	Out of the function: 29
For loop: 3
	In the function: 18
Mr. Tathagata  Magar does not get an appointment date.
	Out of the function: 18


### A.2c Variable Scope

When is a variable accessible by the code? This is determined by its **scope**.


*   Built-in
    * This is the widest scope (i.e. they can be called anywhere in any program without needing to be defined).
    * Examples: the Python keywords, such as `def`, `while`, `True`, etc.

*   Global scope
    * Can be explicitly set by the keyword `global`
    * Variables defined outside of any functions is accessible from anywhere within the specific program that defined it.
    * Examples: 'employees' and 'hired' are global variables

*   Enclosing scope
    * This is specific to nested functions, which are rare, complex cases and we will not cover this.

*   Local scope
    * Variables defined within a function
    * Only accessible during a function call

![picture](https://drive.google.com/uc?id=1HgXTUWyG0C6w8eR9izk1Qr80_PmqYS5I)

In [None]:
# Import statements - usually doing this once per notebook is sufficient
# Keywords like `import`, `def`, `class` are built-in
from random import randint

cutf_off_date=10  #Global
# Sometimes we need a custom function
def appointment_date(request_date):
    start_date = 0  # This is a local variable
    print('\tIn the function:', request_date)
    return request_date < cutf_off_date


# Define our variables.
# These are global variables
patient_names = ['Mr. Inder  Pawar', 'Mrs. Jamuna  Ranganathan', 'Ms. Anahita  Patil', 'Mr. Tathagata  Magar']
request_dates = []

# Generate our data
for e in patient_names:
    day = randint(1, 30)  #day is within the scope of for block and local
    request_dates.append(day)

print(request_dates)


# Apply our "analysis"
for i, date in enumerate(request_dates):
    print('For loop:', i)
    print(patient_names[i], 'gets an appointment date.' if appointment_date(date) else 'does not get an appointment date.')
    print('\tOut of the function:', date)
    print()

# Remove the `#` from the beginning of Line 7.
# What happens if you try to access the variable request_date outside of the function?
# Can you access the variable `request_dates` inside the function?

[20, 15, 23, 15]
For loop: 0
	In the function: 20
Mr. Inder  Pawar does not get an appointment date.
	Out of the function: 20

For loop: 1
	In the function: 15
Mrs. Jamuna  Ranganathan does not get an appointment date.
	Out of the function: 15

For loop: 2
	In the function: 23
Ms. Anahita  Patil does not get an appointment date.
	Out of the function: 23

For loop: 3
	In the function: 15
Mr. Tathagata  Magar does not get an appointment date.
	Out of the function: 15



## Part B. Exercises with Non-primitive Data Structures

**Note**: Stuck at any point? Look up the following resource [Lists, Tuples, Sets, Dictionaries](https://learning.oreilly.com/library/view/python-for-data/9781491957653/ch03.html#tut_data_structures).

Also, one of the key strengths of Python is the ability to use dynamic typing to transform data from one data structure to a more useful data structure. What are the key characteristics, features, and functions associated with each data structure?

### B.1 Lists
**Note**: Be familiar with slicing as this will be important for handling data in basic Python strings, NumPy arrays, and Pandas DataFrames in later modules.

![picture](https://drive.google.com/uc?id=1BbZQMwTyp6kg_ISVAY5sPnRBKWfrlaAx)


In [None]:
# A list is an ordered sequence of elements denoted by `[]`.
opd_list = [3, 7, 12, 13, 14, 25, 36, 37, 48, 59]

# We can access each element by its index, which starts at 0 -> n-1.
var1 = opd_list[0]
print(f'Using indexing (Line 5), the first element is {var1}\n')


# What about negative indexing?
#Please note that while opd_list[::-1] traverses in the reverse order,
#when there is a single argument, it is taken to be the index of the element
#and element with index -1 would be the last element
var2 = opd_list[-1]
print(f'Using negative indexing (Line 10), the last element is {var2}\n')


# Slicing a list - Access a range of elements from [start : stop)
# Note: slicing INCLUDES the start index but EXCLUDES the end index.
slice1 = opd_list[2:5]
print(f'We can isolate part of a list via slicing (Line 16): {slice1}\n')


# If the slice starts at Index 0, the start position can be implicit
slice2 = opd_list[:5]  # equivalent to `list1[0:5]

# If the slice ends with the last index, the end position can be implicit
slice3 = opd_list[5:]  # equivalent to `list1[5:10]'

print(f'In special cases, the start or end index can be implied: ')
print(f'See Line 21 for {slice2} which spans Index 0 through 4, excluding the end at Index 5')
print(f'and Line 24 for {slice3} which spans Index 5 through the end.\n')



Using indexing (Line 5), the first element is 3

Using negative indexing (Line 10), the last element is 59

We can isolate part of a list via slicing (Line 16): [12, 13, 14]

In special cases, the start or end index can be implied: 
See Line 21 for [3, 7, 12, 13, 14] which spans Index 0 through 4, excluding the end at Index 5
and Line 24 for [25, 36, 37, 48, 59] which spans Index 5 through the end.



In [None]:
opd_list = [3, 7, 12, 13, 14, 25, 36, 37, 48, 59]

# Slicing actually has a third value (step size), formatted [start : stop : step]
print('To skip every other value:')
skip_even = opd_list[::2]
print(skip_even)


# Using list1, how would you skip the odd indices?
print('\nWhat if we want the values in the even indices:')

skip_odd = opd_list[1:-1:2]  # Modify this line
print(skip_odd)


# Negative indexing can be used to reverse the values in a list
print('\nTo quickly reverse a list, use negative indexing:')
print(opd_list[::])  # What start, stop, and step values are implied here?
print(opd_list[::-1])



To skip every other value:
[3, 12, 14, 36, 48]

What if we want the values in the even indices:
[7, 13, 25, 37]

To quickly reverse a list, use negative indexing:
[3, 7, 12, 13, 14, 25, 36, 37, 48, 59]
[59, 48, 37, 36, 25, 14, 13, 12, 7, 3]


In [None]:
opd_list = [3, 7, 12, 13, 14, 25, 36, 37, 48, 59]
name_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

# We can use FOR loops to iterate through the elements in a list.
for item in opd_list: #We are here iterating over values
    if item % 2 != 0: #If item is NOT even
        item += 100  # Do something to the element at index i. You could also print(f'Scheduling {item} today')
print(opd_list, '\n')


# We can also iterate through a list by index.
# Note: Python functions can be nested; they are evaluted from the inner most
# function to the outer most function. The function `len()` is given `opd_list` as
# an input. This returns the length of opd_list (i.e. 10) which is then used as an
# input for the function `range()`. This returns a sequence of integers from 0
# to the input value, 10.
for i in range(len(opd_list)):  #We are here iterating over indices
    if i % 2 == 1: opd_list[i] += 200
print(opd_list, '\n')


# Sometimes we need to access both the value and the index of an element,
# usually for cross-referencing or indexing into another list. The `enumerate()`
# function is a useful trick here:
for i, item in enumerate(opd_list):
    if 2 < item < 10:
        print(f'At index {i}, the value in opd_list is {item}')
        print(f'and the corresponding value in name_list is {name_list[i]}.\n')


# Points to ponder:
# 1. Why is using `range(len(list))` better than using `range(10)`?
#ANS: In case we modify the array (i.e. add elements) range(len(list)) will be automatically updated
# 2. Here, we chose to use a FOR loop. Could this be done using a WHILE loop?
#ANS: while i<len(opd_list): #Remember to have i++ within the loop
# 3. How can we do this in 1 line of code using list comprehension? Say increasing odd values by 100
#See https://www.w3schools.com/python/python_lists_comprehension.asp
#new_opd_list = [n for n in opd_list] #Simply reproduces opd_list
#new_opd_list = [n+100 for n in opd_list if n%2==1 ] #Only picks up odd values+100 in opd_list

[3, 7, 12, 13, 14, 25, 36, 37, 48, 59] 

[3, 207, 12, 213, 14, 225, 36, 237, 48, 259] 

At index 0, the value in opd_list is 3
and the corresponding value in name_list is a.





---



#### Problem 1 (mild):

Find the smallest number in the list.

**Input:**

```
nums = [4, 16, -21, -49, 3, 821, 8, 27, -2, 74, -81, 0, 5]
```

**Expected Outputs:**

```
-81
```

In [None]:
# Note the keyword `def`. Write a function that RETURNS the smallest number in a list
# Hint: Consider using a FOR or WHILE loop
#min_val = math.min_value
def find_smallest(my_list):
    # Enter your code here
    #for val in my_list:
      #if (val < min_val):


    return



#------------------------
# Now call your function using the list `nums` as a parameter
nums = [4, 16, -21, -49, 3, 821, 8, 27, -2, 74, -81, 0, 3]





---



#### Problem 2 (medium):

Given two lists, return a list of all elements that are common (without duplicates) between the two.

**Input:**

```
a = [2, 15, 8, 10, 1, 18]
b = [6, 2, 10, 19, 8, 4, 2, 20, 2]
```

**Expected Output:**
```
[2, 8, 10]
```

In [None]:
def common_ele(lista, listb):
    # Enter your code here



    return


#------------------------
# Now call your function using the lists `a` and `b` as parameters
a = [2, 15, 8, 10, 1, 18]
b = [6, 2, 10, 19, 8, 4, 2, 20, 2]






---



#### Problem 3 (spicy):
For any given a list of characters, check if it is a palindrome or not.

> Palindrome: A palindrome is a word, number, phrase, or other sequence of characters which reads the same backward as forward, such as _madam_, _racecar_.

**Input:**

```
s1 = ['n', 'o', 'o', 'n']
s2 = ['k', 'a', 'y', 'a', 'k']
s3 = ['p', 'a', 'l', 'i', 'n', 'd', 'r', 'o', 'm', 'e']
```

**Expected Output:**

```
noon is a Palindrome
kayak is a Palindrome
palindrome is not a Palindrome
```

In [None]:
# Either a WHILE or FOR loop can work here
# Either print or return can work here
# To concatenate all the strings in a list, use this: ''.join(my_list_here)
def palindrome(input):
    # Enter your code here
    #reverse = input[::-1]
    #if reverse == input:
      #return True

    return




#------------------------
# Now call your function using the inputs below
s1 = ['n', 'o', 'o', 'n']
s2 = ['k', 'a', 'y', 'a', 'k']
s3 = ['p', 'a', 'l', 'i', 'n', 'd', 'r', 'o', 'm', 'e']







---
### B.2 Sets


In [None]:
# A set is an unordered collection of UNIQUE elements denoted by `{}`. We will demonstrate using medical concepts and their ids accessible at https://browser.ihtsdotools.org/?
# Note: Sets are unordered; we cannot use an index to access an item.
set1 = {'Laparoscopy', 'Gastritis'} #Recap terms pertaining to the Digestive System.  Laparoscopy=procedure pertaining to the abdominal wall. Gastritis= Disorder of stomach,
set2 = {73632009, 4556007, '73632009', '4556007'} #Snomed CT id for Laparoscopy is 73632009 and for Gastritis it is 4556007 .


# We can add elements using `.add()`
set1.add(73632009)
print(f'Added 73632009: {set1}\n')


# We cannot concatenate sets because they are unordered, instead we union sets
new_set = set1.union(set2)
new_set


Added 73632009: {73632009, 'Gastritis', 'Laparoscopy'}



{'4556007', 4556007, '73632009', 73632009, 'Gastritis', 'Laparoscopy'}

In [None]:
# If we know an element is in a set, we can explicitly remove it.
if 73632009 in set1:  # Recall the keyword 'in' from Operators
    set1.remove(73632009)
    print(f'Removed 73632009: {set1}\n')
else:
    print('73632009 was not found.\n')  # Why didn't we find 2021?


# Note: The IF/ELSE block above can be condensed using a ternary operator:
print (f'Remove 73632009: {set1}\n' if 73632009 in set1 else '73632009 was not found.\n')
#Note the connection with List comprehension which also allows for such condensation.
#https://www.w3schools.com/python/python_lists_comprehension.asp

# Or we can randomly remove an element
if len(set1) > 0:
    var = set1.pop()
    print(f'We removed \'{var}\' from {set1}', '\n')


# Points to Ponder
# 1. Why is it important to check if an element is in a collection before removing it?
#ANS: set1.remove(elem) will throw an error if the element is not in the list! Please verify!
# 2. Why isn't `2021` found in the set?
#ANS: Since it was never added to the set
# 3. Why do we need to check the size/length of a collection before performing pop?
#ANS: pop removes a random element. But the set must be non-empty, else pop could throw an error


Removed 73632009: {'Gastritis', 'Laparoscopy'}

73632009 was not found.

We removed 'Gastritis' from {'Laparoscopy'} 



---
#### Problem 1 (mild):

Given a list, check if it has duplicates. Hint: Consider type casting. What is a key feature of sets that is different from lists?

**Input:**

```
b = [31, 11, 1, 9, 72, 4, 3, 20, 31]
```

**Expected Output:**
```
True
```

In [None]:
#my_set={}
def has_duplicates(list1):
    # Enter your code here
    #for elem in list1:
      #if(elem not in my_set):
        #my_set.add(elem)
      #else:
        #return True
    return

#------------------------
# Now call your function using the inputs below
b = [31, 11, 1, 9, 72, 4, 3, 20, 31]



---
#### Problem 2 (medium):
The set `S` originally contained numbers from `1` to `n`, where `n` is the length of `nums`.

As we were shifting our data from a set to a list, there was a data entry error: one number in `nums` was duplicated thus overwriting the original value. In this case, `3` was duplicated and the value `5` was lost. Note that the size of the set (cardinality) remains the same.

Given the erroneous list `nums` and the original data in `s`, can you write a function that identifies the duplicated number and the missing number?

**Input:**
```
s = {1, 2, 3, 4, 5}
nums = [1, 2, 3, 3, 4]
```
**Expected Output:**

```
3 5
```
**Explanation**: The first number that you return is the number that got duplicated. In this case it was `3`. Now there will be two '3's and hence displacing the last number in the set. The number that it replaced was `5`.


---


**Note**: Your function should be a general solution (i.e. it should work on any set with the same type of error)
```
s = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
nums = [1, 2, 3, 4, 5, 6, 6, 7, 9, 10, 11]
```
**Expected Output:**

```
6 8
```



In [None]:
# Convert list nums into a another set nums
# Find disjoint or s-nums and return the element in s that is missing in nums
# To find the duplicated number use a version of Problem 3 of Lists.

def fix_nums(set1, list1):
    # Enter your code here, Lines 7-8 are placeholders
    duplicate = None
    displaced = None
    #For the duplicated case, logic will be similar to previous exercise, i.e. by creating a new set
    #for elem in nums:
      #if elem not in s:
        #print(f'Missing {elem}')
        #displaced = {elem}



    # Note: you can return multiple objects
    return duplicate, displaced

#------------------------
# Now call your function using the inputs below
s = {1, 2, 3, 4, 5}
nums = [1, 2, 3, 3, 4]

# Note: you'll need to assign 2 variables to this function
dup, dis = fix_nums(s, nums)


---
#### Optional - Problem 3 (extra spicy):
(Extension of Problem 2)

Given a unordered list containing numbers repeating exactly _**K**_ times and one unique number, find that unique number.

For example: for K = 2, and for the given list below
```
list1 = [5, 6, 9, 6, 10, 5, 10]
```
`5`, `6`, and `10` repeat exactly `K = 2` times, but `9` is unique. So `9` is the answer.

**Input:**
```
K = 4
list1 = [5, 23, 6, 10, 10, 411, 23, 6, 11, 23, 6, 5, 411, 10, 2, 5, 10, 2, 2, 23, 6, 5, 411, 411, 2]
```

**Expected Output:**
```
11
```

> Hint: How can you leverage the fact that duplicate numbers always repeat exactly K times ?

In [None]:
# K * SUM(SET(LIST1)) - SUM(LIST1) = K-1*UNIQUE_NUMBER
# UNIQUE_NUMBER = (K * SUM(SET(LIST1)) - SUM(LIST1)) / K-1
#Please note that SUM(SET(LIST1)) etc need to implemented with some elaboration

# First find sum of elements in list1
# Then convert list1 to a set
# Use the inbuilt function "sum()" to get the total value of the elements in set
# Compute the unique_number using the formula above
def find_unique(k, my_list):
    # Enter your code here


    return

#------------------------
# Now call your function using the inputs below
K = 4
list1 = [5, 23, 6, 10, 10, 411, 23, 6, 11, 23, 6, 5, 411, 10, 2, 5, 10, 2, 2, 23, 6, 5, 411, 411, 2]





### B.3 Tuples

#### Problem 1 (mild):

Write a function to find the tuple by searching for `n` in a given list.

**Input:**
```
n = 5.12
l = [('item1', 1.20), ('item2', 5.12), ('item3', 4.58)]
```

**Expected Output:**
```
('item2', 5.12)
```

In [None]:
# Loop over the list `l`
# Check if the second element matches n
# If it matches return that tuple.

def find_n(list1, find_me):

    return


#------------------------
# Now call your function using the inputs below
n = 5.12
l = [('item1', 1.20), ('item2', 5.12), ('item3', 4.58)]




#### Problem 2 (medium):
Write a function to reverse a tuple

**Input:**

```
tup = (13, 24, 1, 5, 18)
```

**Output:**
```
tup = (18, 5, 1, 24, 13)
```

In [None]:
# Note: this should work with tuples of any length
# Note: a key feature of tuples is that they are IMMUTABLE

def flip(tuple1):

    return

#------------------------
# Now call your function using the inputs below
tup = (13, 24, 1, 5, 18)





#### Optional - Problem 3 (extra spicy):

Given a list of integers, and a number, find all good triplets.

A triplet `(l[i], l[j], l[k])` is **good** if the following conditions are true:

- `0 <= i < j < k <= len(l)`
- `|l[i] - l[j]| <= n`
- `|l[j] - l[k]| <= n`
- `|l[i] - l[k]| <= n`

where `n` denotes the absolute value.

Return a tuple of all the **good** triples.

Input:

```
n = 3
l = [4, 1, 0, 1, 7, 9]
```

Expected Output:

```
[(4, 1, 1), (1, 0, 1)]
```


In [None]:
# Use 3 nested for loops with indices i, j, k
# Within each loop, access the elements of list list[i], list[j],list[k]
# Check if the triplet conditions are met
# Identify the good i,j,k combinations and form tuples
# Add them to a list to return

def good_trip(abs_value, list1):
    # Enter your code here


    return

#------------------------
# Now call your function using the inputs below
n = 2
l = [3, 0, 1, 1, 9, 7]




### B.4 Dictionaries

In [None]:
# A dictionary is an indexed collection of key:value pairs denoted by {}
my_dict =	{   # key: value
  "cry(o)-": "Integumentary System",
  "muscul(o)-": "Muscular System",
  "lymph": "Blood, Lymphatic and Immune System"
}

# We can access a value by its key
x = my_dict["lymph"]
print(x)

# Values can be changed in a similar manner
my_dict["lymph"] = "Lymphatic and Immune System"
print(my_dict)

# And new elements can be added
my_dict["cardi"] = "Cardiovascular System"
print(my_dict)

# Or removed... what should be done before removing something from a collection?
#Ideally: Should check if "lymph" was indeed present as key in my_dict
my_dict.pop("lymph")
print(my_dict)
#Note that in contrast to the set pop which removes an arbitrary element, dictionary pop removes the element specified by the key



Blood, Lymphatic and Immune System
{'cry(o)-': 'Integumentary System', 'muscul(o)-': 'Muscular System', 'lymph': 'Lymphatic and Immune System'}
{'cry(o)-': 'Integumentary System', 'muscul(o)-': 'Muscular System', 'lymph': 'Lymphatic and Immune System', 'cardi': 'Cardiovascular System'}
{'cry(o)-': 'Integumentary System', 'muscul(o)-': 'Muscular System', 'cardi': 'Cardiovascular System'}


In [None]:
# Note: For dictionaries, we need to assign 2 variables: the first for the key
# and the second for the value.
for x, y in my_dict.items():
  print(x, y)




cry(o)- Integumentary System
muscul(o)- Muscular System
cardi Cardiovascular System


#### Problem 1 (mild):

Convert the two lists into a dictionary.

**Input:**
```
keys = ['alpha', 'beta', 'charlie']
values = ['a', 'b', 'c']
```
**Expected Output:**

```
d = {'alpha': 'a', 'beta': 'b', 'charlie': 'c'}
```


In [None]:
# You can use function `zip()` to create key-value tuple pairs
# Create an empty dictionary
# Use a FOR loop to fill the dictionary with values

# No need for a function, just write a code snippet for this question
keys = ['alpha', 'beta', 'charlie']
values = ['a', 'b', 'c']
#for key in keys:
  #for value in values:
    #my_dict[key] = value






#### Problem 2 (medium):

Write a function `histogram()` that takes a string and builds a frequency listing of the characters contained in it. Represent the frequency listing as a Python dictionary and print the in sorted order on 'key'.

Input:
```
s = "oneredpaperclip"
```
Output:
```
{'o': 1, 'n': 1, 'e': 3, 'r': 2, 'd': 1, 'p': 3, 'a': 1, 'c': 1, 'l': 1, 'i': 1}
```

In [None]:
# Collect all characters from string
# Count them
# Create the dictionary
# Enter the keys as letters and numbers as values.

def histogram(string1):
    # Enter your code here
    #Dictionary with key as character and count as value

    return


#------------------------
# Now call your function using the input below
s = "oneredpaperclip"








---


# Python 101: Putting it all together


---


We know that Python:


1.   is object-oriented
2.   encourages code modularity
3.   allows dynamic typing
4.   allows dynamic binding

**But what does all this really mean in practice?**


In [None]:
# Almost everything in Python is an object, meaning variables, function, data
# structures, etc. can each be treated as one 'blob' without worrying about its
# parts. The inputs set_of_ids, dblist, and the drug_string1 and drug_string2 are all objects.
#They are mostly constructed based on some drugs and their ids from https://go.drugbank.com - a Clinical Drug Database
#Use of list/set comprehension. Creating set of integers that are 3 times those integers in 100,10000 in gaps of 7 provided they are divisible by 6
#https://www.w3schools.com/python/python_lists_comprehension.asp
set_of_ids = {n * 3 for n in range(100, 10000, 7) if n % 6 == 0}
#Again list comprehension - prefixing each id in set_of_ids with string 'DB' by converting each into a string
dblist = ['DB' + str(m) for m in set_of_ids]

#Paraldehyde is a central nervous system depressant previously used to control convulsions due to various clinical causes, including tetanus, status epilepticus, and convulsive drugs.
drug_string1 = 'paraldehyde'
#Lexxel is a is a calcium channel blocker used to treat hypertension. (https://go.drugbank.com/drugs/DB01023)
drug_string2 = 'lexxel'
 #Carbocisteine is a expectorant mucolytic used in the relief of respiratory of COPD and other conditions associated with increased mucus viscosity.
s1 = ['c', 'a', 'r', 'b', 'o', 'c', 'i', 's', 't', 'e', 'i', 'n', 'e']
#Tylenol most commonly taken analgesic worldwide and is recommended as first-line therapy in pain conditions by the World Health Organization (WHO).
s2 = ['t', 'y', 'l', 'e', 'n', 'o', 'l']
#Xanax (alprazolam), a drug used to treat anxiety disorders
s3= ['x', 'a', 'n', 'a', 'x']

# We can then create a list (another object) and populate it with these datasets
# without needing to consider the individual elements of each.
assorted_inputs = [set_of_ids, dblist, drug_string1, drug_string2, s1, s2, s3]

# An example of code modularity is grouping all the palindrome code into a single
# function. Here's a more complex palindrome function that can handle multiple
# input types (including the three types, viz., set, strings and lists of characters above):
def palindrome(input):
    exit_flag = False
    #print(input)
    for item in input:  #Will work even if input is a set as in set_of_ids, list as in dblist etc
                        #Homework: What will item correspond to if input is a dictionary?
        if type(item) is not str:  # Dynamic typing allows conversion between data structures
            item = str(item)  #In the example above, this will be used for set_of_ids
        if len(item) == 1:
            item, exit_flag = input, True  # Dynamic binding allows (re-)assignment "on-the-fly"
        if item == item[::-1]: #What is item is a string itself? What will this do?
            print(item)
            #https://www.w3schools.com/python/ref_string_join.asp: Join all items in a tuple into a string, using no separator:
            print(item if type(item) is str else ''.join(item), 'is a Palindrome')
        if exit_flag: break
    print()

# Putting this all together, we can write flexible code that scales efficiently
# with data. The list `assorted_inputs` could contain hundreds of different data points.
# Instead of having to write code explicitly for each input, we just need 3 lines.
for n, input in enumerate(assorted_inputs):
    print(f'Dataset {n+1} of type {type(input)} contains {len(input)} items')
    palindrome(input)


# In the next modules, we'll cover advanced data structures that scale even more
# efficiently with "Big Data" (e.g. NumPy arrays and Pandas DataFrames).



Dataset 1 of type <class 'set'> contains 236 items
20502
20502 is a Palindrome

Dataset 2 of type <class 'list'> contains 236 items

Dataset 3 of type <class 'str'> contains 11 items

Dataset 4 of type <class 'str'> contains 6 items
lexxel
lexxel is a Palindrome

Dataset 5 of type <class 'list'> contains 13 items

Dataset 6 of type <class 'list'> contains 7 items

Dataset 7 of type <class 'list'> contains 5 items
['x', 'a', 'n', 'a', 'x']
xanax is a Palindrome

